Abstract
Background: With the increase in populations in urban areas, there is an increase in pollution also. Air pollution is one of the challenging environmental issues in smart cities.
Objective: Real-time monitoring of air quality can help the administration to take appropriate decisions on time. Advancement in the Internet of Things based sensors has changed the way to monitor air quality.
Methods: In this paper, we have applied two-stage regressions. At the first stage, ten regression algorithms (Decision Tree, Random Forest, Elastic Net, Adaboost, Extra Tree, Linear Regression, Lasso, XGBoost, Light GBM, AdaBoost, and Multi-Layer Perceptron) are applied and at second stage best four algorithms are selected and stacking ensemble algorithms are applied using python to predict the PM2.5 pollutants in the air. Dataset of five Chinese cities (Beijing, Chengdu, Guangzhou, Shanghai, and Shenyang) is taken into consideration and compared based on MAE (Mean Absolute Error), RMSE (Root Mean Square Error) and R2 parameters.
Results: We observed that out of ten regression algorithms applied, extra tree algorithm exhibited the best performance on all the five datasets, and further stacking improved the performance.
Conclusion: Feature importance for Sheyang and Beijing city was computed using three regression algorithms, and we found that the four most important features are humidity, wind speed, wind direction and dew point.
Keywords: AQI, regression, deep learning, imputation techniques, PM2.5, machine learning, ensemble learning, IoT, smart city.
Graphical Abstract
[http://dx.doi.org/10.1049/iet-gtd.2018.5812]
[http://dx.doi.org/10.2166/wh.2017.297] [PMID: 28771150]
[http://dx.doi.org/10.1016/S1352-2310(99)00144-2]
[PMID: 11354823]
[http://dx.doi.org/10.1016/j.envint.2018.12.023] [PMID: 30622066]
[http://dx.doi.org/10.1016/j.envres.2012.08.005] [PMID: 22959329]
[http://dx.doi.org/10.1016/j.atmosenv.2005.04.027]
[http://dx.doi.org/10.5194/gmd-3-43-2010]
[http://dx.doi.org/10.1001/jama.2017.17923] [PMID: 29279932]
[http://dx.doi.org/10.4209/aaqr.2016.05.0214]
[http://dx.doi.org/10.1016/S0140-6736(17)30505-6]
[http://dx.doi.org/10.1145/3141128.3141131]
[http://dx.doi.org/10.1007/s00521-015-1927-7]
[http://dx.doi.org/10.1007/s11869-016-0414-3]
[http://dx.doi.org/10.1109/SOLI.2015.7367615]
[http://dx.doi.org/10.3390/ijerph14020114] [PMID: 28125034]
[http://dx.doi.org/10.1016/j.eswa.2010.05.093]
[http://dx.doi.org/10.3390/bdcc2010005]
[http://dx.doi.org/10.3390/ijgi8020099]
[http://dx.doi.org/10.1016/j.ecolmodel.2005.01.008]
[http://dx.doi.org/10.1109/BDCAT.2018.00015]
[http://dx.doi.org/10.5194/isprs-archives-XLII-4-W4-483-2017]
[http://dx.doi.org/10.4209/aaqr.2018.12.0450]
[http://dx.doi.org/10.1007/s10098-019-01709-w]
[http://dx.doi.org/10.1002/2016JD024877]
[http://dx.doi.org/10.3390/sym11060820]
[http://dx.doi.org/10.1080/00949655.2019.1615489]
[http://dx.doi.org/10.1023/B:STCO.0000035301.49549.88]
[http://dx.doi.org/10.1186/1753-6561-6-S2-S10]
[http://dx.doi.org/10.1007/s10994-006-6226-1]
[http://dx.doi.org/10.1109/IJCNN.2004.1380102]
[http://dx.doi.org/10.1007/BF00117832]
[http://dx.doi.org/10.1002/9781118445112.stat06627]
[http://dx.doi.org/10.1080/03610926.2014.960584]
[http://dx.doi.org/10.1080/03610926.2017.1343847]