Have a personal or library account? Click to login
Robust Machine Learning Algorithmic Rules for Detecting Air Pollution in the Lower Parts of the Atmosphere Cover

Robust Machine Learning Algorithmic Rules for Detecting Air Pollution in the Lower Parts of the Atmosphere

Open Access
|Sep 2025

References

  1. Anser, M.K., Ali, S., Mansoor, A., ur Rahman, S., Lodhi, M.S., Naseem, I. and Zaman, K. (2024) ‘Deciphering the dynamics of human-environment interaction in China: Insights into renewable energy, sustainable consumption patterns, and carbon emissions’, Sustainable Futures, 7, p. 100184. Available at: 10.1016/j.sftr.2024.100184
  2. Cai, W., Xu, X., Cheng, X., Wei, F., Qiu, X. and Zhu, W. (2020) ‘Impact of “blocking” structure in the troposphere on the wintertime persistent heavy air pollution in northern China’, Science of The Total Environment, 741, p. 140325. Available at: 10.1016/j.scitotenv.2020.140325
  3. Chapmann, J. (2017) Machine Learning: Fundamental Algorithms for Supervised and Unsupervised Learning With Real-World Applications (Advanced Data Analytics). CreateSpace Independent Publishing Platform.
  4. Cheng, C., Messerschmidt, L., Bravo, I., Waldbauer, M., Bhavikatti, R., Schenk, C., Grujic, V., Model, T., Kubinec, R. and Barceló, J. (2024) ‘A general primer for data harmonization’, Scientific data, 11(1), p. 152. Available at: 10.1038/s41597-024-02956-3
  5. Chi, Y., Fan, M., Zhao, C., Yang, Y., Fan, H., Yang, X., Yang, J. and Tao, J. (2022) ‘Machine learning-based estimation of ground-level NO2 concentrations over China’, Science of the Total Environment, 807, p. 150721. Available at: 10.1016/j.scitotenv.2021.150721
  6. Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) ‘Maximum likelihood from incomplete data via the em algorithm’, Journal of the Royal Statistical Society: Series B (Methodological), 39(1), pp. 122. Available at: 10.1111/j.2517-6161.1977.tb01600.x
  7. Diaz-de Arcaya, J., Garcia-Perez, A., Bonilla, L., Miñón, R. and Torre-Bastida, A.I. (2025) ‘Data harmonization as a keystone for data spaces: Challenges, techniques, and future trends’, in 2025 10th International Conference on Smart and Sustainable Technologies (SpliTech), IEEE, pp. 16. Available at: 10.23919/SpliTech65624.2025.11091719
  8. Du, J., Qiao, F., Lu, P. and Yu, L. (2022) ‘Forecasting ground-level ozone concentration levels using machine learning’, Resources, Conservation and Recycling, 184, p. 106380. Available at: 10.1016/j.resconrec.2022.106380
  9. Edo, G.I., Itoje-akpokiniovo, L.O., Obasohan, P., Ikpekoro, V.O., Samuel, P.O., Jikah, A.N., Nosu, L.C., Ekokotu, H.A., Ugbune, U., Oghroro, E.E.A. et al. (2024) ‘Impact of environmental pollution from human activities on water, air quality and climate change’, Ecological Frontiers 44(5), pp. 874889. Available at: 10.1016/j.ecofro.2024.02.014
  10. Feng, H., Zou, B., Wang, J. and Gu, X. (2019) ‘Dominant variables of global air pollution-climate interaction: Geographic insight’, Ecological Indicators, 99, pp. 251260. Available at: 10.1016/j.ecolind.2018.12.038
  11. Feng, X., Wei, S. and Wang, S. (2020) ‘Temperature inversions in the atmospheric boundary layer and lower troposphere over the Sichuan Basin, China: Climatology and impacts on air pollution’, Science of the Total Environment, 726, p. 138579. Available at: 10.1016/j.scitotenv.2020.138579
  12. Global Alliance on Health and Pollution (2023) ‘Global alliance on health and pollution’. Available at: https://www.gahp.org/
  13. Goncharenko, L.P., Harvey, V.L., Liu, H. and Pedatella, N.M. (2021) ‘Sudden stratospheric warming impacts on the ionosphere–thermosphere system: A review of recent progress’, Ionosphere dynamics and applications, pp. 369400. Available at: 10.1002/9781119815617.ch16
  14. Hirschfeld, H.O. (1935) ‘A connection between correlation and contingency’, Mathematical Proceedings of the Cambridge Philosophical Society, 31(4), pp. 520524. Available at: 10.1017/S0305004100013517
  15. Hsu, C.Y., Soo, J.C., Lin, S.L., Wu, C.D., Chi, K.H., Hsu, W.C., Tseng, C.C. and Chen, Y.C. (2023) ‘Using cluster algorithms with a machine learning technique and pmf models to quantify local-specific origins of PM2.5 and associated metals in Taiwan’, Environmental Pollution, 316, p. 120652. Available at: 10.1016/j.envpol.2022.120652
  16. Janches, D., Berezhnoy, A.A., Christou, A.A., Cremonese, G., Hirai, T., Horányi, M., Jasinski, J.M. and Sarantos, M. (2021) ‘Meteoroids as one of the sources for exosphere formation on airless bodies in the inner solar system’, Space Science Reviews, 217, 50, pp. 141. Available at: 10.1007/s11214-021-00827-6
  17. Kogan, J. (2007) Introduction to Clustering Large and High-Dimensional Data. Cambridge University Press.
  18. Komorowski, M., Marshall, D.C., Salciccioli, J.D. and Crutain, Y. (2016) ‘Exploratory data analysis’, Secondary Analysis of Electronic Health Records, pp. 185203. Available at: 10.1007/978-3-319-43742-2_15
  19. Laštovička, J. (2023) ‘Progress in investigating long-term trends in the mesosphere, thermosphere, and ionosphere’, Atmospheric Chemistry and Physics, 23(10), pp. 57835800. Available at: 10.5194/acp-23-5783-2023
  20. Li, G. and Jung, J.J. (2023) ‘Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges’, Information Fusion, 91, pp. 93102. Available at: 10.1016/j.inffus.2022.10.008
  21. Li, Z., Zhu, Y. and Van Leeuwen, M. (2023) ‘A survey on explainable anomaly detection’, ACM Transactions on Knowledge Discovery from Data, 18(1), pp. 154. Available at: 10.1145/3609333
  22. Liang, R., Huang, C., Zhang, C., Li, B., Saydam, S. and Canbulat, I. (2023) ‘The fusion of data visualisation and data analytics in the process of mining digitalisation’, IEEE Access 11, pp. 4060840628. Available at: 10.1109/ACCESS.2023.3267813
  23. Lin, C., Labzovskii, L.D., Mak, H.W.L., Fung, J.C., Lau, A.K., Kenea, S.T., Bilal, M., Vande, H.J.D., Lu, X. and Ma, J. (2020) ‘Observation of PM2.5 using a combination of satellite remote sensing and low-cost sensor network in Siberian urban areas with limited reference monitoring’, Atmospheric Environment, 227, p. 117410. Available at: 10.1016/j.atmosenv.2020.117410
  24. Liu, X., Lu, D., Zhang, A., Liu, Q. and Jiang, G. (2022a) ‘Data-driven machine learning in environmental pollution: gains and problems’, Environmental science & technology, 56(4), pp. 21242133. Available at: 10.1021/acs.est.1c06157
  25. Liu, Y., Tong, D., Cheng, J., Davis, S.J., Yu, S., Yarlagadda, B., Clarke, L.E., Brauer, M., Cohen, A.J., Kan, H. et al. (2022b) ‘Role of climate goals and clean-air policies on reducing future air pollution deaths in China: a modelling study’, The Lancet Planetary Health, 6(2), pp. e92e99. Available at: https://www.osti.gov/servlets/purl/1855837
  26. Lloyd, S.P. (1957) ‘Least squares quantization in PCM’, Technical Report RR-5497, Bell Laboratories. Available at: https://www.stat.cmu.edu/–˝rnugent/PCMI2016/papers/LloydKMeans.pdf
  27. Lu, Q., Fu, H., Wang, R. and Lu, S. (2022) ‘Collisionless magnetic reconnection in the magnetosphere’, Chinese Physics B, 31(8), p. 089401. Available at: 10.1088/1674-1056/ac76ab
  28. MacQueen, J.B. (1967) ‘Some methods for classification and analysis of multivariate observations’, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, 1, pp. 281297.
  29. Mak, H.W.L. and Lam, Y.F. (2021) ‘Comparative assessments and insights of data openness of 50 smart cities in air quality aspects’, Sustainable Cities and Society, 69, p. 102868. Available at: 10.1016/j.scs.2021.102868
  30. Mak, H.W.L., Laughner, J.L., Fung, J.C.H., Zhu, Q. and Cohen, R.C. (2018) ‘Improved satellite retrieval of tropospheric NO2 column density via updating of air mass factor (AMF): Case study of southern China’, Remote Sensing, 10. Available at: 10.3390/rs10111789
  31. Mak, H.W.L. and Ng, D.C.Y. (2021) ‘Spatial and socio-classification of traffic pollutant emissions and associated mortality rates in high-density Hong Kong via improved data analytic approaches’, International Journal of Environmental Research and Public Health, 18(12), p. 6532. Available at: 10.3390/ijerph18126532
  32. Mumuni, A., Mumuni, F. and Gerrar, N.K. (2024) ‘A survey of synthetic data augmentation methods in machine vision’, Machine Intelligence Research, 21(5), pp. 831869. Available at: 10.1007/s11633-022-1411-7
  33. Mwitondi, K., Al Sadig, I., Hassona, R., Taylor, C. and Yousef, A. (2018a) ‘Statistical estimate of radon concentration from passive and active detectors in Doha’, Data, 3(3). Available at: 10.3390/data3030022
  34. Mwitondi, K., Munyakazi, I. and Gatsheni, B. (2018b) ‘Amenability of the United Nations Sustainable Development Goals to big data modelling’, International Workshop on Data Science-Present and Future of Open Data and Open Science, 12–15 Nov 2018, Joint Support Centre for Data Science Research, Mishima Citizens Cultural Hall, Mishima, Shizuoka, Japan.
  35. Mwitondi, K., Munyakazi, I. and Gatsheni, B. (2020) ‘A robust machine learning approach to SDG data segmentation’, Journal of Big Data, 7(97). Available at: 10.1186/s40537-020-00373-y
  36. Mwitondi, K.S., Moustafa, R.E. and Hadi, A.S. (2013) ‘A data-driven method for selecting optimal models based on graphical visualisation of differences in sequentially fitted ROC model parameters’, Data Science Journal, 12, pp. WDS247WDS253. Available at: 10.2481/dsj.WDS-045
  37. Mwitondi, K.S. and Said, R.A. (2013) ‘A data-based method for harmonising heterogeneous data modelling techniques across data mining applications’, Journal of Statistics Applications & Probability, 2(3), pp. 293305. Available at: 10.12785/jsap/020312
  38. Mwitondi, K.S. and Said, R.A. (2021) ‘Dealing with Randomness and Concept Drift in Large Datasets’, Data, 6(7). Available at: 10.3390/data6070077
  39. Mwitondi, K.S. and Zargari, S.A. (2018) ‘An iterative multiple sampling method for intrusion detection’, Information Security Journal: A Global Perspective, 27(4), pp. 230239. Available at: 10.1080/19393555.2018.1539790
  40. Newby, D.E., Mannucci, P.M., Tell, G.S., Baccarelli, A.A., Brook, R.D., Donaldson, K., Forastiere, F., Franchini, M., Franco, O.H., Graham, I. et al. (2015) ‘Expert position paper on air pollution and cardiovascular disease’, European Heart journal, 36(2), pp. 8393. Available at: 10.1093/eurheartj/ehu458
  41. Omrani, N.E., Keenlyside, N., Matthes, K., Boljka, L., Zanchettin, D., Jungclaus, J.H. and Lubis, S.W. (2022) ‘Coupled stratosphere-troposphere-atlantic multidecadal oscillation and its importance for near-future climate projection’, NPJ Climate and Atmospheric Science, 5(1), p. 59. Available at: 10.1038/s41612-022-00275-1
  42. Pan, Q., Harrou, F. and Sun, Y. (2023) ‘A comparison of machine learning methods for ozone pollution prediction’, Journal of Big Data, 10(1), p. 63. Available at: 10.1186/s40537-023-00748-x
  43. Pika, A., ter Hofstede, A.H., Perrons, R.K., Grossmann, G., Stumptner, M. and Cooley, J. (2021) ‘Using big data to improve safety performance: An application of process mining to enhance data visualisation’, Big Data Research, 25, p. 100210. Available at: 10.1016/j.bdr.2021.100210
  44. Ridzuan, F. and Zainon, W.M.N.W. (2022) ‘Diagnostic analysis for outlier detection in big data analytics’, Procedia Computer Science, 197, pp. 685692. Available at: 10.1016/j.procs.2021.12.189
  45. Shafiev, T. (2024) ‘Development of a mathematical model and an efficient computational algorithm for predicting atmospheric pollution in industrial regions’, in AIP Conference Proceedings, AIP Publishing. Available at: 10.1063/5.0199817
  46. Sillmann, J., Aunan, K., Emberson, L., Büker, P., Van Oort, B., O’Neill, C., Otero, N., Pandey, D. and Brisebois, A. (2021) ‘Combined impacts of climate and air pollution on human health and agricultural productivity’, Environmental Research Letters, 16(9), p. 093004. Available at: 10.1088/1748-9326/ac1df8
  47. Soares, P.H., Monteiro, J.P., Gaioto, F.J., Ogiboski, L. and Andrade, C.M.G. (2023) ‘Use of association algorithms in air quality monitoring’, Atmosphere, 14(4), p. 648. Available at: 10.3390/atmos14040648
  48. Sun, Y. (2016) ‘The changing role of China in global environmental governance’, Rising Powers Quarterly, 1(1), pp. 4353.
  49. United Nations (2015) ‘Sustainable Development Goals’. Available at: https://www.un.org/sustainabledevelopment/sustainable-development-goals/
  50. United Nations Environment Programme (UNEP) (2023) ‘Annual report: Keeping the promise’. Available at: https://www.unep.org/annualreport/2023
  51. Vardoulakis, S., Valiantis, M., Milner, J. and ApSimon, H. (2007) ‘Operational air pollution modelling in the UK—street canyon applications and challenges’, Atmospheric Environment, 41(22), pp. 46224637. Available at: 10.1016/j.atmosenv.2007.03.039
  52. Wang, Z., Wang, P., Liu, K., Wang, P., Fu, Y., Lu, C.T., Aggarwal, C.C., Pei, J. and Zhou, Y. (2024) ‘A comprehensive survey on data augmentation’, arXiv preprint arXiv:240509591. Available at: 10.48550/arXiv.2405.09591
  53. World Health Organization (2021) Global Quality Guidelines: Particulate Matter (PM2.5 & PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide, World Health Organisation, p. 290. Available at: https://www.who.int/publications/i/item/9789240034228
  54. Wu, X., Wen, Q. and Zhu, J. (2024) ‘Association rule mining with a special rule coding and dynamic genetic algorithm for air quality impact factors in Beijing, China’, PloS one, 19(3), p. e0299865. Available at: 10.1371/journal.pone.0299865
  55. Xu, M., Tian, W., Zhang, J., Screen, J.A., Zhang, C. and Wang, Z. (2023) ‘Important role of stratosphere-troposphere coupling in the arctic mid-to-upper tropospheric warming in response to sea-ice loss’, npj Climate and Atmospheric Science, 6(1), p. 9. Available at: 10.1038/s41612-023-00333-2
  56. Xu, Y., Ho, H.C., Wong, M.S., Deng, C., Shi, Y., Chan, T.C. and Knudby, A. (2018) ‘Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM2.5’, Environmental pollution, 242, pp. 14171426. Available at: 10.1016/j.envpol.2018.08.029
  57. Yan, H., Cordier, M. and Uehara, T. (2024) ‘Future projections of global plastic pollution: Scenario analyses and policy implications’, Sustainability, 16(2), p. 643. Available at: 10.3390/su16020643
  58. Zhang, B., Rong, Y., Yong, R., Qin, D., Li, M., Zou, G. and Pan, J. (2022a) ‘Deep learning for air pollutant concentration prediction: A review’, Atmospheric Environment, 290, p. 119347. Available at: 10.1016/j.atmosenv.2022.119347
  59. Zhang, L. and Yang, G. (2022) ‘Cluster analysis of PM2.5 pollution in China using the frequent itemset clustering approach’, Environmental Research, 204, p. 112009. Available at: 10.1016/j.envres.2021.112009
  60. Zhang, Q., Meng, X., Shi, S., Kan, L., Chen, R. and Kan, H. (2022b) ‘Overview of particulate air pollution and human health in China: Evidence, challenges, and opportunities’, The Innovation, 3(6). Available at: 10.1016/j.xinn.2022.100312
  61. Zheng, S. and Kahn, M.E. (2017) ‘A new era of pollution progress in urban China?’, Journal of Economic Perspectives, 31(1), pp. 7192. Available at: 10.1257/jep.31.1.71
  62. Zhou, X., Hu, Y., Liang, W., Ma, J. and Jin, Q. (2021) ‘Variational LSTM enhanced anomaly detection for industrial big data’, IEEE Transactions on Industrial Informatics, 17(5), pp. 34693477. Available at: 10.1109/TII.2020.3022432
  63. Zusman, E., Elder, M. and Sussman, D.D. (2020) A Clean Air Sustainable Development Goal (SDG). Singapore: Springer Nature Singapore, pp. 112. Available at: 10.1007/978-981-15-2527-8_50-1
Language: English
Submitted on: Nov 29, 2024
|
Accepted on: Aug 28, 2025
|
Published on: Sep 24, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Kassim Mwitondi, Hugo Wai Leung Mak, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.