Skip to main content
Have a personal or library account? Click to login
Improving Deterministic Air Quality Forecasts Using Supervised Machine Learning: A Feasibility Study Cover

Improving Deterministic Air Quality Forecasts Using Supervised Machine Learning: A Feasibility Study

By: Lech Łobocki  
Open Access
|Dec 2025

Figures & Tables

Figure 1.

Location of the air-quality monitoring stations used in this study. Model grid is marked with grey lines. Station codes are explained in Table 1

Figure 2.

GEM-AQ predictions vs. observations at six air quality monitoring stations in Warsaw, during the test period in 2024

Figure 3.

The hybrid deterministic-SML predictions vs. observations at six air quality monitoring stations in Warsaw, during the test period in 2024. All the features listed in Section II were used for training

Figure 4.

Residuals (errors) histograms in the hybrid deterministic-SML predictions at six air-quality monitoring stations in Warsaw during the test period in 2024

Figure 5.

Residuals (errors) histograms in the deterministic (GEM-AQ) predictions at six air-quality monitoring stations in Warsaw during the test period in 2024

Figure 6.

Quantile-Quantile comparison of distributions of the hybrid deterministic-SML predictions vs. observations at six air quality monitoring stations in Warsaw, during the test period in 2024

Figure 7.

Quantile-Quantile comparison of distributions of the deterministic GEM-AQ predictions vs. observations at six air quality monitoring stations in Warsaw, during the test period in 2024

Figure 8.

Observed and predicted PM2.5 concentrations during the first 300 hours of the test period. OBS, observations; P-ML, prediction using hybrid deterministic-SML; P-D, deterministic forecast only

Location of air quality monitoring stations used in this study*

Station CodeAddressGeographic Coordinates (WGS84)
Longitude [°E]Latitude [°N]
PL0140AWarszawa, Al. Niepodległości 227/23321.00472452.219298
PL0141AWarszawa, ul. Wokalna 121.03381952.160772
PL0143AWarszawa, ul. Kondratowicza 821.04245852.290864
PL0308AWarszawa, ul. Tołstoja 220.93301852.285073
PL0717AWarszawa, ul. Bajkowa 17/2121.17623352.188474
PL0739AWarszawa, ul. Chróścickiego 16/1820.90607352.207742

Performance metrics of statistical forecasts using RF regression model, without consideration of meteorological variables in RF*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości0.05920.00990.999960.999982
PL0141Aul. Wokalna0.03970.00660.999980.999991
PL0143Aul. Kondratowicza0.02870.00590.999990.999995
PL0308Aul. Tołstoja0.03820.00890.999990.999995
PL0717Aul. Bajkowa0.09030.01790.999950.999975
PL0739Aul. Chróścickiego0.23050.01610.999370.999693

Performance metrics of deterministic-statistical forecasts using the GB regression model*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości0.1330.0850.99980.9999
PL0141Aul. Wokalna0.1150.0690.99980.9999
PL0143Aul. Kondratowicza0.1110.0700.99990.9999
PL0308Aul. Tołstoja0.1480.0990.99990.9999
PL0717Aul. Bajkowa0.1550.0980.99990.9999
PL0739Aul. Chróścickiego0.1860.0800.99960.9998

Performance metrics of statistical forecasts using RF regression model, based upon the deterministic meteorological forecast (without consideration of the model-predicted PM2_5 concentration)*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości9.126.700.1540.461
PL0141Aul. Wokalna8.296.190.1980.497
PL0143Aul. Kondratowicza8.916.400.0700.471
PL0308Aul. Tołstoja11.648.590.1290.464
PL0717Aul. Bajkowa11.397.900.1980.547
PL0739Aul. Chróścickiego8.8906.430.0660.496

Performance metrics of the deterministic forecasts done using GEM-AQ*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości9.156.190.05850.5152
PL0141Aul. Wokalna8.395.860.08550.5236
PL0143Aul. Kondratowicza8.925.940.03250.5497
PL0308Aul. Tołstoja11.58.110.09780.5343
PL0717Aul. Bajkowa9.115.940.27190.6071
PL0739Aul. Chróścickiego8.305.540.02050.6077

Performance metrics of deterministic-statistical forecasts using the XGBoost regression model*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości0.4320.1370.99810.9991
PL0141Aul. Wokalna0.3950.1220.99820.9991
PL0143Aul. Kondratowicza0.4720.1170.99740.9987
PL0308Aul. Tołstoja0.4440.1510.99870.9994
PL0717Aul. Bajkowa0.9510.2950.99440.9972
PL0739Aul. Chróścickiego0.5100.1890.99690.9985

Performance metrics of deterministic-statistical forecasts using the RF regression model*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości0.07120.01230.999950.99997
PL0141Aul. Wokalna0.05740.00960.999960.99998
PL0143Aul. Kondratowicza0.03620.00790.999980.99999
PL0308Aul. Tołstoja0.05340.01320.999980.99999
PL0717Aul. Bajkowa0.09360.02160.999950.99997
PL0739Aul. Chróścickiego0.26850.01990.999150.99958

Performance metrics of deterministic-statistical forecasts using the SVR model*

Station IDLocationRMSEMAECoD (R2)PCC (r)
PL0140AAl. Niepodległości1.3390.4710.98180.9913
PL0141Aul. Wokalna1.4450.5030.97550.9885
PL0143Aul. Kondratowicza1.2190.4320.98260.9914
PL0308Aul. Tołstoja2.0600.6750.97270.9867
PL0717Aul. Bajkowa1.7910.6600.98020.9909
PL0739Aul. Chróścickiego1.6140.5250.96920.9853
DOI: https://doi.org/10.2478/oszn-2025-0018 | Journal eISSN: 2353-8589 | Journal ISSN: 1230-7831
Language: English
Page range: 15 - 29
Published on: Dec 31, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year
Related subjects:

© 2025 Lech Łobocki, published by National Research Institute, Institute of Environmental Protection
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.