Abstract
Sophisticated data-intensive approaches have been widely applied in addressing air pollution problems, with applications ranging from remote sensing quantification of ground-level concentrations of atmospheric pollutants to associating particulate matter with atmospheric CO2. The biggest challenge to such applications, however, remains model optimisation—a problem that derives from inherent randomness in training, validation and test data. A standard approach to address data randomness hinges on data harmonisation and data augmentation—two concepts that naturally appeal to the highly “non-orthogonal” 17 Sustainable Development Goals (SDG). This paper proposes a novel approach with built-in robust mechanisms for generating the “most parsimonious model” – with potential “global representativeness” to highlight data-driven solutions of regional and global environmental challenges. The proposed approach is powered by two algorithms that sequentially estimate, maximise and optimise parameters from thirty thousand ground-level air pollution data points obtained from different locations in southern China; generate statistical associations among the pollutants and present interpretable visual outputs. The algorithms balance the power of data, machine learning techniques and underlying domain knowledge to enhance problem identification and solution development. The results show optimal associations between spatio-temporal attributes and relevant pollutants, thus provide useful insights into the state of pollution in southern China. The findings also indicate robustness of features that exhibit a great potential for building analytical bridges across disciplines and sectors. This research is expected to contribute to our understanding of how pollutants are spatially distributed within the lower part of the atmosphere, potentially leading to improved model performance and innovation. Further, it will also contribute to the design of methods to deal with challenges posed by the “non-orthogonality” of socio-economic, technical and environmental attributes of the SDG.
