Constructing transferable and interpretable machine learning models for black carbon concentrations

Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PNacc, r = 0.73-0.85) and nitrogen dioxide (NO2, r = 0.68-0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R2 = 0.80-0.86; mean absolute error MAE = 3.90-4.73 %) and at the urban background site in Dresden (R2 = 0.79-0.84; MAE = 4.23-4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PNacc and NO2 on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time.

Saved in:
Bibliographic Details
Main Authors: Fung, Pak Lun, Savadkoohi, Marjan, Zaidan, Martha Arbayani, Niemi, Jarkko V, Timonen, Hilkka, Pandolfi, Marco, Alastuey, Andrés, Querol, Xavier, Hussein, Tareq, Petäjä, Tuukka
Other Authors: European Commission
Format: artículo biblioteca
Language:English
Published: Elsevier 2024-01-22
Subjects:Virtual sensors, BC estimation, Neural network, Relative importance, SHAP, Traffic emission, Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation, Make cities and human settlements inclusive, safe, resilient and sustainable, Responsible Consumption and Production,
Online Access:http://hdl.handle.net/10261/346140
http://dx.doi.org/10.13039/501100000780
https://api.elsevier.com/content/abstract/scopus_id/85183508183
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-idaea-es-10261-346140
record_format koha
institution IDAEA ES
collection DSpace
country España
countrycode ES
component Bibliográfico
access En linea
databasecode dig-idaea-es
tag biblioteca
region Europa del Sur
libraryname Biblioteca del IDAEA España
language English
topic Virtual sensors
BC estimation
Neural network
Relative importance
SHAP
Traffic emission
Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Make cities and human settlements inclusive, safe, resilient and sustainable
Responsible Consumption and Production
Virtual sensors
BC estimation
Neural network
Relative importance
SHAP
Traffic emission
Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Make cities and human settlements inclusive, safe, resilient and sustainable
Responsible Consumption and Production
spellingShingle Virtual sensors
BC estimation
Neural network
Relative importance
SHAP
Traffic emission
Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Make cities and human settlements inclusive, safe, resilient and sustainable
Responsible Consumption and Production
Virtual sensors
BC estimation
Neural network
Relative importance
SHAP
Traffic emission
Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Make cities and human settlements inclusive, safe, resilient and sustainable
Responsible Consumption and Production
Fung, Pak Lun
Savadkoohi, Marjan
Zaidan, Martha Arbayani
Niemi, Jarkko V
Timonen, Hilkka
Pandolfi, Marco
Alastuey, Andrés
Querol, Xavier
Hussein, Tareq
Petäjä, Tuukka
Constructing transferable and interpretable machine learning models for black carbon concentrations
description Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PNacc, r = 0.73-0.85) and nitrogen dioxide (NO2, r = 0.68-0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R2 = 0.80-0.86; mean absolute error MAE = 3.90-4.73 %) and at the urban background site in Dresden (R2 = 0.79-0.84; MAE = 4.23-4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PNacc and NO2 on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time.
author2 European Commission
author_facet European Commission
Fung, Pak Lun
Savadkoohi, Marjan
Zaidan, Martha Arbayani
Niemi, Jarkko V
Timonen, Hilkka
Pandolfi, Marco
Alastuey, Andrés
Querol, Xavier
Hussein, Tareq
Petäjä, Tuukka
format artículo
topic_facet Virtual sensors
BC estimation
Neural network
Relative importance
SHAP
Traffic emission
Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Make cities and human settlements inclusive, safe, resilient and sustainable
Responsible Consumption and Production
author Fung, Pak Lun
Savadkoohi, Marjan
Zaidan, Martha Arbayani
Niemi, Jarkko V
Timonen, Hilkka
Pandolfi, Marco
Alastuey, Andrés
Querol, Xavier
Hussein, Tareq
Petäjä, Tuukka
author_sort Fung, Pak Lun
title Constructing transferable and interpretable machine learning models for black carbon concentrations
title_short Constructing transferable and interpretable machine learning models for black carbon concentrations
title_full Constructing transferable and interpretable machine learning models for black carbon concentrations
title_fullStr Constructing transferable and interpretable machine learning models for black carbon concentrations
title_full_unstemmed Constructing transferable and interpretable machine learning models for black carbon concentrations
title_sort constructing transferable and interpretable machine learning models for black carbon concentrations
publisher Elsevier
publishDate 2024-01-22
url http://hdl.handle.net/10261/346140
http://dx.doi.org/10.13039/501100000780
https://api.elsevier.com/content/abstract/scopus_id/85183508183
work_keys_str_mv AT fungpaklun constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT savadkoohimarjan constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT zaidanmarthaarbayani constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT niemijarkkov constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT timonenhilkka constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT pandolfimarco constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT alastueyandres constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT querolxavier constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT husseintareq constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
AT petajatuukka constructingtransferableandinterpretablemachinelearningmodelsforblackcarbonconcentrations
_version_ 1802820442246873088
spelling dig-idaea-es-10261-3461402024-05-18T20:37:21Z Constructing transferable and interpretable machine learning models for black carbon concentrations Fung, Pak Lun Savadkoohi, Marjan Zaidan, Martha Arbayani Niemi, Jarkko V Timonen, Hilkka Pandolfi, Marco Alastuey, Andrés Querol, Xavier Hussein, Tareq Petäjä, Tuukka European Commission 0000-0002-3707-2601 0000-0002-5453-5495 0000-0002-6549-9899 0000-0002-0241-6435 0000-0002-1881-9044 Virtual sensors BC estimation Neural network Relative importance SHAP Traffic emission Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation Make cities and human settlements inclusive, safe, resilient and sustainable Responsible Consumption and Production Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PNacc, r = 0.73-0.85) and nitrogen dioxide (NO2, r = 0.68-0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R2 = 0.80-0.86; mean absolute error MAE = 3.90-4.73 %) and at the urban background site in Dresden (R2 = 0.79-0.84; MAE = 4.23-4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PNacc and NO2 on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time. This study is supported by the RI-URBANS project (Research Infrastructures Services Reinforcing Air Quality Monitoring Capacities in European Urban & Industrial Areas, European Union’s Horizon 2020 research and innovation program, Green Deal, European Commission, contract 101036245). RI-URBANS is implementing the ACTRIS (http://actris.eu) strategy for the development of services for improving air quality in Europe. The authors would also like to thank the support from “Agencia Estatal de Investigación” from the Spanish Ministry of Science and Innovation under the project CAIAC (PID2019-108990RB-I00), AIRPHONEMA (PID2022-142160OB-I00), and the Generalitat de Catalunya (AGAUR, SGR-447), Technology Industries of Finland Centennial Foundation to Urban Air Quality 2.0 project, Research Council of Finland Flagship funding (project number: 337549, 337552), Research Council of Finland Research Fellowship funding (project number: 355330) and European Commission via on-CO2 Forcers And Their Climate, Weather, Air Quality And Health Impacts (FOCI, project number: 101056783). P.L. Fung would like to acknowledge Artificial Intelligence for Urban Low-Emission Autonomous Traffic (AIforlessAuto) funded under the Green and Digital transition call from the Research Council of Finland (project numbers: 347197, 347198) for the support. M. Savadkoohi would like to thank the Spanish Ministry of Science and Innovation for her FPI grant (PRE-2020-095498). The authors also take the opportunity to thank Dr. Susanne Bastian from the Saxon State Office For Environment for contributing to data collection in the study. Open access funded by Helsinki University Library. Peer reviewed 2024-02-08T09:27:31Z 2024-02-08T09:27:31Z 2024-01-22 artículo http://purl.org/coar/resource_type/c_6501 Environment International 184: 108449 (2024) 01604120 http://hdl.handle.net/10261/346140 10.1016/j.envint.2024.108449 http://dx.doi.org/10.13039/501100000780 38286044 2-s2.0-85183508183 https://api.elsevier.com/content/abstract/scopus_id/85183508183 en #PLACEHOLDER_PARENT_METADATA_VALUE# info:eu-repo/grantAgreement/EC/H2020/101036245 Environment international Publisher's version https://doi.org/10.1016/j.envint.2024.108449 Sí open Elsevier