Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Concurrent Spatiotemporal Daily Land Use Regression Modeling and Missing Data Imputation of Fine Particulate Matter Using Distributed Space-Time Expectation Maximization Publisher



Taghavishahri SM1, 2 ; Fasso A3 ; Mahaki B1 ; Amini H4, 5, 6
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Epidemiology and Biostatistics, School of Health, Isfahan University of Medical Sciences, Isfahan, Iran
  2. 2. Student Research Committee, School of Health, Isfahan University of Medical Sciences, Isfahan, Iran
  3. 3. Department of Management, Information and Production Engineering, University of Bergamo, Bergamo, Italy
  4. 4. Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA, United States
  5. 5. Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute, Basel, Switzerland
  6. 6. University of Basel, Basel, Switzerland

Source: Atmospheric Environment Published:2020


Abstract

In this study, a spatiotemporal land use regression (LUR) model using distributed space-time expectation maximization (D-STEM) software was developed. We trained the model using daily mean ambient particulate matter ≤2.5 μm (PM2.5) data measured hourly in 2015 at 30 regulatory monitoring network stations within the megacity of Tehran, Iran. Since a substantial amount of measured data were missing (48% of the total number of daily PM2.5 observations), we used the D-STEM to impute missing data and compared the missing imputation performance between different fitted models and the mean substitution method. We used h-block cross-validation (h-block CV) method in order to account for spatial autocorrelation in the model building and validation. In the imputation of missing data, the D-STEM LUR model had a mean absolute percentage error (MAPE) of 25.3%, outperforming the mean substitution method, which resulted in MAPE of 28.3%. The spatiotemporal R-squared was 0.73 and the average CV R-squared of 2-block and 5-block cross-validations was 0.60. These values were 0.68 and 0.47 when the spatial aspect of the LUR model was assessed, and 0.995 and 0.992 when the temporal aspect of the LUR model was assessed. This study demonstrated the competence of D-STEM software in spatiotemporal modeling, missing data imputation, and mapping of daily ambient PM2.5 at a very high spatial resolution (20 m × 20 m). These estimations are available for future research, especially for epidemiological studies on short- and/or long-term health effects of ambient PM2.5. Generally, we found D-STEM as a promising tool for spatiotemporal LUR modeling of ambient air pollution, especially for those models that rely on regulatory network monitoring stations with a considerable amount of missing data. © 2020 Elsevier Ltd