Synergy of AERONET and MODIS AOD products in the estimation of PM2.5 concentrations in Beijing

Satellite aerosol optical depth (AOD) is widely used to estimate particulate matter with aerodynamic diameter ≤2.5 µm (PM2.5) mass concentrations. Polar orbiting satellite retrieval 1–2 times each day is frequently affected by cloud, snow cover or misclassification of heavy pollution. Novel methods are therefore required to improve AOD sampling. Sunphotometer provides much more AODs than satellite at a fixed point. Furthermore, much of the aerosol pollution is regional. Both factors indicate that sunphotometer has great potential for PM2.5 concentration estimation. The spatial representativeness of the Aerosol Robotic Network (AERONET) AOD at Beijing site is investigated by linear regression analysis of 13-year daily paired AODs at each grid from Moderate Resolution Imaging Spectroradiometer (MODIS) on Aqua and Beijing AERONET. The result suggests a good correlation for the whole Beijing Administrative region, with regional mean correlation coefficient exceeding 0.73. Pixel AODs are then estimated from AERONET AOD using linear equations, which are verified to have the same accuracy as that of MODIS AOD. Either AOD from MODIS retrieval or estimation from AERONET AOD in the absence of MODIS pixel AOD is finally used to predict PM2.5 concentration. Daily AOD sampling in average is enhanced by 59% in winter when MODIS AODs are very limited. More importantly, synergy of AERONET and MODIS AOD is able to improve the estimation of regional mean PM2.5 concentrations, which indicates this method would play a significant role in monitoring regional aerosol pollution.

southeastern United States. Hence, solving the under sampling problem is one of fundamental requirements for the improvement of PM 2.5 estimation from space 10 . Several methods have been made to solve this issue. Kloog et al. 11 estimated daily PM 2.5 for grid cell without AOD data by using the mean PM 2.5 levels from nearby grid cells. A combined MODIS and MISR AOD has been used to improve AOD sampling 3 . Lu et al. proposed a method to estimate missing AODs by assuming a linear relationship between PM 2.5 and AOD 12 . The methods mentioned above require either surface PM 2.5 measurements to constrain the AOD filling or more satellite products.
Different from space-borne sensors, Aerosol Robotic Network (AERONET) has been providing robust AOD measurements in good temporal resolution for nearly two decades. More importantly, the boundaries of the region resembling AOD temporal variability as that at AERONET sites vary between 200 and 500 km depending on their specific locations, which indicates temporal variation of AOD at AERONET sites could be representative for a larger region [13][14][15] . This deduction is reasonable since the temporal variation of AOD is overwhelming determined by weather conditions that are able to lead to a coherent variation of AOD in a fairly large area. For example, a stable stagnant condition favors for a regional haze, on the contrary, a cold front always disperses large-spread haze dramatically. Both phenomena are often observed in North China Plain (NCP). Therefore, it is not surprising that a high agreement between spring AOD at Beijing and Xianghe has been reported 16 . A close connection between spatial distribution of AOD and the circulation types has also been shown 15,17 . Changes of temporal and spatial emissions should have played a minor role in the AOD representation of one location relative to weather conditions. This opens opportunities to enhance PM 2.5 estimation from AOD if we can establish a robust synergy between spatial converge from satellite and temporal coverage from AEROENT for estimating PM 2.5 .
Beijing, the capital of the largest developing country in the world, China, has been suffering from heavy air pollution in recent years, especially in winter. A persistent regional air pollution episode occurred in winter of 2013 as recorded by a regional air quality monitoring network 18,19 , however, very few MODIS AOD retrievals are available for the estimation of a fine spatial variation of PM 2.5 . Heavy aerosol pollution is probably misclassified into clouds by the MODIS cloud discrimination algorithm since aerosol signal is to some extent close to that of clouds. Figure 1 presents examples of MODIS AOD retrievals under different conditions in NCP. Missing MODIS retrievals are likely due to clouds (on February 15) or misclassification of heavy haze to clouds (on February 14,16,18). On the contrary, AERONET AODs are available due to its high temporal resolution, especially on polluted days. Thus, AERONET AODs show their potential in the estimation of PM 2.5 under these conditions.
Here we evaluate the representative spatial boundaries of AERONET-derived AODs at Beijing station (39.97°N, 116.38°E) and investigate its potential for the estimation of PM 2.5 concentration at a regional scale. MODIS/Aqua daily level-2 AOD products in Beijing area from 2002 to 2014 are firstly interpolated into a regular gridded product with a spatial resolution of 0.1°. Linear equations have then been established from simultaneous daily AERONET AODs at Beijing and gridded MODIS AODs in the Beijing Administrative area, i.e., we create a distinct linear equation between Beijing AEORNET AODs and MODIS AODs at each grid. These equations are then used to fill missing MODIS AODs from AERONET AODs. Spatial distribution of PM 2.5 concentration is finally estimated from a mixed effects model. Validation shows that this method is robust in the Beijing Administrative area that suggests a great potential of AERONET AOD products for monitoring PM 2.5 concentration, especially in heavily polluted regions.

Result
AOD sampling enhancement by the synergy of AERONET and MODIS. Figure 2 shows spatial distribution of correlation coefficients (R) between MODIS gridded AODs in the entire Beijing administrative area and Beijing AERONET AODs. Seasonal mean R values are 0.73 ± 0.14, 0.76 ± 0.09, 0.78 ± 0.11, 0.74 ± 0.14 for spring, summer, autumn and winter, respectively. As expected, R values decrease as a function of distance from the site. Meanwhile, R presents a decreasing gradient from eastern to western region (especially in winter), likely due to their differences in topography, land use and transport path. Spatial variation of paired data points should also contribute to this result. Figure 3 presents the performance of data fusion method during the winter of 2013. Before data fusion, MODIS retrieves AOD at approximately 50% of probability (regional mean). The retrieval percentage shows a spatial variation, ranging from about 20% in north to about 70% in south. After data fusion, regional mean AOD sampling over entire area increases to 81%. More specifically, AOD sampling substantially increases by ~50% in west and by ~40% in north.
MODIS and fused AODs were compared with independent sunphotometer measurements at Xianghe and SDZ (Fig. 4). MODIS AOD at the grid closest to the station is used to compare with sunphotometer daily-mean AOD. MODIS works very well in the retrieval of AOD at Xianghe and SDZ, with R of 0.91 (data points of 440) and 0.86 (data points of 191), respectively. 70.5% and 48.7% of MODIS AODs falling within the expected uncertainty of ±0.05 ± 20% × AOD at Xianghe and SDZ. MODIS AODs are closer to ground truth at Xianghe than at SDZ. This is likely associated with complex terrain at SDZ. The accuracy of AOD estimation from the synergy of AERONET and MODIS AOD is close to that of the MODIS retrieval. The mean prediction error (MPE) and root mean square error (RMSE) fused versus sunphotometer AOD are similar as those between MODIS and sunphotometer AOD. Fusion of AERONET-derived pixel AOD and MODIS AOD results in an increase of AOD sampling by 65% at Xianghe and 93% at SDZ in the winter (Fig. 4), which definitely would be expected to improve PM 2.5 estimation from AOD.  (Fig. 5). Averaged satellite-derived PM 2.5 concentrations over the entire area were 95.5 ± 67.8 μg m −3 and 104.3 ± 74.6 μg m −3 for these two datasets. PM 2.5 values in southwest and north estimated from the fused AOD are larger than those from MODIS AOD by >20 μg m −3 , which is mainly because AOD sampling increases substantially in these sub-regions by the fusion method. This result indicates that PM 2.5 is probably underestimated if MODIS only AODs are used due to its under-sampling of AODs. Improvement of PM 2.5 estimation as a result of AOD sampling enhancement by fusion of AERONET AODs is clearly shown in Fig. 6, the scatter plot of station PM 2.5 measurements and estimations at 25 stations. To evaluate the performance of both AOD datasets, we adopt the cross validation (CV) method. Here we collect collocate PM 2.5 and AOD data at 25 stations. Only data at 24 stations are used to train the model while the data at the remaining station are used to evaluate the model each time. This leave-one-out process was repeated for each of the 25 sites, which follows the same procedure as previous study for cross-validation 10 . R between measured and MODIS    Figure 7 presents the histogram of three PM 2.5 datasets at 25 stations, namely, ground measurement, estimations from fused and MODIS AOD. Compared with MODIS AOD derived PM 2.5 , Fused-AOD derived PM 2.5 shows a histogram of PM 2.5 much closer to that of ground measurements. The correlation coefficient and the mean absolute difference between ground measurements and fused-AOD derived PM 2.5 are 0.90 and 3.59%, respectively, which are less than those between ground observations and estimations from MODIS AOD (0.88 and 3.90%).
The method is also applied in other three seasons. PM 2.5 estimation has also been improved as a result of enhancement of AOD sampling by the AOD fusion, although the improvements are less than that in the winter. PM 2.5 concentrations estimated from the fused AOD are all closer to ground measurements at 25 stations than those from MODIS AOD only (Table 1). Both mean PM 2.5 and its standard deviation (temporal variability) estimated from fused AOD increase to some extent to approach to that of ground measurements.

Discussion and Conclusions
We use a statistical analysis to compare the AOD products from MODIS and AERONET Beijing between 2002 and 2014. The correlation analyses indicate that AOD at AERONET site can be used as representative of temporal variability for a larger region around its location. Grid AOD is then estimated from AERONET AOD at Beijing based on a linear regression analysis. The fused-AOD dataset provides a relatively higher temporal coverage in the winter (81%) instead of 50% days by MODIS only retrievals. PM 2.5 concentration estimation using MODIS only AOD data resulted in an underestimation of PM 2.5 . PM 2.5 concentrations calculated by the mixed effects model based on improved AOD sampling increased by 0.8, 6.1, 2.7, 6.5 μg m −3 in the spring, summer, autumn and winter, respectively.
The method in this study to fill missing MODIS AOD can supply more AOD data into chemistry models and model assimilations, provide good spatial and temporal coverage of PM 2.5 concentrations based on increasing AOD-PM 2.5 matchups, and offer better estimations of PM 2.5 variability for epidemiological studies. Although only MODIS/Aqua data (13:30 local standard time) are used to generate the correlation map, using this map to calculate AOD at other times of the day may also be promising since the temporal variation of AOD is small. For example, Mishra et al. 13 used  It should be noted that this method is highly dependent on the spatial representativeness of ground site and thereby optimal deployments of ground observations can enlarge the application of data fusion 14 . Besides, changes in spatial emissions over the domain in the past years may also play a role in the spatial correlation relationships that needs further study.  (Fig. 8) and daily-mean PM 2.5 concentrations are calculated from hourly measurements within a day. Automated monitoring systems are installed at each site to measure ambient concentration of SO 2 , NO 2 , O 3 , CO, and PM 2.5 and PM 10 according to China Environmental Protection Standards. PM 2.5 concentrations are measured by the Tapered Element Oscillating Microbalance method (TEOM). The TEOM's filter is heated to avoid particle-bound water that may result in a slight underestimation of PM 2.5 mass concentration owing to volatilization of semi-volatile material 20 . Inter-comparison of PM 2.5 concentrations from the Beijing U.S. Embassy and the nearby Ministry of Environmental Protection site indicated that these two data sets were in good agreement in the temporal variation but the former was slightly higher than the latter since the beta attenuation monitor was used at the Beijing U.S. Embassy 21 .
Aeronet Aod. AERONET is a ground-based internationally federated, globally distributed network of sun photometers. AERONET AOD is derived from direct beam solar measurements at wavelengths from ultraviolet to infrared 24 . We used the cloud-screened and quality checked level 2.0 AOD product (http://aeronet.gsfc.nasa.gov/) 25 .   ; summer (June-July-August); autumn (September-October-November) and winter (December-January-February). Pearson coefficient maps are derived from linear correlation analysis between two variables above (Fig. 2). A threshold value of correlation coefficient (R ≥ 0.5) is set to determine whether AERONET AOD can be used in the estimation of regional PM 2.5 . For grids with R ≥ 0.5, we use the linear-fit AOD values based on AERONET Beijing to fill missing values of MODIS AOD retrievals.
The mixed effects Model. A mixed effects model to investigate the AOD-PM 2.5 relationship is as follows.
where PM i,j represents PM 2.5 value at site i on day j; α and β represent fixed intercept and slope respectively; u j and v j are the random intercept and slope. s i ~ N (0, σ s 2 ) and ε i,j ~ N (0, σ 2 ) represent the random intercept of site i and the error term at site i on day j. s 2 σ and σ 2 denote the variances for s i and ε i,j . ∑ is the variance-covariance matrix for the day-specific random effects 7 . We select the site-specific satellite AOD values for each surface site where it falls within a 10 × 10 km 2 grid to collocate PM 2.5 concentrations. If there are more than one site within a single 10 × 10 km 2 grid, the PM 2.5 values of those sites are averaged. With this process, there remain 25 pairs of AOD and PM 2.5 data for the model development.
Data availability. The datasets generated during and/or analyzed in the current study are available from the corresponding author on reasonable request.