Estimation of Ground PM2.5 Concentrations using a DEM-assisted Information Diffusion Algorithm: A Case Study in China

When estimating national PM2.5 concentrations, the results of traditional interpolation algorithms are unreliable due to a lack of monitoring sites and heterogeneous spatial distributions. PM2.5 spatial distribution is strongly correlated to elevation, and the information diffusion algorithm has been shown to be highly reliable when dealing with sparse data interpolation issues. Therefore, to overcome the disadvantages of traditional algorithms, we proposed a method combining elevation data with the information diffusion algorithm. Firstly, a digital elevation model (DEM) was used to segment the study area into multiple scales. Then, the information diffusion algorithm was applied in each region to estimate the ground PM2.5 concentration, which was compared with estimation results using the Ordinary Kriging and Inverse Distance Weighted algorithms. The results showed that: (1) reliable estimate at local area was obtained using the DEM-assisted information diffusion algorithm; (2) the information diffusion algorithm was more applicable for estimating daily average PM2.5 concentrations due to the advantage in noise data; (3) the information diffusion algorithm required less supplementary data and was suitable for simulating the diffusion of air pollutants. We still expect a new comprehensive model integrating more factors would be developed in the future to optimize the interpretation accuracy of short time observation data.

one factor for calculation 23 . Therefore, it is difficult to combine the factors affecting PM 2.5 spatial distribution with the algorithms. These shortcomings restrict the accuracy of traditional interpolation algorithms in ground PM 2.5 concentration estimation.
In recent years, the information diffusion algorithm, a normal diffusion interpolation model based on the fuzzy mapping philosophy, has shown a certain advantage in its application to small-sample and nonlinear interpolation 24,25 . The information diffusion algorithm transforms the original information to value samples of the fuzzy set, and then assigns information of single-value samples to different discrete points based on the diffusion functions 26,27 . In this way, reliable results can be produced even if the underlying physical processes are not fully understood 28 . This algorithm has been widely applied to various fields such as risk assessment 29,30 and sea depth estimation 31 . However, the heterogeneous spatial distribution of monitoring data still affects the estimation accuracy of the information diffusion algorithm. Wang et al. 32 improved this by calculating the optimal window width in inhomogeneous samples (that is, using the monitoring data of areas adjacent to points for estimation, instead of all monitoring data, for interpolation calculation). The work of Dong et al. 33 indicated that PM 2.5 spatial distribution is correlated with elevation, and that collinearity exists between elevation and multiple factors characterized by human activities 1 . Hence, this study used the information diffusion algorithm to estimate ground PM 2.5 concentrations by assuming the segmentation results of a digital elevation model (DEM) as the interpolation window widths. We exploited the advantages of the information diffusion algorithm to combine elevations affecting PM 2.5 spatial distribution with the algorithm. As a result, the problem of a lack of data samples was removed, supplementary data acquisition was simplified, and model improvement was simpler than when using traditional algorithms.
This study estimated ground PM 2.5 concentrations in a study area in China based on both ground-measured PM 2.5 data and DEM data using the information diffusion algorithm to address the data incompleteness issue. The purpose of this study was to overcome the shortcomings of traditional interpolation algorithms, and provide a reliable method for rapidly monitoring ground PM 2.5 concentrations on a large scale. The main goals were to: (1) verify the effectiveness of the DEM-assisted information diffusion algorithm for estimating ground PM 2.5 concentrations; (2) compare estimation accuracy between the DEM-assisted information diffusion algorithm, the Inverse Distance Weighted method, and the Ordinary Kriging method to verify the reliability of the information diffusion algorithm; and (3) analyze the applicability of the DEM-assisted information diffusion algorithm for monitoring data on different time scales.

Study area and data
Study area. The study area lies between 19.98° N and 44.78° N, and 99.53° E and 129.33° E in mainland China. It covers a land area of 4,870,000 km 2 , 51% of China's total area (Fig. 1). The study area contains the eastern coastal area, which boasts the most rapidly developing economy, the highest levels of industrialization and urbanization, and the densest cluster of urban agglomerations in the country. In the past few years, the air quality has severely deteriorated in all provinces and municipalities of the study area, greatly affecting the lives of residents 34 . This phenomenon is caused by anthropogenic emission sources from urban expansion, industrial emissions, and excessive use of vehicles 3,35 . Therefore, this area is an important site for studying ground PM 2.5 concentrations, aided by the gradual implementation of a ground PM 2.5 concentration monitoring network since 2013. According to segmentation result, the study site was divided into seven geographical subareas with different range of elevation (Table 1). High altitude areas with plateau and mountain are mainly concentrated in three regions, followed by Tibetan Plateau area (average altitude 3300 m), Yunnan-Guizhou Plateau area (average altitude 1712m), Inner Mongolia-Loess Plateau area (average altitude 1279 m). There are some hills in Northeastern area (average altitude 499 m), Southeastern coastal area (average altitude 311 m) and Central area (average altitude 743 m), while the plain mainly distributed in Eastern coastal area (average altitude 63 m). Therefore, you may find the study area has a large altitude range and diverse landscape types, making it suitable for analyzing the relationship between ground PM 2.5 concentration and elevation.
PM2.5 monitoring site data. We used hourly PM 2.5 monitoring data provided by the China Environment Monitoring Center (http://106. 37.208.233:20035). Data was collected from 763 monitoring sites from January to December 2015 (Fig. 1), and PM 2.5 concentrations were measured using the Tapered Element Oscillating Microbalance (TEOM) method 36 . Based on hourly PM 2.5 monitoring data, daily, monthly, and quarterly average data was obtained to test the applicability of the information diffusion algorithm on different time scales. DEM data. To consider the impact of elevation on PM 2.5 spatial distribution, we obtained the DEM product generated during the Shuttle Radar Topography Mission (SRTM) from the United States Geological Survey (USGS) (ftp://e0srp01u.ecs.nasa.gov/srtm/version2/SRTM3). This product was used by the National Aeronautics and Space Administration (NASA) to measure PM 2.5 concentrations in over 80% of the land area between latitudes 56° S and 60° N, with a spatial resolution of 90 m. Here, we used ArcGIS10.2 to bridge single-frame product data and obtain DEM data for the whole study area.
Methodology. The multi-scale segmentation algorithm was combined with the information diffusion algorithm to estimate ground PM 2.5 concentrations in the study area using ground-measured PM 2.5 data, based on the assumption that ground PM 2.5 spatial distribution is highly correlated with elevation. The main methods were as follows: (1) 10-fold cross-validation was used to assess the accuracy of models, and the monitoring data set was randomly split into ten equal-sized data sets, with one group as the validation set and the remaining nine groups together forming the training set; (2) multi-scale segmentation based on the DEM, where the study area was segmented into multiple regions of homogeneous elevation, generating interpolation windows for the information diffusion algorithm; (3) in each segmented region, the information diffusion algorithm was used to estimate ground PM 2.5 concentrations based on daily, monthly, and quarterly average PM 2.5 data from ground monitoring sites; (4) the Ordinary Kriging and Inverse Distance Weighted algorithms were used to estimate ground PM 2.5 concentrations based on the same training set; (5) the experiments were performed ten times, and accuracy assessment indexes of different algorithms were calculated using the validation set in each experiment; and (6) the mean values of the accuracy assessment indexes were calculated, and the differences between the information diffusion algorithm and traditional interpolation algorithms were explained. The flowchart of the full analysis procedure is illustrated in Fig. 2.
Multi-scale segmentation. A recent study showed that the different spatial scale of characteristic variables could be affecting the PM 2.5 concentration estimation performance of the interpolation algorithm 37 . Thus, it is necessary to find the optimized spatial scale. Because DEM data was included to consider the impact of elevation on PM 2.5 spatial distribution, remote-sensing image segmentation was used to determine the optimized spatial scale and ensure the degree of automation of the entire procedure. Multi-scale segmentation is one of the most popular remote-sensing image segmentation algorithms 38,39 . It is a bottom-up region integration process starting from the pixel layer. Image objects are integrated into larger image objects layer by layer, producing segmentation results of different segmentation scales. Based on this approach, this study used the eCognition 9.0 multi-scale segmentation algorithm to perform data segmentation based on the obtained DEM data. Segmentation divided the study area into multiple segments with small changes in internal elevation. During interpolation, the segments were used as windows to select sample points and integrate elevation data with the information diffusion algorithm.

Information diffusion.
Information diffusion is a mathematical model proposed to address the information incompleteness issue during assessment of small-sample events such as natural disasters 27 . Based on this approach, this study regarded PM 2.5 monitoring data as incomplete sample data, used the diffusion function to binarize the fuzzy sets of PM 2.5 data, and performed interpolation to obtain homogeneously distributed grid point data. Data of ground PM 2.5 concentration variability was then obtained for the whole study area. The following elaborates on the information diffusion algorithm procedure.
(1) Constructing the monitoring spaces To effectively construct the monitoring space, the minimum step is calculated for each variable in the information diffusion model, and the range of monitoring values is discretized to an equal-interval discrete space, namely, the monitoring space.
where x i , y i , and z i are coordinates of Sample Point i, and x j , y j , and z j are coordinates of Sample Point j.
According to the minimum steps given by the above equations, the monitoring space can be constructed as The default values of u 0 , v 0 , and w 0 are 0. We assumed that the minimum steps of the monitoring space in three dimensions are simple coefficient h can be given by the averaging model 24 , as follows: − − > h b a n b a n b a n b a n b a n b a n b a n n where = a a a a , , x y z ; n is the total number of samples; and h is the diffusion factor. (2) Calculating the information diffusion of sample points Assuming that the diffusion information of the PM 2.5 monitoring point x y z ( , , ) where h x , h y , and h z can be given by Equation (4); l represents the number of sample points; and u v w , , The original information matrix Q can be given by calculating the information diffusion of all monitoring points in the monitoring space: l i j k can be obtained by Equation (5) and l represents the number of sample points. The original information matrix ijk m t s can be calculated using Equation (6). (4) Calculating the fuzzy relationship matrix A causal fuzzy relationship matrix can be obtained from the original information matrix Q. In this study, we transformed the original information matrix The fuzzy set centroid can be calculated to obtain the estimated ground PM 2.5 concentration: and r ijk can be given by Equation (7), in which = … i m 1, 2, and = … j t 1, 2, . To smooth PM 2.5 concentration variations in the information diffusion algorithm results, we constructed a buffer up to 100 km long (this was verified as the maximum distance over which the interpolation method could generate a better result than remote-sensing based methods 18 ) in each segmented region to increase the number of sample points in each region.
Inverse Distance Weighted algorithm. The Inverse Distance Weighted algorithm is based on the proximity-similarity principle, i.e., the closer two objects are to one another, the more similar their properties, and the further they are from one another, the more different their properties. The algorithm assumes that the impact of unknown points on to-be-estimated points diminishes as the distance increases. The general formula of the interpolation algorithm is as follows: where Ẑ S ( ) 0 is the estimated value of the to-be-estimated point S 0 ; N is the number of sample points around the to-be-estimated points that are required in the estimation process;λ i is the weight of all sample points in the calculation process; and Z S ( ) i is the sample value of S i . The weight can be given by: where p is a parameter. The optimal value can be determined by acquiring the minimum value of the root-mean-square error (RMSE). In addition, the Inverse Distance Weighted algorithm was implemented by the land statistic module of ArcGIS10.2.
Ordinary Kriging interpolation. The Ordinary Kriging algorithm is the best linear unbiased prediction (BLUP) for to-be-estimated points based on the structured characteristics of sample points 40 . If the structured variable Z is not constant and the mathematical expectation is unknown, the Ordinary Kriging algorithm can be used for interpolation calculation. It is implemented as follows: Accuracy assessment. This study used the accuracy verification method based on ground monitoring sites to assess the validity of spatial interpolation. We extracted the estimation results of validation points and then compared monitoring values with estimation results. The main indicators used for accuracy assessment included the absolute error (AE), root mean squared error (RMSE), and range of absolute error (RAE). AE represents the absolute deviation between the estimation result and the monitoring data. RMSE represents the mean deviation between estimated and monitored values, which indicates the reliability of the interpolation model. Because the experiments were performed ten times, the mean values of the accuracy assessment indicators were assessed for the different algorithms. The above three indicators are calculated as follows: where O represents the monitoring data and S represents the estimation data; E max and E min represent the maximum and minimum AEs, respectively; and n represents the total number of samples. Because the number of validation points in each experiment was more than one, the mean value of AE (called MAE below) was calculated for each experiment. Following the above steps, this study used the information diffusion, Ordinary Kriging, and Inverse Distance Weighted algorithms to estimate ground PM 2.5 concentrations. Then, a comparative analysis was performed between the estimates and the monitoring results of the validation points. To study the applicability of different algorithms for different time scales, calculations were based on the daily average data of January 2, 2015, monthly average data of December 2015, and average data of the winter (January to February) of 2015.

Segmentation of homogeneous-elevation regions.
Using the multi-scale segmentation technique, segmentation scales were continuously optimized to obtain segmentation results at different scales. Figure 3 shows that the number of segments in the study area diminished as the segmentation scale increased. When a small segmentation scale was selected, there were many small segments, and segment homogeneity was high. As the segmentation scale increased, the number of segments decreased, segments with a large variation in elevation characteristics remained approximately the same, and small segments were combined. When the segmentation scale was less than 400 (dimensionless unit), some segments did not have sample points (Fig. 3). When the segmentation scale was between 400 and 800, correlations between the PM 2.5 concentration estimates increased (from 0.77 to 0.79). When the segmentation scale reached 800, the correlation was 0.83. When the segmentation scale exceeded 800, the correlations of results decreased. To ensure sufficient sample points for calculation in each segment, and to achieve high accuracy, we selected 800 as the optimal segmentation scale based on the above analysis and the calculation requirements. According to Fig. 3, the study area was divided into seven regions of homogeneous elevation ( Table 1). The elevations varied slightly in each segmented region and their elevation characteristics were similar, providing appropriate calculation windows for an analysis of ground PM 2.5 concentration estimates. Table 2 shows that the correlations of estimates derived from the information diffusion algorithm were more accurate than those of the Ordinary Kriging and Inverse Distance Weighted algorithms based on daily, monthly, and quarterly average data. The information diffusion algorithm showed a slight advantage for global correlation between estimation results (daily average: 0.80; monthly average: 0.83; quarterly average: 0.82), followed by the Ordinary Kriging algorithm (0.79, 0.82, and 0.82, respectively), and finally the Inverse Distance Weighted algorithm (0.78, 0.81, and 0.80, respectively).

Ground PM 2.5 concentration estimation. Accuracy analysis.
Estimates using the information diffusion algorithm showed RMSE (mean value of RMSE) values that were 0.73 µg/m 3 to 1.17 µg/m 3 smaller than those of the Inverse Distance Weighted algorithm ( Table 2). The RMSE values of daily average and quarterly average results of information diffusion algorithm were larger than those of Ordinary Kriging, but the differences were relatively small. According to the MAE and RAE analysis results, all results except daily average data showed that information diffusion algorithm results were more reliable. Specifically, information diffusion results based on monthly average data showed higher accuracy.
Due to the impacts of algorithm principles, the estimation results of the three algorithms varied in some details. In general, using the information diffusion algorithm, estimated values beyond a PM 2.5 concentration interval of 35-115 µg/m 3 were smaller than monitoring values, while estimates within the interval of 35-115 µg/ m 3 were larger than monitoring values (Table 3). Moreover, the distribution of estimation results in each interval varied with time scale for the other two algorithms (Table 3). Estimation errors of the information diffusion algorithm typically fluctuated within a smaller range, showing higher stability than the other two methods (Tables 2  and 3).
Analysis of PM 2.5 spatial distribution. Ground PM 2.5 concentrations estimated by the above three algorithms are illustrated in Fig. 4. The daily, monthly, and quarterly average PM 2.5 concentrations obtained by different algorithms showed similar spatial distributions. In the daily average results, PM 2.5 concentrations were higher than 145 µg/m 3 in central Hebei, southeast Shandong, north Ningxia, Chongqing, and east Sichuan. In the monthly average results, PM 2.5 concentrations were high in central and south Hebei, west Shandong, north Henan, south Shaanxi, and central Liaoning. The spatial distribution of high PM 2.5 concentrations using the quarterly average results was similar to that using the monthly average results. In the estimation results for all time scales, high PM 2.5 concentrations were similarly spatially distributed, occurring in northeastern and eastern coastal areas of China. In addition, PM 2.5 concentrations were high in densely populated and urban built-up regions such as Sichuan Basin and Guanzhong Plain, and were relatively low in high-altitude regions such as the Tibetan Plateau and the Yunnan-Guizhou Plateau. PM 2.5 concentrations were also relatively low in southeastern coastal areas with abundant average annual precipitation, despite this including the densely populated and built-up urban region of the Pearl River Delta.
Whilst similar, the spatial distribution results of the three algorithms showed certain differences. Due to the impacts of algorithm principles, the results obtained by the Inverse Distance Weighted algorithm indicated laminar diffusion from the monitoring sites in the center to their surroundings. The estimation results of the Ordinary Kriging algorithm showed this phenomenon to a lesser extent; however, this algorithm considered that only one factor affected PM 2.5 spatial distribution, i.e. the relationships between monitoring data, thus leading to a concentric diffusion of estimation results. The information diffusion algorithm corrected such phenomena to some extent; therefore, variations in estimation results were smoother than the Inverse Distance Weighted algorithm, and did not show any clear concentric diffusion. In addition, the extreme values in the estimation results of the information diffusion algorithm were less than those of the other two algorithms, indicating that our proposed method had a smoothing effect on extreme values.

Features of the DEM-assisted information diffusion algorithm.
The results showed that PM 2.5 concentrations in high-altitude regions (Tibetan Plateau, Yunnan-Guizhou Plateau, north Sichuan, northwest Hubei, and southwest Hubei) were lower than those of surrounding areas (Fig. 4), indicating a clear negative correlation between PM 2.5 concentration and elevation.
Using the information diffusion algorithm, this study considered the impact of elevation on PM 2.5 spatial distribution, and segmented the study area based on elevation variations. Based on estimation results, this algorithm revealed a greater impact of elevation than the other two algorithms. We performed a comparative analysis on south Shaanxi and peripheral areas of the Qinling Mountains, shown in Fig. 5 as the rectangular area called Qinling area. The highest altitude in this area exceeds 3400 m (the red area in Fig. 5a, called Qinling Mountain). The northern area is Guanzhong Plain (Guanzhong urban agglomeration, where cities and people are clustered) and the southern area is Ankang Basin, which have different altitudes. The blue line is the segmentation boundary. Monitoring values in Qinling Mountain were relatively low, and were higher in Guanzhong Plain. Estimates based on the three methods showed similar patterns generally in global area (Fig. 4), but spatial distributions of ground PM 2.5 concentrations varied greatly in local area of complex terrain (Fig. 5). Estimates were relatively high in Guanzhong Plain but in high-altitude regions (Qinling Mountain), the information diffusion algorithm estimated significantly lower PM 2.5 estimates, while other methods that did not consider the effect of terrain still estimated relatively high PM 2.5 concentrations. Qinling Mountains play an important role in preventing PM 2.5 diffusion, resulting in no high PM 2.5 concentration areas. This suggests that the information diffusion algorithm estimates based on DEM segmentation more accurately reflect the real spatial distribution of PM 2.5 than traditional interpolation algorithms.

Discussion
By combining the advantages of DEM data and the information diffusion algorithm, this study developed a DEM-assisted information diffusion algorithm, which we have proved is a powerful tool for estimating ground PM 2.5 concentrations. The DEM-assisted information diffusion algorithm reduced the algorithm dependence on the distribution characteristics of monitoring sites, and improved the performance of the estimation algorithm. According to our results, estimates obtained by the information diffusion algorithm were more accurate than those obtained by traditional interpolation algorithms.

Comparison between the information diffusion algorithm and traditional algorithms.
Although the ground PM 2.5 monitoring network has become increasingly sophisticated, the inhomogeneous spatial distribution of monitoring sites and lack of data affect the estimation accuracy of traditional interpolation algorithms 23 .
As traditional interpolation algorithms only consider the relationships between sample data, the estimates obtained by such algorithms have obvious defects and usually display concentric spatial distributions, which do not comply with PM 2.5 diffusion rules 26 . This phenomenon shows that traditional algorithms are dependent on the distribution (site density and mutual distance) of monitoring sites 15 . Thus, the information diffusion algorithm is more reliable when processing small-sample data 24 . In addition, DEM multi-scale segmentation divides the study area into different segments, and a buffer is built for each segment, reducing abnormal phenomena such as abrupt variations and defects in PM 2.5 concentrations. Thus, the method proposed in this study reduces the dependence of the algorithm on the distribution characteristics of monitoring sites. Subsequently, an accurate estimation and mapping of PM2.5 concentrations could benefit more from the proposed method, which is able to delineate the distribution pattern of PM2.5 concentrations finely by high resolution interpolation at local region (e.g., 1 km resolution in this study). The information diffusion algorithm was slightly superior to traditional interpolation algorithms based on the accuracy assessment of whole monitoring sites (Table 2 and Fig. 4). The information diffusion algorithm error was relatively small, the correlation was 0.01-0.02 higher than other algorithms slightly, and the R 2 of the estimated monthly average ground PM 2.5 concentrations reached 0.83. Furthermore, the proposed method considering the DEM has a big advantage at local region of complex terrain, while the R 2 for the proposed method is significantly better than that of other both methods, respectively 0.81 for the proposed method, 0.73 for Ordinary Kriging method and 0.68 for inverse Distance Weighted method (Fig. 5). This is mainly attributed the subareas divided by segmenting the DEM that they pay more attention to the monitoring points within themselves, while it has been proved that PM2.5 distribution is related to landscape and is region-dependence 41 . Thus, PM 2.5 concentrations estimated by the DEM-assisted information diffusion algorithm are highly reliable. Previous studies have suggested that ground PM 2.5 concentrations are linked to multiple factors, and the impact factor varies with location 22 . However, in most studies, ground PM 2.5 concentrations are highly correlated to elevation 22,42 , while elevation is evidently collinear with economic development level 1 . This study used DEM segmentation results to represent the economic development level of each segmented region, and regarded the edges of these regions as hard break lines during calculation, overcoming the difficulty in spatial quantification of economic and social data. Compared with traditional interpolation algorithms, the DEM-assisted information diffusion algorithm proposed in this study only requires commonly available DEM data, to obtain the high resolution interpolation PM2.5 concentrations, especially for local area. This algorithm can obtain more accurate estimation results without acquiring optimal parameter values or transforming the normal distribution of data, making it more applicable than the other two algorithms compared in this study.
Analysis of estimation results based on different time scales. The estimation accuracy obtained by three methods differed significantly with time scale. The accuracy of estimates using quarterly average data was better than that of estimates using daily and monthly average data ( Table 2). It could be attributed to the data quality improvement due to the average of long time series monitoring data. On the other hand, it seemed that information diffusion algorithm perform better for daily average data, even though the accuracies of three methods using daily data are all low. The main reason for this might be the smoothing effect of the information diffusion algorithm on extreme values, while the variation of RAE for the information diffusion method is superior obviously to other both methods (Table 2). Furthermore, monthly average data not only reduced the influence of extreme values on estimation results, but also kept the data from being over-average. With extending the time scale, the advantages of the proposed method become unapparent, especially for quarterly average data. Thus, this study suggests that the information diffusion algorithm is more applicable for estimating daily/monthly average monitoring data. Subsequently, our method could contribute more to the mapping of high temporal resolution monitoring data.
Analysis of PM 2.5 spatial distribution patterns. PM 2.5 concentrations showed significant spatial heterogeneity (Fig. 4), which might be related to emission factors and dispersal conditions 41 . In winter, regions with high ground PM 2.5 concentrations are the northeastern area, eastern coastal area, and Guanzhong urban aggolmeration 22 . In the densely populated and built-up urban area of Sichuan Basin, the ground PM 2.5 concentration is much higher than in peripheral areas. Low precipitation, heavy pollution due to coal burning, developed industries, high urbanization, and excessive use of motor vehicles could explain the poor air conditions 4,43,44 . In the middle of Inner Mongolia and Shanxi province, despite relatively high altitudes, dust originating from the Loess plateau and the mixed barren soil often degrades air quality 17,45 . Coal burning could also be a dominant cause of air pollution in these areas, especially during winter. Additionally, dispersal conditions related to the topography could affect the distribution of cold and hot points 1 . Meanwhile, in the southeastern coastal area with abundant precipitation, and high-altitude regions such as the Tibetan Plateau and the Yunnan-Guizhou Plateau, ground PM 2.5 concentrations are low. In summary, due to the imperfect ground PM 2.5 monitoring network, common ground PM 2.5 concentration inversion models require significant supplementary data to obtain higher inversion accuracy (for example, terrain, weather, and economic development data) 4,13 . The high spatial heterogeneity of supplementary data makes it very difficult to apply and develop these models 13 . The information diffusion algorithm only requires DEM data; therefore, its maturity and wide application reduces difficulties relating to algorithm development. Nevertheless, we still expect that the new models would be developed to integrate all relevant factors (e.g., land use, landscape pattern, population, meteorological variables), while it is already proved further that PM 2.5 concentration is related to land use and the corresponding landscape, especially for high resolution mapping at regional scale 44,46 .

Conclusions
To address the common spatial heterogeneity issue of monitoring data used in atmospheric science studies, this research applied the information diffusion algorithm to ground PM 2.5 concentration estimation. The results showed that, compared with traditional interpolation algorithms, the DEM-assisted information diffusion algorithm focuses on the impact of elevation factors on PM 2.5 spatial distribution, and is not restricted by the lack of monitoring sites and their inhomogeneous spatial distribution. Therefore, the estimation results of this algorithm are more accurate and more appropriately distributed than those of traditional interpolation algorithms, to contribute to the high temporal resolution mapping. Using DEM segmentation results as supplementary information, estimates using this algorithm are more compliant with the diffusion rules of air pollutants. However, the proposed method only considered the effect of topographic characteristic. In the future research, we expect that an advanced method is developed to consider more factors when conducting the interpretation based on the station-based PM 2.5 monitoring data, since the spatial distribution of PM2.5 concentrations is strongly impacted by many factors, for example, landscape type, economic development level, and so on.