Introduction

Aerosols are fine particles suspended in air either in liquid or solid form. They are often observed as dust, smoke or haze and play a crucial role in global climatic fluctuations and regional environments1. Aerosols not only have direct and indirect radiative forcing effects on climate systems2,3, but also influence air quality and human health4. In recent years, aerosols have been measured and quantified by ground-based and satellite-borne instrumentation to improve the characterization of their physical-chemical behaviors and effects on radiative transfer and climate change5,6,7,8. Aerosol optical thickness (AOT or τ), defined as the integrated extinction coefficient over a vertical column of unit cross section, is an important physical parameter for characterizing aerosols and evaluating aerosol-inducing atmospheric pollution and climatic effects9. Spatio-temporal variations of AOT have attracted worldwide attention. Meij et al.10 analyzed global and regional AOT changing trends from 2000 to 2009 by using AOT data from MODIS and MISR (Multi-angle Imaging SpectroRadiometer). They showed a decrease in AOT over Europe and North America and an increase over South and East Asia. Koukouli et al.11 showed a decrease in AOT over the Southern Balkan/Eastern Mediterranean region, the highest rate of decrease recorded as 4.09% in summer and the lowest as 2.55% in winter. In China, Luo et al.12 used AOT values retrieved from direct solar radiation data of 47 stations over China from 1961 to 1990. They discovered that AOT increased over most regions with particularly rapid increases in the southwest, middle and lower reaches of the Yangtze River. Slight decreases were observed in the western region of Xinjiang and parts of Yunnan. Jiang et al.13 analyzed the spatio-temporal distribution of AOT in the Pearl River Delta (PRD) from 2007 to 2009 by using MODIS AOT data and the derived fine-mode aerosol and indicated that over the PRD region AOT was largest in spring, followed by autumn and summer and smallest in winter.

Atmospheric aerosols arise from natural sources as well as anthropogenic activities14. Natural sources include soil dust, sea spray, volcanic dust and biomass burning, while human activities primarily are fuel combustion, industrial emission, transportation and nonindustrial fugitive sources (e.g. construction work). Anthropogenic sources always produce fine particles that remain suspended for many days and play significant roles in aerosol formation. It has been discovered that rapid urbanization has caused heavy urban traffic and increased industrial emissions and consequently an increase in the organic component of aerosols15,16. Kaufman et al.17 argued that spatial and temporal distributions of aerosols are closely related to terrain, climate, population density and socio-economic activities. Vegetation also affects aerosol distribution, positively by biomass burning or negatively by absorption and deposition. In fact, aerosols are always caused by the interaction between natural and anthropogenic sources. Topography, vegetation coverage, industrial development, construction and population growth could be the possible contributors to aerosol distribution.

As the most developed province in China, Guangdong experiences unprecedented rapid economic growth and population explosion in the past three decades, with concomitant severe air pollution. Increasing haze days and deteriorating air quality, greatly impairing the health of the population and the visibility over the region, have drawn widespread attention from the government and the public18,19,20,21. What determines the aerosol distribution in Guangdong? Are there natural and/or anthropogenic factors? These questions are still under debate. For exploring the driving factors of aerosols in Guangdong, this paper analyzes the spatio-temporal variation of AOT over Guangdong Province using MODIS-retrieved aerosol data from 2010 to 2012. It simulates the correlation between AOT and elevation, normalized difference vegetation index (NDVI) and several typical socio-economic index by using Self-organizing Maps (SOM) and linear regression methods, aiming to demonstrate the spatial distribution and temporal evolution of AOT in Guangdong, as well as the natural and socio-economic factors that influence atmospheric aerosols.

Guangdong province (109°45′ to 117°20′ E longitude and 20°09′ to 25°31′ N latitude) (Figure 1) is located in the southernmost part of China. Its surface area is approximately 179,800 km2, about 2% of China's land area. As the province with the largest economy and fastest development in China, Guangdong's Gross Domestic Product (GDP) has reached 11% of China's total in 201222. The Guangdong topography consists of highlands in the eastern, western and northern border, hills and terraces in the middle and plains in the southern estuarine region. It has a humid subtropical climate with ample sunlight, high temperature and abundant precipitation. Such climate tends to increase the likelihood of photochemical reactions and thus the formation of photochemical smog, further aggravating haze conditions. The air quality in the province has sharply deteriorated, due to its complex topography and climate, rapid industrial development, population explosion and heavy traffic pressure. Situated in south-central Guangdong is the world-famous urban agglomeration-Pearl River Delta (PRD), comprising 11 mega-cities with Guangzhou, Hong Kong and Macau at its core. The PRD region boasts the fastest economic growth and population increase in the world. It is also one of the areas facing the severest air pollution. Industrial emission, complicated traffic network and basin topography, are the reasons for haze lingering over the region23.

Figure 1
figure 1

Map showing the location of the 21 administrative divisions (cities) in Guangdong province, China.

The map was produced by Li using SuperMap 6.0.

Five data sources have been used for this paper: 1) MODIS-retrieved aerosol level-2 C051 data with 10 km spatial resolution at 550 nm wavelength (τ0.55) from 2010 to 2012 and Level 1 Atmosphere Archive and Distribution System (LAADS) (at URL http://lpdaac.usgs.gov/lpdaac/get_data). The derived aerosol optical thickness (AOT) has been validated against ground measurements, with an accuracy of 0.05 ± 0.2 AOT over land and 0.03 ± 0.05AOT over oceans24,25. 2) MODIS Normalized Difference Vegetation Index (NDVI) level-3 monthly data at 1 km spatial resolution in the Sinusoidal projection, an indicator of vegetation canopy greenness and composite properties of leaf area; 3) 500m-resolution MODIS land cover type data classified by the IGBP global vegetation classification scheme; 4) the Shuttle Radar Topography Mission (SRTM) data set (90 × 90 m) provided by the CGIAR Consortium for Spatial Information (http://srtm.csi.cgiar.org/); 5) several socioeconomic index of each city in Guangdong obtained from the Guangdong Statistical Yearbook22.

Results

Aerosol spatio-temporal variation

The geographical distribution of the 3-year averaged AOT over the Guangdong province is shown in Figure 2(a). The highest values of AOT (τ0.55 > 0.7) occur over the Pearl River Delta region and eastern Guangdong, whereas the lowest values (τ0.55 < 0.4) mainly occur over the north and northeast. The maximum AOT occurs in Foshan, Zhongshan, Jiangmen, Guangzhou, Dongguan and Shenzhen. It is closely related to fine-particle aerosols and attributed to large emissions from industrial pollution, urban traffic and secondary aerosol generated by strong photochemical reactions18,26. According to 2012 statistics from the PRD air quality monitoring network (http://www.gdep.gov.cn/hjjce/kqjc/), most of the highest hourly PM10 observation in these cities greatly exceeded the mean daily-critical value of the national standard (150 μg/m3). A localized pocket including the two important transportation harbors for import and export, Chaozhou and Shantou in the eastern part of the province, also has a high value of τ0.55. Their high GDP growth (second to cities in PRD) and industrial development might be one reason, next to violent transport activity.

Figure 2
figure 2

(a) Spatial distribution of 3-year average aerosol optical thickness at 550 nm wavelength over Guangdong; (b) Topographic map of Guangdong obtained from SRTM. The AOT data was downloaded from NASA Level 1 Atmosphere Archive and Distribution System (LAADS) (http://lpdaac.usgs.gov/lpdaac/get_data) and the maps were produced by Li using SuperMap 6.0.

The spatial distribution of AOT in different seasons over the Guangdong province is displayed in Figure 3. It is similar to the yearly average AOT, thus the PRD region as the highest values and the north and northeast the lowest. Due to the coarse spatial resolution of AOT data and cloudy weather, some null values occur, especially in summer when the common occurrence of clouds makes AOT rarely observable. The seasonal averaged AOT is 0.351 in spring, 0.416 in summer, 0.254 in autumn and 0.292 in winter. The maximum AOT appeared in summer in Foshan and the western PRD region, with τ0.55 value over 1.5. It is related to peak precipitation (Table 1) and corresponding abundant atmospheric water vapor in that season. Droplet growth in aerosols increases its scattering coefficient and the total extinction coefficient, as well as the aerosol optical thickness of the whole atmospheric layer. In addition, photochemical reactions are more active in summer, producing more particles in the atmosphere. The second maximum AOT occurred in spring due to peak biomass burning activity during spring in Southeast Asia27. Values of AOT in winter and autumn are relatively low presumably due to the northeast monsoon, controlled by warm and moist air.

Table 1 Total precipitation and temperature during 2010-2012 in Guangdong
Figure 3
figure 3

Seasonal variations of multi-year average AOT over Guangdong during 2010–2012.

(a) Spring (March to May); (b) Summer (June to August); (c) Autumn (September to November); (d) Winter (December to February). The map was produced by Li using SuperMap 6.0.

Model analysis on factors influencing AOT

Unary linear regression

To seek the factors that contribute to aerosol optical thickness, we collected a set of data of 21 cities in Guangdong between 2010 and 2011. These data included elevation, vegetation index, the fraction of urbanized land and several socio-economic parameters, such as population, Gross Domestic Product (GDP), industrial and agricultural outputs, vehicle density and so on. All the socio-economic parameters of the cities are firstly divided by the corresponding area and then processed by natural logarithm to standardize the statistical analysis. We analyzed their correlation with AOT by using unary linear regression. As the results in Figure 4 show, the annual mean AOT in Guangdong correlates negatively with ln(elevation) and NDVI. It correlates positively with percentage of urbanized land, population density, GDP, Secondary Industrial (SI) output, Tertiary Industrial (TI) output, industrial output, construction output, civil vehicle (CV) density and private vehicle (PV) density. It is not relevant, however, to Primary Industrial (PI) output, i.e. agricultural activity.

Figure 4
figure 4

Relationship between τ0.55 and NDVI, elevation, urbanized land fraction and several socio-economic variables in the 21 Guangdong cities during 2010–2011.

The analysis was done by Origin software.

The R2 values for NDVI and elevation with τ0.55 are 0.782 and 0.731. The annual mean AOT rises with decreasing elevation and NDVI. High terrain prevents horizontal diffusion of air pollutants, thus Shaoguan (at 417 m) has a τ0.55 of 0.26, Foshan (26 m) 0.8 and Zhongshan (23 m) 0.77. In addition, areas with denser vegetation are less influenced by human activities and absorb more atmospheric particles, thus suffering less severe air pollution.

We also calculate the relationship between the percentage of urbanized land in the 21 cities and τ0.55, the resulting coefficient R2 being 0.43. This low correlation is caused by the two cities-Jiangmen and Zhuhai, which have small urbanized land areas but high values of AOT. Located close to the sea, these two cities have a warm, wet climate which facilitates droplet formation. Intensive temperature inversion caused by increased energy consumption and change of underlying type, as well as decreasing wind speed owing to the increased urbanized area and density, hinders the dilution and dispersion of air pollutants among the city group28.

Correlations between socio-economic factors and AOT proved to be high, except for primary industrial output (R2 = 0.06). Such positive correlations suggest a significant influence from anthropogenic emission on aerosols in Guangdong. R2 between AOT and population density is 0.588, while GDP and secondary industrial output are strongly related to AOT, the correlation coefficients being 0.732 and 0.738, respectively. Fast economic development, especially the development of metallurgy, manufacturing and energy sectors (R2 between industrial output and τ0.55 0.74) and construction (R2 = 0.616), leads to heavy emission of industrial gases, such as sulfur oxide, nitrogen oxide and smoke dust. The tertiary industry, involving transportation, post, hotels and catering services, is another important factor impacting aerosols with R2 being 0.682. Vehicle density is highly related to τ0.55, as R2 for civil and private vehicles density is 0.732 and 0.735, respectively.

SOM Analysis

A self-organizing map is used to intuitively describe the relationship between AOT and its contributing factors. NDVI, elevation, fraction of urbanized land and socio-economic variables were chosen to make the component planes in Figure 5, with aerosol optical thickness included as one plane. The input is a 13*42 matrix (42 samples of 13 elements), which is classified into 13 component planes, each with 10*10 neurons. The learning rate, neighborhood function and neighborhood radius were set at default values in the training process.

Figure 5
figure 5

(a) is the SOM visualization of AOT and its contributing factors with 13 component planes. (b) is the unified distance matrix (U-matrix) showing the distance between map neurons. Darker colors represent higher values in weight planes and larger distances in U-matrix. The map was produced in SOM toolbox in MATLAB 7.0.

Figure 5 shows the weight plane for each variable and visualizes the weights that connect each input to each of the neurons, darker colors representing higher values of the variables. Similarity in connection patterns of the inputs suggests a high correlation. At the bottom right is the U-matrix, indicating the distances between neighboring neurons and identifying the cluster structures of the map. The darker colors represent larger distances and the lighter colors represent smaller ones. Low AOT values (< 0.53) in the top left corner of the plane are linked to low values of ln(population density) (<6.8), ln(GDP/area) (<6), ln(SI/area) (<5.4), ln(TI/area) (<6), ln(industry/area) (<5.2), ln(construction/area) (<2.6), ln(civil vehicle density) (< 4.13) and ln(private vehicle density) (<4), but high NDVI (>0.58) and ln(elevation) (>4.59). In addition, high AOT in the bottom right are apparently associated with low NDVI and elevation and high values of population density, economic output and vehicle density. The corresponding units in the U-matrix reflect the close distance between the unit and its neighbors. We conclude from Figure 5 that τ0.55 is negatively correlated with NDVI and elevation and positively correlated with population density, GDP, secondary industry, tertiary industry, civil and private vehicle density. In contrast, AOT does not reveal an obvious relationship with urbanized land fraction and primary industry.

Multiple linear regression

We carried out multiple linear regression (MLR) between AOT and NDVI, elevation and socio-economic factors (not including urbanized land fraction and primary industry). AOT is set as the dependent variable, while NDVI and elevation are constant independent variables and the socio-economic factor is another independent variable. The Variance Inflation Factor (VIF) was also applied to indicate the multicollinearity among the independent variables. The results are displayed in Table 2. Each of the regression coefficients for the dependent and independent variables coincides with those from Unary linear regression (ULR), R2 rising to between 0.78 and 0.83. The VIF values of NDVI are around 10 while those of other variables are less than 10. This means there are some correlations among NDVI and other factors: elevation and human activities affect the vegetation coverage. Areas with low elevation or high population density always have low values of NDVI. Since vegetation, elevation and human activities affect the aerosols differently, all the variables are kept in the predictive models. The F tests show the models have statistically substantial predictive capability and all of the corresponding P values are less than 0.0001. The higher coefficients of GDP and vehicle also indicate the greater effects of economic development and vehicles on aerosols. We evaluated the accuracy of our results by computing the predicted τ0.55 of the 21cities in 2012 and the root mean square error (RMSE) between the actual and predicted values is calculated to range from 0.075 to 0.098, thus reflecting the stability of the models.

Table 2 Multiple linear regression models of annual AOT and contributing factors

Discussion

The spatial distribution of multi-year average aerosol optical thickness over the Guangdong province shows some interesting features. The highest values of τ0.55 mainly occur over the rapidly developing and populated Pearl River Delta region, especially in Foshan, Zhongshan, Jiangmen, Guangzhou, Dongguan and Shenzhen, next to the eastern part of Guangdong, where the import and export hubs, Chaozhou and Shantou, are located. Lower values occur in the plains in the south and the lowest ones in the mountainous north and northeast. Inspection of the Guangdong topographic map in Figure 2(b) illustrates that elevation impacts the spatial distribution of AOT. The overall topography goes from sea level in the south to elevated terrain in the north. The AOT over the middle and southern areas correspondingly is much higher than that in the north. Low values (τ0.55 < 0.4) occur in mountains and high hills, higher values in the plains in the south and the highest ones primarily on terraces in the middle. Additionally the high humidity over the coastal regions leads to droplet formation and thus to a larger volume of fine particles and higher value of AOT29.

Seasonal variations of aerosols in Guangdong display as maximum values appearing in summer (0.416) and minimum values in autumn (0.254). The distributions of AOT throughout the four seasons are spatially similar, with the PRD region consistently having the highest values and the mountainous north-northeast region having the lowest. The difference of AOT between the seasons is related to precipitation, temperature and monsoon effects. Seasonal AOT variations do not run parallel to low visibility, which always occurs in autumn and winter. The reason for this is that AOT is the extinction effect of the whole vertical aerosol column whereas ground visibility represents the horizontal extinction of the aerosol close to the surface and is affected by boundary layer height.

Linear regression and self-organizing maps are applied to quantitatively analyze the factors contributing to aerosol formation. Unary linear regression in Figure 4 shows that aerosol optical thickness shows a strong linear correlation with vegetation index (R2 = 0.782) and topography (R2 = 0.731). They indicate that aerosol optical thickness is related significantly to vegetation index and topography, in accordance with previous studies in other regions30,31,32. The coefficient relating urbanized land percentage to τ0.55 is only 0.43, mainly affected by data from Jiangmen and Zhuhai. R2 between AOT and population density is 0.588, high and low τ0.55 value areas being consistent with densely populated (Shenzhen and Dongguan) versus small population density cities (Shaoguan and Heyuan). Correlations between socio-economic factors, with the exception of primary industrial output and AOT are around 0.70. The result is consistent with the study by Gang et al.33, which shows that regions with high GDP, large population and large urban areas would have high PM2.5 concentrations over China, especially in areas with fast economic development and population growth. Fast industrial and construction development leads to extensive air pollution and exhaust emissions, such as sulfur oxide, nitrogen oxide and smoke dust. These are the principal sources for secondary aerosols and cause new particulate pollution resulting from photochemical reaction and other ways. Cars and trucks also produce primary and secondary aerosols in the atmosphere, including particulate matter, hydrocarbons, nitrogen oxides, carbon monoxide and micrometer sized solid particles34,35. Thus, the increase in private vehicles is accompanied by aerosol generation and concomittant deterioration of air quality, which is also confirmed by previous research in the PRD showing that industrial, on-road mobile and power plant sources are major contributors to SO2, PM10 and PM2.536. However, primary industry, consisting of agriculture, forestry, animal husbandry and fishery, contributes insignificantly to aerosols. It implies that occasional biomass burning has no bearing on the distribution of multi-year average aerosols in Guangdong, as also can be inferred from the positive relation between NDVI and AOT.

The three most relevant socio-economic variables are GDP, industrial output and vehicle density with R2 = 0.74, indicating the great influence of rapid industrial and vehicular growth on aerosols. Results of unary linear regression demonstrates the crucial role of anthropogenic emission on aerosol formation in the Guangdong province. These relationships are visualized and validated by using SOM. The spatial distributions of ln(GDP/area), ln(industry/area) and ln(vehicle density) planes in Figure 5 are more similar to the AOT pattern than other variables, revealing aerosol distribution is more highly related to GDP, industry and number of vehicles.

Multiple linear regressions between AOT and NDVI, elevation, socio-economic factors similarly show high R2 values (0.78 ~ 0.83), inferring that MLR is more suitable than ULR for modeling the relationship between the annual average AOT and contributing factors, because the aerosol distribution is influenced by complicated interactions among a variety of elements. Root mean square analysis between observed and calculated data shows errors below 0.1, indicative of the robustness of the analysis. The most critical contributors to aerosol distribution are NDVI, economic and industrial development and vehicle density in Guangdong, which emphatically points to the significance and necessity of reforestation, industrial pollution prevention and strict emission controls on vehicles.

Methods

Self-organizing Maps (SOM)

A Self-organizing Map (SOM) is a type of artificial neural network based on unsupervised learning to produce a low-dimensional, discretized representation of training samples. It uses a neighborhood function to preserve the topological properties of the input space37. It describes mapping from a higher dimensional input space to a lower dimensional map. The procedure for placing a vector onto the map is to find the node with the closest weight vector to the data space vector. SOM has been applied to visualize simulation results to identify the relationship between input parameters38,39,40. The component plane shows the value of the variable in each map unit. The unified distance matrix (U-matrix) shows the distance between map neurons, visualized by gray shade scaling. In this study, the SOM toolbox based on the MATLAB 7.0 computational environment, developed by the Laboratory of Computer and Information Science at Helsinki University of Technology, was used for SOM simulation. The toolbox contains functions for creating, visualizing and analyzing self-organizing maps. The input is a matrix stored as. csv. After setting the number of neurons, the learning rate, neighborhood function and neighborhood radius, the network is trained to learn the topology and distribution of the input samples so that the component planes and neighbor distances can be obtained. It is available free of charge at http://www.cis.hut.fi/projects/somtoolbox/.

Linear regression analysis

Linear regression analyzes the relationship between a dependent variable and one or more independent variables. We carried out many quantitative analyses of factors influencing the AOT distribution of by unary linear regression (ULR) as well as multiple linear regression (MLR). The regression relationship between AOT and contributing factors was computed and evaluated with Origin 75 software.