SO2 Emissions in China – Their Network and Hierarchical Structures

SO2 emissions lead to various harmful effects on environment and human health. The SO2 emission in China has significant contribution to the global SO2 emission, so it is necessary to employ various methods to study SO2 emissions in China with great details in order to lay the foundation for policymaking to improve environmental conditions in China. Network analysis is used to analyze the SO2 emissions from power generation, industrial, residential and transportation sectors in China for 2008 and 2010, which are recently available from 1744 ground surface monitoring stations. The results show that the SO2 emissions from power generation sector were highly individualized as small-sized clusters, the SO2 emissions from industrial sector underwent an integration process with a large cluster contained 1674 places covering all industrial areas in China, the SO2 emissions from residential sector was not impacted by time, and the SO2 emissions from transportation sector underwent significant integration. Hierarchical structure is obtained by further combining SO2 emissions from all four sectors and is potentially useful to find out similar patterns of SO2 emissions, which can provide information on understanding the mechanisms of SO2 pollution and on designing different environmental measure to combat SO2 emissions.

industrialized and heavily populated cities are located. For SO 2 observations, the ground surface SO 2 observations are less subject to the long-range transport, which usually takes place over 2000 meters in free troposphere 46 .
As a matter of fact, not many studies were oriented to the SO 2 emissions beyond 2008 in China 30,47 . Surely 2008 is a turning point in this regard, and the coal consumption increased to 2740 Mt from 1271 Mt in 2000 at 10.1% annually before 2008, to which power generation contributed around 65% 47 . In 2008, the estimated total anthropogenic SO 2 emissions were about 31.3 Tg in China 43 . On the other hand, Beijing Olympic Games in 2008 had a huge impact on the reduction of SO 2 emission 48,49 .
Basically, Chinese government statistical reports and yearbooks carry SO 2 emissions from both fuel combustion and non-combustion sources 47 . Recently, the very detailed SO 2 concentrations on ground surface observations became available as MIX, the mosaic Asian anthropogenic emission inventory for 2008 and 2010 50 . The MIX documents the emissions from 40.125 E to 179.875 E and from 20.125 S to 89.875 N in 0.25 × 0.25 (∼ 25 km × 25 km) grid, including 2168 monitoring stations in China that collect monthly emission data from residential, industrial, power generation, transportation and agricultural sources 50 .
These two periods in MIX are important because (i) the Chinese government required all coal-fired power plants to install flue gas desulfurization (FGD) devices in 2005 51 , therefore the analysis on these two periods should reflect the situation after implementation of this regulation, for example, FGD reached an operation rate of 97% in July 2007 in Jiangsu province 32 ; (ii) the proportion of FGD systems reached 81.7% in 2008 52 despite of that the FGD penetration was planned to reach 71% in 2008 and 73% in 2010 53 ; and (iii) it was estimated that a half of 3.1 billion-tons of coal consumed in 2010 was attributed to power generation in China 54 .
To date, a number of models have been applied to studying SO 2 emissions, whereas network analysis has not yet been used as far as our knowledge is concerned. Nevertheless, this approach looks promising not only because each model has its own advantage and disadvantage and network can simultaneously analyze spatial and temporal relationship, but also because the SO 2 concentration reflects the emission level in its surrounding area 47 . Additionally, the characteristic of network analysis decides its suitability to study SO 2 emissions because network analysis studies various interrelationships in terms of graphic nodes and edges. For instances, nodes can be cities and edges can be roads between cities in transport network, nodes can be proteins and edges can be interactions between proteins in protein interaction network, nodes can be people and edges can be their friendships in social network, etc. In the context of SO 2 emission, we define a node as an observation station, and an edge between two nodes as the correlation between two SO 2 emission profiles. The use of correlation to define an edge between nodes can easily be found in other research fields such as gene co-expression network 55 . Important characters of SO 2 include: (i) SO 2 has a shorter lifetime than sulfate, (ii) the SO 2 aloft in the free troposphere has a longer lifetime than the SO 2 on ground surface 42 , and (iii) the SO 2 emission from transportation sector is a mobile source emission 56 . These characters lay the foundation for network analysis of SO 2 emissions. As the correlation in network analysis needs to capture a meaningful sense in two SO 2 emission profiles, so a short lifetime pollutant will give more sensible correlation than stagnant pollutants. Hence, this study applies network analysis to exploring SO 2 emission in China in 2008 and 2010. Based on the results of network analysis, we end up with building the hierarchical structure of SO 2 emissions from power generation, industrial, residential and transportation sectors in order to get an integrated view.

Results and Discussion
SO 2 emission from power generation sector. Figure 1 shows the network of SO 2 emission from power generation sector in 2008 (upper panel) and 2010 (lower panel). In this type of figures, a symbol represents a monitoring station with its code, and 31 colors donate to 22 provinces, 4 municipalities and 5 autonomous regions in China. A line between two symbols interprets the SO 2 emission profiles in the two monitoring stations having a good correlation. A cluster aggregates the symbols that more densely connect each other within the given cluster but sparsely connect with the symbols in other clusters. At first glance, network analysis discovers many isolated nodes, which occupy middle and lower parts of both panels in Fig. 1, namely, 782 and 816 SO 2 emission profiles do not have any good correlations with any place in 2008 and 2010, respectively.
Because SO 2 emission profiles in isolated places have no resemblance with any SO 2 emission profiles, so the higher number of isolated places in 2010 would suggest the effects of implementation of FGD because it could theoretically eliminate common patterns in SO 2 emission profiles and led to the local characteristic to dominate SO 2 emission profiles 32,51-54 . Figure 2 demonstrates the network of SO 2 emissions from power generation sector without those isolated places in 2008 (upper panel) and 2010 (lower panel). Technically, Fig. 2 is a subset of Fig. 1 for the purpose of better visualization. An important feature in Fig. 2 is that the same colored symbols did not gather in a single cluster but spread in two or more clusters. For example, lime green symbols at right-upper corner in upper panel represent the places in Fujian province, however, a small cluster with lime green symbols can be found at middle of upper panel. As a result, the SO 2 emissions from power generation sector in Fujian can be primarily divided into two clusters, indicating that each cluster has its own characteristic and requires different measures to control the emission even within the same province.
In social network analysis, the node with most edges is the central point, from where information propagates. If we apply this concept to Fig. 2, we found that the most connected nodes came from Sichuan province (the first  left-middle cluster with green yellow symbols in upper panel and the third left-upper cluster with green yellow symbols in lower panel) and Guangdong province (the fourth right-upper cluster with blue symbols in upper panel and the second left-upper cluster with blue symbols in lower panel). Although these nodes do not represent major power generation places, their geographical locations could be the determinant factor for their similar SO 2 emission profiles. SO 2 emissions from industrial sector. Figure 3 illustrates the network of SO 2 emissions from industrial sector in 2008 (upper panel) and 2010 (lower panel). As can be seen, there are less isolated places in Fig. 3 Fig. 1, so the SO 2 emissions from industrial sector have more common features than that from power generation sector. This implies that it is somewhat easier to implement a common measure to reduce the SO 2 emissions from industrial sector than that from power generation sector.

than in
For 2008, network analysis discovers 23 clusters, among them 33 isolated places are classified as a single cluster and presented at the bottom of upper panel. The 8 clusters, which do not have any connection with outside clusters, are placed at right-hand periphery of upper panel. Then clusters A to N construct a large cluster, because it does not have any connection with any node in peripheral clusters. In this large cluster, the number of connections between the clusters from A to N varies greatly. For example, cluster M connects with cluster G through a single node (54284, Donggang) and with cluster B through a single node (54063, Fuyu), but there are many connections among clusters A, B, H and K. A cluster does not necessarily contain the places from the same province because network analyzes the correlation of two SO 2 emission profiles for any two places.
Once again, we look at the node with most edges. Strikingly, the places, whose SO 2 emission profiles correlate best with other places in both panels, are the places in the northern part of Anhui province although not many huge industrial enterprises are located there. Because the northern part of Anhui province is the terminal of Great North China Plain, therefore SO 2 could be accumulated in this region due to strong winds from North China. This explanation is reasonable because the long-range transport of SO 2 takes place over 2000 meters in free troposphere 46 .
For 2010, network analysis discovers 18 clusters, among them 31 isolated places are considered as a single cluster at the bottom of lower panel, and 6 clusters are presented at right-hand periphery of lower panel. Eventually, 11 clusters construct a large cluster, which does not have any connection with outside nodes, and this large cluster contains 1674 places across China, which is realistic because this large cluster includes almost all industrial areas across China. Naturally, each cluster does not exclusively include the places from a single province. For example, cluster J includes not only the places in Jilin province (black colored symbols) but also the places from Heilongjiang province (teal blue symbols), so these places have similar emission pattern, which is plausible because both provinces are located together.
Let us have a close look at two clusters. Cluster A is characterized as follows: (i) containing 100% monitoring stations in Fujian, 98.51% in Jiangxi, 88.41% in Hunan and 73.44% in Guangdong, which are four provinces geographically connected together; (ii) containing 88.89% monitoring stations in Henan and 64.94% in Shanxi, which are two provinces geographically connected together; (iii) containing 22.22% monitoring stations in Ningxia and 20.59% in Gansu, which are geographically connected together; and (iv) containing 25% monitoring stations in Shanghai, 15.79% in Anhui, 11.67% in Hubei, 5.1% in Hebei, 3.16% in Inner Mongolia, 2.6% in Shandong and 1.79% in Jiangsu, which are geographically corridors between most accounted provinces, for example, Anhui is located between Henan and Jiangsu, and between Henan and Zhejiang. Cluster B contains (i) 88.89% monitoring stations in Guizhou and 85.19% in Chongqing, and both are geographically connected together with 2.04% monitoring stations in Yunnan and 1.72% in Sichuan; (ii) 82.46% monitoring stations in Anhui and 77.78% in Zhejiang, and both are geographically connected together with 14.29% monitoring stations in Jiangsu, 1.49% in Jiangxi and 1.02% in Hubei; (iii) 66.67% monitoring stations in Hainan and 25% in Guangdong, and both are geographically connected together; and (iv) 20.9% monitoring stations in Shaanxi, 11.11% in Ningxia, 10.14% in Hunan, 2.94% in Gansu and 2.6% in Shanxi, and these five provinces form geographically a belt. These clusters perfectly classify similar pattern of SO 2 emissions from different places, suggesting that environmental measures could be adopted in consideration of what these clusters are composed of. SO 2 emissions from residential sector. Figure  In   In 2010, clusters A, B, C, D and F interweave together with many connections between clusters, implying a high level transportation between them. Indeed, these five clusters include 100% monitoring stations in Anhui, 100% in Beijing, 100% in Chongqing, 93.75% in Fujian, 94.12% in Gansu, 1 SO 2 emissions characterized from four sectors. In order to get a balanced overview, Figure 6 puts all the SO 2 emissions in terms of their cluster membership from all four sectors together with the use of heatmap and hierarchical cluster analysis in 2010. This hierarchical cluster analysis furthermore defines the patterns of SO 2 emissions because network analysis can stratify SO 2 emissions according to their similarity, but cannot define the hierarchical structure among clusters. On the right-hand side with respect to dendrogram structure on the left-hand side, we can see that the SO 2 emissions from residential and transportation sectors are more similar, and then they merge with the SO 2 emissions from industrial sector, and finally merge with the SO 2 emissions from power generation sector. Clearly, the SO 2 emission from power generation sector is different from others. Because 1744 monitoring stations are included in analysis, the labels are superimposed at the bottom of figure, but their hierarchical relationship is visible on the top of Fig. 6 (the hierarchical relationships of 1744 monitoring stations can be found in Table A7 in Supplementary information files). For example, an initial hierarchical relationship begins from merging of Dangshan (Anhui, 58015) and Funan (Anhui, 58202), and then Mianchi (Henan, 57063). For another example, Runan (Henan, 57197) merges with Xiaoxian (Anhui, 58016), and then merges with Bozhou (Anhui, 58102), which come from the merging of Guangshan (Henan, 57299) and Bozhou (Anhui, 58102) ( Table A7 in Supplementary information files). Basically, this hierarchical structure is potentially useful to find out similar patterns of SO 2 emissions, which can provide information on understanding the mechanisms of SO 2 pollution and on designing different environmental measures to combat SO 2 emissions.
In this study, we add the hierarchical structure analysis to study SO 2 emission, which reveals interrelationship between sectors and emissions. To some extend, SO 2 emission networks are somewhat similar to PM 2.5 emission  Table A7 in Supplementary information files).
Scientific RepoRts | 7:46216 | DOI: 10.1038/srep46216 networks, which are reasonable because PM 2.5 formation is closely connected to SO 2 emission. Therefore both studies can give us more general patterns on hazardous emissions in China.

Conclusions
To our best knowledge, this is the first study to analyze SO 2 emissions in China using network analysis, and the results demonstrate the heterogeneity of SO 2 emissions from different sectors and their dynamic changes. The obtained clusters and connectivity provide clear views of SO 2 emission patterns from various places across China. Together with the hierarchical structure, we can trace similar emission patterns in detail, which shed new insights into the understanding of mechanisms of SO 2 pollution. In particular, such analyses can help to make policy decision for different regions according to their pattern of SO 2 emissions.

Materials and Methods
Data. The  The grid in MIX is smaller than that used previously 52 , consequently several monitoring stations may be happened in the same grid. For the sake of single measurement per grid, only one monitoring station was selected for network analysis. Also, incomplete datasets were excluded from our analysis.

Analysis.
As abovementioned, an edge between two nodes dedicates a relationship. Thus, we define a Pearson's correlation as a measure to determine whether two SO 2 emission profiles obtained from two monitoring stations are relevant. In particular, we consider it significant when a Pearson's correlation is larger than 0.95, whose root is approximate to 0.92 as a criterion to evaluate a method 57 . iGraph R package (http://igraph.org/) and Pajek 58 were used in network analysis. Hierarchical structure was built using hierarchical cluster analysis in R package.