Introduction

The ‘Rise of Central China’ Plan has become the cornerstone of Chinese economic development strategy1,2, following the previous development of coastal regions3. Central areas of China, with their high population densities and developmental potential, are also very attractive sites for industrial development1,4. Today, this transfer of industrial activity is already in progress between developing and developed regions. The scales, temporal-spatial pattern, transfer process and intensity of industrial transfer vary significantly from area to area5,6. Water pollution inevitably accompanies the transfer of manufacturing enterprises between regions7. Due to limited regulation and registration in the undeveloped areas, there are a large number of unlicensed or unregistered enterprises in these undeveloped regions8,9. Therefore, economic surveys or use of economic geography, the usual method for tracing industrial transfer, are not useful10,11.

We tackled the problem of accounting for pollution transfer by identifying the pollutant composition (fingerprint) in water bodies from various regions. The pollutant profiles, with appropriate analysis, reflect current pollution status of particular water sources and infer the transfer process. To assure that water fingerprint samples were representative of transfer activities, we selected two areas to serve as prototypical export and acceptance regions (Fig. 1). The export region located in the Yangtze Delta includes the developed coastal area in Eastern China1. This region has transferred various enterprises in recent years to central areas, such as Henan province4. With the soaring economic development in the past two decades, water bodies have received significant increases in organic pollutants12,13. In particular, parts of the Huai River basin located in Henan segment experienced increased organic pollution discharge12. These two areas, separately affiliated to the two main, relatively isolated river basins in China, can sufficiently visualize the image of industrial (pollution) transfer process based on their geographic, climatic, economic, developmental and urban-rural differences.

Figure 1
figure 1

Industrial and pollution transfers from Yangtze Delta to Henan in China.

The elliptical broken line denotes the industrial acceptance regions belonging to the Huai River segment in Henan. A–E red dots represent sampling sites in HPR (highly polluted region); which correspond to HPRA-HPRE in Fig. 2. The other red dot indicates the sampling site of LPR (low polluted region) for the control. The triangular broken line shows a region of the Yangtze Delta, the typical industrial export region. The four grey arrows represent four times of industrial transfer from the Yangtze Delta to Henan (see Fig. 2).

Water fingerprints evolve by a process of descent with changes14. Pollutant components of water bodies in different regions and periods originate from the same or different pollution sources; similarities among water fingerprints reflect water-body relationships (Supplementary Information 1)15. The evolutionary process gives rise to diversities of fingerprints among various water bodies15. Similarly, in species evolving from ancestors, random mutations and variation caused by environmental pressure (insertion, substitution and deletion) of biological sequences produce the diversities of species (Supplementary Information 1)16,17. Reconstructing ancestral origin discloses complex relationships among species based on comparison of similarity and variation between existent biological sequences16,17. Tree topology, which powerfully reflects natural history and biological phylogeny18,19,20,21,22, effectively identifies the relationships of the pollution fingerprints in various water bodies23.

In this study, we developed an analytical framework combining water fingerprint and evolutionary analysis, examined the diversities of water fingerprints and temporal-spatial relationships among the water sources and explicitly traced industrial transfer processes and common features between different regions. In addition, we analyzed target compounds in the water sources for identifying possible types of industrial transfer.

Results

Fig. 2 demonstrates the tree constructed for the water fingerprints in various regions. The red (highly polluted regions in Henan), purple (Shanghai), green (Zhejiang) and orange (Jiangsu) branches cluster together in main branch 2 and 3. The blue branches (low polluted regions in Henan) are in a separate main branch 1. Thus, fingerprints of water samples in highly polluted regions of Henan were more similar to those of Eastern regions than were those of low polluted regions (control regions) in Henan. This observation confirms the pollution transfer between the industrial export and acceptance regions. Thus, the water bodies in industrial export regions serve as a ‘source’ and those in the industrial acceptance regions appear as a ‘sink’.

Figure 2
figure 2

Industrial transfer process reconstructed based on water pollution fingerprints.

Branch 1, 2 and 3 (the numbers labels are next to the relevant branches), colored by locality, denote the three major branches in the tree. Locations in LPR (low polluted region) in Henan are in blue; locations (HPRA-HPRE) in HRP (high polluted region) are in red. Water samples from Shanghai, Zhejiang and Jiangsu are respectively in purple, green and orange. T(a)–T(d) indicate the four estimated industrial transfers. The thicker line of T(d) represents the faster transfer rate than other three transfers. The red dots and values represent the inferred time points in the industrial transfer process.

In two main branches of the tree (branch 2 and branch 3), the red branches with highly polluted region A, B, C, D and E in Henan actually clustered with the water sources from Shanghai, Zhejiang and Jiangsu in four clades - T(a), T(b), T(c) and T(d). Though A, B, C, D and E localities had relative proximity (Fig. 1), the water fingerprint samples from these five localities were more closely related to those from water sources in Eastern regions than to each other. The local water fingerprints in A–E localities belonged to four distinct lineages. Since the reconstructed relationships of water pollution fingerprints represented the origin of pollution relationships between different regions (Supplementary Information 1), these pollution lineages are suggestive of four pollution origin episodes. Additionally, each lineage showed the same or similar pollution origin transferred from industrial export regions to acceptance regions. Therefore, these four episodes are likely to represent independent events. T(a), T(b), T(c) and T(d) then are likely to represent four episodes of pollution (industrial) transfer from Eastern to Central China (Fig. 2).

The transfer rate heterogeneity among these four pollution transfers was analyzed by the relative rate test between the lineages in the tree. T(d) had relatively faster rates than other three transfers; however, no significant rate differences were observed between every two transfers of T(a), T(b) and T(c) (Table 1 and Fig. 2). Therefore, the grouping [T(a), T(b) and T(c)] had different transfer features compared to T(d), indicating two different stages of industrial transfer from east to west.

Table 1 Results of relative rate tests for four industrial transfers

Our analysis of the target compounds in the water samples revealed that the water fingerprints from high polluted regions in Henan and from the Yangtze Delta identified relatively high concentrations of plasticizers, polychlorinated biphenyls (PCBs), polybrominated diphenyl ethers (PBDEs) and sulfonamide or steroid compounds; these pollutants were not found in the low pollution area of Henan (Table 2).

Table 2 Quantitative results of main target compounds in the water fingerprints (μg/L)

Discussion

Based on the specific outcomes from the reconstructed ‘pollution cladogram’ and rate heterogeneity test for transfer, we now can focus on the process of these four episodes of industrial transfer. In China, the industrial transfer began in 200424. Local economic statistics also indicated that the highly polluted regions in Henan accepted industries on a relative large scale for the first time during 2004–2005 (Supplementary Information 2). Owing to the global financial crisis after 20084, China experienced significant adjustments and changes in industrial policies and economic strategies. During this period, the developed regions accelerated the pace of industrial transfer. Thus, we hypothesized that T(d), with its faster rate, happened in this more recent economic cycle and was the fourth transfer. The previous three industrial transfers, T(a) through T(c), occurred during 2004–2008, with a constant rate of one episode every sixteen months (Table 1).

The occurrence or evolution of pollution is now extremely rapid in this era of intensive industrialization. Our results also show that pollution transfers vary on a year-to-year scale in emerging economies. The ability to estimate timing of events in industrial transfer processes requires much smaller errors than in biological or linguistic evolution processes. Based on the inferred errors in the rate heterogeneity test (Table 1), each constant rate has an associated uncertainty of ±0.2 year (about two and a half months). This correction can provide a more precise estimation for transfer rates.

The analysis on the industrial transfer process illustrates the need to confirm the conclusions of the ‘pollution transfer tree' by considerations of economic data and industrial cycle information. Although these demographic data are limited, they could refine ideas about the main periods of transfer and give valuable clues in tracing the industrial transfer process ongoing in Eastern China. Unlike the conclusions from biological evolutionary trees of a single origin species and subsequent descent, we did not find any single origin of water fingerprints from these various sources. Nonetheless, because local pollutant releases are relatively steady within the time-periods of our observations, the water pollutant compositions and the whole fingerprints provide an overview of pollution features.

Specific pollutant compounds can serve as index compounds or industrial tracers for inferring the nature of industries involved in the pollution. Recycling procedures for electronic wastes (E-waste) including computers, cellular phones and televisions25,26 release PCBs and PBDEs from capacitors or transformers. With the export of computers and electronic industry wastes from developed countries, Eastern China became one of the important E-waste recycling regions with widespread distribution of these compounds in various environmental media and into the general population27. Previously, the electronic industry in Henan was undeveloped and index pollutants for E-waste such as PCBs or PBDEs were uncommon in environmental media27. The transferred PCBs and PBDEs in sink type water bodies strongly suggest that the electronic industry has shifted various activities to the acceptance region.

Plasticizers appear in a wide range of products28. The plasticizers detected in the highly polluted regions of Henan indicated that the plastic industry has also transferred activities from Yangtze Delta.

Sulfonamide and steroid compounds are intermediate products or metabolites for many pharmaceuticals29. Sulfonamides are a kind of antibiotics: both animals and humans produce and excrete various steroids. In many cases, these compounds are more used as tracers for domestic or urban waste water than as industrial tracers30,31. However, in the control regions (also in Henan and population in these regions have similar lifestyle with those in highly polluted regions), the sulfonamides and steroids were not detected in water samples (Table 2). Additionally, in the upper reaches of highly polluted regions in Henan, the target sulfonamides and steroids were not detected in surface or ground waters. These observations suggested that target sulfonamides and steroids may also act as industrial tracers or index compounds in this particular case. Therefore, the detected sulfonamides and steroids in water samples implied that biomedical industries were also accepted by local areas in Henan.

These categories of industry - E-waste recycling, plastics and biomedicines – all previously existed in Henan. However, the detection of these compounds at relatively high levels in sink type water bodies suggested that these types of industry had expanded their presence in acceptance localities in Henan.

Industrial enterprises, with their associated pollution, have transferred from the developed regions and are invading inland rural areas in China. This demographic shift is inducing serious pressures on vulnerable rural ecology, the environment and public health. Our pollution-transfer-tree unraveled the complexities of the industrial transfer process and provided results to enhance the scientific bases for pollution control and environmental protection.

Based on projections of growth and movement of industries westward, we can predict the industrial transfer trends in the future. For instance, it was estimated that Henan will experience unprecedented opportunities for development during the 12th Five-Year Plan (2011–2015)32. Based on this observation of faster transfer after 2008, we expect more than three times the rate of industrial (pollution) transfer wave from Yangtze Delta to Henan in the coming four years. Another value of our tree analysis is the ability to trace industrial transfer processes and uncover transferred industry types. For example, two kinds of industrial transfer process can be traced separately: from developed countries to developed regions in China and from China's developed regions to undeveloped regions. Analyzing results in this manner can give valuable information for comparing differences of types, dynamics, effects and patterns between intra- and inter-country pollutant transfers.

The government could acquire sufficient pollution transfer information by monitoring of water pollution fingerprints on a regular and long-term manner covering more regions. Such a process would outline a more integrated and macro- image for industrial transfer over a longer time scale within the whole country, even inter-countries independently from the economic survey. This expanded monitoring can provide a supplement to the economic survey and even play dominant roles if the monitoring data are more solid and convincing.

China, suffering from regional imbalance and experiencing unprecedented development in multiple geographic regions, provides a unique opportunity to evaluate pollution consequences of industrialization. Our pollution tree framework is generic. Other regions in China and developing countries facing similar challenges trace pollution transfer and monitor development through use of these tools.

Methods

Sampling strategy

We selected Shanghai, Zhejiang and Jiangsu provinces as study sites in the Yangtze Delta, the most typical and important industrial export regions in China. Water samples from the most representative water plants in the developed cities in Yangtze Delta were collected, as they can show the pollution fingerprints of urban water sources and drinking water. Henan province, one of the important regions accepting industries from the Yangtze Delta and experiencing soaring economic development in recent decades, was selected. We selected five localities in highly polluted regions of Henan for water sampling (A, B, C, D and E; Fig. 1). These localities have significantly more industrial enterprises from transfer than from local investment or foreign countries (see Supplementary Information 2). The water samples were collected from the main rivers receiving the discharge of aggregated enterprises along the river. The ground water samples were collected in the villages neighboring the polluted rivers. To examine and distinguish the pollution in water from historical pollution discharge or from recent period of industrial transfer, we also selected low pollution regions as a historical ‘control regions' in Henan (Fig. 1 and Supplementary Table S1). These ‘control regions' were distant from the tributary of the Huai River. Since no irrigation canals with water sourced from polluted regions flow through this area, the river basin in this area has no connection to highly polluted river basin. Additionally, in the sampling drainage area, there are less than 20 enterprises on average (Supplementary Table S1).

Water samples

In all the selected sites, a total of 41 water samples were collected during 2009–2010 to cover the main types of local water bodies (Supplementary Table S2). Based on drainage basin features and surrounding industrial enterprises, we respectively sampled surface and ground water in highly polluted and low polluted areas (control areas) of Henan. In Shanghai, Zhejiang and Jiangsu, we sampled water source, raw water and drinking water from the municipal water plants.

Whole-pollution-fingerprints analysis

To examine pollutant discharge, organic fingerprints of various water samples, analyzed by gas chromatography-mass spectrum (GC-MS) and liquid chromatography-tandem mass spectrometry (LC-MS/MS), traced the pollutant discharges and provided the data matrix for creating the evolutionary tree. Each water sample (40 L) was first filtered through a 0.45 μm glass fiber membrane and extracted by passing through 10 g XAD-2 cartridge. After elution and concentration, the whole pollutant fingerprint in each extract was detected by GC-MS23,33 and LC-MS/MS (detailed information in Supplementary Information 4 and Supplementary Information 5).

Common compound matching between water fingerprints

We developed programs and macrocommands in SAS 9.234 for automatically matching the resolved spectra between every two water pollution samples. As the retention time of the peaks could not give sufficient information for identifying compounds, the resolved mass spectra were combined to ascertain that the same compound was represented by the same variable number in all samples. We evaluated similarity between spectra depending on several guides23,35,36. Finally, we calculated the integrated areas of the remaining resolved chromatograms. In the resulting data matrix (X), each row represented one sample and each column denoted one compound, the latter identified by its mean retention time. The numbers of detected peaks in all water fingerprints from 41 water samples ranged from 240 to 392. The total number of common compounds (appeared in at least two samples) in all water pollution fingerprints was 1522 based on similarity matching among all resolved peaks using both retention time and mass spectrum information. On average, between every two pollution fingerprints of water samples, 148 common compounds (ranging from 115 to 192) were matched.

Data matrix conversion23

After establishing a data matrix (X), the presence or absence of each compound was coded as ‘1’ or ‘0’, respectively, to produce a binary matrix (Y) of all matched and included compounds in X. Then the converted data matrix (Y) can be used for character-feature-based tree-building methods.

Distance matrix establishment23

Using the information of overlapped compounds between every two water fingerprints, we defined and calculated a distance matrix of ‘intersection and union ratio (IUR)’ of all 41 water pollution samples. Detailed descriptions and formulae are in Supplementary Information 3 and Supplementary Table S3.

Tree construction

The neighbor-joining (NJ) method was used to construct trees of complex water fingerprints through MEGA 4.137 based on IUR distance matrix. The binary data matrix (Y) was used for constructing ‘pollution trees' by maximum parsimony (MP) and maximum likelihood (ML) methods through PAUP 4.038. Topological robustness was investigated using 10000 non-parametric bootstrap replicates.

Relative rate test

The relative-rate tests of between the lineages in the water fingerprint tree were performed with the proposed method39 and using PAML 4.540. The binary transferred matrix was used for relative rate tests.

Quantitative analysis for target compounds

The target compounds in the water fingerprints such as plastics, PCBs, PBDEs, sulfonamides and steroidal compounds (the detailed compound list see Supplementary Table S4) were quantitatively investigated. The phthalate esters (PAEs), PCBs and PBDEs were detected by solid phase extraction (SPE) with GC-MS41,42,43,44. To detect the trace-level PCBs and PBDEs in water samples, a large-volume (20 L) of water was used for extracting target compounds in each sample41,44. Sulfonamides and steroids were analyzed by SPE with liquid chromatography-tandem mass spectrometry (LC-MS/MS)31,45. Only the sulfonamides were quantified by external standard curves, all the other target compounds were quantified by internal standard calibration procedures. The correlation coefficients of the calibration curves for all target compounds were greater than 0.99 (Supplementary Information 5 and Supplementary Table S5, S7, S8 and S9). The limit of detection (LOD) was defined as a signal of 3 times the noise level. The LOD for PAEs (0.2–1.4 ng/L), and, PCBs (2–8 pg/L), PBDEs (0.08–6 pg/L), sulfonamides (0.2–0.8 ng/L) and steroids (0.02–0.06 ng/L) are shown in Supplementary Information 5 and Supplementary Table S5, S7, S8 and S9. For every batch of 10 samples, a solvent blank and a procedural blank were added. The recoveries of surrogate standards, spiked blank samples and spiked matrix blank samples were analyzed to evaluate the repeatability and accuracy of analytical procedures. The sample preparation, parameters of instrumental analysis and results of quality assurance (QA) and quality control (QC) are shown in Supplementary Information (Supplementary Information 5 and Supplementary Table S5–S9).