Introduction

Tropospheric ozone (O3) is an important ambient air pollutant that poses serious public health risks1. Epidemiological studies have shown a clear link between high ambient O3 concentrations and the risks of respiratory and cardiovascular mortality2. Severe O3 pollution has been alleviated in many cities of developed countries, but it has gradually turned into a hot spot in South and East Asia due to rapid industrialization and urbanization3,4,5. More recently, summertime O3 pollution has become an emerging concern in China6,7.

Globally, continuous efforts are being made to capture the dynamic pollution of tropospheric O3 and its causes, sources, and impacts. As a secondary pollutant, O3 in the troposphere can accumulate due to the photochemical oxidation of volatile organic compounds (VOCs) reacting in the presence of nitrogen oxides (NOx)8, which has been proved expensive to control9. Rather, as crucial precursors for sustained formation of O3 and active participant in the cycling of radicals, VOCs are directly emitted by anthropogenic and biogenic sources, and have been a focus of air quality management7,10,11. Studies have shown that although high VOCs concentrations could be more conducive to O3 accumulation12,13, in fact, O3 formation rate has a nonlinear response to the (relative) changes in observed VOCs concentrations14,15. In this work, the concentration of VOCs and O3 even exhibited an opposite trend in time series (Fig. 1). This phenomenon has also been found in previous studies. For example, Zou et al. manifested that the seasonal variation in the concentration of VOCs was opposite to that of O3 concentration in Guangzhou16, and in Wuhan, the seasonal variation of VOCs concentration was negatively correlated with O3 concentration17. However, most studies focus more on the description of observational data, and the actual reasons behind the phenomenon remain unclear which require further research. The effective prevention and control of O3 pollution therefore requires the reasons for this nonlinear relationship, as well as a detailed description of the sources and fates of VOCs.

Fig. 1: The concentrations of measured VOCs (M-VOCs) and O3 from January to December in 2018.
figure 1

Bold solid lines and shaded sections show the observed mean and standard deviation, respectively. From summer to winter, the variations of VOCs and O3 show clear opposite trends. Annual mean concentration of M-VOCs was about 21.4 ± 26.8 ppbv, varying between 10.8 ± 9.3 ppbv in summer and 32.6 ± 26.1 ppbv in winter. The annual mean concentration of O3 was about 20.4 ± 0.1 ppbv, ranging from 11.4 ± 10.4 ppbv (winter) to 40.7 ± 25.3 ppbv (summer). The hourly concentrations of measured VOCs, four VOCs categories and O3 are shown in Supplementary Fig. 1.

The key to impeding the development of this problem is that some VOCs species readily react with atmospheric oxidants such as hydroxyl radicals (OH), nitrate radicals (NO3), chlorine atoms, and O310,17,18. But which species is more reactive remains a matter of debate19,20,21. Complex chemical reactions lead to highly reactive VOCs species being consumed to varying degrees from sources to the observation sites. If we focus on the high concentration of observational VOCs species, can it play a vital role in O3 prevention and control? Undoubtedly, such photochemical consumptions of different species add difficulties to reveal the real sources of VOCs and their impacts on O3 formation.

Quantifying the photochemical consumption of VOCs and their sources profoundly is the key to capture the real source of VOCs’ effect on O3. However, previous studies could not directly provide the sources of consumed VOCs (C-VOCs)21,22,23 which reduced the intuitive perception of VOCs contribution to O3. Recently, the global VOCs online monitoring sites are gradually increasing, which can obtain high time resolution VOCs data. This provides a key data base and advantages for VOCs traceability. In this work, we demonstrated that C-VOCs are the direct causes of the nonlinear relationship between O3 and VOCs, and it is the C-VOCs that control O3 formation, by using chemical kinetics, environmental models, and theoretical calculations. Therefore, we developed an improved quantitative method of the source impacts of C-VOCs based on the observation VOCs dataset, which will help us to develop more effective strategies to control atmospheric O3 pollution. At the same time, we also quantified the source impacts of C-VOCs that contribute to O3 production. This new method for source apportionment of C-VOCs will be a useful tool to better understand the VOCs sources and their impacts on O3 formation, and thus aiding in developing O3 pollution control strategies in different locations.

Results

Opposite trends between measured VOCs and O3

The temporal concentration variations of measured VOCs (M-VOCs, specific species in Supplementary Table 1 and Supplementary Fig. 1) and O3 based on hourly online observations conducted in Tianjin during 2018 are illustrated in Fig. 1. The sampling site location is in a community located about 200 meters away from a major road, and more information is presented in the Methods. Interestingly, M-VOCs and O3 exhibited an opposite trend during different seasons. M-VOCs concentrations were much lower in summer (10.8 ± 9.3 ppbv) than in winter (32.6 ± 26.1 ppbv). By contrast, seasonal O3 concentrations variations were contrary to M-VOCs, which were high in summer (40.7 ± 25.3 ppbv) and low in winter (11.4 ± 10.4 ppbv). This inverse correlation between M-VOCs and O3 agrees with earlier findings16,24. Generally, high VOCs may facilitate the production of O3, but here low M-VOCs concentrations were accompanied by high O3 concentrations. Studies have shown that the O3 formation regime depends on the ratio of NOx/VOCs25,26,27. In Supplementary Note 1, the O3-NOx-VOCs sensitivity is discussed by using the empirical kinetic modeling approach and NOx/VOCs ratio. The simulated O3 showed a clear dependence on VOCs, indicating that O3 production at our site was in the VOCs-limited regime (Supplementary Fig. 2), which is consistent with previous studies28,29. Thus, this paper mainly discusses the relationship between VOCs and O3. To explain this phenomenon, we need an in-depth understanding of the characteristics and reaction processes of VOCs, as well as the impact of VOCs consumption on O3 formation. As key precursors of O3, the photochemical reactions of VOCs are crucial in the formation of ambient O330.

Theoretically, the generation, transport and diffusion of pollutants are jointly affected by physical and chemical processes, which eventually result in the removal or transformation of pollutants from the atmosphere31,32. For example, changes in boundary layer evolution can affect the diffusion dilution of pollutants in the vertical direction. Here, the monthly variation concentration of M-VOCs is negatively correlated with boundary layer evolution, as well as O3 and the evolution of the boundary layer (correlation coefficients (r) are 0.06 and –0.12, see Supplementary Fig. 1c). But it is recognized that the physical dilution is far less than the chemical reaction10. For this reason, in this study, we mainly focus on consumed VOCs (C-VOCs) caused by the VOCs photochemical reaction to produce O3 pathway. OH radical is the dominant atmospheric oxidant that reacts with VOCs to generate O3 (e.g., substitution and addition reaction, details are shown in Supplementary Note 2)10,33, and plays a leading role in the consumption of VOCs. This makes it reasonable to hypothesize that the observed inverse correlations between M-VOCs and O3 (Fig. 1) can be adequately explained by OH consuming VOCs to form O3, as described in Eq. (1):

$${{{\mathrm{VOCs}}}} + {{{\mathrm{OH}}}} \to {{{\mathrm{RO}}}}_2 + {{{\mathrm{HO}}}}_2$$
(1)

Affected by environmental factors such as solar radiation and ambient temperature (T), the reaction rates of various VOCs species with OH radicals are drastically different34. In addition, as photochemical age increased, the oxidation process of VOCs may be conducive to accelerating O3 accumulation13,34. These variables are frequently observed to be correlated with the photochemical loss process of VOCs, and can be viewed as powerful indicators of chemical kinetics. Therefore, the C-VOCs are not only related to the reactant (OH), but also inseparable from other parameters of chemical kinetics.

Survivor bias in O3-VOCs system

To understand the opposite trend between O3 and M-VOCs, we should deeply analyze the relationship among C-VOCs, M-VOCs and O3. In this study, VOCs are classified into three categories to better distinguish different types of VOCs reactions: initial VOCs (In-VOCs, fresh VOCs emitted from sources without chemical consumption), C-VOCs and M-VOCs. Detailed definitions of the three are presented in Methods. Then, (1) starting from the perspective of chemical kinetics, a method for estimating C-VOCs was established by theoretical calculations involving reaction rate constant (k), concentration of OH radical and the photochemical consumption time (∆t, details in Methods) on VOCs loss in the atmosphere; (2) subsequently, the relationship between C-VOCs and O3 was analyzed to evaluate the contributions of C-VOCs on O3 formation potentials.

We applied kinetic equations (Eqs. (3)–(6) in Methods) to calculate VOCs consumption. For accuracy, kOH for different species is calculated by Eqs. (5) and (6)10,35 based on real atmospheric conditions. Furthermore, a series of simulations using a box model (a model widely used in radical calculation36,37), Framework for 0-D Atmospheric Modeling38, were performed to estimate the possible levels of OH, while observations and other methods were used to verify the reliability of the results (Supplementary Note 3.1). We also estimated the ∆t for all detected VOCs species by applying a new method (more details are available in Supplementary Note 3.2) which takes into account both transport, chemical reaction processes and daytime duration (Supplementary Table 2, based on solar radiation in Supplementary Fig. 3). Calculation results in Supplementary Table 3 indicated that the photochemical age (ta) of any given species varied among different months. Based on the results, ∆t of each species (Supplementary Table 4) was estimated. As shown in Supplementary Fig. 4, OH radical was strongly influenced by light intensity, and its concentration is higher in summer and lower in winter (Supplementary Table 5). Such seasonal trend is in contrary to that of M-VOCs. These results suggest that photochemical aging is mainly dominated by the reaction with OH, which is a key factor leading to the evolution of VOCs and its contribution to O3, especially in summertime.

After calculation, the annual average concentration of C-VOCs was 2.9 ppbv. From Fig. 2a, C-VOCs accounted for a significant fraction of In-VOCs, and exhibited an apparent seasonal variation: the average level of C-VOCs peaked at 7.4 ppbv in June and was lowest in January at 0.9 ppbv. Based on the observed concentration variation of C-VOCs, M-VOCs and In-VOCs, we conclude that source emissions would be greatly underestimated if substantial chemical losses were ignored, especially in summer and autumn (from June to September). During the day, with the increase in solar radiation and T, photochemical reactions were enhanced13, C-VOCs (the gap between In-VOCs and M-VOCs) rose gradually, reaching their peak of 7.6 ppbv at 17:00 and then slowly decrease afterward (Fig. 2b). From 11:00 to 18:00, the concentrations of In-VOCs were 1.3–1.9 times higher than M-VOCs, which due primarily to active photochemical reactions in daytime. During night (21:00–6:00), C-VOCs dropped to zero, which is due to the slow reaction rate with NO3 and Cl (several orders of magnitude lower than OH)10. Therefore, the estimation of C-VOCs at nighttime had been evaluated as negligible.

Fig. 2: Variation trends of VOCs concentrations in different reaction types and the correlation between O3 and consumed VOCs (C-VOCs).
figure 2

Monthly variations of a C-VOCs and measured VOCs (M-VOCs) (average annual concentration was 21.4 ppbv), the sum of which were initial VOCs (In-VOCs) (average annual concentration was 24.3 ppbv); b average diurnal variation of In-VOCs and M-VOCs; c monthly variation of C-VOCs and O3; and d diurnal trends of C-VOCs and O3. a, b show that C-VOCs are consumed VOCs and M-VOCs are the survivable VOCs after consumption. In c, d, the high time-series consistency between C-VOCs and O3 confirms the key role of C-VOCs in O3 formation.

We released that C-VOCs were significantly correlated with O3 in both monthly and diurnal variations (correlation coefficients were 0.99 and 0.93 in Fig. 2c, d) for the whole dataset, while M-VOCs were negatively related to O3 (Supplementary Fig. 5). The excellent agreement between C-VOCs and O3 suggests that, in ambient air, M-VOCs are residues of photochemical reactions rather than participants in O3 formation. In addition, we estimated the ozone formation potentials (OFP) of C-VOCs (OFPC-VOCs) according to Eq. (10). The results showed that correlation coefficient between daily or monthly average OFPC-VOCs (total) and O3 concentration was 0.57 and 0.98 respectively, which far exceeded the correlation between OFPM-VOCs (calculated based on M-VOCs) and O3 (Fig. 3). Thus, it can be further indicated that C-VOCs, instead of M-VOCs, are shown to significantly affect O3 levels. In order to understand the impacts of regional transport, we also simulated in situ O3 production using a box model36,37. The good correlation of in situ O3 production with C-VOCs as well as OFPC-VOCs (r = 0.68 and 0.66) suggests that it is appropriate to use C-VOCs to evaluate their key role in O3 production (Supplementary Fig. 5). C-VOCs are largely consumed (from source to ambient observation site) to produce high O3, resulting in low survivable level (M-VOCs) at observation site in summer, which makes high-O3 and low-VOCs (M-VOCs) were observed in summer, and oppositely in winter. Through the above analysis, we have concluded that C-VOCs explain the opposite seasonal trend between M-VOCs and O3, and are also likely to be the direct cause of O3 generation in ambient atmosphere. In this regard, we interpret this fact as survivor bias in O3-VOCs system.

Fig. 3: The correlation of OFPC-VOCs (nine), OFPC-VOCs (total), OFPM-VOCs and O3.
figure 3

Time series of a diurnal variation, b monthly concentration, and c daily concentration of ozone formation potential (OFP) and O3. Note that the OFPC-VOCs (nine) and OFPC-VOCs (total) (OFP calculated based on the top nine or all C-VOCs) are higher than the observed O3 especially in the summer. This may be because OFPC-VOCs calculates maximum ozone formation potential, though O3 consumption during the summer is substantial. Or due to the uncertainty of maximum incremental reactivity (MIR) when applied to the airshed conditions in 2018 in Tianjin, China. And the influence of physical processes (boundary layer height, etc.). Here, we mainly focus on the change trend of OFP and O3. The correlation between OFPC-VOCs and O3 is obviously higher than that between OFPM-VOCs and O3, which further illustrates the important role of C-VOCs in O3 formation.

Role of C-VOCs on ozone formation

We further investigated the impact of C-VOCs on O3 formation. Considering that different C-VOCs species have wildly varying molecular structure and chemical reaction activity, their impacts on O3 generation are expected to be highly uneven. It is necessary to further investigate which are the key species in C-VOCs that contribute significantly to O3 formation potentials, as well as the overall effects of C-VOCs on O3 in the transport process from source to receptor. The relative contribution of the four categories of VOCs showed different levels of temporal variations in daytime (Supplementary Fig. 6). Alkenes were the dominant C-VOCs, contributing to ~85.2% of total C-VOCs. Such results are consistent with previous studies21,23. The remaining C-VOCs were alkanes, aromatics and alkynes, accounting for 7.8%, 6.5% and 0.5%, respectively. Then, Eq. (10) was used to evaluate key reactive components of atmospheric C-VOCs during the study period.

The calculated OFPC-VOCs of all detected VOCs species are presented in Fig. 4a. In addition, the results are also validated by using LOH method (OH radical loss rate, see Supplementary Note 4 and Supplementary Fig. 7 for details), which provides a simple indicator for studying the relative contribution of specific VOCs to daytime photochemical reactions. In terms of C-VOCs, nine species (isoprene, cis-2-butene, trans-2-butene, propylene, ethylene, 1-butene, m/p-xylene, cis-2-pentene, and styrene) were identified as the key reactive species (Fig. 4b), which contributed to 96% of total OFPC-VOCs. These reactive species are important precursors that should be the focus for controlling atmospheric O3 pollution. On the other hand, for the above mentioned nine species, their OFPC-VOCs showed a significant correlation with O3 (Fig. 3), suggesting that the depletion (reaction) of key species in VOCs plays an important role in the formation of O3. Therefore, for O3 control, more attention needs to be paid to the emission sources of these key C-VOCs species.

Fig. 4: OFPC-VOCs of different species were calculated.
figure 4

a OFPC-VOCs (OFP calculated based on C-VOCs) importance ranking of all volatile organic compounds; b OFPC-VOCs of the top nine reactive species and their percentage contributions to OFPC-VOCs. This figure illustrates that reactive VOCs species should be valued for their participation in chemical reactions.

C-VOCs sources play roles in O3 formation

Given the chemical loss of VOCs, estimating their source contribution still has some challenges. To intuitively understand the actual source impacts of C-VOCs on O3 formation, we proposed a new hybrid chemical kinetics and positive matrix factorization (CK-PMF) model (see Methods). The method is executed in two parts to account for sources that slowly release VOCs or far away from observation sites. This method is not designed for source appointment at receptors located close to sources, in which case VOCs may be transported to the receptor location with negligible loss (detailed description is presented in Methods).

We identified six sources of C-VOCs and their contributions (Fig. 5a): biogenic emissions (BE, 43%), gasoline evaporation (GE, 17%), industrial emissions (IE, 14%), solvent usage (SU, 14%), liquid petroleum gas evaporation (LPG, 7%) and vehicle emissions (VE, 5%). BE was the largest source of C-VOCs, which is consistent with the fact that the most reactive species, isoprene, is used as the marker species for this source. GE was greatly affected by VOCs photochemical consumption, while VE had the least impact, which was also related to the amount of reactive species released by the source. Besides, we also calculated the source contributions of M-VOCs and In-VOCs (Supplementary Fig. 8) for comparison. It should be noticed that the highest contributor to C-VOCs (BE) is different from the highest contributors to M-VOCs (GE, 28%) and In-VOCs (IE, 35%) (Supplementary Fig. 8), indicating that if we use VOCs data at diverse reaction types for source apportionment, various results of source contributions can be obtained. And as discussed above, C-VOCs play a vital role in O3 formation, thus, their sources should be the real sources that contribute to O3 formation. Additionally, for an individual key reactive species, the sources that contribute significantly to consumed concentration are summarized in Supplementary Fig. 9. In this work, isoprene was almost entirely from BE (99%). This finding is consistent with the fact that vegetation is abundant around the observation site. Ethylene mainly comes from IE (84%) with minor contribution from VE (7%); styrene is abundant in SU (64%) and IE (18%). Therefore, high reactive C-VOCs species (such as isoprene, ethylene, styrene, etc.) should be more focused to explore their real VOCs sources which play an important role in O3 formation. However, the measured concentrations of these species are not very high in all M-VOCs. If we directly use the observed survivable VOCs data (M-VOCs) to track the sources of VOCs, it will cause a certain degree of deviation (survivor bias).

Fig. 5: The main source contributions to VOCs and their potential impacts on O3 formation.
figure 5

a Source apportionment to consumed VOCs (C-VOCs); and b contributions of different pollution sources to OFPC-VOCs (based on nine reactive species). The highest contributor to C-VOCs is different from the highest contributors to M-VOCs and In-VOCs (Supplementary Fig. 8). This figure shows that focusing on the sources of C-VOCs can capture the real source of VOCs’ effect on O3.

We further calculated the impact of C-VOCs sources on O3 formation based upon C-VOCs source apportionment results and OFPC-VOCs. Figure 5b showed that BE (49%) contributed the most to OFPC-VOCs, followed by GE (18%), IE (15%), SU (10%), LPG (4%) and VE (4%). We note that among the contributions of C-VOCs and OFPC-VOCs, the contribution of BE is the most significant (although its contribution to In-VOCs is small). This result indicates that VOCs emitted from vegetation significantly impact on O3 formation, which is consistent with previous views39,40,41,42. But more importantly, we further quantified the source contribution of VOCs emitted by vegetation to O3 pollution. Isoprene is a key species of BE, so changing its other conversion pathways (such as inhibiting the activity of isoprene synthase43) may provide new insights for future O3 control.

Therefore, we suggest that both source emissions, key species and the chemical activity of VOCs should be taken into account in investigating the contribution of VOCs to O3 formation. Supplementary Fig. 10 explores the diel variations of the concentration of different sources to C-VOCs. Note that the concentration of C-VOCs from VE does not follow the typical road traffic volume. This is because the light intensity is low in the morning, making the chemical reaction of VOCs slow and C-VOCs gradually peak at noon. By simulating the C-VOCs produced by different species under different Δt (Supplementary Note 5 and Supplementary Fig. 11), we further illustrated that the C-VOCs are related to the concentration and Δt of species. Thus, prioritizing the source control of key reactive VOCs species within given time will help to reduce O3 pollution more effectively. For the mitigation of surface O3 pollution, VOCs control strategy inferred from the results of CK-PMF model will be effective.

Discussion

This study proposes a new method to quantify the source contributions of C-VOCs (e.g., BE, industry, SU) and their impacts on O3 formation, by using a combination of ground-based observations and model calculations. This finding not only offers new insights into the prioritization of VOCs control for mitigating O3 pollution, but also better reduces the misjudgment of major VOCs sources (contributing to O3 formation) caused by survivor bias in observational data. In summary, the processes for ambient VOCs to generate O3 are complex due to VOCs reactivity and the behavior of source emissions. Figure 6 illuminates the concept map for survivor bias in O3-VOCs system. We found that C-VOCs are the main precursors for O3 production, and M-VOCs are actually the survivable VOCs after photochemical reactions. In summer, the highly reactive C-VOCs species elapsed a lot to produce O3, leaving fewer survivable M-VOCs; in winter, with the weakening of photochemical reactions, a small part of C-VOCs elapsed, and the survivable M-VOCs gradually enriched. For example, in summer, the observed concentration of survivable ethylene (one of the reactive species) is low, but its contribution to O3 is relatively great of all species. The idea of survivor bias answers the question that why the concentration between M-VOCs and O3 exhibits an opposite trend, and clarifies the cause of this environmental problem.

Fig. 6: Schematic diagram.
figure 6

This diagram explains the survivor bias for the O3-VOCs system. At the observation site, low-VOCs concentrations in summer correspond to high O3 concentrations, while high VOCs concentrations in winter correspond to low O3 concentrations. With the enhancement of photochemical reaction (in summer), VOCs with high reactivity will undergo photochemical reaction and participate in the accumulation of O3, resulting in less survivable M-VOCs. In contrast, a small number of VOCs were lost in winter, and the survivable M-VOCs gradually increased. Therefore, from the perspective of O3 pollution control, we should focus more on the C-VOCs sources.

Understanding the survivor bias can help us better capture the VOC-sources that actually impact O3 formation. This work provides evidence that sources of C-VOCs play an important role on O3 formation. Here we calculated the main sources of C-VOCs by the comprehensive application of chemical kinetics and source apportionment technology. The differences in source contribution results between C-VOCs and M-VOCs suggest that if we only focus on the sources of observed species with high concentrations, it will lead to a bias in O3 pollution control strategies. Of course, to reduce VOCs concentration, focus on the main emission sources of M-VOCs is necessary. However, from the perspective of O3 pollution control, we should focus more on the VOCs sources that contributing the most O3 formation, that is, the emission sources of C-VOCs. The results suggested that: to better prevent and control O3 precursors, it is necessary to control the C-VOCs from sources, rather than relying solely on observational data. The findings here will support the development of more efficient O3 control strategies through VOCs chemical kinetics and photochemical depletion processes, and will also inform management of O3-VOCs systems by assessing the source impacts of C-VOCs on O3 formation. Future air control strategies should pay more attention to C-VOCs and their sources.

In addition, the understanding of this mechanism of survivor bias in O3-VOCs system provided by this study is also applicable to other locations, and may be extended beyond air pollution to the management of volatile or highly active pollutants in a variety of environmental issues (e.g., water pollution), but may have limitations in the prevention and control of stable substances.

Methods

Measurements and instruments

Online VOCs and meteorological measurements with 1 h time resolution were conducted in Tianjin, a megacity in northern China, from January 1 to December 31, 2018. The sampling site is located at a rural area surrounded by rich vegetation. The site is about 200 m away from a major roadway with high automobile traffic volume. Multiple industrial parks and industrial furnaces are located about 40 km to the northwest and northeast direction. The industrial facilities are engaged in rubber manufacturing, plastics production and metal smelting. The gas chromatograph (GC955 611/811, Syntech Spectras Inc., Holland) with two detectors was used to measure ambient VOCs, including 27 alkanes, 10 alkenes, 1 alkyne (acetylene) and 16 aromatic hydrocarbons (Supplementary Table 1). These VOCs are included in the list of important species that the USEPA recommends photochemical assessment monitoring stations should monitor. Hourly O3 concentrations were measured by ultraviolet absorption instrument (Focused Photonics Inc., China). Meteorological parameters, such as T, were measured using an automatic meteorological observation system (LUFFT Inc., Germany). Solar radiation was recorded with a sun photometer (Kipp & Zonen Inc., the Netherlands). More details, including monitoring instruments and quality control, are described in Supplementary Note 6.

Receptor models

For the purpose of calculating the source of C-VOCs, the receptor model in source apportionment technology is introduced. Receptor models generally assume that the chemical composition of pollutants is relatively stable and do not take into account dispersion in rough terrain, long-range transport, wet and dry deposition44. The PMF model is a widely used receptor model for air pollutants source apportionment27,45,46,47,48. The principle of can be expressed by Eq. (2)49,50,51:

$$x_{ij} = \mathop {\sum }\limits_{k = 1}^p g_{ik}f_{kj} + e_{ij}$$
(2)

where xij (ppbv) is the concentration of the jth species in ith sample, gik (ppbv) is the contribution of the kth source to the ith sample, fkj (ppbv/ppbv) is the fraction of the jth species from the kth source; eij (ppbv) are the residuals, and p is the number of factors.

Major VOCs species were selected (the principles for species selection are described in Supplementary Note 7.2) to put into factorization models. Here, the PMF/ME2-SR (PMF/multilinear engine 2-species ratios) model was used to estimate source contributions by adding the ratios of VOCs characteristic species components (Supplementary Note 7.3) in the extraction process52,53. The advantage of this model is that the extracted factors have more physical significance. To evaluate the reliability of the source apportionment results of PMF/ME2-SR, we also applied partial target transformation-positive matrix factorization (PTT-PMF) to calculate the sources of VOCs. It establishes a normalized target source profile based on measured source profiles, selects and fixes the source markers of all sources, and finally realizes source apportionment54. Details on PMF/ME2-SR and PTT-PMF models are provided in Supplementary Note 7.

Quantitative calculation of C-VOCs

The calculation of C-VOCs is the key step of the proposed source apportionment method. Existing VOCs source apportionment methods are based on a common assumption: all VOC species have a common photochemical age, that is, all species have the same transport time from the source to the receptor. In fact, the concentration of VOCs in the atmosphere is the result of the mixture of disparate pollutant air masses discharged in different time and space, and the photochemical age of VOCs emitted by various pollution sources may also be significantly different. Here, the proposed method considers the diversity of VOCs species and estimates unique reaction rate constants, radical concentrations, and photochemical consumption times for each VOC species through chemical kinetics. Therefore, compared to previous methods, the source emission concentration of VOCs by this method is more representative of the actual atmospheric conditions.

Based on the conservation of species, the consumption of VOCs is related to the reaction and deposition process. However, this study only refers to the consumption caused by the chemical reaction since it is combined with the PMF model which without considering diffusion for subsequent analysis. The photochemical consumption of VOCs (without considering dispersion) from the source to receptors can be described by Eq. (3):

$${\mathrm{C}}\!\!-\!\!{\rm{VOCs}} = {\mathrm{In}}\!\! - \!\!{\mathrm{VOCs}} - {\mathrm{M}}\!\!-\!\!{\mathrm{VOCs}}$$
(3)

In this study, C-VOCs are defined as the photochemical consumption of VOCs in reaction with OH radicals during the transport from the source to observation site in the pathway of O3 formation (Eq. (3), unit: ppbv); the concentration of VOCs in the air released from pollution sources (or, effectively, at the receptor assuming no photochemical consumption) as initial VOCs concentrations, hereafter defined as In-VOCs (see Eq. (4), unit: ppbv); and the measured concentration at the observation site (with photochemical loss) as M-VOCs (unit: ppbv). Their relationship can be better described in Supplementary Fig. 12.

The dominant chemical reactions of VOCs in the troposphere are OH radical reactions during the daytime10, other removal paths including deposition and reacting with NO3 radicals are assumed to be negligible. As shown in Eq. (4)55, In-VOCs are calculated by assuming all VOCs come from pollution sources and there is no dispersion:

$$\left[ {{\mathrm{VOC}}_i} \right]_{{\mathrm{In}}} = \left[ {{\mathrm{VOC}}_i} \right]_{{\mathrm{M}}} \times {{{\mathrm{exp}}}}\left( {k_i\left[ {{{OH}}} \right]{\Delta}t} \right)$$
(4)

where \(\left[ {\rm{{VOC}}_i} \right]_{\rm{{In}}}\) denotes the concentration of VOCs species i, as if no photochemical aging is taken place (unit: ppbv); ki represents the rate constant of the reaction between species i and OH radicals (unit: cm3·molecule−1·s−1) and is estimated based on Eqs. (5)–(6); [OH] is the concentration of OH radicals (the estimation methods are in Supplementary Note 3.1, unit: molecule·cm−3); and ∆t is the photochemical consumption time of VOCs species i reacting with OH radicals from sources to the receptor (the estimation method is in Supplementary Note 3.2, unit: s):

$$k_{{{OH}}}{{{\mathrm{ = }}}}{\rm{AT}}^n{{{\mathrm{e}}}}^{ - B/T}$$
(5)
$$k_{OH}=A{{{\mathrm{e}}}}^{ -B/T}$$
(6)

where A is the Arrhenius constant (unit: cm3·molecule−1·s−1); B is the ratio of apparent activation energy (Ea) to molar gas constant (R) (unit: K); n is a coefficient, generally n = 210. For alkanes, the recommended temperature-dependent expression is a three-parameter expression (Eq. (5)); while for alkenes, alkynes, and aromatics, Eq. (6) is used56. The values of A, B, n, and kOH are provided in Supplementary Table 6 of the Supplementary Information.

Source apportionment of C-VOCs

Considering the reactivity of VOCs, using PMF/ME2-SR for directly analyzing the sources of M-VOCs is not appropriate because PMF is based on linear fitting (assuming no chemical loss of species). Here, CK-PMF model was established by combining chemical kinetics with PMF model to obtain the consumption sources and contributions of VOCs.

Stage 1. To calculate the sources of C-VOCs, we should quantify the source contributions of In-VOCs. The chemical kinetics method was used to calculate relevant parameters to obtain In-VOCs. Then the PMF/ME2-SR model was adopted to determine sources categories of typical VOCs (31 species in this study), with In-VOCs as input data. As shown in Eq. (7):

$${{ES}}_{{{In}} - {{VOCs}}\,ij} = {{ES}}_{1\,ij} + {{ES}}_{2\,ij} + \cdots + {{ES}}_{n\,ij}$$
(7)

where \({{ES}}_{{\rm{In - VOCs}}\,ij}\) represents the total contribution of ith In-VOCs species emitted from all sources at jth hour; ES1 ij, ES2 ij, and ESn ij represent the contributions of ith In-VOCs species emissions from source 1, source 2 and up to source n at jth hour, respectively.

For PMF/ME2-SR modeling, we applied species compositions (ratios) in VOCs source profiles (Supplementary Table 7) as constraints to make the extracted factors approach actual VOCs source profile. Six factors were identified (detail source identification procedures are described in Supplementary Note 8), including IE (35%), VE (21%), SU (18%), BE (13%), GE (7%) and LPG (6%) (Supplementary Fig. 8). The results were consistent with independent modeling performed using the partial target transformation-PMF (PTT-PMF) model (Supplementary Fig. 13).

Stage 2. Based on chemical kinetics, Eq. (8) was derived from Eqs. (3) and (4), and it was used to predict sources and their contributions to C-VOCs (chemical consumptions in daytime), based on the above results:

$${{CS}}_n = {{ES}}_n - {{ES}}_n \cdot e^{ - k_{ij} \cdot \left[ {{{OH}}} \right]_{ij} \cdot {\Delta}t_{ij}} = {{ES}}_n \cdot \left( {{{{\mathrm{1}}}} - e^{ - k_{ij} \cdot \left[ {{{OH}}} \right]_{ij} \cdot {\Delta}t_{ij}}} \right)$$
(8)

where CSn is the contribution of ith C-VOCs species from the nth source at jth time of the observation site.

Ozone formation potential

The OFP is used to reflect the ability of different VOCs species that participate in atmospheric chemical reactions34,57,58. It’s estimated here using the maximum incremental reactivity (MIR) method9,57. The OFP of individual VOC is calculated as follows:

$${\mathrm{OFP}}_i = \left[ {\mathrm{VOCs}} \right]_i{\times}{MIR}_i$$
(9)

where \(\left[ {{\rm{VOCs}}} \right]_i\) represents the concentration of VOC species i (units: ppbv) and \({{{\mathrm{MIR}}}}_i\) is the MIR coefficient of VOC species i (dimensionless), indicating the reference scale of the O3 forming ability. OFPC-VOCs represents the maximum contributions of C-VOCs species to O3 formation under idealized photochemical reaction conditions:

$${\mathrm{OFP}}_{i\,{\mathrm{C - VOCs}}} = \left( {\left[ {{\mathrm{VOC}}_i} \right]_{\mathrm{In}} - \left[ {{\mathrm{VOC}}_i} \right]_{\mathrm{M}}} \right){{{\mathrm{ \times }}}}{{MIR}}_i$$
(10)

The MIR coefficients were obtained from Carter et al.9 (Supplementary Table 6). Additionally, the method of OH radical loss rate (LOH) was also estimated for verification purposes, as shown in Supplementary Note 4.