Equal abundance of summertime natural and wintertime anthropogenic Arctic organic aerosols

Aerosols play an important yet uncertain role in modulating the radiation balance of the sensitive Arctic atmosphere. Organic aerosol is one of the most abundant, yet least understood, fractions of the Arctic aerosol mass. Here we use data from eight observatories that represent the entire Arctic to reveal the annual cycles in anthropogenic and biogenic sources of organic aerosol. We show that during winter, the organic aerosol in the Arctic is dominated by anthropogenic emissions, mainly from Eurasia, which consist of both direct combustion emissions and long-range transported, aged pollution. In summer, the decreasing anthropogenic pollution is replaced by natural emissions. These include marine secondary, biogenic secondary and primary biological emissions, which have the potential to be important to Arctic climate by modifying the cloud condensation nuclei properties and acting as ice-nucleating particles. Their source strength or atmospheric processing is sensitive to nutrient availability, solar radiation, temperature and snow cover. Our results provide a comprehensive understanding of the current pan-Arctic organic aerosol, which can be used to support modelling efforts that aim to quantify the climate impacts of emissions in this sensitive region.

was approximately one week but varied from six to nine days. Total sampled amount of air was typically between 330 and 500 m 3 per sample. After sampling, the filters were stored in a fridge at the station and shipped to PSI after one to three months in a Styrofoam box with a cooler. TIK: Aerosol measurements at the International Hydro-Meteorological Observatory (HMO) Tiksi (71°6'N, 128°9'E) were taken at the Clean Air Facility (CAF), located 500 m from the Laptev sea coast and 5 km from the Tiksi settlement. The 20 m Tiksi meteorological flux tower is located around 300 m from the CAF. A total suspended particle (TSP) inlet was installed approximately 1.5 m above the CAF roof and 5 m above the ground. Aerosols were sampled at an air flow of ~45 L min -1 and during the protocol times. TSP was collected on 47 mm quartz fiber (Pallflex) and Teflon (Zefluor) filters for subsequent analyses in the laboratory. The low concentrations of ambient aerosols necessitated sampling times ranging from one day in November up to three days in September, to allow the loading to exceed the detection limit for relevant aerosol chemistry analyses. Sampling was performed in September and November of 2014, in March, May-June, and September 2015, and in June and September 2016. Upon removal from the sampling system, the samples were wrapped in aluminum foil, plastic tightly closed bag, and immediately put into a deep freeze. For transportation an additional plastic box was used with a tight lid. The duration of transportation was smaller than the duration of storage. More details are provided in Popovicheva et al. (2019). UTQ: PM samples were collected on the North Slope of Alaska from June 2016 through September 2017, at the Climate Research Facility 7.4 km northeast of the village of Utqiaġvik (UTQ), Alaska (71°2'N, 156°4'W), 515 km north of the Arctic Circle. The site is approximately 1.6 km from the nearest coast. TSP samples were collected on QFF (Tissuquartz Filters 2500 QAT-UP; 20 x 25 cm) using Hi-Q high volume samplers ~10 m above ground level. The sampling duration was on average one week at a flow rate of 1.2 m 3 min -1 . Filters were removed from the sampler immediately after sample period had ended and were stored in a freezer on-site when not directly in use. QFFs were baked prior to sampling at 500 o C for 12 h and stored in aluminum foil packets and storage bags in a freezer before and after sampling. Field blanks were taken periodically throughout the sampling campaigns by placing an unsampled filter in a filter holder, placing it in the sampler momentarily, and then removing it and placing the filter in storage. Field blanks were treated in the same manner as sampled filters. Filters were shipped to and from the site in plastic bins in coolers cooled with blue ice packs. VRS: Villum Research Station (VRS) is located in North Greenland (81°36'N, 16°40'W, 24 m asl). The atmospheric measurement site is located 2 km southeast of the Danish Military facility on a small peninsula (Princess Ingeborg peninsula). The region is characterized by a dry and cold climate with 188 mm of precipitation annually and an annual mean temperature of -16.9 °C. Dominant wind direction is from southwest and the observatory is most of the time upwind of the military outpost Station Nord. The annual average wind speed is 4 m s -1 . VRS is surrounded by sea ice with bare ground occasionally present in the summer and appearing more and more frequent the latest years. Polar sunrise is observed at the end of February, while polar night prevails from mid-October. A high-volume sampler (Digitel DHA-80) was operated in the air observatory at a flow rate of 500 L min -1 (STP) and regularly tested towards a transfer standard and adjusted. The inlet head was heated to avoid condensation. The HVS itself is placed indoor in a temperaturecontrolled room. The sample passes through a PM10 head located just on top of the HVS and sample air was collected on quartz fiber filters over one week corresponding to ~5 k m 3 of air. After sampling, the exposed filters are placed between two pieces of aluminum foil and placed in a rilsan bag and stored in the dark in a freezer at -20 o C. During transport by plane, the samples are still kept in the dark but the temperature is ambient (takes normally 2 days). After received at Aarhus University, the samples are stored again in a freezer at -20 o C. ZEP: Sixty seven aerosol filter samples were collected at the Zeppelin Observatory (78°5'N, 11°5'E, 475 m asl) at Svalbard, Norway, between January 2017 and December 2018. The filter samples were collected using a Digitel highvolume sampler (PM10) with a flow rate of 40 m 3 h −1 and a filter face velocity of 72.2 cm s -1 . The sampling inlet being situated 2.5 m above the roof level of the observatory and 7 m above the ground level. Aerosol particles were collected on pre-fired (850°C; 3 h) QFF (PALLFLEX Tissuequartz 2500QAT-UP; 150 mm in diameter) for 1 week, and according to the quartz fiber filter behind quartz fiber filter (QBQ) set up, thus providing dynamic field blanks. Back and front filters were mounted in pre-cleaned filter holders, wrapped in preheated aluminum foil, and locked in two Zip-lock polyethylene bags, taking place in NILUs clean room. Shipment from NILU to the Zeppelin Observatory and vice versa were made in aluminum boxes, typically ten filters in each parcel. During transport from NILU to the Zeppelin Observatory and back again, the parcel was kept in ambient air. At the Zeppelin Observatory, the filters were stored in a freezer (-18 o C) prior to and after being exposed. At NILU, exposed filter samples were stored in a freezer (-18 o C). Thermal-optical analysis (TOA) was performed using the Sunset Lab OC/EC Aerosol Analyzer, using transmission for charring correction and operated according to the EUSAAR-2 temperature program 4 . Aliquots were cut from each of the exposed filters in our clean room, wrapped in preheated aluminum foil, locked in two Zip-lock polyethylene bags, and shipped to PSI for analysis. Upon combination of certain front filters, 60 ZEP samples were measured with offline AMS at PSI.
Text S2. Additional information on the AMS measurements We have tested Teflon filters vs. QFF, as Teflon filters were also available (alternating with QFF) from the Russian stations. However, the extraction efficiency of test Teflon filters in water was found to be significantly lower than that of the QFF, both filters being collected concurrently at PSI (to represent the same ambient sample, in order to assess only the potential effect of the substrate on extraction efficiency). In general, the filter substrate should be quartz fiber if the goal of the campaign includes OA monitoring, as the only offline method available to quantify the OC loadings involves the OA thermal decomposition, therefore a polymeric filter material could partially decompose leading to OC artifacts.
The inorganic-salt artifact on the AMS CO2 and CO fragment ion signal 5 was also accounted for. We measured ammonium nitrate (AN) and ammonium sulfate (AS) standards (pure, without any filter extracts present) of three different concentrations (4, 12, 36 ppm) at the beginning and at the end of the campaign to correct data matrices for the inorganic AN/AS effects on both the CO and CO2 signals, as a function of loading and time. The resulting "b" parameters (slopes) used to modify the fragmentation table were, b-AN: 0.012 (CO2), -0.0032 (CO); b-AS: 0.0043 (CO2), -0.0026 (CO), with no significant variability between the start and end of the measurement campaign (duration ~5 consecutive days). These "b" parameters were significantly lower than the average/median values by Pieber et al. (2016) 5 , causing minor corrections.
Text S3. Auxiliary measurements Additional offline analyses were carried out (using different punches/extracts than for the AMS measurements) to corroborate and validate the source apportionment results (Text S4), e.g. Fig. S12-S13 and Table S6. Elemental and organic carbon (Sunset-EC/OC) were quantified by thermal-optical analysis, following the EUSAAR2 protocol 4 (UTQ: NIOSH 5040; ALT: see next paragraph); water-soluble OC (WSOC) was measured by water extraction followed by catalytic oxidation and non-dispersive infrared detection of CO2 using a total organic carbon (TOC) analyzer 6 . We measured major ions (including methanesulfonic acid, MSA) in selected samples by ion chromatography (IC) 7 . Organic markers were determined for selected samples: sugar-alcohols and sugars were measured by high-performance liquid chromatography (HPLC) associated with a fluorescence detector (LC 240 Perkin Elmer) and HPLC-pulsed amperometric detection 8 ; organic acids were determined by LC-MS. IC-based MSA from the same filters (selected samples) was measured also by other laboratories. Correlation between IC-based MSA and TOC+AMS/PMF MSA-OA at different stations is provided in Table S7.
The EC and OC concentrations of the filters collected at ALT were analyzed by a thermal evolution protocol, developed at ECCC as EnCan-Total-900 (ECT9), to quantify the amount of OC and EC in carbonaceous aerosol and their δ 13 C values 9,10 . The fractions were separated from each other, according to their degree of refractoriness. Specifically, carbon fractions were released by the ECT9 protocol in three steps: (1) OC at 550 °C for 600 seconds in pure He; (2) Pyrolyzed OC (PyOC) & carbonate carbon (CC) at 870 °C for 600 seconds in pure He; and (3) EC at 900 °C for 420 seconds in a mixture of 2 % O2 with 98 % He. All fractions were fully oxidized to CO2 by passing through a furnace containing MnO2 maintained at 870 °C. For concentration determination, the CO2 passed through a methanator at 500 C, was converted to CH4, and quantified with a flame ionization detector. Based on isotope measurements ( 14 C & 13 C), it was verified that the ECT9 protocol 11 effectively isolates OC or EC from complex mixtures of reference materials with an uncertainty of about 5 %.
Several environmental parameters were retrieved as well. Temperature data for TIK were obtained from Popovicheva et al. (2019) 12 . Temperature, solar radiation and snow depth data for VRS were obtained from the station website: https://villumresearchstation.dk/data/. Normal-climate average (1980-2010) snowfall data for UTQ were obtained from: http://akclimate.org/Climate/Normals; in June, July, and August the long-term absence of snow events at this site is evident, and PBOA concurrently increases significantly with decreasing snowfall, before becoming negligible starting from September. Electronic archive Arctic and Antarctic Research Institute (AARI) term meteorological and upper-air observations data at Research station "Ice Base Cape Baranova" 2013-2020 were obtained from: http://www.aari.ru/main.php?lg=1. Other data were retrieved from ebas (http://ebas.nilu.no) or measured within the current project. Data were averaged to match the time resolution of the filter sample composites measured by AMS.
Text S4. Additional information on the PMF analysis Number of factors: Compared to n = 11, other factor solutions were less stable among the different random seed runs, for certain factors (Table S3): MSA-OA first appeared in the 8-factor solution but only in certain seed runs, while it was not identified in the 7-factor solution. Starting from n = 9, the time series of MSA-OA, BSOA, and PBOA became stable among the five random seed runs and were similar to those of the 11-factor solution in both absolute (slope close to 1.0) and relative terms (high R 2 ). This was also the case for POA and haze but starting from n = 10. OOA and CHN-rich also became stable starting from n = 10, but with different absolute contributions than for n = 11. The Field blank-related factor (see "Retention of PMF factors related to ambient organic aerosols" subsection below) remained mixed with other factors for n ≤ 10 and therefore was the last one separated in the 11-factor solution. For n > 11, the solutions resembled the 11-factor solution, except for specific factor splits, e.g. for n = 13, the CHN-rich factor was split into three factors, or the CHN-rich and BSOA factors were split into two factors each. Even though these splits might indicate some inherent variability in these components, the robust separation of associated distinct features was not possible by PMF and/or for this specific dataset and/or sample size. Therefore, with regard to a partial exploration of the rotational ambiguity, the analysis described above showed that the 11-factor solution was the most robust.
Error/sensitivity analysis: The uncertainty analysis approach followed in the present study is described as follows: a preliminary uncertainty analysis was carried out by performing 21 free BS runs, of which 16 were similar to the 11-factor "base case" solution. The rejected runs were related to (temporal) co-variability between certain factors (e.g. of POA and haze) and to one sample from TIK (24.09.2014) representing a pollution episode (highest organic mass among all stations) likely not explained by any factor. Unlike all other samples, the Q/Qexp of this sample (only) remained high in the 11-factor solution. Therefore, this outlier sample was not included in any discussion/presentation. We have therefore S4 used from the 16 retained runs the 11-factor average mass spectral profiles' relative fragment ion intensities within one standard deviation (1 SD) to proceed with running BS 100 times for obtaining a modeling error estimate. By applying these constraints on the retrieved factor mass spectral profiles, we aimed at guiding the solution towards environmentally meaningful rotations 13 , but without forcing the profiles into too narrow intervals that could potentially result in unrealistic/biased relative errors. We also introduced a "block BS" approach in these 100 runs, using a block length (l) of 7 consecutive samples, according to the semi-empirical criterion 14,15 l = N 1/3 , where N was the sample size (~350). Therefore, 50 non-overlapping blocks were created (each block contained samples from one station). These were treated as single-sample blocks in the different resampling runs, which assisted in preserving the original time series structure to a certain extent, i.e. to account for the partial co-variability between e.g. haze and POA observed in the preliminary uncertainty analysis. Besides addressing co-variability issues, the blocked bootstrap strategy is also recommended when performing less than ~100 BS runs 16 . By following this approach, all 100 BS runs matched the base case solution with relative errors below 30 % on average for factor mass concentrations > 50 ng m -3 , with MSA-OA clearly being the least uncertain factor (Fig. S6). On the basis of comparable absolute mass concentrations, our relative errors were generally lower than those reported for > 100 ng m -3 in Daellenbach et al. (2017) 17 .
In parallel, we assessed the sensitivity of the 11-factor solution by testing a complementary, independent approach conceptually similar to bootstrapping (not adopted as the main one eventually). This approach consisted of running PMF on randomly reduced datasets (< 350 samples in the input matrix) with variable sample size, i.e. 33 %, 50 %, 70 % and 85 % of the 350 samples, 5 times for each sample size (20 sensitivity runs in total), following the approach of Hedberg et al. (2005) 18 . The BSOA, MSA-OA and PBOA factors were identified (matching the base case) in all of these 20 sensitivity runs. At the same time, we observed similar features to those of the selected approach described before, i.e. one TIK sample affecting the output when randomly selected or not (e.g. misattribution of the CHN-rich, OOA and FBrelated factors), and temporal co-variability, e.g. between the haze and POA factors, leading to imprecise factor identification. Overall 9 runs from this approach were matching the base case solution, with an increasing acceptanceto-rejection ratio by increasing the sample size (80 % solution acceptance for 85 % randomly reduced datasets). This is because reduced datasets may be explained by less than 11 factors (if constraints are not introduced) and indicates the importance of large PMF-input datasets in sufficiently capturing the variability in both the chemical composition and temporal trends. Without considering the high-loading TIK sample, 13/20 runs would instead be accepted. This exercise therefore provided further support of the observed stability of the solution, upon running BS with partial constraints on the retrieved mass spectral profiles, as eventually selected for this study and described above.
We also compared the 11-factor solution that included all m/z (up to 191) vs. the solution with m/z up to 133 (i.e. base case). The high correlation coefficients indicate excellent agreement for all OA factor time series (Table S4). No difference in the obtained PMF result was therefore observed by considering fragments with m/z > 133 or not. We note that the day-to-day relative contribution of these larger fragments to the total AMS signal by HR organics was 1.5 ± 0.4 %, although many of them exhibited high SNR (in the samples compared to the AMS water-blanks).
Retention of PMF factors related to ambient organic aerosols: We observed significant decreases in the fCO2 after fumigation for five selected samples with very high measured initial fCO2 (~0.75), which confirmed the presence of carbonate. By contrast, the decrease in fCO2 was not as significant for five other samples with lower carbonate-related factor content in relative terms (Fig. S3). We also found excellent agreement between the sample mass spectra after fumigation vs. mathematical subtraction of the carbonate-related factor as retrieved by PMF from the respective original measured AMS spectra (Fig. S3), which provides strong evidence of the sufficient removal of inorganic carbonate via fumigation. These support the mathematical subtraction approach used in the present study, especially considering that chemical damage of the organic content can occur upon fumigation 19 . The absolute concentrations of the remaining factors (median and IQR from the 100 BS runs) were corrected/rescaled to measured WSOC using a relative ionization efficiency of 1.4 for all organics 20 vs. 1.16 for carbonate 21  We performed test AMS/PMF runs by including the field blank samples in the input matrix. The contributions of each organic factor in the blanks were then compared to the respective contributions in the samples (Fig. S4-S5). The absolute concentrations of the field-blank (FB)-related factor were not statistically different between the field blanks and the samples (Fig. S4). Further, the FB-related factor exhibits higher relative contributions at stations with non-pre-baked filters (one third -35 ± 17 %-vs. one fifth -21 ± 13 %-of the total signal, i.e. sum of the 11 factors, for pre-baked filters). Also, the FB-related factor profile from the base case solution correlated with one of the two factor profiles identified by applying PMF on the FB AMS spectra from various stations (scatter plot of 578 fragment relative intensities, R 2 : 0.99; slope: 1.00; both profiles had identical fCO2 = 0.43). These provided both quantitative and qualitative support of a systematic association of the FB-related factor and its mass spectral fingerprint with the organic mass on the filter substrate at the different stations.
We identified two non-interpretable factors; one was S-rich (F1) and the other was N-rich (F2). Their combined profile fingerprint appeared to be separated as a second factor when applying PMF on the field blanks, with minor relative contributions compared to the FB-related factor but enhanced for ALT. Their combined contributions in the samples were overall significantly lower than those of the FB-related factor, amounting to less than 5 % of the total signal in the base case solution (sum of 11 factor absolute mass concentrations over all stations), with elevated contributions in samples from ALT. In many samples these factors did not contribute with a real signal (Fig. S4), they exhibited relatively low absolute concentrations lacking temporal trend, and did not contain source-marker fragments nor correlated with available auxiliary data. Also, backward trajectory analysis did not provide any indication of specific source regions. These relatively minor factors were therefore not identified and thus were not considered for the discussion in the main text nor in the results presentation.
The CHN-rich factor (N:C ~0.11; Fig. S2) was rich in proteinaceous matter and dominated the variability of reduced Ncontaining fragments typically related to amino acids (Table S5). It exhibited yearly-average concentrations >100 ng m -3 in BAR, GRU and TIK. While the composition of this factor is well-understood, links to natural or anthropogenic primary emissions remain elusive. We hypothesize an association with combustion emissions (trash burning, landfills at TIK), biological matter (terrestrial dust, phytoplankton production, bacteria and biological degradation) and/or sea salt aerosol arising from the marine microlayer 22,23,24 . The latter can be supported by its fair correlation with estimated sea salt concentrations at GRU (R 2 : 0.5), which could partially explain a lack of a clear temporal trend at the different stations. However, anthropogenic emissions cannot be excluded. This factor was relatively less defined and its contributions in the samples appeared to be distinguishable from the contributions in the blanks only for high mass concentrations (Fig.  S4).
Final AMS/PMF result: Recovery analysis was performed using PMF, following a simplified version of the approach of D. Bhattu et al. (pers. Comm.). Briefly, the analysis was carried out on 265 samples where Sunset-OC data were available. Fifty BS runs were performed in total, where the output time series were constrained using all 11 factors (normalized median concentrations) within their IQR, as obtained from the 100 BS runs performed on the water-soluble fraction. In these 50 runs, the water-insoluble OC (Sunset-OC minus WSOC) time series was used as an additional variable in the PMF input matrix (scaled to WSOA of the input matrix). The recovery was then defined as the ratio of the output profiles' water-soluble-to-total signal ratio for each factor. The well-constrained resulting recoveries are shown in Fig. S9 for the six retained OA factors. The lowest water-solubility was ~60-80 % for POA, PBOA, haze and OOA, whereas MSA-OA and BSOA can be considered fully water-soluble. The median recoveries of the Carbonate-related, FB-related F1, F2 and CHN-rich factors were 90 %, 100 %, 99 %, 77 %, and 100 %, respectively.
Source-marker AMS fragments: We provide here specific fragments identified in our dataset as characteristic of specific sources, which were also identified in previous studies. These fragments were selected based on their highest contribution to these factors and the dominant contribution of these factors to these fragments. The numbers next to elements correspond to subscripts in standard chemical formulas, while the number behind each fragment indicates its m/z value. All AMS fragments correlating with each AMS/PMF factor full-dataset time series are listed in Table S5.

Text S5. Additional information on the back-trajectory analysis
The aim of this analysis was to identify long-term inner-Arctic vs. distant (transported) OA source components by coupling AMS/PMF with CWT. Based on the overall obtained results from all factors at all stations, and considering which factors are expected to be transported, we focused on the presentation and discussion of obtained results for haze, POA MSA-OA and BSOA, in order to further support their identification/interpretation, especially for haze. The atmospheric lifetime of pollutants is generally much longer in Arctic winter than at lower latitudes, due to slow dry/wet deposition. Tests were carried out with 10-vs. 14-d BTs for haze arriving at (remote) ALT (most likely component to be most aged over all sites), considering the large mean Arctic age of air in winter 42,43 , but the result remained unchanged. Therefore, we did not extend the 10-d BTs in the subsequent main runs. We note that back-trajectories on the order of 10-d may emphasize the importance of Eurasia for Arctic pollution, while we cannot exclude transport over longer timescales and the importance of more southerly sources, e.g. in Asia. For MSA-OA prevalent in summer, 5-d BTs proved sufficient in identifying potential source origins in our study, in line with the relatively shorter mean Arctic age of air in summer 42 . We are aware of potential inaccuracies and artifacts related to, for instance, sparse Arctic weather data 44 , complex orography around the station 45 , surface effects 46 and generally shallow planetary boundary layer heights (low inversion layers) 47 , however we have used long-term data and merged results (as detailed in the following) in an attempt to reduce their potential impact 48 on the main trends at a regional scale 49 . Fig. S11 were the following: i) haze (10-d BTs; 5-d for Sub-Arctic PAL) is largely transported from Europe and mainland Russia to the different Arctic stations; ii) POA (10-d BTs unless otherwise noted) has mainly Eurasian potential source regions, possibly except for GRU and TIK where a more local source influence can be expected (only summer samples were available), in which cases a trajectory-based approach conceptually fails to provide meaningful information; iii) MSA-OA (5-d BTs) exhibited a clear marine origin at all stations (negligible influence in TIK); iv) a similar (distant) source region in central Siberia is found for BSOA arriving at both BAR and TIK (Russian stations; 10-d BTs); a marine distant source at UTQ (10-d BTs) is not ruled out; more local/regional influence is expected at PAL (5-d BTs), in which case a trajectory-based approach conceptually fails to provide meaningful information. BT results for the BSOA factor in other stations were not considered/interpreted due to lack of response to temperature (see Fig. S14).

The main observations from individual station results shown in
Supplementary Tables   Table S1. Filter sampling coverage. Table S2. Polar night (wintertime absolute darkness) and midnight sun (summertime continuous sunlight) periods at the different Arctic stations. Note that these time periods vary slightly from one station to another due to their different latitude (e.g. earlier onset and larger duration of wintertime darkness at higher-latitude stations).   Table S3. Time series correlations between identified factors in various AMS/PMF solutions for different number of factors (from 7 to 13), and the respective factor time series from the base case 11-factor solution. Factors colored in grey (Carbonate, FB-related, F1, F2, CHN-rich) were interpreted to not be related to sampled organics or major OA sources.  Table S4. Same Table S3, but for the 11-factor solution by including fragments up to m/z 133 vs. all m/z. The excellent agreement between the 11-factor base case solution and the median factor time series based on the 100 BS runs is also demonstrated.  Table S5. Correlation of 266 AMS fragment ion time series (used in the PMF input) with the corresponding full-dataset time series of seven PMF-output OA factors (identified "marker" fragments are indicated with bold). Fragments were classified according to the Pearson's r correlation coefficients, and then they were ordered by increasing m/z (25 HR fragments with SNR < 2.0 are indicated with italics). Fragments correlating with two factors are indicated with the color of the other factor.   showing that solutions with more than 11 factors did not further reduce the residuals significantly (<10 %). c) Scaled residuals (Q/Qexp; color scale on the right) of the base case 11-factor solution, for the full dataset (variables: HR fragment relative intensity). Figure S2. All 11-factor AMS/PMF mass spectra (profiles; shown as normalized fragment intensities) in HR with average atomic ratios, where the fragments are color-coded with the family. Factors with legends colored in grey (Carbonaterelated, F1, F2, FB-related, CHN-rich) were interpreted to not be related to sampled organics or major OA sources. The spectra for m/z > 50 are magnified (right panel, x10 -3 ).

Time series
Variables a b c S14 Figure S3. Comparison of normalized mass spectra, averaged for five samples with higher initial fCO2 and five other samples with lower initial fCO2, with fumigation vs. without fumigation before the AMS measurement, as well as with fumigation vs. "without fumigation minus Carbonate-related factor". The former comparison indicates a substantial decrease in the CO/CO2 signal upon fumigation in samples with larger initial fCO2, while the latter comparison supports the mathematical subtraction of the Carbonate-related factor from the PMF analysis to account for the presence of inorganic carbonate in our samples. Figure S4. Normalized cumulative distribution functions (CDFs) for the water-extracted organic carbon mass concentration of the FB-related, F1, F2 and CHN-rich factors in the samples and in the blanks, for six stations (number of samples = 250) with available field blank filters. Open circles display actual data in thirty concentration bins (lines: fitted curves). Note that the same range of x and y variables is shown in all panels. Insets show normalized counts in log scale with different bin number for demonstration. The blank concentrations in each sample were estimated using station-specific (and season/year-specific, where applicable) blank-filter relative organic factor composition (from PMF) and the sample-specific m 3 of sampled air per cm 2 of filter area. Similar ranges and distributions were found for the FBrelated factor. F1 and F2 did not exhibit concentrations higher than 100 ng m -3 . Together with CHN-rich, these factors did not have statistically different contributions in the samples from those in respective field blanks, i.e.
[IQR] of day-today sample-to-blank mass ratio not statistically different from 1.0: FB-related, [0.7, 3]; F1, [1,6]; F2, [1,4], and CHNrich, [0.9, 3]. Figure S5. Normalized cumulative distribution functions (CDFs) for the mass concentrations of the six retained/interpretable WSOC factors in the samples and in the blanks, for six stations (number of samples = 250) with available field blank filters. Open circles display actual data with bin size 10 ng m -3 (lines: fitted curves). Insets show normalized counts in log scale with different bin size for demonstration. The blank concentrations in each sample were estimated based on station-specific (and season/year-specific, where applicable) blank-filter relative organic factor composition (from PMF) and the sample-specific m 3 of sampled air per cm 2 . In contrast to the factors discussed in Fig.  S4, these six factors did not contribute significantly to the signal of the field blanks. Specifically, all six factor concentrations in the blanks resided in the first or first two concentration bins (0-20 ng m -3 ). Therefore, their detection limits were relatively low and their high concentrations in the samples can be considered real with high confidence. The full-dataset P99,sample-to-P99,blank mass ratio for POA, haze, MSA-OA, BSOA, PBOA and OOA is 7, 19, 34, 93, 12 and 10, respectively (P: percentile). Figure S6. Scatter plot of the relative fragment intensities vs. their standard deviation (1 SD) for the six retained major WSOA factors from 100 BS runs, and the resulting time series of relative error (1SD/average or relative IQR/2) vs. the average or median factor concentrations. Black line shows the 1:1 line. Individual factor time series linear fits are shown with lines having the same color as the WSOA factor. Figure S7. Station-specific seasonal absolute WSOA mass concentrations, sorted in descending order of the station annual-average OA, and the respective relative factor contributions to WSOA mass (before recovery corrections). The corresponding panels for total OA are found in the main text (Fig. 1). S18 Figure S8. Same as Fig. 2, but for WSOA factors. Figure S9. Estimated AMS/PMF recoveries for the major OA sources. The median values were used to convert the water-soluble to total OA mass. Figure S10. Time series of cumulative (median) absolute factor contributions to total OA mass at each station (median composite dates shown for the sampling period at each station from start to end). Figure S11. Back-trajectory analysis results using ZeFir, based on concentration-weighted trajectories (CWT), where the entire time series of each WSOA factor mass from each station were used as input. The receptor site is indicated with a red circle. The heat maps indicate air parcels responsible for high measured factor concentrations arriving at a receptor site (label), and thus potential major source regions for the associated long-term datasets. Results for stations with very low year-long factor concentrations that are not omitted here should be interpreted with caution.  Figure S16. Same as Fig. 3, but for total OA (sum of 6 factors).