A global view on the effect of water uptake on aerosol particle light scattering

A reference dataset of multi-wavelength particle light scattering and hemispheric backscattering coefficients for different relative humidities (RH) between RH = 30 and 95% and wavelengths between λ = 450 nm and 700 nm is described in this work. Tandem-humidified nephelometer measurements from 26 ground-based sites around the globe, covering multiple aerosol types, have been re-analysed and harmonized into a single dataset. The dataset includes multi-annual measurements from long-term monitoring sites as well as short-term field campaign data. The result is a unique collection of RH-dependent aerosol light scattering properties, presented as a function of size cut. This dataset is important for climate and atmospheric model-measurement inter-comparisons, as a means to improve model performance, and may be useful for satellite and remote sensing evaluation using surface-based, in-situ measurements.


Methods
Sites. This study utilises data from 26 sites across the globe, operating both on a campaign basis and as part of long-term monitoring efforts. Table 1 gives an overview of the sites and defines the acronyms used throughout the manuscript.
A majority of the data come from the USA Department of Energy Atmospheric Radiation Measurements (DOE/ARM) deployments either at their long-term atmospheric observatory (SGP) or via the ARM Mobile Facility (AMF) campaigns (FKB, GRW, HFE, MAO, NIM, PGH, PVC, and PYE). Another large subset of the data was obtained during field campaigns in Europe (6 sites: CES, HYY, JFJ, MEL, MHD, and ZEP) performed by a research group from the Paul Scherrer Institute (PSI) in Switzerland. A detailed comparison of the PSI sites, recommendations for instrument operation and closure studies can be found in Zieger et al. 16 . More data was obtained from long-term monitoring sites in the USA National Oceanic and Atmospheric Administration Federated Aerosol Network (NOAA-FAN) (4 sites: APP, BRW (supported by DOE), THD and UGR) and shorter field campaign deployments by the NOAA-FAN research group (4 sites: CBG, GSN, HLM, and KCO). A few additional institutes like the University of Crete, the Chinese Academy of Meteorological Sciences (CAMS), and the USA National Park Service's Interagency Monitoring of Protected Visual Environments (IMPROVE) program have also provided data from their deployments of tandem nephelometer systems (FIK, LAN and YOS, respectively; see Table 1 for references).
This study presents, for the first time, the f(RH) results from the following sites: APP, BRW, FIK, GRW, HLM, MAO, NIM, PYE, and THD. Five of the sites (APP, BRW, GRW, SGP and THD) provide more than one year of continuous hygroscopicity measurements, enabling the investigation of annual cycles and climatologies in f(RH). Figure 1 shows data coverage for each site. More information about the measurement stations is provided below. The air sampling infrastructure at all DOE/ARM and NOAA sites utilizes the inlet system developed by NOAA/ ESRL and follows GAW aerosol sampling protocols 18,22 . Other sites have individual characteristics which are briefly described below and in more detail in the provided references.
Appalachian State (APP), USA. The Appalachian Atmospheric Interdisciplinary Research Facility (APP) is situated at the highest point on the Appalachian State University campus (1080 m a.s.l.), in the heavily forested southern Appalachian Mountain region of North Carolina in the south-eastern USA. Although there are no major local aerosol sources (other from commuter and tourist traffic) the aerosol inlet is located 34 m above ground to minimize sampling of local sources. Secondary organic aerosol (largely isoprene-derived in summer) and sulphates dominate the sub-micron aerosol mass sampled at APP, along with a biomass burning influence during non-summer months 30 . More details about the site and the temporal variability of light scattering coefficient can be found in Sherman et al. 31 .
www.nature.com/scientificdata www.nature.com/scientificdata/ Barrow (BRW), USA. The Barrow facility is a coastal Arctic site in northern Alaska operated by NOAA/ESRL. The station is surrounded by flat tundra, large lagoons and lakes, and is approximately 1 km from the Arctic Ocean. The predominant wind direction is from east-north-east from the Beaufort Sea with minimal anthropogenic pollution. Generally, the station can be described as having an Arctic maritime climate. A description of the site as well as statistics and temporal variability of light scattering coefficient (among other optical properties) can be found in Delene et al. 32 .
Chebogue Point (CBG), Canada. A short-term, ground-based field site was established as part of the International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) at Chebogue Point 33 . This coastal site was located at the south-west tip of Nova Scotia, Canada, 9 km south-south-west of the  www.nature.com/scientificdata www.nature.com/scientificdata/ small town of Yarmouth. The Maine/New Brunswick coastline lies 130 km to the north-west across the Gulf of Maine. The cities of Boston and New York are 430 km and 730 km, respectively, to the south-west. Further site-specific information can be found in Ervens et al. 9 .
Cabauw (CES), Netherlands. The Cabauw Experimental Site for Atmospheric Research (CESAR 34 ) is located about 40 km from the North Sea at 0.7 m below sea level, while air is sampled at around 60 m a.s.l. The station's environment is typical for north-west Europe and can be described as background rural and maritime, depending on the wind direction and air mass influences. Further site information and previous results of f(RH) measurements and their link to hygroscopicity and remote-sensing data can be found in Zieger et al. 12 .
Finokalia (FIK), Greece. The remote coastal site of Finokalia, representative of the Eastern Mediterranean area, is located in the north-eastern coast of the island of Crete at the top of a hill at around 250 m a.s.l. While FIK is primarily a remote marine location, long-range transport episodes from Athens, central Europe, Asia, and North Africa can strongly affect the site. Aerosol measurements are conducted in a dedicated building at the station equipped with various aerosol inlets which sample at 4 m above ground level. More site-specific details can be found in Kalivitis et al. 35 .
Black Forest (FKB), Germany. This dataset was obtained within the context of the Convective and Orographic Induced Precipitation Study (COPS) field campaign in Heselbach, Germany. The site is located in a low mountain valley of the Black Forest surrounded by agricultural activity and is downwind of Stuttgart. Thus, the aerosol measured at FKB are primarily representative of rural continental air with occasional incursions from urban sources. The site has a typical mid-latitude moderate climate. Measurements reported here come from a deployment of the DOE/ARM Mobile Facility 36 . Fierz-Schmidhauser et al. 23 presents the comparison between data measured by the NOAA and PSI systems, while in this study only data from the NOAA system has been analysed.
Graciosa (GRW), Portugal. The DOE/ARM Mobile Facility was deployed on the island of Graciosa to support the campaign Clouds, Aerosol and Precipitation in the Marine Boundary Layer (CAP-MBL). Graciosa is situated within the Azores archipelago in the eastern Atlantic Ocean. This marine site lies in the boundary between the subtropics and the mid-latitudes and experiences a wide range of meteorological conditions, ranging from undisturbed trade wind flow to cyclonic systems or extensive low-level stratus clouds. While the site is dominated by clean marine air masses, it can experience periodic episodes of polluted air masses from North America and Europe, and dust from the Saharan desert. More details on the GRW site can be found in Wood et al. 37 .
Gosan (GSN), South Korea. The Gosan supersite on Jeju Island, off the southern tip of South Korea, measured aerosol optical properties during the ACE-Asia campaign. Wintertime and spring flow is predominantly out of the north-west, carrying dust from the Loess regions, sea salt and pollution from coastal China. Local pollution includes burning and night time fishing vessels. The station is at the top of a 72 m a.s.l. cliff and the inlet at a height of 10 m above the ground. A site description and study of aerosol optical properties measured at Gosan are provided in Doherty et al. 38 .
Shouxian (HFE), China. The ARM-China campaign deployed the DOE/ARM Mobile Facility to Shouxian, in the Anhui province of China, located around 500 km west of Shanghai. The site lies within the rural region on Jiang-Huai between the Huai and Yangtze rivers. The site is located at the edge of a rural town and is largely surrounded by farmland. The weather is influenced by the East Asian monsoon system and the site is characterized by mixed agricultural, pollution and dust aerosol from road and building construction in nearby Nanjing. More details about HFE aerosol sampling system can be found in Jefferson et al. 39 and previous results in Liu et al. 40 .
Holme Moss (HLM), UK. The Holme Moss site is located in Yorkshire in north-western England, approximately 30 km to the north-east of the city of Manchester and is characterized as a polluted rural site 41 . Aerosol measurements were made at HLM as part of a joint field campaign of the NOAA-FAN and University of Manchester research groups. The site is described in more detail in Liu et al. 41 .
Hyytiälä (HYY), Finland. The station SMEAR II is located in Hyytiälä, southern Finland 42,43 . This is an established long-term site surrounded by dense forests. The largest nearby city is Tampere, at around 60 km south-west. In this study the measurements from the campaign carried out as part of the EU-FP7 project PEGASOS are analysed. Details about instrument and campaign settings, previous results, including comparisons to aerosol mass spectrometer and airborne profile measurements, can be found in Zieger et al. 26 .
Jungfraujoch (JFJ), Switzerland. During the Cloud and Aerosol Characterization Experiments (CLACE) campaign measurements were performed in the Jungfraujoch research station 44 . Due to its high altitude (3580 m a.s.l.), JFJ is situated in the free troposphere most of the time. Nevertheless, thermal convection transports air from the planetary boundary layer to the site (especially during summer) and also long-range transport events such as African desert dust intrusions 13 or volcanic ash from Iceland can be observed 45 . More information on the humidified nephelometer measurements at JFJ and the site in general can be found in Fierz-Schmidhauser et al. 46 , Zieger et al. 13 and Bukowiecki et al. 44  www.nature.com/scientificdata www.nature.com/scientificdata/ airflow is from the Indian subcontinent. During the first half of the campaign (Feb 14-Mar 19, 1999) the air masses sampled originated from the Bay of Bengal and Calcutta region, while from Mar 10-Mar 28 winds were from the Arabian Sea. The measurements described here took place as part of the Indian Ocean Experiment (INDOEX) campaign and details can be found in Eldering et al. 47 and Ramanathan et al. 48 .
Lin'an (LAN), China. The Lin'an Regional Atmosphere Background Station is located in the center of the Yangtze River Delta, China. It is approximately 11 km north of the city of Lin'an, ~50 km west of Hangzhou, and ~210 km south-west of Shanghai. The Lin'an station is on the top of a small hill, in an area primarily covered by bamboo forests and rice paddies, and represents the polluted background conditions of the Yangtze River Delta. Previous results of the relative humidity dependence of aerosol light-scattering for LAN have been presented in Zhang et al. 49 . Previous results about the hygroscopicity measurements can be found in Fierz-Schmidhauser et al. 52 .
Niamey (NIM), Niger. The DOE/ARM Mobile Facility deployment in Niamey during 2006 was associated with two large international campaigns: the African Monsoon Multidisciplinary Analysis (AMMA) and the Geostationary Earth Radiation Budget (GERB) experiment. Niamey, the capital of Niger, is located in the south-east region of the country, next to the Niger River. Due to both its location and the local meteorology the region experiences episodes of mineral dust (from the Sahara) and biomass burning aerosols in the dry season and deep tropical convection in the wet season. The measurement site was located near the Niamey Airport, close to runway traffic and jet exhaust plumes. A detailed description of the site and instrument set-up is described in Miller et al. 53 .
Nainital (PGH), India. Nainital is located in the foothills of the central Himalayas at an altitude of 1958 m a.s.l. The aerosol measurements were performed at the Aryabhatta Research Institute for Observational Sciences observatory at Manora Peak during the Ganges Valley Aerosol Experiment (GVAX). Nainital is impacted by both local and transported aerosols plumes. At specific time periods (winter time, early morning and late evening) the growth of the planetary boundary layer plays a major role in transporting aerosols from the valleys to the site, producing significant perturbations in aerosol properties. Measurements reported here come from a deployment of the DOE/ARM Mobile Facility during a 9-month campaign. Results of the f(RH) measurements for PGH have been previously presented in Dumka et al. 54 and Gogoi et al. 55 .
Cape Cod (PVC), USA. The measurements at Cape Cod were conducted by the DOE/ARM during the Two-Column Aerosol Project (TCAP) deployment at Cape Cod, Massachusetts. Cape Cod is a peninsula jutting out into the Atlantic Ocean in the easternmost portion of the state of Massachusetts, in the north-eastern USA. The deployment was located in the north-eastern part of the cape, inside the Cape Cod National Seashore, and relatively close to large urban agglomerations such as Providence and Boston. Due to its location, the site is subject to both clean maritime and polluted conditions. Titos et al. 29  While the site is primarily a clean marine location, there are a number of dairy farms around the site and the area receives more than 2 million visitors annually. More information on this site can be found in Berkowitz et al. 56 .
Southern Great Plains (SGP), USA. The USA Department of Energy, Atmospheric Radiation, Southern Great Plains (SGP) facility is located in north central Oklahoma. The site is located in an agricultural region with mostly wheat, corn, alfalfa, and hay crops. The closest urban centres are Wichita, Kansas, 113 km north, and Oklahoma City, Oklahoma, 136 km south from the site. More details about the instrumentation and operation of the aerosol observing system can be found in 22 and previous aerosol hygroscopicity results have been presented in Jefferson et al. 57 and Sheridan et al. 22 .
www.nature.com/scientificdata www.nature.com/scientificdata/ Trinidad Head (THD), USA. Trinidad Head, California, is located 320 km north of the San Francisco Bay area and 320 km miles south of Eugene, OR and 0.5 km from the Pacific Ocean. This site is located relatively far from large local or regional sources of anthropogenic pollution. THD was established in 2002 at the start of a 1-month intensive field campaign (Intercontinental Transport and Chemical Transformation, ITCT 2K2 58 ). The objective of establishing this site was to study aerosol properties entering the USA before they were influenced by North America pollution sources. The site continued as a NOAA monitoring site after the ITCT 2K2 project, but over time instruments were progressively removed, and the site was closed in June 2017.

Granada (UGR), Spain.
Granada is a medium-sized city in south-eastern Spain. It is situated in a valley surrounded by mountains. The sampling site is located at the Andalusian Institute for Earth System Research (IISTA-CEAMA, University of Granada) in the southern part of the city and it is less than 500 m away from a highway that surrounds the city. The main local aerosol source is road traffic, with influences from domestic heating and biomass burning sources during winter. Results of the relative humidity dependence of aerosol light-scattering at UGR have been reported by Titos et al. 25 . Ny-Ålesund (ZEP), Norway. The Zeppelin observatory is located at 475 m a.s.l. on Zeppelin mountain close to the settlement of Ny-Ålesund on the island of Spitsbergen. It is a pristine site characterized by low levels of particle concentrations and typical Arctic aerosol. The clean conditions are dominant in an area where no local sources or long-range transport of aerosols are observed during the period of the year when the tandem nephelometer was deployed (July-October 2008). More details regarding instrumentation and previous results can be found in Zieger et al. 61 .
overview of different instrumentation designs. All but one system considered in this study consisted of two integrating nephelometers, one operating under low-RH conditions (DryNeph) and the other operated downstream of a humidifier and thus measuring at programmable RH (WetNeph). The exception is Finokalia's system, in which the WetNeph measured at pseudo-ambient conditions rather than using a humidifier to control humidity conditions. Figure 2 shows a schematic view of the two most common tandem nephelometer designs (the 'NOAA design' which was deployed at 17 sites and the 'PSI design' which was deployed at 6 sites). Both instruments designs were compared at FKB and further information can be found in Fierz-Schmidhauser et al. 23 . For all sites, the reference nephelometer (DryNeph) is run at low RH conditions to measure the particle light scattering coefficient at dry conditions as a reference, while the second nephelometer (WetNeph) measures σ sp at varying and elevated RH conditions (RH cycles or scans). The 'NOAA design' and the 'PSI design' are briefly described below. Additionally, three sites (UGR, FIK and YOS) developed their own tandem nephelometer designs and we provide relevant details of those as well. Table 1 indicates which type of system was deployed at each site and Table 2 shows information about the instrument design.
NOAA design. The tandem nephelometer was deployed at several NOAA-FAN sites (except UGR), DOE/ARM sites (SGP plus the various ARM Mobile Facility deployments) and also at LAN. These systems consist of the two nephelometers (DryNeph and WetNeph) connected in series with the humidifier between them (see Fig. 2a). Prior to passing through an aerosol impactor size cut, the sample air is dried via gentle heating as needed in order to maintain RH < 40%. The heating will depend on the site and the season and typically, temperature changes www.nature.com/scientificdata www.nature.com/scientificdata/ from sample heating varies from 0 to 10 °C. At some sites with high aerosol loading, additional drying of the sample air was accomplished by diluting the sample with filtered dry air (see Table 2). In order to minimize transmission loss of coarse-mode particles, the system air flow was controlled to 30 lpm, making RH control of the high flow sample air a challenge.
After the impactors, the sample air flows through a reference nephelometer (DryNeph) where the σ sp (RH dry ) is measured. The sample air exiting the DryNeph then enters the humidifier which is used to expose the particles to a controlled and elevated RH environment typically between 40 and 85% RH. The humidified air stream then enters the second nephelometer (WetNeph) where σ sp (RH) is measured as a function of RH. One RH cycle (increasing and decreasing RH) is performed on an hourly basis with the inlet size cut alternating between 10 and 1 μm (aerodynamic diameter) over the course of the hourly cycle at different time intervals depending on the site (see Table 2).
The NOAA system exclusively used one nephelometer type (TSI Inc., Model 3563) for both DryNeph and WetNeph. This instrument measures light scattering and backscattering at λ = 450, 550 and 700 nm. The set-up has changed slightly over the years since the initial deployment of a NOAA design tandem nephelometer system at SGP in 1998 22 . One important change over the 20 years of NOAA tandem nephelometer operation was the placement of the RH sensor used to control the humidifier. Originally, this RH sensor was placed at the humidifier outlet. Because of the sharp temperature gradient at the humidifier outlet, this sensor was eventually moved to a more stable RH region at the WetNeph exit.
The NOAA design strategy balances ease of operating with minimal perturbation of the ambient aerosol characteristics. All of the NOAA sites were operated remotely 24/7 with minimal technical service. The humidifier operated in a hydration mode scanning from low to high RH. In order to avoid volatilization of weak acids, the sample heating was regulated to maintain a maximum RH of 40%. One disadvantage of the high flow (30 lpm) NOAA system is that the RH range of the humidifier scan is limited by the ability of the humidifying system to overcome the ambient dew point. At low dew point conditions the humidifier is unable to reach a high RH range. At high dew point conditions the humidifier RH does not extend low enough to capture a minimum in the DryNeph/WetNeph ratio. As described below, the software fits of the data compensated for some of these limitations by specifying boundary conditions to the RH range. More detailed descriptions of this design can be found in Carrico et al. 62 , Titos et al. 29 , Jefferson et al. 57 , and references therein.  www.nature.com/scientificdata www.nature.com/scientificdata/ PSI design. The PSI design has been deployed on a campaign basis at six different sites in Europe within the European Community (EC) projects EUSAAR and GEOmon. In this design, the two nephelometers are operated in parallel. The WetNeph, which is preceded by a humidifying and drying system (see Fig. 2b), measures the σ sp at humid conditions. This design allows measurement of both the lower (deliquescent) and upper (efflorescent) branches of the hysteresis curve. A complete humidogram or RH cycle usually took 3 hours: during the first 1.5 hours the aerosol is humidified and RH increases (hydration) and during the last 1.5 hour the aerosol is humidified followed by an active drying (dehydration).
The PSI design utilises multiple calibrated RH sensors located at different points within the system, including inside the nephelometer. Additionally, a dew point sensor measures the dew point temperature. In this design, aerosols encounter the highest RH after passing through the humidifier. The RH is then lowered in the dryer and further lowered inside the nephelometer (due to a ~1 °C temperature increase caused by heating from the nephelometer lamp). A detailed description of this design can be found in Fierz-Schmidhauser et al. 23 and Zieger et al. 16 .
Depending on the field site and general inlet conditions, the PSI system measured particles with an aerodynamic diameter lower than 10 μm (PM 10 ) or the whole sample air (no size cut), see Table 2. TSI Model 3563 nephelometers were used at all sample sites except HYY, where both nephelometers were replaced by newer LED-based instruments (Ecotech Pty Ltd., Aurora 3000). The Ecotech nephelometers measure at slightly different wavelengths (450, 525, 635 nm) and are less influenced by heat effects from the nephelometer lamp 26 . The internal Kalman filter setting of the instrument was only used during calibration of the nephelometer.
The PSI design includes some major improvements relative to some earlier designs. Firstly, the RH inside the WetNeph is measured by one of the calibrated RH sensors installed directly into the sample volume (as opposed to the manufacturer's internal T/RH sensor relied on in many other humidograph systems). Additionally, an air-cooled infra-red filter is placed in front of the nephelometer halogen light source to minimize changes in sample volume RH due to heating from the lamp. Another feature is that the PSI design can also be operated to measure the upper branch of the hysteresis curve 12 . www.nature.com/scientificdata www.nature.com/scientificdata/ Other designs: UGR, FIK and YOS. The design at UGR consists of two nephelometers (TSI Inc., Model 3563) sampling from the same inlet, and measuring in parallel. There is no heater/dryer upstream of the instruments to ensure low RH in DryNeph. However, due to the arid conditions in Granada the RH in DryNeph was typically <40%, with a mean value of 20%. The sample air for WetNeph flows through a humidifier, which performs increasing/decreasing RH scans on a 30-min basis, before entering the instrument. There are four T/RH sensors, three associated with the WetNeph (located before the humidifier, after the humidifier and inside the nephelometer) and another sensor placed inside the DryNeph (although this last sensor was not operative over the entire measurement period). Further technical details on this system can be found in Titos et al. 25 .
For the FIK site, the University of Crete has performed several campaigns measuring particle light scattering as a function of RH since 2009. Two nephelometers are connected in series with a drier between them. The first nephelometer (Radiance Research Model 903, wavelength = 532 nm) serves as WetNeph and measures scattering at pseudo-ambient conditions, performing one cycle per day. The sample air then passes through a diffusion dryer to the second nephelometer (Ecotech Aurora 1000, wavelength = 525 nm) acting as DryNeph. The nephelometers were operated with the Kalman filter off since Finokalia is a remote marine site and rapid variations in the scattering coefficient are not expected to occur except during long range transport events. This study focused on the measurements in 2012 when both nephelometers measured with a PM 1 size cut.
The USA National Park Service utilized a tandem nephelometer system operating two Radiance Research M903 nephelometers (530 nm) as DryNeph and WetNeph in parallel. The DryNeph was dried to low RH using a drying system and the humidity inside the WetNeph was controlled by two sample air conditioners (which could act as humidifiers or driers) operated in series. In this system it takes between 2 and 3 hours to complete a full RH cycle (only one RH cycle is done each day). When the first conditioner was used as a drier and the second conditioner was used as a humidifier, the deliquescent f(RH) can be measured. Both nephelometers were fitted with 2.5 μm cyclone inlets. An in-depth description of this design can be found in Malm et al. 60 .

calibration of instruments. Nephelometer calibrations.
There are three standard operating procedures used to ensure the quality of the nephelometer measurements 63 : (1) filtered air checks to obtain the background scattering which is then subtracted from the measured scattering automatically by the instrument; (2) calibration checks to measure instrument response on filtered air and CO 2 (or another gas with known scattering characteristics) in order to check that the current calibration is still valid; and (3) full instrument calibration which is similar to a calibration check but results in a change of the calibration coefficients in the nephelometer firmware.
NOAA systems performed 5-min filtered air checks on an hourly basis and calibration checks (filtered air and CO 2 ) on a weekly to monthly basis. Full calibrations with filtered air and CO 2 and instrument maintenance (cleaning, inspection, PMT voltage adjustment etc.) were only performed when an instrument scientist was present (i.e., typically on a semi-annual to annual basis).
The nephelometers in the PSI system were calibrated with filtered air and CO 2 at the beginning of each field campaign, while calibration checks were performed on an irregular basis during the campaign and at the end of the campaign. Filtered air checks were done at least on a daily basis and the nephelometers were also intercompared at dry conditions. For the UGR system, filtered air checks in both nephelometers were performed hourly. Full calibration with filtered air and CO 2 and maintenance of the nephelometers (including cleaning and inspection) was performed approximately 4 times per year. Intercomparison of the two nephelometers was performed periodically to check the consistency between the instruments.
The University of Crete in Finokalia performed calibrations (with CO 2 as a span gas) and checks every 6 months and filtered air checks on a weekly basis. During the 6-month checks, both nephelometers were intercompared while measuring in parallel at the same conditions. At YOS, filtered air and Freon 134a (a common refrigerant gas with known scattering characteristics 63 ) were used to perform full calibrations on an almost daily basis. These frequent calibrations made it unnecessary to carry out filtered air and calibration checks of the nephelometers as calibrations would not be expected to shift over the course of a day.
Hygroscopicity-related calibrations. The operation of a humidograph system requires attention to technical detail and calibration 16,23,27 . Specifically, the system RH sensors need to be calibrated frequently to assure that the RH in the system is well characterized. Additionally, optical closure calculations using lab-generated aerosols of known  Table 3. Overview of data levels, applied corrections and corresponding products. σ sp (RH): wet particle light scattering coefficient, σ bsp (RH): wet particle light backscattering coefficient, σ sp (RH dry ): dry particle light scattering coefficient, σ bsp (RH dry ): dry particle light backscattering coefficient, f(RH): light scattering enhancement factor, f b (RH): light backscattering enhancement factor, QF: quality flag.
www.nature.com/scientificdata www.nature.com/scientificdata/ size and composition should be carried out to assess the performance of the system 16,23 . Finally, particle losses and instrumental differences at low RH conditions between the dry and wet nephelometers should be characterized for all the sites.
In the NOAA system, T/RH sensors were calibrated on a semi-annual to annual basis. Particle losses in the system were assessed by running the humidifier at low RH conditions and comparing scattering coefficients measured by the two nephelometers. No optical closure calculations were performed on the humidograph system (WetNeph) measurements, although successful optical closure (based on measured size distributions and assumed chemistry) has been performed for DryNeph measurements in several of these systems.
The T/RH sensor of the PSI system were calibrated with unsaturated salt solutions, while the light scattering coefficients at prescribed RH in the PSI system were validated using monodisperse or polydisperse salt measurements of ammonium sulphate and sodium chloride generated in the laboratory or in the field (see Fierz-Schmidhauser et al. 23 for more details). The RH is calibrated by comparing its deliquescent and/or efflorescence values expected for salts of known composition. The light scattering coefficients at multiple RH values are compared with the theoretical light scattering coefficients calculated using Mie Theory and the measured particle number size distribution 16,23,26 . Additionally, particle losses in the humidifier were characterized at low RH conditions. In the UGR system, T/RH sensors were calibrated frequently using unsaturated saline solutions of known RH at three calibration points (20, 60, 80%). Additionally, the T/RH sensors were periodically intercompared. The system RH was checked by generating and sampling salts of known deliquescence RH 25 . However, a closure study based on the salt measurements was not performed. Particle losses in the humidifier were characterized at low RH conditions and are dependent on the aerosol type, with the highest differences observed under desert dust intrusions, denoting higher losses for larger particles.
At FIK the T/RH sensors were not calibrated and losses in the drier were not determined. Closure studies were carried out in Kalivitis et al. 35 where both dry and ambient aerosol light scattering coefficients were reconstructed based on chemical composition (using as main components ammonium sulphate and organic matter for no dust event days) and mass scattering efficiencies. Measured and reconstructed daily averages showed good agreement (R 2 ≥ 0.8).
In Yosemite, calibration and comparisons between the T/RH sensors was performed before and at the end of the field campaign. The dry and wet nephelometers were operated under dry conditions for extended time periods daily to assess differences in scattering coefficient between the two instruments. Comparisons were also made between the humidograph system nephelometers and ambient Optec nephelometers to assess particle losses through the inlets. Additional optical closure calibration was done using ammonium sulphate.
Data handling and harmonization of data sets. Below, we describe the general data handling procedure carried out to develop a harmonized data set of RH-dependent σ sp and f(RH). In Fig. 3, we show an example of the time series of the scattering coefficients and RH measurements at Manacapuru on the 18 th of October,

Hygroscopic growth
Aerosol load (Mm −1 ) RH = 0% RH = 50% RH = 85% σ sp (σ bsp )  www.nature.com/scientificdata www.nature.com/scientificdata/ 2014, as well as an example of one of the corresponding humidograms for this day (between 11 to 12 am) and the correlation of σ sp (RH dry ) and σ sp (RH wet ) for different RH values, to guide the reader through the data processing. The processing flow and the products corresponding to each data level are shown in Table 3. Many of the nephelometers also measure aerosol backscattering coefficient (σ bsp ). The backscattering coefficient provides an indication of the angular dependence of light scattering and can be used to derive parameters such as up-scatter fraction and asymmetry parameter 64 . Where this measurement was available in both the DryNeph and WetNeph, the same data handling procedure was followed to process the backscattering data and develop a harmonized data set of RH-dependent σ bsp . For clarity, we only use the term 'scattering' in what follows to encompass both total and backscattering.
The processing starts with the raw data provided by each site mentor/site manager to which we then apply standard corrections using identical methodology. The measurements during filtered air checks and calibration of the instruments are not included in our dataset. The raw data consists of the high frequency measurements with 1 minute timebase. A first homogenization step is necessary in the case of ZEP and HLM, since these data are recorded at higher frequency (1 sec and 20 sec timebase, respectively) and needed to be averaged to the 1 minute timebase. Additionally, while the dilution correction necessary for HFE, MAO and NIM is already incorporated into the raw data provided by the site operator, a dilution correction is applied to the LAN data set. The RH calibrations for the PSI systems are applied during this initial processing phase, too (the RH calibrations for other systems are already incorporated in the raw data). This preliminary, homogenized dataset corresponds to Level 0 data.
When the Level 0 dataset is finalised, in-depth data screening is carried out in order to obtain the Level 1 data. As a first step, this consists of removing data during invalid periods and during system malfunctions (i.e., as indicated in each site logbook or the editing directives from the data provider(s)). The time series of the dry and wet scattering coefficients as well as RH and T values for each site are then further inspected in order to identify possible outliers and additional questionable data periods that had not been flagged during the data provider's quality control processing. Valid measurements are flagged with the quality flag (QF) set to 0 and, for invalid measurements, the quality flag is set to 2. Periods at PSI sites when the humidograph is not scanning RH values, but rather operating at a constant high or low RH are included with QF = 0 in Level 1 data if no other problem is detected in the quality control.
After identifying the good (QF = 0) data, several corrections are applied. First, the nephelometers are corrected for angular truncation and illumination non-idealities. For the TSI and Radiance Research nephelometers, the correction scheme proposed by Anderson et al. 63 is used, while the correction scheme developed by Müeller et al. 65 is applied to the Ecotech Aurora nephelometers. Next, an adjustment to standard temperature and pressure (STP, T = 273.15 K and P = 1013.25 hPa) is applied to all values of σ sp . Figure 3a shows the Level 1 time series of σ sp as an example.
In order to account for potential particle losses within the instrument system and to identify discrepancies between the two nephelometer calibrations, the linear regression of σ sp in the DryNeph and σ sp in the WetNeph when both instruments are operating at similar low RH values (typically 20 < RH < 50%) is calculated. As an example, Fig. 3d shows σ sp (RH wet ) versus σ sp (RH dry ) measured in MAO colour-coded for RH to illustrate this point. The derived correction is then applied to the WetNeph σ sp and ranges between 5 and 15% for most of the sites, with the highest value in PYE of 18% (for total scattering and 550 nm). UGR is frequently affected by www.nature.com/scientificdata www.nature.com/scientificdata/ Saharan dust outbreaks, where we have observed higher losses in the humidifier for larger particles and therefore we have applied different correction factors for dust-free and dust conditions at this site. Finally, a 10 minute moving average (11 measurements) is applied to σ sp (RH dry ) in order to reduce the influence of noise and outliers. This averaging helps to minimize noise in the f(RH) calculations especially during periods of extremely low σ sp (RH dry ). This moving average is especially necessary for measurements at pristine sites with very low particle concentrations (e.g., ZEP, JFJ, BRW), but is applied to all data sets for consistency. The corrected Level 1 data is used for the calculations of the scattering enhancement factor (f(RH)) also provided in the Level 1 datafile.
The Level 2 data includes the particle light scattering coefficients for RH values ranging from 30% to 95% at intervals of 5%. Each of these σ sp (RH) values are obtained by interpolating between Level 1 scattering measurements obtained at the closest two RH values bracketing the desired 5% RH interval. The results in Level 2 are given in an averaged (1, 3, 6 or 12 hours) data file with up to 20 interpolated scattering values (one for each RH interval) representing the RH scan for each humidogram. Data points are set to the missing value code when measurements are not available for interpolation.
Determining the sample volume RH. The RH inside the dry and wet nephelometers is a critical parameter for the precise determination of f(RH). Here we call these values RH dry and RH wet . For all sites the RH dry is always the RH measured by the manufacturer's sensor inside the DryNeph (or, in the case of Radiance Research nephelometers, at the exhaust of the DryNeph). RH wet is determined in different ways depending on the system design as described below. Table 2 lists the method used to calculate RH wet for each site.
For the PSI systems we utilise the additional installed and calibrated RH sensor inside the nephelometer sample volume as RH wet . Zieger et al. 16 emphasised the need for salt calibrations to determine the exact RH at the point of light scattering detection inside the WetNeph. In addition, one should keep in mind that the exact deliquescence RH measured by the WetNeph may not be the same as the thermodynamic deliquescence RH due to temperature differences between humidifier and subsequent nephelometer where the light scattering is being measured [see 23 ]. Like the PSI systems, the UGR system also relies on a calibrated RH sensor inserted in www.nature.com/scientificdata www.nature.com/scientificdata/ the WetNeph sample volume, rather than using the manufacturer's internal T/RH sensor. RH wet values for YOS and FIK are obtained from an external RH sensor downstream of the WetNeph. These two sites utilised Radiance Research nephelometers which are less subject to the lamp heating issues that occur with TSI nephelometers.
For the NOAA design systems, the determination of RH wet is less straightforward as two potential RH wet values are evaluated. The first approach is to use the RH measured by the manufacturer's internal T/RH sensor in the WetNeph sample volume. The second approach is to calculate the sample dew point temperature using a calibrated external T/RH sensor (placed upstream or downstream of the WetNeph depending on the site, see Table 2) and then use that dew point value to calculate the RH inside the nephelometer sample volume based on temperature measured by the manufacturer's internal T sensor. As Fig. 3b shows, discrepancies between the RH values calculated by these two methods may exist. One possible reason is drift in the manufacturer's calibration of the internal RH sensor 25 . Another reason is that the internal T/RH sensor is located in an instrument wall cavity outside of the central sample airflow. As such, the sensor is susceptible to the thermal inertia of the instrument wall as well as radial RH differences between the centre flow and wall. To assure the best choice for RH wet , time series of RH measured by the manufacturer's internal T/RH sensor and RH calculated with the dew point are analysed. If no problem appears throughout the entire measurement period and the RH values agree, the RH calculated based on dew point is selected as RH wet . For HLM, the T/RH sensor placed upstream of the WetNeph was affected by the humidifier and the RH values exhibit large variability. For THD, there is a period where the external T/RH sensor did not measure correctly. In these two cases, RH measured inside the WetNeph is selected as the best choice for the RH wet .

Determination of f(RH).
Using the corrected Level 1 data the the total and back scattering enhancement factors, f(RH) and f b (RH), can be now obtained using Eq. 1. Each humidogram, i.e., the f(RH) values as a function of RH wet for each individual scan in the Level 1 data, can be numerically parametrised using a variety of equations (for a summary see Titos et al. 27 ). The Level 2 data presented here (see example in Fig. 3c), uses a variation of  www.nature.com/scientificdata www.nature.com/scientificdata/ the most common fit equation initially introduced by Kasten et al. 66 . This is a two parameter fit equation where parameter a represents the intercept at RH = 0% and parameter γ is an indicator of aerosol hygroscopicity: Several sites are not able to maintain suitably dry conditions inside the DryNeph (i.e., the RH of the dry nephelometer is occasionally (or even frequently), higher than 40%). Calculated f(RH) values for time periods when RH dry > 40% are flagged as invalid (QF = 2) in the Level 1 data. Constraints are imposed on each humidogram in order to obtain valid fits. First, only those humidograms spanning an RH wet range larger than 30% in the WetNeph are included in the Level 2 data. Since most humidograms start at RH larger than 30-40%, this means that fits typically cover RH ranges at least up to 70% (and usually higher). Humidograms spanning a narrower RH interval are flagged as invalid in the Level 2 file. Additionally, a goodness-of-fit criterion is applied such that humidogram fits with a R-squared value less than 0.5 are also flagged as invalid in the Level 2 file. A stricter goodness of fit requirement is used for Hyytiälä, Jungfraujoch (R-squared value threshold was set to 0.7 and 0.8, respectively) where higher variability is observed in the RH scans, mostly during summer months due to the uplift of air masses and the atmospheric boundary layer. At FIK, which used pseudo-ambient conditions rather than controlling RH, and at YOS, which scanned over long periods, 12 hours of measurements are taken into account for each humidogram to increase the number of scans meeting the RH wet range larger than 30% criterion. To avoid possible errors produced by sharp changes of air masses over the 12h period, the R-squared value threshold for FIK and YOS is set to 0.9 to select humidograms representing relatively constant air masses.
In the Level 2 data file, f(RH = 85%/RH dry ) and f(RH = 85%/RH = 40%) are provided together with its relative uncertainty. The value of f(RH wet /RH dry ) has been calculated in two ways. First, and as main product, it is calculated with the measured reference scattering (DryNeph) represented by σ sp (RH dry ) where RH dry is the measured RH in the DryNeph (within the range of 0-40%). Secondly, for comparison reasons, f(RH wet /RH = 40%) is derived with the reference scattering coefficient σ sp (RH = 40%) obtained from the interpolated Level 1 data at RH = 40% (WetNeph). This was done only for humidograms where the RH scan time was below 1.5 hours, therefore excluding sites like YOS and FIK. It should be noted that the second approach does not account for possible rapid changes in aerosol load. All these quantities are given for the three nephelometer wavelengths and for the total and backscattering coefficients.   www.nature.com/scientificdata www.nature.com/scientificdata/ Uncertainty analysis for σ sp (RH) and f(RH). The uncertainty associated with the light total and back scattering coefficients has been calculated following the methodology developed in Sherman et al. 31 , explained in detail in their supplementary materials. Briefly, major sources of uncertainty in σ sp and σ bsp measured by the nephelometer are: instrumental noise, uncertainty in the nephelometer calibration, nephelometer calibration variability, uncertainty in the correction for nephelometer angular non-idealities, and uncertainty in correcting light scattering to standard temperature and pressure (STP) conditions. In this study, in order to represent the range of aerosol conditions at the 26 sampling sites, calculations have been performed for different levels of aerosol loading, with σ sp values of 5, 50 and 200 Mm −1 (for σ bsp the loading values used were a factor of 10 lower), 1-minute averaging time, pressure values ranging from 700 to 1013 hPa, temperature values ranging from 293.15 to 303.15 K and differentiating between no size cut and PM 1 particles since the truncation correction uncertainty is different for these two subsets of particles 63 .
The uncertainty associated with σ sp (RH wet ) and σ bsp (RH wet ) can then be calculated by error propagation using Eq. 2, where the absolute uncertainty associated with the measurement of RH is assumed to be 3%, selected as an upper conservative threshold at high RH for the RH sensors commonly used in the different designs of the humidified tandem nephelometers. Uncertainties in σ sp (RH wet ), σ bsp (RH wet ) and f(RH) vary depending on aerosol load, RH and hygroscopicity of particles. Calculations are carried out considering RH ranging from 50% to 85% and particles with low and high hygroscopicity, assuming a gamma parameter ranging between 0.2 and 0.9. A summary of uncertainties in σ sp (RH dry ), σ sp (RH wet ) and f(RH) at λ = 550 nm is shown in Table 4. Since the observed influence of T and P on the uncertainty is small, results presented in Table 4 are given for T = 20 °C and P = 1013 hPa.
Titos et al. 27 calculated the uncertainty of f(RH) by Monte Carlo technique, associating an uncertainty of 9.2% with both σ sp (RH dry ) and σ sp (RH wet ) and considering a range of aerosol loads and hygroscopic growth factors. The results obtained for the uncertainty of low hygroscopic particles, with γ = 0.2, increases with RH and varies between 10-15%. For highly hygroscopic particles (for example, for γ = 0.9), uncertainty ranged between 15-40% for increasing values of RH. Jefferson et al. 57 also obtained relative uncertainties for σ sp (RH) using error propagation and a Monte Carlo technique, finding uncertainties associated with wet scattering coefficient between 19.2 and 25.3% for σ sp (RH dry ) = 10 Mm −1 (and between 9.6 and 18.7% for σ sp (RH dry ) =100 Mm −1 ) and a reference RH dry = 40% for different values of γ and RH wet . Our results and those reported by Jefferson et al. 57 follow similar behaviour, showing a decrease in relative uncertainty of σ sp (RH) for increases in aerosol load, decreases of RH, and decreases in hygroscopic parameter γ. Therefore, the relative uncertainty of σ sp (RH) increases for low aerosol loads, which is important especially for polar, clean marine, or mountain sites with predominantly observed low aerosol concentrations.
There may be additional uncertainty related to the different configurations of the humidified nephelometer systems. This relates mainly to the order of the humidifier and dryer, the method of drying (active drying vs. heating), whether calibrations with mono-disperse salt calibrations were performed and the number of sensors used within the system to monitor RH and temperature. The uncertainty contributions related to these configuration differences are difficult to quantify (see Fierz-Schmidhauser et al. 23 , for more details). In addition, there might be further and undocumented circumstances possibly affecting measurement reliability and uncertainty resulting from the field operation (e.g., changes in building heating/cooling, system leaks, etc.) as can happen at any long-term monitoring site.

Data Records
Data records are composed of 339 ASCII files (NASA Ames format) organized in three levels containing the products shown in Table 3. One file is provided per site, size cut, year, and data level.
The files are available on the EBAS Data Portal, accessible through the URL: http://ebas.nilu.no. Individual files are accessible via a search function, including visualization tools. Data has also been deposited to ACTRIS Data Centre 67 under the following https://doi.org/10.21336/gen. 4.
Each file has a number of lines with relevant metadata, followed by the corresponding data products. For metadata information please visit: https://ebas-submit.nilu.no/Submit-Data/Data-Reporting/Templates/ Category/Aerosol/Integrating-Nephelometer-Data. Figure 4 shows the location of the different sites and the mean values of f(RH = 85%/RH dry ) (segregated by size-cut when possible), while Fig. 5 shows the frequency of occurrence of the f(RH = 85%/RH dry ) for different size cuts at the 26 sites. The mean, standard deviation and percentiles (25th, 50th and 75th) are given in Table 5 for total light scattering enhancement factors (f(RH = 85%/RH dry ) and f(RH = 85%/RH = 40%)) and in Table 6 for light backscattering enhancement factors (f b (RH = 85%/RH dry ) and f b (RH = 85%/RH = 40%)). In Table 6 some sites are missing due to the lack of backscattering coefficient measurements (like FIK and YOS) or because their measurements did not meet the quality criteria (like PGH and MAO).

Technical Validation
Overall, Arctic and marine sites exhibit the highest values of f(RH = 85%/RH dry ) (median values ranging between 1.5 and 3.0 for PM 10 ) and desert, urban and polluted sites the lowest f(RH = 85%/RH dry ) values (ranging from 1.1-1.7 for PM 10 ). Mountain and rural sites exhibit a wide range of values (spanning 1.4 to 2.7 for PM 10 ). These ranges are consistent with what has previously been reported for f(RH = 85%/RH dry ) as a function of aerosol type, e.g., Titos et al. 27 . Slightly higher values are reported for PM 1 than PM 10 , which is also consistent with previous works (e.g., Carrico et al. 6 ).
Where possible, results from specific sites in the benchmark dataset have been compared to previous findings reported in the literature for those same sites. Differences between the benchmark dataset values and literature values may occur for several reasons, such as (a) consideration of different measurement periods, (b) applying www.nature.com/scientificdata www.nature.com/scientificdata/ different and/or additional data screening, (c) analysis procedures and/or (d) segregating by different types of air masses (not done in this benchmark dataset as it requires additional information). Additionally, how f(RH) is reported can differ. For example, some authors report f(RH = 85%), i.e., the wet scattering at a defined RH (e.g., RH wet = 85%) referenced to the dry scattering at the RH inside the dry nephelometer. Others may report f(RH) at defined wet and dry RH values (e.g., RH wet = 85%, RH dry = 40%). Overall, the differences found between our results and those found in literature are within the uncertainty.
Some other authors have also studied f(RH = 85%/RH dry ). For LAN, Zhang et al. 49 reported a mean value for PM 10 of 1.6 ± 0.1 which is the same as the value reported in this study. For PGH, Dumka et al. 54 obtained a mean value of 1.3 ± 0.1 for both PM 10 and PM 1 , similar to our value 1.4 ± 0.2. For the urban site UGR, a mean value (under urban atmospheric conditions) of 1.6 ± 0.3 has been reported by Titos et al. 25 , while in our study (for all atmospheric conditions) we find a median value of 1.7 ± 0.3. The rural polluted site of YOS was found to have mean values for PM 2.5 of 1.3 ± 0.2 60 , while in this study a mean value of 1.5 ± 0.3 is obtained.
Several studies give values for f(RH = 85%/RH = 40%). Doherty et al. 38 studied data from the polluted marine site of GSN and found mean values of 2.3 ± 0.6 for PM 10 and 2.4 ± 0.5 for PM 1 , which are similar to the values obtained in our study of 2.1 ± 0.4 for PM 10 and PM 1 . Liu et al. 40 presented results for the urban site of HFE giving a median value of 1.7 ± 0.2 for PM 10 , close to our value of 1.6 ± 0.3. The longest time series corresponds to the rural polluted site of SGP. Jefferson et al. 57 summarized these measurements and gave mean values for PM 10 and PM 1 of 1.8 ± 0.4 and 1.9 ± 0.4, respectively. An earlier study at SGP 22 found median f(RH = 85%/RH = 40%) values of 1.8 and 1.9 for PM 10 and PM 1 respectively, while in our study we obtain median values of 2.0 and 2.1(±0.6) for the same size cuts. For KCO, Clarke et al. 68 gave the fit parameters for Eq. 2 and PM 1 measurements. The retrieved mean value for f(RH = 85%/RH = 40%) using those parameters is 1.7 and the value obtained with the analysed data of this study is also 1.7 ± 0.1.
At PVC, Titos et al. 29 reported mean values of f(RH = 80%/RH dry ) for the whole campaign segregated by PM 10 and PM 1 . They found values of 1.9 ± 0.3 for PM 10 and 1.8 ± 0.4 for PM 1 , while in our study these values are: 1.9 ± 0.4 and 2.0 ± 0.6, respectively. For FKB, a range of f(RH = 80%/RH dry ) values between 1.1 and 1.5 was given by Fierz-Schmidhauser et al. 23 . Our results show 25 and 75 percentile values of 1.3 and 1.7, with a mean value of 1.6 ± 0.3.
For FIK, the value of f(RH = 80%/RH dry ) obtained in our study is 2.2 ± 0.2 is significantly lower than values of f(RH = 80%/RH dry ) obtained by Stock et al. 69 which ranged between 2.7-3.5. This discrepancy is likely due to (a) different methodology for obtaining f(RH) and (b) significant differences in measurement periods for the two studies. While our study used measurements of particle light scattering at dry and wet conditions to determine the enhancement factor, Stock et al. 69 simulated the value of scattering at ambient conditions by means of an optical model using measurements of dry scattering and particle number size distribution and estimates of the complex index of refractive (derived from combining measured dry and wet size distributions and optical measurements) and ambient RH as input parameters. Additionally, our value represents the mean f(RH = 80%/ RH dry ) for one year of measurements, while Stock et al. 69 only considered a three-day period with clean marine air masses.
To further assess the quality of our dataset, quality checks can be carried out in the form of closure studies at sites where the required additional measurements (chemical composition and aerosol size distribution) are available. This has been done previously for most PSI sites. For example, Fierz-Schmidhauser et al. 52 compared the measured f(RH) at MHD with the corresponding value simulated by Mie theory (using the measured aerosol size distribution and the complex refractive index determined from chemical composition as inputs). Zieger et al. 51 also performed closure studies at MEL to compare the measured and calculated dry scattering coefficient and the scattering enhancement factor. Similar closure studies were also done for HYY 26 , JFJ 46 (for a different time period) and CES 12 .
Another type of closure study that could be used to assess these data is to determine if surface scattering coefficients, adjusted to ambient RH are consistent with remotely sensed vertical profile data, as was done in Zieger et al. 13 , where nephelometer measurements of aerosol hygroscopicity were compared to lidar and MAX-DOAS observations.

Usage Notes
Researchers using this database can decide to use Level 0 or Level 1 light scattering coefficients measurements, the interpolated light scattering coefficients at a given RH, or the calculated scattering enhancement factors: f(RH wet / RH dry ) and f(RH wet /RH = 40%) (Level 2). All these quantities are given for total and back scattering, but for different wavelengths depending on the site (see Table 2).
Level 0 light scattering coefficients may be useful if a user wants to carry out their own data processing and corrections. Light scattering coefficients from Level 1 may be useful if a user wants to perform their own fit and try a different equation than the used in this study (Eq. 2). The user can also calculate their own f(RH = 85%/ RH dry ) or f(RH = 85%/RH = 40%) and compare to our findings. This dataset can be useful to perform closure studies, light scattering coefficients can be compared to the outputs from Mie calculations obtained using size distribution or chemical data. www.nature.com/scientificdata www.nature.com/scientificdata/ The entire dataset and metadata can be accessed via the ACTRIS Data Centre 67 , while the data for the individual sites can be accessed via the EBAS Data Centre (http://ebas.nilu.no/) which also includes further online visualization tools and other site specific services such as further atmospheric observational data and air mass trajectory calculations. The data files are provided in the NASA Ames format and EBAS providesa useful Python package for accessing and reading the files (see repository at https://git.nilu.no/ebas/ebas-io). Further reading files (e.g. for Matlab) are provided by the authors upon request.

code Availability
The Matlab code used to generate this dataset is available from the corresponding authors upon request. The repository contains 26 different scripts due to the site dependent characteristics. These scripts read the WetNeph and DryNeph raw files as well as the T/RH sensor files. The sample RH is established for each particular case as explained in this text, and flags for size cut and valid/invalid measurements are also set taking into consideration each particular case.
The user can find detailed explanations of the various corrections applied to the raw measurements in the literature as cited in this paper. The codes apply the corrections to the raw measurements, obtaining the resulting corrected quantities for σ sp (RH dry ), σ bsp (RH dry ), σ sp (RH wet ) and σ bsp (RH wet ).
The user can use this dataset to apply their own methodology for obtaining f(RH) and/or applying empirical regressions to the measured humidograms.