Introduction

Atmospheric aerosol particles (hereafter aerosols) offset some fraction of the global warming associated with increased greenhouse gas concentrations.1 In addition to directly scattering and absorbing insolation, aerosols impact cloud properties, further perturbing surface insolation and climate.2,3 Direct and indirect climate forcing from aerosol particles is a major source of uncertainty in climate projections at the global and regional scales. For the indirect (i.e., cloud mediated) effect, the uncertainty arises from a range of sources including the number concentration of particles, activation of those particles to form cloud droplets, and cloud thermodynamics. The effects are particularly important at the regional scale, and “warming holes” (regions of decreasing daily maximum air temperatures) in the eastern and central USA are at least partly attributed to aerosol-cloud brightening.4,5 New particle formation (NPF) has been observed to occur in numerous environmental conditions across the Earth and contributes significantly to aerosol number concentrations.6,7,8 However, the degree to which freshly nucleated aerosols can grow sufficiently to act as cloud condensation nuclei (CCN), increase cloud albedo, and result in negative radiative forcing, remains uncertain,9,10 and a full understanding of the complex cloud-aerosol interactions is lacking.11,12

A number of challenges confront attempts to quantify the impact of NPF on CCN number concentrations and cloud properties at the local, regional, and global scale. These include: (1) Uncertainties in the precise nucleation mechanism responsible for NPF and appropriate scaling parameters within NPF schemes. While previous numerical modeling and in situ measurement studies have suggested that a significant portion of global CCN originates from NPF,13,14,15 the impact is regionally variable, scale dependent, and dependent on the assumed nucleation pathway.16,17 (2) Uncertainties in, or challenges to, representation of the impact of NPF on gas phase concentrations, aerosol particle size distributions (PSD), and cloud droplet number concentrations (CDNC). Uncertainties in CDNC can result from uncertainties in the representation of aerosol-cloud interactions including the supersaturation necessary for CCN activation18 and within cloud updraft velocities.

NPF is not inevitably associated with increased cloud droplet concentrations, at least at the regional scale. For example, NPF can alter the condensational sink (CS) sufficiently to inhibit growth of pre-existing particles to CCN sizes,19 which is consistent with observations of increased contribution of NPF to CCN concentrations during less polluted conditions.20 Some experimental studies (ref.10 and references therein) have sought to estimate the contribution of NPF to CCN concentrations by comparing concentrations before and after NPF events, but are subject to large uncertainties due to violations of the stationarity assumption. Further, the actual activation of particles to generate cloud droplets is critically dependent on the supersaturation (i.e., excess water vapor pressure above that in equilibrium with a plane surface of pure water), and thus some analyses have sought to derive CCN estimates theoretically from aerosol number concentrations and assumed supersaturations that may not realistically represent true in-cloud conditions.21,22 The high degree of sensitivity to assumed water vapor supersaturation is illustrated by previous simulations over the Beijing region of China, which found that while NPF increased CCN at high supersaturations, it reduced CCN at lower supersaturations.23

Most global climate models predict a linearly proportional response in cloud optical depth to changes in aerosol optical depth because they do not fully treat aerosol populations, composition, and mixing state, and aerosol-cloud interactions, microphysics, and cloud dynamics.3 Thus more computationally efficient, limited area models offer an important avenue for more diagnostic analyses of the actual aerosol indirect effect at the regional scale. Recent increases in computing resources and complexity of available aerosol-cloud microphysics schemes24,25 enable simulation of the impact of NPF on aerosol PSD, while explicitly resolving convective clouds at high resolution and cloud droplet number distributions (i.e., double-moment microphysics), and thus permit evaluation of the resultant aerosol-cloud impacts.

Herein, we use the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem v3.6.1) with the NPF mechanism of Matsui et al. 2011 (ref. 23) to simulate and quantify the impact of NPF on cloud radiative properties during May 2008. WRF-Chem is applied to a domain of 800 × 800 km, for the period 9–27 May 2008 (9–11 May are used as spin-up) using the Morrison double-moment microphysics scheme to describe, with high fidelity, cloud droplet concentrations, and at 4 km resolution, which is sufficient to explicitly resolve convection26 (See Table S1 for full simulation settings). May was chosen as the simulation period because it is the month of the highest observed NPF frequency over the Midwestern USA.27 Further, during this specific month (May 2008) an intensive field experiment was conducted in southern Indiana permitting detailed evaluation of simulated aerosol PSD along an 80 km transect across the middle of the domain extending from a small town, to an expansive forest (Morgan Monroe State Forest, MMSF), and to the major urban area of Indianapolis28 (Fig. 1a). Near-surface aerosol and gas phase concentrations are also evaluated using data from the US Environmental Protection Agency (EPA)’s Air Quality System (AQS) (see details in SI). Cloud properties are evaluated using satellite-based remote sensing measurements from the MOderate Resolution Imaging Spectroradiometer on the Aqua satellite (MODIS-Aqua) at 5 km nadir resolution.

Fig. 1
figure 1

a Land use within the simulation domain from MODIS for 2012. Also shown are locations of the three in situ particle size distribution measurement sites (BMG, MMSF, and INDY). b Time series of hourly nucleation mode number concentrations from WRF-Chem (solid-lines; see legend in bottom panel) with various prefactors and NPF turned off, and from in situ measurements (dashed-lines). Note two instruments (an SMPS and an FMPS) were operated simultaneously at MMSF. c Comparison of the hourly average condensational sink (CS) from WRF-Chem (abscissa axis) and in situ measurements (ordinate axis). As CS did not vary greatly across the simulations, only the highest prefactor and no NPF are shown. Approximate Dp range used to compute the CS is shown in the upper left of each panel

Four sets of WRF-Chem simulations are conducted. Although the precise nucleation mechanism for NPF likely varies in space and time,29,30,31 most studies indicate that the production of ~1 nm diameter particles within the boundary layer scales with sulfuric acid vapor concentrations raised to a power, n, ([H2SO4]n), multiplied by a prefactor.32,33,34,35 Further, in situ measurements within the domain indicate ultrafine particle number concentrations are strongly associated with [H2SO4].28 Thus, herein we implement an activation type nucleation mechanism23,36,37 with n = 1 and prefactors of 2 × 10−7, 2 × 10−6, and 2 × 10−5 × [H2SO4], and compare those with a simulation in which NPF is excluded. This suite of simulations is inclusive of nucleation prefactors used in prior studies and is evaluated in terms of the degree of agreement with PSD measured at three sites during the study period. Use of the four simulations permits both an analysis of the sensitivity of our results to the nucleation prefactor and clear quantification of the impact of NPF on CCN concentrations and cloud properties. Simulations such as those presented herein are highly computationally demanding (the four simulations of the 19 day period presented herein required ~350,000 core hours on a Cray supercomputer), thus previous studies have either chosen to simulate relatively short periods (a few days or less),36 simulate at comparatively low resolution requiring use of cumulus parameterizations,38 employ a single prefactor,31 or simplify the gas-particle chemical schemes (most frequently neglecting production of secondary organic aerosol mass).39 Herein, we selected to simulate a long period, evaluate the sensitivity to NPF prefactors, and conduct the simulations at convective permitting scales with a high precision cloud microphysics scheme. Due to evidence from in situ measurements that partitioning of semi-volatile products of oxidation of volatile organic compounds makes only a small contribution to the growth of nano-scale particles in this environment28 and that a proxy for cloud droplet concentrations exhibit little response to changes in biogenic emissions at MMSF,1 for computational efficiency production of organic aerosol mass is excluded from the simulations. Due to the limited duration of the simulations, there is no discrimination based on cloud height or phase.

Results

Evaluation of the simulations

Prior to use of the WRF-Chem output to quantify the impact of NPF on cloud properties, the simulations were subject to a detailed model performance evaluation (see SI). In brief, the results of this evaluation indicate that for all three observing sites, the activation scheme replicates the day-to-day variability in occurrence and intensity of NPF (Fig. 1b, Figures S1 and S2). The highest correlation between hourly measured and simulated nucleation mode concentrations (particle diameter (Dp) < 25 nm) occurred with a prefactor of 2 × 10−5 (r = 0.20–0.48), and the lowest root mean squared error (RMSE) resulted from a prefactor of 2 × 10−6 (RMSE = 8.7 × 103–3.9 × 104 cm−3). The RMSE is the same magnitude as inter-instrument uncertainty: The RMSE between the two instruments co-located at MMSF that operate with different PSD discretization and measurement principle (Scanning Mobility Particle Sizer (MMSF-SMPS) and Fast Mobility Particle Sizer (MMSF-FMPS)), calculated by treating one instrument as observation and the other as a predictor, is 7.8 × 103 cm−3. Thus the simulations closely match the frequency and intensity of NPF in the center of the domain.

For an appreciable climate impact, aerosol forcing must occur on relatively large (e.g., regional) scales. In situ measurements taken at spatially distributed sites have indicated that NPF is indeed a regional phenomenon, but exhibits important sub-regional heterogeneity due to local emissions and/or sulfur-rich plumes.40,41,42 Thus it is essential that the model reproduces the scales of NPF events. Using the definition that a NPF event day is one in which nucleation mode (Dp < 25 nm) number concentrations during one or more hours exceed the grid cell mean value by 0.5 standard deviations, 9 of 16 simulation days were classed as exhibiting nucleation events at Indianapolis (INDY) and Bloomington (BMG), and 10 of 16 days at MMSF. Using this definition, NPF is simulated, on average across all grid cells, on 76–78% of days on which a NPF event day is identified at each of the three in situ measurement sites (Fig. 2a). When isolating only particularly strong NPF days (i.e., days when nucleation mode particle number concentrations were one standard deviation above the mean), the spatial scale of coherence is smaller, but averaged across the entire domain, strong NPF still occurs on 62–66% of days on which NPF is simulated at each of the three in situ measurement sites (Fig. 2b). This implies that NPF is being observed and simulated on sufficiently large scales to potentially have an impact on regional cloud properties and thus regional climate.

Fig. 2
figure 2

Fraction of modeled days on which NPF events were simultaneously predicted in the grid cell and those containing the three observational sites (INDY, MMSF, and BMG). An event day is declared if any hourly nucleation mode concentration is a >one-half of a standard deviation above the local mean or b a standard deviation above the local mean. The classification of a most closely agrees with a subjective classification, while b identifies only intense NPF event days. The black bars on the color bars represent the coincident NPF days expected for a random field (1000 Monte-Carlo iterations)

The model output also exhibits similar values of the CS to near-surface observations, although the range is higher in the model except at the site in Indianapolis (INDY), due to the higher background aerosol concentration in the urban area (Fig. 1c). This is consistent with a positive bias in simulated near-surface aerosol mass (PM2.5 and PM10) relative to EPA measurements across the study domain,43 and may in part account for the underestimation of surface SO2 concentrations in the model (Figure S3). Conversely, ozone concentrations, and thus oxidant concentrations, exhibits good accord (mean error (ME) = 0–10 ppb) with EPA surface measurements. For brevity, the remainder of the study focuses primarily on output from the 2 × 10−5 × [H2SO4] and control (no NPF) simulations.

Cloud fraction is a dominant modulator of radiative fluxes, but capturing cloud presence at the correct location, at the correct time is one of the greatest challenges to atmospheric models.44,45 When compared with MODIS-Aqua retrievals on a pixel-to-pixel and hour-to-hour basis, 50% of cloud impacted pixels (i.e., grid cells with clouds in any vertical layer) are correctly simulated across the entire domain (Fig. 3b). Cloud top heights exhibit r > 0.6 over 20% of grid cells, and a mean (standard deviation across grid cells) r = 0.38 (0.30) and root mean square error (RMSE) = 369 hPa (64 hPa) for all grid cells between the simulation and MODIS (Fig. 3c,d). Thus, we conclude the simulations reproduce the observed cloud fields sufficiently well that they are adequate for evaluating differences in cloud properties between the control and NPF simulations.

Fig. 3
figure 3

a Number of MODIS retrievals in each grid cell when a cloud was present in MODIS or WRF-Chem. b Fraction of retrievals for which WRF-Chem correctly simulated the presence of clouds (see Eq. (5) in Methods section). c Pearson’s correlation coefficient and d root mean squared error for cloud top heights (hPa) between MODIS and WRF-Chem. The first column shows results for no NPF simulations; the second column is with NPF turned on with a prefactor of 2 × 10−5 × [H2SO4]. The rippling is an artifact of the strict criteria for matching a MODIS retrieval with the model grid (i.e. overlapping assuming a 5 × 5 km footprint). In reality, the retrieval footprint is larger off-nadir, resulting in the pattern

Impact of NPF on top of atmosphere (TOA) radiative forcing

The impact of NPF on cloud albedo and radiative (climate) forcing is quantified by comparing the TOA outgoing shortwave (TOA-SW) radiation from the NPF simulations and the control simulation without NPF. For all prefactors, the NPF simulations have lower TOA-SW for all grid cells with clouds than in the control run. This indicates a decrease in cloud albedo due to the occurrence of NPF and the effect is of the greatest magnitude for simulations with the highest NPF prefactor (i.e., the strongest nucleation intensity).

Averaged over the entire simulation period (12–27 May 2008), this forcing is 10 W m−2 when including all hours over the majority (50%) of grid cells and over 8% of the domain this forcing exceeds 30 W m−2 when only cloudy hours are considered. When only daytime, cloudy pixels are considered, the median forcing across all grid cells is 25 W m−2. This indicates NPF substantially impacts cloud brightness and hence the regional radiative budget and climate (Fig. 4a and Figure S5). NPF does not drastically change simulated cloud top heights. Ninety-percent of all grid cells and simulation hours exhibit a difference in cloud top heights of less than 30 hPa between the NPF simulation and the simulation without NPF. While NPF decreases TOA-SW over grid cells with clouds, there is also only a modest change in simulated cloud fraction. Ninety-percent of grid cells exhibit little or no difference (between –6 and +2%) in the number of hours with cloud present in the simulations with and without NPF. While TOA-SW is generally lower when NPF is turned on indicating lower cloud albedo and hence reduced reflection of solar radiation on days when NPF is simulated, this is not uniform in space or time. For example, the TOA-SW is lower in the NPF simulation for 12 May, but higher on 26 May than in the no NPF simulation (Fig. 4d). This illustrates the complexity of aerosol-cloud interactions and the challenges to making projections of regional climate response to changing anthropogenic aerosol and gas phase emissions.

Fig. 4
figure 4

Difference in top of atmosphere shortwave radiative flux (W m−2) between the simulations with NPF on and the simulation with NPF off. ac Mean difference across the entire study period with a prefactor of a 2 × 10−5, b 2 × 10−6, c 2 × 10−7 × [H2SO4]. d Daily mean differences for the 2 × 10−5 × [H2SO4] simulation. Dates are ordered sequentially in row-major order (i.e., upper left is 12 May, while the lower right shows 27 May) and NPF events days as simulated at MMSF are indicated by red stars in the upper right corner of each panel

For most days on which no NPF was observed in the NPF-enabled simulations, there is little difference in cloud presence (Fig. 3a) or TOA-SW (Fig. 4d) between the NPF and control simulations. This illustrates the numerical stability of the simulations and indicates that within the simulation domain cloud properties are primarily impacted by NPF on the day of an event.

NPF increases aerosol particle number concentrations, and with sufficient ambient condensable vapor concentrations, those new particles can grow sufficiently to increase CCN concentrations, and ultimately increase cloud albedo, resulting in a negative radiative forcing (i.e., cooling of the surface and troposphere). Prior research found that at supersaturations of 1% and 0.4%, approximately half of ~50 and 80 nm aerosols can act as CCN, respectively.21 However, the importance of NPF to CCN is very sensitive to the assumed supersaturation, and supersaturations rarely exceed 1% in nature.22 Consistent with prior studies, our simulations also indicate there is an increase in 50 nm (occasionally 80 nm) diameter aerosols following the majority of NPF events; however, the increase is often (~60% of events at MMSF) accompanied by a decrease in the concentrations of larger diameter aerosol particles (Fig. 5a,c). Therefore, while the number of aerosol particles that are potentially large enough to act as CCN increases, it is typically accompanied by a downward shift in aerosol diameter and thus a requirement for higher water vapor supersaturation. Further consistent with prior studies that indicate CCN enhancement from NPF,10,14 the simulations presented herein indicate that the concentration of aerosol particles with diameters >100 nm are up to two standard deviations above the mean on NPF event days (Fig. 5b). However, the enhancement of CCN concentrations occurs on those days even in the control simulation without NPF, indicating a non-NPF CCN source that coincides with conditions favorable for NPF (Fig. 5b). Additionally, on NPF days, this CCN enhancement is reduced and a shift to smaller particle diameters is simulated, potentially due to NPF reducing available condensable species (such as H2SO4, Fig. 5c) and thus inhibiting condensational growth of pre-existing aerosols.

Fig. 5
figure 5

a Difference in particle number concentration (dN/dlogDp; cm−3) by particle diameter (colors) between output from the NPF on and NPF off simulations at the locations of the three in situ measurements sites (location of sites shown in Fig. 1). b Anomalies (standard deviations from mean) for the no NPF simulation (bottom panel), and differences (NPF minus control simulation) in anomalies of the particle number concentrations calculated by diameter across the study period at the MMSF grid cell for the various prefactor (top three panels). NPF event days are indicated by red asterisks on the MMSF panel in (a) and above the top panel in (b). c As in (a) for MMSF, but enlarged and annotated to highlight NPF events and increases in 51 nm (and 82 nm) particles at the expense of larger particles. d Ratio of sulfuric acid concentrations in the no NPF simulation relative to the simulation with NPF

Discussion

This is one of the first studies to explicitly resolve changes in CDNC and cloud albedo due to NPF. Contrary to our a priori expectation, in the limited study area and period presented here, NPF reduces cloud albedo and thus radiative forcing, on average, by 10 W m−2 and by up to 50 W m−2 on individual days in a substantial number of grid cells. This implies that NPF actually leads to cloud dimming and thus regional warming at least during some periods of the year over the comparatively polluted Midwestern USA. It is worthy of note that for all of the NPF prefactor values simulated herein, the TOA-SW forcing is, on average, positive (Fig. 4a-c), though it is of smallest magnitude for the lowest prefactor. It is important to re-emphasize that May is the month with greatest frequency of NPF events in the Midwest and thus the indirect radiative forcing (IRF) from NPF reported herein cannot be scaled to give an estimate of the annual net radiative forcing. However, for comparison, this radiative forcing is of the opposite sign of the mean annual aerosol indirect forcing of −2 W m−2 for the eastern USA derived from coarse-resolution global simulations.5 The discrepancy between our results and the global simulations is likely due to our focus on the calendar month of highest NPF frequency and the importance of direct versus parameterized convection and the microphysical scheme in dictating the cloud response to NPF. Further comparison can be made with the annual mean net radiative forcing of approximately −1 W m−2 associated with the observed transition of cropland to forests over the eastern USA during 1920–1996 (ref. 46).

The computational demand imposed by the complexity of resolving nucleation, convection, and aerosol-cloud interactions necessarily limited the spatial domain and temporal duration of the simulations presented herein and led to a decision to use a comparatively simple chemical scheme. Future work is needed to test the sensitivity of our results to the NPF pathway scheme used and/or inclusion of the production and partitioning of semi-volatile organic species to the particle phase and their contribution to aerosol growth and aerosol-cloud interactions; time of year and geographic location, and both future emissions and climate scenarios; and/or treatment of planetary boundary layer physics, particularly in light of evidence that, at least on some days, NPF may be focused at or near the residual layer formed by the nocturnal inversion.47 A further uncertainty in this analysis pertains to the quantity, size, and composition of primary particle emissions. In the 2011 US NEI (used for the simulations presented herein) particle emissions are described only by PM2.5 and PM10 particle mass, and in the current study are partitioned across the particle size bins (>10 nm) following Matsui et al. (ref. 23).

Building confidence in climate model simulations of indirect aerosol forcing (i.e., modification of cloud properties) and making robust projections of the impact of anthropogenic emission controls requires quantitative comparison of simulations under current conditions with a diverse suite of observations, as conducted herein. We show that NPF can have a significant positive or negative impact on regional aerosol IRF with the magnitude and even sign of the effect (warming or cooling) varying in space and time even within a region. This has significant implications for making robust regional climate projections. Implementation of the Clean Air Act and subsequent amendments in the USA has led to substantial reductions in emissions of NPF precursors (e.g., SO2) and primary particles, and thus has greatly modified aerosol particle populations.5 However research presented herein indicates anthropogenic emission reductions are unlikely to be associated with a linear response in regional aerosol indirect forcing, with the magnitude and sign of the forcing being dependent on the response in NPF frequency and intensity to changes in precursor emissions, changes in background aerosol populations, and the interaction between NPF and preexisting aerosols. Our research further highlights the importance of explicitly resolving complex cloud-aerosol processes in order to better understand and characterize aerosol IRF and specifically the contribution from NPF.

Methods

Model description and settings

Simulations were run from 9–27 May 2008 (9–11 May are used as model spin-up) on a 200 × 200 × 45 grid cell domain (the outer five grid cells were treated as a buffer zone and excluded from the analysis) centered on 40°N, 85°W using WRF-Chem version 3.6.1 with the activation NPF scheme of Matsui et al., 2011(ref. 23). The simulations were run on a Cray XE6/XK7 supercomputer (Big Red II) for 28 days on 128 cores per simulation. To decrease overall runtime, the cores were distributed across 16, 16-core Opteron/NVIDIA nodes using only 8/16 cores per node. In total, the simulations presented herein required ~350,000 core hours.

The simulations were run with the following settings (see comprehensive overview of simulation settings in SI):

  • A time-step of 20s (physics, chemistry, and photolysis) was used and output was saved once every hour.

  • Anthropogenic emissions were from the 2011 National Emissions Inventory (NEI); biogenic emissions are from the Model of Emissions of Gases and Aerosols from Nature (MEGAN); chemical initial and boundary conditions are from the Model for OZone and Related chemical Tracers (MOZART); and meteorological initial and boundary conditions are from the North American Mesoscale Forecast System (NAM).

  • No cumulus parameterization was enabled and the Morrison double-moment microphysics scheme was used with prognostic cloud droplet number. The Morrison scheme resolves mass and number concentrations for both liquid (cloud droplets and rain) and solid (ice, snow, and graupel) phase.48

  • The Rapid Radiative Transfer Model for GCMs (RRTMG) was used for both longwave and shortwave radiation at 10 min time-steps.

  • The Monin-Obukhov Similarity scheme was used for the surface-layer parameterization, the Yonsei University (YSU) scheme was used for the planetary boundary layer physics, and the Noah Land Surface Model was applied.

  • Chemistry and aerosols were simulated with the Carbon-Bond Mechanism version Z (CBMZ) and the Model for Simulating Aerosol Interactions and Chemistry (MOSAIC) using 20 sectional aerosol bins from 1 nm–10 μm with aqueous reactions, and aerosol optical properties were calculated using a volume approximation.

Evaluation methods

The appropriate prefactor for the activation nucleation mechanism used herein is subject to considerable uncertainty and may be spatially and temporally variable.23,37 Thus simulations were run using 2 × 10−7, 2 × 10−6, and 2 × 10−5 × [H2SO4] (Eq. (1)), and the aerosol number concentrations were compared to in situ measurements from either scanning mobility particle sizers (SMPS) or fast mobility particle sizers (FMPS) at three locations near the center of the study domain: Indianapolis, IN (INDY; FMPS Dp = 6–523 nm), MMSF, IN (MMSF; SMPS Dp = 6–110 nm and FMPS Dp = 6–523 nm), and Bloomington, IN (BMG; SMPS Dp = 10–414 nm). The comparisons of nucleation mode particle number concentrations and CS only include simulated aerosol number concentrations in the size bins matching the discretization of the instruments. The CS is calculated using Eqs. (2), (3), and (4) (ref. 49).

$$J = A \times \left[ {{{\rm{H}}_2{\rm{SO}}_4}} \right]$$
(1)

where J is the nucleation rate of 1 nm particles, A is the prefactor, and [H2SO4] is the sulfuric acid vapor concentration.

$$\mathrm{CS} = 2\pi D\mathop {\sum }\limits_i {\kern 1pt} \beta _iD_{pi}N_i$$
(2)
$$\beta _i = \frac{{1 + Kn}}{{1 + \left( {\frac{4}{{3\alpha _i}} + 0.337} \right)Kn + \frac{4}{{3\alpha _i}}Kn^2}}$$
(3)
$$\mathrm{Kn} = \frac{{2\lambda }}{{D_p}}$$
(4)

where D is diffusion coefficient (0.06 cm2 s−1 herein; ref. 50), β i is the transition regime correction factor and N i is the aerosol number concentration for aerosols size i, Kn is the Knudsen number, α is the sticking coefficient (unity herein), λ is the mean free path of air (68 nm herein).

In situ surface trace gas and aerosol mass measurements used in the model performance assessment are from all available US EPA monitoring sites within the study domain. Trace gas measurements are daily mean and maximum values calculated from hourly average measurements for SO2, NO2, and carbon monoxide (CO), and 8-h running average measurements for ozone (O3); daily mean particulate matter with diameters <2.5 and 10 μm (PM2.5 and PM10) mass measurements are on a one-in-three or one-in-six day schedule depending on the site. The mean error (ME) is calculated for the nearest WRF-Chem grid cell centroid to each site. Additional model validation is provided in the supplemental information, including: The time series of the PSD at the three in situ measurements sites, the corresponding time series from the simulations, and the time series for each of the NPF prefactors at MMSF; and the domain mean and variability, as well as spatial distribution in surface trace gas and PM errors relative to the EPA measurements.

Cloud properties used in the model evaluation are from the MODIS instrument on the Aqua satellite (equatorial overpass ~1330 local standard time) and have a spatial resolution of 5 × 5 km at nadir. For all retrievals over the domain, comparison is made to the nearest WRF-Chem output hour for the nearest grid cell within 10 km of the retrieval centroid. As no cumulus parameterization is used, grid cells are defined as either cloud or no cloud, and a location is defined as cloudy if clouds are present anywhere in the vertical column; MODIS pixels with a cloud fraction > 0 are defined as cloudy. The percentage of correctly simulated cloud pixels is quantified as shown in Eq. 5. The simulation cloud top height is defined as the highest model level with clouds present and is also evaluated relative to observational estimates from MODIS using the Pearson correlation. It is noted that cloud top heights are not Gaussian; thus Pearson’s correlation coefficients should be interpreted cautiously.

$$Percent\,Correct = \frac{h}{{m + f + h}}$$
(5)

where h is the number of correct forecasts (cloud in both MODIS and WRF-Chem), m is the number of missed forecasts (cloud in MODIS, but not WRF-Chem), and f is the number of false alarms (cloud in WRF-Chem, but not MODIS).

As there are 190 × 190 grid cells in the horizontal dimension, an automated method is used to classify each day as either a NPF event or non-event day. A day is defined as a NPF event day if the nucleation mode (Dp < 25 nm) number concentration during any hour was greater than 0.5 standard deviations (σ) above the mean value for that grid cell over the entire simulation period. This threshold is selected to replicate the NPF frequency derived using a subjective classification of the PSD time series at MMSF. Additionally, a NPF event day was considered ‘intense’ if the nucleation mode number concentration was greater than 1 σ above the local mean. The scale of spatial coherence in NPF occurrence is quantified by calculating the fraction of days on which NPF occurred at each grid cell for all days on which NPF was simulated in the grid cells containing three in situ measurement sites. The frequency of co-occurrence of NPF event days between each site and a random field is computed using a 1000-iteration Monte Carlo simulation. For each iteration a random value is given to each day, and if this value is 0.5 (1) σ above the mean for all days it is classified as an NPF event (intense event) day.

Radiative forcing is quantified using TOA upward shortwave (SW) radiation. The difference between the NPF and control simulations is calculated using all simulation hours in all grid cells with clouds simulated in both the NPF and no NPF simulations. Thus, the difference includes nighttime hours when no insolation is present and there is no difference in TOA-SW between the NPF and control simulations, and therefore the radiative forcing estimates presented here should be viewed as conservative estimates.

Data availability

The observational datasets and simulation output presented herein are available on request. EPA data are available from: https://www.epa.gov/outdoor-air-quality-data; MODIS data are available from: https://disc.sci.gsfc.nasa.gov/; WRF-Chem is available from: http://www2.mmm.ucar.edu/wrf/users/.