Background & Summary

Although it is well understood in the scientific community that extreme hydrometeorological events in cold climates are often related to snow processes, i.e., snowmelt and rain-on-snow (ROS)1,2,3,4,5,6, they have largely been ignored or under-represented in hydrologic design that relies largely on traditional precipitation-based intensity-duration-frequency curves (PREC-IDF) to estimate design basis extreme events (e.g., 100-year 24-hour event). PREC-IDF such as the National Oceanic and Atmospheric Administration (NOAA) Atlas 147 assumes that precipitation is in the form of rainfall that is immediately available for rainfall-runoff processes. This assumption has obvious shortcomings, especially in snow-dominated regions where winter precipitation is primarily snowfall. At locations where runoff is released slowly from accumulated snowpack, PREC-IDF can lead to infrastructure overdesign and incur unnecessary costs. Conversely, underdesign will occur where the snow-driven runoff rate is higher than the rate of precipitation. This was confirmed by previous research8,9, which demonstrated that PREC-IDF underestimates the 100-year, 24-hour extreme events in 45% of the Snow Telemetry (SNOTEL) stations examined, and the resulting peak design flood could be underestimated by up to 324%.

The Oroville Dam failure in February 2017 that required $1.5 billion to repair is a notable example of costly infrastructure damages resulting from floods driven by ROS events10. The exclusion of snow processes in the PREC-IDF technique is likely to cause greater errors in estimating extremes in a warming climate and present higher infrastructure design risk. For example, increased frequency and intensity of atmospheric rivers in the future are anticipated to cause more extreme orographic precipitation and subsequently increase flood risk along the U.S. West Coast11,12,13,14.

Given the limitations in PREC-IDF and implications for design risk, the Next-Generation IDF (NG-IDF) technique8 was developed to estimate extreme events based on the amount of water reaching the land surface (W) during rain, snowmelt, and ROS events. By including snow processes, NG-IDF provides a systematic and consistent technique for all environments from rain-dominated, transitional, to snow-dominated locations. Despite the marked advantages of NG-IDF over PREC-IDF, its wide adoption by engineers and planners is hindered by rather limited snow observations for estimating W, especially relative to widely available precipitation products15,16,17. Using physics-based models to produce reasonable simulations of W, while feasible, is rather challenging, given the significant requirements of expert knowledge in model calibration and considerable cost of computational resources.

To support the broad adoption of the NG-IDF approach, we developed a suite of datasets that characterize extreme events relevant to hydrologic design over 1951‒2013 at a 1/16th-degree (~6 km) resolution across the conterminous United States (CONUS). For over 200,000 locations in the CONUS, the datasets provide: the magnitude and dominant driving mechanism (i.e., rain, melt, and ROS) of extreme events for different durations (24‒72 hours) and return periods (2‒500 years) derived from the NG-IDF curves; the magnitude of design extreme events associated with different hydrometeorological drivers; trend and seasonality of annual maximum W events (AMW) over 1951‒2013; and infrastructure design risk associated with the PREC-IDF technique. These datasets were developed based on sub-daily simulations of W by a well-validated physics-based hydrological model (Distributed Hydrology Soil Vegetation Model, DHSVM18). To examine the accuracy of the simulations, the simulated snow water equivalent (SWE) was evaluated against long-term observations of SWE from 246 SNOTEL stations distributed across the Western U.S. These datasets intend to offer spatially distributed, quantitative measures of extreme events that are readily usable by the science and engineering communities to understand the climatology and driving mechanism(s) of extreme events, identify potential infrastructure design risk associated with the standard PREC-IDF method, and improve estimation of extremes.

Methods

Overview of NG-IDF datasets development

The method to develop the NG-IDF datasets is illustrated in Fig. 1. In contrast to PREC-IDF that estimates extreme events based on the total amount of precipitation (implicitly assumed to be in the form of rain), NG-IDF curves are developed based on the amount of water available for runoff (W) for the bare ground condition with no canopy cover, which can be represented mathematically by:

$$W=P-\Delta SWE+S$$
(1)

where P is precipitation, ΔSWE is the change in ground snowpack water content, and S indicates condensation (positive) or evaporation/sublimation (negative) of snowpack. In this development, as described in more detail in the following sections, P was taken from the gridded, gauge-based meteorological dataset, ΔSWE and S were simulated by the DHSVM snow model. The resulting W data is a 3-hourly time series for 1950–2013 at 207,173 grid cells covering the land surface of the CONUS at a 1/16th-degree resolution.

Fig. 1
figure 1

Schematic view of the NG-IDF datasets.

As observations suggested that snowfall events largely last up to 72 hours19, NG-IDF curves were developed for W events with 24-, 48- and 72-hour durations, respectively. We first developed series of annual maximum W (AMW) for years 1951–2013 based on subdaily estimations of W aggregated over different durations. For example, AMW with a 72-hour duration is the maximum of the 72-hour moving sum of W over a given year. Given variations in precipitation seasonality across the CONUS, we defined AMW events for both water year (October 1st through September 30th) and calendar year (January 1st through December 31st). AMW based on calendar year is more appropriate for locations with extreme precipitation occurring during the summer season such as the Southwest U.S., and AMW based on water year is better for locations with extreme winter precipitation such as the Western U.S. Given the focus on snow-driven extreme events, we used AMW defined by water years for IDF curves.

We applied the nonparametric Mann-Kendall test20,21 on the AMW series. Where trends were significant at the 5% confidence level, we detrended the time series using Sen’s slope22 while maintaining the long-term average of the time series. As suggested by NOAA Atlas 14, the generalized extreme value (GEV) distribution was fitted to AMW across all locations using the L-moments statistics7,8. Here we used the same GEV distribution for analysis of extreme events with different durations across the CONUS so that direct comparisons can be made in frequency estimates across durations or between locations. To quantify the implications of using PREC-IDF for infrastructure design risk, we also developed PREC-IDF curves following the same approach based on annual maximum precipitation (P). The design risk was quantified by comparing the NG-IDF and PREC-IDF values of extreme events (e.g., 100-year 24-hour event) over the CONUS.

A Monte Carlo (MC) simulation procedure23 and NOAA Atlas 147 was used to consider sample data uncertainty in frequency analysis. For each location, we generated 1,000 MC synthetic annual maximum series. We then fitted the GEV distribution to each MC series using the L-moments statistics and estimated the associated NG-IDF values. The uncertainties in NG-IDF curves for each location were quantified using the 5% and 95% quantiles (i.e., the 90% confidence interval) of ensemble members.

Gauge-based gridded meteorological data

For simulations of snow processes, the model requires subdaily input of P, air temperature, relative humidity, downward shortwave and longwave radiation, and wind speed. The meteorological input used here was derived from disaggregating the gridded (~6 km), daily land surface meteorological dataset24 over the CONUS using the Mountain Microclimate Simulation Model (MTCLIM) algorithm25. Distinct from reanalysis products, this dataset is one of the few gauge-based gridded datasets with long-term continuous records. It includes daily records of P, maximum and minimum air temperature, and wind speed spanning the period 1950–2013. The disaggregation algorithms25 estimate subdaily air temperatures with a 3rd-order polynomial fit based on daily temperatures range. Radiation and relative humidity are estimated based on daily temperature range, P, and solar geometry25. P is assumed to occur at a uniform rate throughout the day.

Observational snow data

Daily SWE measurements were retrieved from SNOTEL stations for snow model parameterization and evaluation. Among the 785 SNOTEL stations, we selected 246 stations that shared the longest common period (2007‒2013) of bias-corrected and quality-controlled (BCQC) daily SWE records. Briefly, standard quality control procedures8,26 were applied to remove stations with missing data, outliers, and problematic SWE values (e.g., peak SWE > accumulated winter precipitation). The BCQC procedures and the resulting BCQC datasets27 are available to the public at https://www.pnnl.gov/data-products.

CONUS-scale snow modeling and parameter development

The physics-based, snow submodel of DHSVM18 was implemented at the point scale to simulate snowpack dynamics under a bare ground and flat terrain condition. The model was run at the 3-hourly time step from 1950–2013 for 207,173 point locations at a 1/16th-degree grid spacing that coincides with the center of the meteorological grids. Model output includes 3-hourly times series of SWE and S, which are used together with observed P for calculating W in Eq. 1.

DHSVM simulates ground snowpack accumulation and melt using a two-layer mass and energy balance ground snowpack module. The mass balance components consist of P, S, changes in SWE, and melt from the snowpack. The partition of P into rain and snow is based on air temperature thresholds:

$$\left\{\begin{array}{cc}R=P & {T}_{a}\ge {T}_{R}\\ R=P({T}_{R}-{T}_{a})/({T}_{R}-{T}_{S}) & {T}_{S} < {T}_{a} < {T}_{R}\\ R=0 & {T}_{a}\le {T}_{S}\end{array}\right.$$
(2)

where Ta is air temperature, TS and TR is the temperature threshold for P to be completely snowfall and rain, respectively. If Ta ≥ TS, 100% of the P is rain (R); if Ta < TR, 100% of the P is snow; if Ta falls between the two thresholds, rain and snow are proportionally allocated to represent mixed rain and snow events. Energy balance at the snow surface is driven by net radiation, sensible and latent heat, and advected heat by rain. Energy and mass exchange between the thin surface layer and deep snowpack layer occurs via the exchange of meltwater. When liquid water in the deep snowpack exceeds its holding capacity, excess water is released to the underlying soil column. Detailed descriptions of the DHSVM snow model physics and governing algorithms can be found in a large body of literature18,28,29,30.

Prior calibration of snow models is performed typically at relatively local scales; thus, there are no calibrated, spatial snow parameter sets that can be readily applied for the CONUS-domain snow modeling. In support of this work as well as future large-domain snow modeling, here we developed spatially distributed snow parameters for the CONUS domain. Based on previous research29 that documented the robustness of regionally coherent snow parameters in modeling snowpack dynamics, we developed snow parameters for five spatial clusters covering the CONUS (Fig. 2). Given strong correlations between winter climate and key aspects of snowpack dynamics31,32,33, we determined the clusters using the k-means clustering machine learning technique34 based on the grid-level climatological mean of P, maximum and minimum air temperature, and wind speed from November through March during 1950–2013. Different numbers of clusters were tested and we selected the optimal five clusters based on the inertia elbow method and our previous work29.

Fig. 2
figure 2

Evaluation of cluster snow parameters in snow modeling. (a) Five clusters grouped by the mean winter precipitation, air temperature, and wind speed at the 1/16° resolution in the CONUS. Numbers in the parenthesis following each cluster name indicate the number of SNOTEL stations (black dots) within the cluster used for parameter development; (b) boxplots showing the model performance against SWE observations from SNOTEL stations, which was measured by NSE, PEAK.ERR and PDATE.ERR and grouped by clusters.

Parameter development focused on four snow parameters that were identified by previous work29 to be crucial for capturing daily SWE dynamics29: (1) TS (defined in Eq. 2), (2) fresh snow albedo (amax), (3) albedo decay coefficient during snow accumulation (λA) and (4) snowmelt (λM). The last three parameters are applied in the snow albedo decay curve of Laramie & Schaake35 for estimating snow surface albedo evolution, given by:

$$\begin{array}{lll}{\alpha }_{A} & = & {\alpha }_{max}\cdot {\lambda }_{A}^{{d}^{{\rm{0}}{\rm{.58}}}}\\ {\alpha }_{M} & = & {\alpha }_{max}\cdot {\lambda }_{M}^{{d}^{{\rm{0}}{\rm{.46}}}}\end{array}$$
(3)

where αA and αM are the snow surface albedo during the accumulation and melting seasons, respectively, and d is the number of days since the last snowfall. For other model parameters, we used the default values. The cluster-based snow parameters (Table 1) were developed as follows: (1) we produced prior ensemble parameters, consisting of 10,000 sets of the four snow parameters drawn uniformly from their physically plausible ranges, using the Latin Hypercube Sampling algorithm; (2) snow simulations were conducted at each SNOTEL location for every prior parameter set; (3) the posterior ensemble parameters were resampled from the prior ensemble if they met the threshold values of objective functions with observations. Here, we used three metrics (Eqs. 46): Nash-Sutcliffe Efficiency (NSE) of daily SWE ≥ 0.6, the bias in the mean annual peak SWE (PEAK.ERR) within ± 25%, and the bias in the timing of peak SWE (PDATE.ERR) ≤ 14 days.

$$NSE=1-\frac{{\sum }_{i=1}^{t}{({Y}_{i}-{O}_{i})}^{2}}{{\sum }_{i=1}^{t}{({O}_{i}-\bar{O})}^{2}}$$
(4)

where Oi and Yi are the observed and predicted daily SWE at day i, respectively; t is the total number of days for which model simulations were performed; \(\bar{O}\) is the observed mean daily SWE over the simulation period.

$${PEAK.ERR}={\sum }_{k=1}^{ny}\frac{({Y}_{k}^{P}-{O}_{k}^{P})}{{Y}_{k}^{P}}/ny$$
(5)

where \({O}_{k}^{P}\) and \({Y}_{k}^{P}\) are the observed and predicted peak SWE for the kth water year, respectively; ny is the total number of years of simulation.

$${PDATE.ERR}={\sum }_{k=1}^{ny}\left({Y}_{k}^{D}-{O}_{k}^{D}\right)/ny$$
(6)

where \({Y}_{k}^{D}\) and \({O}_{k}^{D}\) are Julian dates of observed and predicted peak SWE for the kth water year, respectively. After step (3), 58 stations with no qualified posterior parameter values were removed from subsequent analyses. Most of these stations are located in high-latitude areas of Montana and Wyoming, where model skill is challenged by the lack of representation of wind effects on snow redistribution, or the maritime Pacific Northwest where snowmelt tends to be sensitive to errors in modeled energy balance and precipitation partitioning when temperature is near freezing29; (4) for each cluster, final parameter values were calculated as the ensemble mean over the posterior parameter space of all stations within the cluster. For the Southern cluster (C3), no SNOTEL observations are available for parameter development. Because snowfall is very limited for most of the Southern cluster where snow parameters have negligible effects on extreme events, we used the parameter values of the maritime cluster for the Southern cluster given their commonality of warm winter. The cluster parameter values are presented in Table 1.

Table 1 Cluster snow parameters developed for the CONUS.

Driving mechanism of extreme events

For each location, the driving mechanism was determined for extreme W events with different durations and return periods. Table 2 presents the classification approach used for determining the mechanism based on subdaily information of P and SWE, which include:

  1. 1.

    Rainfall only (R): precipitation on snow-free ground;

  2. 2.

    Snowmelt only (M): decreasing SWE with no concurrent precipitation;

  3. 3.

    Rain-on-snow (ROS): decreasing SWE with concurrent precipitation. Given the interest in flood potential, a ROS event is further refined as one with at least 10 mm rainfall per day falling on a snowpack with at least 10 mm SWE over the selected duration, and the sum of rain and snowmelt contains at least 20% of snowmelt36,37,38.

Table 2 Classification of the driving mechanism of W extremes.

Annual maximum series resulting distinctively from each driving mechanism were determined, based on which IDF curves were developed following the same approach as described for NG-IDF. The mechanism producing the largest IDF value was identified as the dominant mechanism of extreme events.

Seasonality of extreme events

The seasonality of AMW at the 24-, 48- and 72-hour durations was represented by the seasonality index (SI) and the mean date (MD) relative to October 1st due to the use of water year. They were calculated using the circular statistics1,39,40, given by:

$$SI=\sqrt{{\bar{x}}^{2}+{\bar{y}}^{2}}$$
(7)
$$MD{=\tan }^{-1}(\bar{y}/\bar{x})\cdot 365/2\pi $$
(8)

where

$${\theta }_{i}=D\cdot 2{\rm{\pi }}/365$$
$$\bar{x}={\sum }_{i=1}^{n}\cos ({\theta }_{i})/n$$
$$\bar{y}=\mathop{\sum }\limits_{i=1}^{n}{\rm{\sin }}({\theta }_{i})/n$$

For a given water year denoted by i, D is the day of the AMF occurrence relative to October 1st (i.e., D = 1 if the event occurred on October 1st); n is the total number of water years used in the analysis. SI, ranging from 0 to 1, measures the temporal variability of the occurrence of events. A smaller SI suggests weaker seasonality, and the associated MD is therefore less reflective of the actual timing of the extreme events.

Data Records

The NG-IDF datasets41 are available to the public through an unrestricted repository at https://doi.org/10.5281/zenodo.5827028 in comma-separated value (.csv) format. Table 3 provides a summary of the folder structures, description of data files, output variables in each file and the format.

Table 3 Description of the NG-IDF datasets.

Technical Validation

The accuracy of estimated W and all related datasets depends primarily on the accuracy of daily SWE simulations given that P was observational (see Eq. 1). As there exists no data for direct evaluation of NG-IDF curves, we validated SWE simulations against daily SWE observations from 246 SNOTEL stations. Three model performance metrics that compare the simulated and observed SWE were applied: (1) NSE, (2) PEAK.ERR, and (3) PDATE.ERR, which measured the overall goodness-of-fit, peak SWE, and the timing of peak SWE, respectively. Model evaluations (Fig. 2) showed that NSE of daily SWE was greater than 0.6 at 75% of all stations, PEAK.ERR was within ± 25% at 67% of the stations, and PDATE.ERR was within two weeks at 67% of the stations. Overall, the simulations were able to reproduce the observed SWE dynamics at most stations using the cluster-based snow parameters.

Usage Notes

The NG-IDF datasets listed in Table 3, with no additional data analysis, can be used for a wide variety of applications over spatial scales ranging from local, regional to the CONUS scales. Overall, the presented estimates of extreme events and their characteristics based on long-term observational and simulation records are crucial for understanding flood potential, particularly for cold regions where infrastructure design risk exists from using PREC-IDF curves or NOAA Atlas 14.

For hydrologic engineering designs and analyses, one can obtain for any location(s) of interest the magnitude of extreme W events (Fig. 3a) and their dominant mechanism (Fig. 3c). Through comparing the NG-IDF and PREC-IDF values of extreme events for the same locations, one can determine the magnitude of bias or design risk related to the use of PREC-IDF (Fig. 3b). Information related to the flood seasonality is also key for understanding the generating mechanisms of floods and supporting future flood management. For instance, one can obtain the seasonality index (SI) and the mean timing of AMW events for any location of interest. For locations with a stronger seasonality (i.e., a higher SI), there is less inter-annual variability in the timing of AMW occurrence, and thus the mean date represents better the occurrence dates of AMW from year to year. At broader spatial scales, the datasets can be used for prioritizing locations for flood management and adaptation. As shown in Fig. 3, there is substantial spatial variability in the magnitude of 100-year 24-hour W events, which are typically higher in the ROS-dominated Pacific Northwest mountain ranges, and rain-dominated Gulf coastal plains. The dominant mechanism exhibits greater heterogeneity in topographically complex mountainous regions, and there is a shift in the dominant mechanism from ROS to rain or melt for events with a longer duration (72-hour versus 24-hour) (Fig. 3c,d). The presented datasets can also be used for estimating catchment-scale flood responses by coupling the NG-IDF curves with a rainfall-runoff model as demonstrated in prior work16.

Fig. 3
figure 3

Characteristics of extreme events over the CONUS based on the NG-IDF technique. (a) 100-year 24-h extreme events (unit: mm); (b) design risk associated with PREC-IDF indicated by the biases of PREC-IDF estimated 100-year 24-h event relative to NG-IDF (underdesign if <−25% and overdesign if >25%); (c) dominant mechanism (rain, ROS, or melt) of 24-h extreme events; (d) dominant mechanism of 72-h extreme events.

Lastly, it should be noted that the datasets are subject to a few limitations:

  1. (1)

    Diurnal variability is not represented in the precipitation data used here to force the snow model and construct IDF curves. As a result, our estimates of extreme events are limited to daily or longer durations. While the daily temporal resolution is mostly sufficient to capture snow-related extreme events, short-duration IDF curves based on high-resolution rainfall data are more appropriate for capturing short-duration extremes (e.g., flash floods) or floods in small catchments with fast response times (<24 hours).

  2. (2)

    Although generally smaller compared to sample data uncertainty8, the choice of probability distribution can contribute to uncertainties in estimates of extremes42. Depending on the usage, one can apply the distribution of choice or ensemble distributions for analysis of extreme W events using provided AMW datasets.

  3. (3)

    Greater uncertainties in NG-IDF estimates are expected for locations with lower snow model skill, such as the maritime Pacific Northwest and locations exposed to high winds during winter.

  4. (4)

    Stationarity assumption. Based on the Mann-Kendall test, about 10% of the CONUS shows a statistically significant trend in AMW with any duration of 24, 48, or 72 hours. Hence, the IDF curve estimates based on the stationary assumption are valid for about 90% of the CONUS. For the remaining locations showing a significant trend in AMW, we recommend applying a nonstationary approach for constructing NG-IDF curves. Among a variety of nonstationary approaches17,43,44, here we demonstrate the application of the Non-stationary Extreme Value Analysis (NEVA) Software45 to construct the NG-IDF curve based on the 24-hour AMW at a location in the Oregon State, where AMW has the highest positive trend of 1.12 mm/decade over the CONUS (Supplementary Fig. 1). In this particular case, the analysis suggests that the stationary assumption may lead to the underestimation of extreme events into the future. With the provided AMW series, one can develop nonstationary IDF curves using the approach of choice.

  5. (5)

    Consideration of land cover. The NG-IDF datasets provided in this study are developed for open conditions (as opposed to forest conditions). Given strong forest-snow interactions33 and their implications for streamflows30, incorporating different land cover types into the development of NG-IDF datasets is an undergoing research effort. The datasets presented here are used as the baseline for understanding the vegetation impacts or land use change (e.g., urbanization) impacts on NG-IDF curves.