An ensemble of bias-adjusted CMIP6 climate simulations based on a high-resolution North American reanalysis

Lavoie, Juliette; Bourgault, Pascal; Smith, Trevor James; Logan, Travis; Leduc, Martin; Caron, Louis-Philippe; Gammon, Sarah; Braun, Marco

doi:10.1038/s41597-023-02855-z

Download PDF

Data Descriptor
Open access
Published: 11 January 2024

An ensemble of bias-adjusted CMIP6 climate simulations based on a high-resolution North American reanalysis

Scientific Data volume 11, Article number: 64 (2024) Cite this article

1735 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

ESPO-G6-R2 v1.0 is a set of statistically downscaled and bias-adjusted climate simulations based on the Coupled Model Intercomparison Project 6 (CMIP6) models. The dataset is composed of daily timeseries of three variables: daily maximum temperature, daily minimum temperature and daily precipitation. Data are available from 1950 to 2100 over North America. The simulation ensemble is comprised of 14 models driven by two emissions scenarios (SSP2-4.5 and SSP3-7.0). In this paper, we describe the workflow used for the bias-adjustment, which relies on the detrended quantile mapping method and the Regional Deterministic Reforecast System (RDRS) v2.1 reference dataset. Using the framework defined in the VALUE project, we show the improvements made by the bias-adjustment on marginal, temporal and multivariate aspects of the data. We also verify that the bias-adjusted climate data have similar climate change signal to the original climate model simulations. Finally, we provide guidance to users on how to use this dataset.

CLIMBra - Climate Change Dataset for Brazil

Article Open access 20 January 2023

Bias-corrected CMIP6 global dataset for dynamical downscaling of the historical and future climate (1979–2100)

Article Open access 04 November 2021

A high-resolution daily global dataset of statistically downscaled CMIP6 models for climate impact analyses

Article Open access 11 September 2023

Background & Summary

The need to adapt to climate change is present in a growing number of sectors, leading to an increase in the demand for authoritative, easily accessible, quality controlled climate information. In order to meet this growing demand and to support numerous Vulnerability, Impact, and Adaptation (VIA) studies, Ouranos, a Quebec-based climate change adaptation research consortium, produced a set of operational multipurpose bias-adjusted climate simulations referred to as Ensemble de Simulations Post-traitées d’Ouranos (ESPO). This paper presents the ESPO-G6-R2 v1.0 dataset. The “-G6” suffix refers to the sixth version of the Global (G) Coupled Model Intercomparison Project (CMIP)¹ models, used as inputs, while the “-R2” suffix refers to the Regional Deterministic Reforecast System (RDRS) v2.1 reanalysis², used as an observational reference for the bias-adjustment. The ESPO-G6-R2 dataset serves multiple purposes: it is needed internally for climate change adaptation projects, supports the organization’s climate portal Portraits Climatiques (https://portraits.ouranos.ca), and is additionally made available to collaborators and external users through the PAVICS (https://pavics.ouranos.ca) platform (a hosted JupyterLab analysis environment with an associated THREDDS data server). This dataset is an updated version of a previous dataset based on CMIP5 simulations³.

To build the ESPO-G6-R2 v1.0 dataset, the original CMIP6 simulations were statistically downscaled and bias-adjusted using a variant of the detrended quantile mapping method and the RDRS v2.1 dataset. This reanalysis product was created by Environment and Climate Change Canada (ECCC) using the Regional Deterministic Reforecast System (RDRS) to downscale the Global Deterministic Reforecast System (GDRS) initialized with ERA-Interim⁴. The system is also coupled with the Canadian Land Data Assimilation System (CaLDAS)^5,6,7 and Canadian Precipitation Analysis (CaPA)^8,9,10,11. RDRS v2.1 and, consequently, the ESPO-G6-R2 dataset, covers North America (Fig. 1) at a resolution of 0.09° on a rotated uniform latitude–longitude grid. While RDRS v2.1 provides data over both land and ocean, ESPO-G6-R2 adopts the same domain but was only produced over land with a narrow buffer along the coasts. Ocean grid points were masked prior to bias-adjustment. The dataset includes daily minimum temperature (tasmin), daily maximum temperature (tasmax) and daily precipitation (pr) for the 1950–2100 period for the emissions scenarios SSP2-4.5 and SSP3-7.0¹². Only original simulations produced with models having a Transient Climate Response (TCR) within the Intergovernmental Panel on Climate Change (IPCC)-defined likely range (1.4–2.2 °C¹³) were used to create the ensemble (Table 1), as was suggested by Hausfather et al.¹⁴. A single member per model is used to create the ensemble.

Table 1 Climate simulations included in ESPO-G6-R2 v1.0.

Full size table

In this paper, we begin by describing the bias-adjustment method used to create this dataset. Next, we provide information on the data and how to acquire it. We subsequently validate the dataset, in part by following the VALUE project framework¹⁵. Finally, we provide recommendations for effective usage of the data.

Methods

Data extraction

The original CMIP6 simulations¹⁶ (Table 1) were downloaded from the Earth System Grid Federation (ESGF) archive using the synda Python library (now replaced by esgpull; https://esgf.github.io/esgf-download) and the North American domain was extracted. The reference dataset (RDRS v2.1 reanalysis²) was downloaded from CaSPAr (https://caspar-data.ca)¹⁷ under ECCC Data Servers End-use Licence (https://eccc-msc.github.io/open-data/licence/readme_en/). As we are not interested in data over the ocean, we selected all grid cells that have a sea area fraction less than one and we added a buffer of one grid cell along coastlines.

Regridding

In order to perform the bias-adjustment, all simulation land and ocean cells were interpolated onto the masked RDRS v2.1 grid using the bilinear method. Because of the large difference in resolution between the simulations and RDRS v2.1, the regridding is done in cascade, from the original grid to a 1° regular grid, to a 0.5° regular grid, to the final RDRS v2.1 rotated 0.09° resolution grid.

Bias-adjustment

The ESPO-G6-R2 v1.0 bias-adjustment procedure uses a variant of the detrended quantile mapping method, as provided by xclim (https://xclim.readthedocs.io)¹⁸ and described by many authors^19,20. The procedure is univariate (applied to each variable individually), acts independently on the trends and the anomalies, and is applied iteratively on each day of the year as well as at each grid point.

Variables

Adjustments are applied separately for each of the three variables. Because adjusting tasmax and tasmin independently can lead to physical inconsistencies in the final data (i.e., cases with tasmin > tasmax^21,22), we instead applied the bias correction to the daily temperature range (or amplitude; dtr = tasmax - tasmin) in addition to tasmax and pr. tasmin was reconstructed after the bias-adjustment.

While tasmax has no physical bounds in practice, this is not the case for pr and dtr which are bounded by zero. To apply this constraint in practice, the adjustment process is additive for tasmax and multiplicative for pr and dtr²³. The multiplicative approach prevents values to drop below zero.

Grouping and calendar

The bias-adjustment is applied to each day of the year and each grid point independently. To render the procedure more robust, a window of 31 days centred on the current day of the year was used for the calibration (training step). For example, the adjustment for February 1 was calibrated using data from January 15 to February 15, over the 30 years of the reference period. In order to avoid having four (4) times fewer data points for the 366th day of the year during leap years, we converted all inputs to a “noleap” calendar by removing data on the 29th of February. For simulations using the “360_day” calendar, the simulations were untouched, but the RDRS v2.1 data was converted to that calendar by removing days at regular intervals.

Detrending

We first computed the averages (−) and anomalies (′) of the RDRS v2.1 reference data (Y_r) and regridded simulations over the 1989–2018 reference period (X_r), the most recent 30-year period in the RDRS v2.1 dataset, for each day of the year and each grid point. The subscripts indicate the period (in this case, reference). As mentioned above, anomalies are computed either additively or multiplicatively, depending on the variable:

$${Y}_{r}=\{\begin{array}{cc}{\bar{Y}}_{r}+{Y}_{r}^{{\rm{{\prime} }}} & {\rm{f}}{\rm{o}}{\rm{r}}\,{\rm{t}}{\rm{a}}{\rm{s}}{\rm{m}}{\rm{a}}{\rm{x}}\\ {\bar{Y}}_{r}\cdot {Y}_{r}^{{\rm{{\prime} }}} & {\rm{f}}{\rm{o}}{\rm{r}}\,{\rm{d}}{\rm{t}}{\rm{r}},{\rm{p}}{\rm{r}}\end{array}$$

(1)

and similarly for X, $\bar{X}$ and X′.

Instead of a simple moving mean, the simulation was detrended with a locally weighted regression (LOESS²⁴) over the full 1950–2100 simulation period (X_s). We chose this method for the slightly heavier weights given to the centre of the moving window, thus reducing the impacts of abrupt inter-annual changes on the trend and anomalies. It also performs better near the edge of the timeseries. The LOESS window had a 30-year width and a tricube shape, while the local regression was of degree 0 and only one robustness iteration was performed. The LOESS detrending was applied on each day of the year after averaging over a 31-day window, yielding the trend ${\bar{X}}_{s}$ and the residuals ${X}_{s}^{{\prime} }$. Here again, the process can be additive or multiplicative.

Adjustment of the residuals

With ${F}_{{Y}_{r}^{{\prime} }}$ and ${F}_{{X}_{r}^{{\prime} }}$ the empirical cumulative distribution functions (CDF) of ${Y}_{r}^{{\prime} }$ and ${X}_{r}^{{\prime} }$ respectively, an adjustment factor function was first computed:

$${A}_{+}(q):={F}_{{Y}_{r}^{{\prime} }}^{-1}\left(q\right)-{F}_{{X}_{r}^{{\prime} }}^{-1}\left(q\right)\quad \quad \quad {A}_{\times }(q):=\frac{{F}_{{Y}_{r}^{{\prime} }}^{-1}\left(q\right)}{{F}_{{X}_{r}^{{\prime} }}^{-1}\left(q\right)}$$

(2)

where q is a quantile (in range [0, 1]), A₊(q) is the additive function used with tasmax and A_×(q) the multiplicative function used with pr and dtr. The CDFs were estimated from the thirty (one for each year) 31-day windows. In the implementation, during the training step, maps of A were saved to disk by sampling q with 50 values, going from 0.01 to 0.99 by steps of 0.02. The adjustment for each day was then as follows:

$${X}_{s}^{* {\prime} }={X}_{s}^{{\prime} }+{A}_{+}\left({F}_{{X}_{r}^{{\prime} }}\left({X}_{s}^{{\prime} }\right)\right)\quad \quad \quad {X}_{s}^{* {\prime} }={X}_{s}^{{\prime} }\cdot {A}_{\times }\left({F}_{{X}_{r}^{{\prime} }}\left({X}_{s}^{{\prime} }\right)\right),$$

(3)

where ${X}_{s}^{* }$ is the bias-adjusted simulation over the simulation period (1950–2100). Nearest neighbour interpolation was used to map ${F}_{{X}_{r}^{{\prime} }}({X}_{s}^{{\prime} })$ to the 50 values of q. Constant extrapolation was used for values of ${X}_{s}^{{\prime} }$ outside the range of ${X}_{r}^{{\prime} }$.

Adjustment of the trend

In the training step, a simple scaling or offset factor was computed from the averages:

$${C}_{+}={\bar{Y}}_{r}-{\bar{X}}_{r}\quad \quad \quad {C}_{\times }=\frac{{\bar{Y}}_{r}}{{\bar{X}}_{r}}$$

(4)

This factor was then applied to the trend in the adjustment step:

$${\bar{X}}_{s}^{* }={\bar{X}}_{s}^{* }+{C}_{+}\quad \quad \quad {\bar{X}}_{s}^{* }={\bar{X}}_{s}\cdot {C}_{\times }$$

(5)

Finally, the bias-adjusted timeseries ${X}_{s}^{* }$ for a given day of the year, grid point, and variable is simply the sum or product of these two terms:

$${X}_{s}^{* }={\bar{X}}_{s}^{* }+{X}_{s}^{* {\prime} }\quad \quad \quad {X}_{s}^{* }={\bar{X}}_{s}^{* }\cdot {X}_{s}^{* {\prime} }$$

(6)

Pre-processing for multiplicative bias-adjustment

Two extra steps are added in the multiplicative adjustment procedure. First, it should be noted that the multiplicative mode is prone to division by zero, especially with precipitation where values of zero are quite common. This problem was resolved by modifying the inputs of the calibration step, where the zeros of precipitation were replaced by random values between zero (excluded) and 0.01 mm/d. Even though in dtr values close to 0 °C are rare, the dtr timeseries was also modified for values under 0.0001 °C with random values above 0 °C.

Second, Themeßl et al.²⁵ observed that models having higher dry-day frequency than the reference (here, RDRS v2.1) can produce additive adjustment factors that map dry days to wet days, resulting in a wet bias. With a multiplicative adjustment and the injection of very small random values in the first pre-processing step, this problem transforms into the generation of large adjustment factors for those extra dry days where the very small (dry) simulation quantile divides realistic (wet) RDRS v2.1 quantile (see A_× in Eq. 2). These aberrant factors generate unphysical values in the adjustment procedure. To remove this bias, a second pre-processing step was applied. The frequency adaptation method, as proposed by Themeßl et al.²⁵, finds the fraction of “extra” dry days:

$$\Delta {P}_{dry}=\frac{{F}_{{X}_{r}}(D)-{F}_{{Y}_{r}}(D)}{{F}_{{X}_{r}}(D)}$$

(7)

where D is the dry-day threshold, taken here to be 1 mm/d. During the training step, a fraction ΔP_dry of dry days was transformed into wet days by injecting random values taken in the interval $\left[D,{F}_{{Y}_{r}}^{-1}\left({F}_{{X}_{r}}(D)\right)\right]$ (the precipitation value in RDRS v2.1 at the first quantile with precipitation in the simulation). Hence, adjustment factors are calculated separately for dry days and wet days. Note that, in the inverse case, where RDRS v2.1 would have a higher dry-day frequency than the models, the small number is in the numerator and only a few wet days are mapped to dry days. This is not a problem.

Both pre-processing functions were applied only on the calibration step inputs (Y_r and X_r) before the division between average and anomalies (Eq. 1). As such, only the adjustment factors were impacted while there were no explicitly injected precipitation values in the final bias-adjusted simulations.

Data Records

The reference for the ESPO-G6-R2 dataset is https://doi.org/10.5281/zenodo.7877330²⁶. The dataset is available through a Thematic Real-time Environmental Distributed Data Services (THREDDS) at the following link: https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/birdhouse/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.html.

The dataset is stored in NetCDF files. Each netCDF file (1 GB) contains 4 years of data for one model, one variable and one emissions scenario. Loaded together they create the full dataset containing timeseries from 1950 to 2100 for daily maximum temperature, daily minimum temperature and daily precipitation over North America at a 0.09° resolution on a rotated uniform latitude-longitude grid. Information about the grid is included in the attributes of the rotated_pole coordinate of each file.

The ensemble analysed in this paper includes the climate models listed in Table 1 for emissions scenarios SSP2-4.5 and SSP3-7.0. We also provide data for additional models, including models which have a TCR outside the likely range defined in the latest IPCC report, as well as simulations following the emissions scenarios prescribed by SSP 5–8.5. All CMIP6 models are made available under the CC-BY 4.0 License.

Technical Validation

Here, we present a succinct evaluation of ESPO-G6-R2 v1.0.

Health checks

First, health checks were conducted to verify that the datasets were free of unphysical values. The five checks are:

1.
No negative pr
2.
No tasmin above tasmax
3.
No tasmax above 60 °C
4.
No tasmin below −70 °C
5.
No pr above 1650 mm/d

All the bias-adjusted simulations in the ensemble passed the first three checks. The fourth check on minimum temperature is based on the minimum recorded temperature in the northern hemisphere (−69.6 °C in Klinck, Greenland²⁷) and revealed some issues. As CMIP6 models have very large grid cells, regions that are considered land on the RDRS v2.1 grid might correspond to an ocean grid cell in the model. In regions with sea ice, this can lead to issues with dtr being very small over water and larger over ice. Because the quantile mapping methodology is not designed to address these non-linear effects, this can lead to very large dtr and translate into non-physical tasmin values (very close to 0 K). This problem was detected near the coasts of Alaska and Greenland for the BCC-CSM2-MR and GFDL-ESM4 models. Instances of tasmin smaller than 100 K (−173.15 °C) have been replaced by NaN values.

There are still a few extremely rare cases (6 × 10⁻⁶% of all data points) where the minimum temperature is lower than the observed threshold, although we note that this threshold is also exceeded in the original simulations. We do not replace these extremes by NaN values. The last check is based on the highest recorded 1-day precipitation in the Northern Hemisphere (1633.98 mm in Isla Mujeres, Mexico)^28,29. This threshold is only exceeded in 1 × 10⁻⁷% of the dataset.

Inspection of the projected change

The goal of our bias-adjustment, which implicitly includes statistical downscaling, is to combine the climate change signal from the original simulation with the statistical information from the present climate at the fine spatial scale available in the reference dataset, RDRS v2.1. In this section, we validate that the bias adjustment did not impact the multimodel median projected changes in mean temperature and mean precipitation, whereas in the next section, we compare the bias-adjusted simulations with RDRS v2.1 for the present climate.

We start by inspecting the annual time series of the three variables, expressed as the difference with respect to the 1989–2018 climatology, for a grid cell located near Montreal, Canada.We show results for SSP3.7-0 only, but they are similar for SSP2-4.5. Figure 2 shows that the median change of the original regridded simulations (X_r) and the median change of the regridded and bias-adjusted simulations (${X}_{r}^{* }$) are similar, confirming that the climate change signal is not impacted by the bias-adjustment. From this figure, we can also see that the ensemble spread is not impacted by the bias adjustment, as that spread (90th percentile - 10th percentile) is similar in both ensembles. F-tests confirm that the variance of the original and bias-adjusted distributions are not statistically different.

We now extend this analysis to the entire domain. Figure 3 shows the difference between the 2071–2100 and 1989–2018 climatological means for both the original and the bias-adjusted ensemble. For the bias-adjusted simulations (Fig. 3d,e), the average warming is 4.1 °C for tasmax and 4.3 °C for tasmin, with the northern region warming more than the southern region. The spread of the ensemble is also larger in the north (Fig. 4). These results are consistent with the signal obtained in the original simulations (Figs. 3a,b, 4a,b).

For precipitation, the bias-adjusted simulations project an increase in precipitation over most of the domain, except for some small regions across the Caribbean islands and southern Mexico, where models don’t agree on the sign of the change (Fig. 3). Here, Approach B of the IPCC report is used to define model agreement (Cross-Chapter Box Atlas.1), wherein agreement is found if at least 80% of the models have the same sign for a given projected change³⁰. This is similar to the original simulations, except that the region without model agreement expands slightly. As noted previously, the spread of the ensemble is similar in the original and bias-adjusted simulations (Fig. 4c,f).

In general, the climate change signal is well-preserved in the bias-adjusted simulations. Nevertheless, since this dataset was generated using GCMs with spatial resolutions coarser or equal to 100 km, it is still essential to exercise caution when interpreting climate change patterns at finer scales.

Evaluation Through VALUE Diagnostics

In this section, we evaluate the capacity of our bias-adjustment method to improve the representation of the RDRS v2.1 reference climate with respect to the original simulations using the diagnostic framework developed in the VALUE project¹⁵. Each diagnostic is based on a property (called “indices” in the VALUE project) and a measure. Properties evaluate a dataset’s statistical characteristics by collapsing the time axis and are divided into three aspects: marginal (related to the distribution), temporal (related to the annual cycle, spells and transitions) and multivariate (related to the relation between variables). Measures evaluate the differences in a given property between two datasets.

We compute properties for RDRS v2.1 (Y_d), the original regridded simulations (X_d), and the regridded and bias-adjusted simulations (${X}_{d}^{* }$) over the 1981–2010 diagnostic (d) period. Measures are then calculated between the RDRS v2.1 reference and the original simulation, as well as between the RDRS v2.1 reference and the bias-adjusted simulation. The complete list of computed diagnostics is provided in Table 2. More details on the implementation of the properties and measures, including references and the code used to compute them, can be found within the modules xclim.sdba.properties and xclim.sdba.measures.

Table 2 Diagnostics used to assess the performance of ESPO-G6-R2 v1.0, including the full name of the property, the short name used in figures, the variable(s) on which the property is calculated, the measure associated with the property, and the aspect evaluated by the property.

Full size table

The diagnostics are computed using daily time series over the 1981–2010 diagnostic period for each model. Note that the diagnostic period is different from the reference training period (1989–2018) in order to maximize the number of independant years used for validation. The diagnostics shown here are calculated over a subregion (green contour in Fig. 1) in order to reduce the computational load and to focus on the region of interest for the majority of Ouranos’ stakeholders. We will refer to this region as Magtogoek, after the Algonquin word for the St-Lawrence River^31,32. Additionally, to increase confidence in the dataset for the rest of the domain, we compute diagnostics on three smaller regions with distinct climates and show the results in the Supplementary Information. The regions are labelled Tlicho, Cree and Ute after traditional native territories³³ and are shown with blue contours on Fig. 1.

In order to summarize the analysis across all models, emissions scenarios and properties, Fig. 5 shows the fraction of improved grid cells (IMP) over the Magtogoek region. IMP is calculated as the fraction of grid cells that results in a better measure in the bias-adjusted simulation (e.g. Figure 6e) compared to the original simulation measure (e.g. Figure 6c), which means either a smaller bias or a ratio closer to 1. IMP is defined as

$$IMP=\frac{1}{N}\sum _{i,j}{I}_{i,j}\quad \quad {\rm{where}}\quad \quad {I}_{i,j}=\left(\begin{array}{ll}\left(\begin{array}{cc}1 & {\rm{if}}\left|{M}_{i,j}^{sim}\right| > \left|{M}_{i,j}^{scen}\right|\\ 0 & {\rm{if}}\left|{M}_{i,j}^{sim}\right| < \left|{M}_{i,j}^{scen}\right|\end{array}\right. & {\rm{if}}\;{\rm{M}}\;{\rm{is}}\;{\rm{a}}\;{\rm{bias}}\\ \left(\begin{array}{cc}1 & {\rm{if}}\left|{M}_{i,j}^{sim}-1\right| > \left|{M}_{i,j}^{scen}-1\right|\\ 0 & {\rm{if}}\left|{M}_{i,j}^{sim}-{\rm{1}}\right| < \left|{M}_{i,j}^{scen}-{\rm{1}}\right|\end{array}\right. & {\rm{if}}\;{\rm{M}}\;{\rm{is}}\;{\rm{a}}\;{\rm{ratio}}\end{array}\right.$$

(8)

where M is the measure of the bias between the original or bias-adjusted simulation and the RDRS v2.1 reference and N is the number of grid cells (i, j) in the region. The advantage of this method is that it allows for the comparison of all properties using the same unit of measurement. Figure 5 shows that ESPO-G6-R2 v1.0 provides an improvement over the original simulations for most properties, as evidenced by the majority of values above 0.5. It is important to note that the underlying assumption of this analysis is that RDRS v2.1 used reflects the “ground truth”. As such, any biases that might be present in this dataset are passed along to the bias-adjusted simulations and are undetectable with this analysis. The quality of the RDRS v2.1 itself is beyond the scope of this paper. We encourage users of the dataset to verify the performance of RDRS v2.1 for their region and application.

The following sections further elaborate on the performance of the three aspects evaluated (marginal, temporal, multivariate). We show the maps of three properties for one model as an illustrative examples. The results are not exactly the same for every model, but the general conclusion and explanations remain the same. The maps also demonstrate that the fine-scale spatial features are better represented in the bias-adjusted dataset. We still note that there can be inflation in the dataset as is the case for most quantile mapping methods that rely on a simple spatial downscaling³⁴. This means that a day of extreme precipitation in a GCM grid cell will lead to more extreme precipitation in all finer grid cells of the bias-adjusted simulation that were contained in the original grid cell, instead of only in a few grid cells. Supplementary Information shows the impact on one grid cell as an example. We recommend caution using this dataset if spatial features of daily precipitation have a high impact on the user’s application. The use of regional climate models as the input instead of GCMs could help alleviate this problem as the resolution difference would be less important.

Finally, for each property, we compute the the root-mean square error (RMSE) between the RDRS v2.1 reference and original simulation as well as between the RDRS v2.1 reference and the bias-adjusted simulation to help user better assess the error before and after bias-adjustment. Table 3 shows the model average RMSE for each property and regions. The RMSE for the bias-adjusted simulations are nearly always smaller than for the original simulation.

Table 3 Ensemble mean RMSEs for each property and each region (Magtogoek, Tlicho, Ute and Cree) for SSP3-7.0 between the RDRS v2.1 reference and the original simulation (OR) and between the RDRS v2.1 reference and the bias-adjustment simulation (BA).

Full size table

Marginal aspect

As expected, the detrended quantile mapping method performs much better than the original simulation for the marginal aspects of tasmax and pr. This is not surprising since this method adjusts each quantile separately. To illustrate, we show in Fig. 6 that the bias in the 95th percentile has been near-completely removed in the bias-adjusted simulation. This matches the corresponding model average IMP of 88%. There are also large improvement in RMSE.

The results are slightly different for tasmin, which was not adjusted directly. Indeed, in order to avoid temperature inversions (tasmax < tasmin), we adjusted dtr and reconstructed tasmin in a subsequent step. As a consequence, co-occurrence of small errors from both dtr and tasmax can accumulate and decrease the performance of extremes in tasmin. Figure 7 shows that the bias-adjusted simulations (d) reproduce the spatial pattern of the RDRS v2.1 reference much better than the original simulation (b) even if there is a cold bias and about half the grid cells values of the original simulation are closer to RDRS v2.1 than the bias-adjusted simulation values (see Fig. 5). We can also see that the RMSE is smaller for the bias-adjusted simulation than for the original. This result is representative of tasmin’s marginal properties in general.

Temporal aspect

Because the bias-adjustment method is applied to each day of the year independently, we expect the bias-adjusted simulations to accurately reproduce the RDRS v2.1 annual cycle. The average IMP of the amplitude of the annual cycle of maximum temperatures (aca_tasmax) is 93% over all models. There are also large improvements in the RMSE. For the relative amplitude of precipitation (aca_pr), this ratio decreases to 72%. This difference can be explained by a weaker annual cycle in some regions compared to temperature.

On the other hand, the properties measuring sequences of days have not been explicitly corrected, but most of them nonetheless still perform reasonably well compared to the original simulations, with an average IMP of 81% for maximum length of warm spell, 73% for maximum length of dry spell, and 85% for wet-wet transition. The property with the lowest IMP was the dry-wet transition with 62%. Figure 8 shows that, for this property, there was very little change between the original and bias-adjusted simulations. This might be due to the second pre-processing step of bias-adjustment, which adapts the frequency of dry days. We recommend that users interested in dry-wet transitions use the original simulations directly as the RMSE is already very small.

Multivariate aspect

Our bias-adjustment method is univariate, in the sense that each variable is corrected independently. However, the workflow for each variable is not completely independent, as tasmin is reconstructed from tasmax and dtr. This could explain in part the mean IMP of 93% for the correlation between tasmax and tasmin. That said, the IMP of the correlation between tasmax and pr is also high (89%), even though they were not corrected together. From the RMSE, we can see that the bias-adjustment helps both properties, but the tasmax and tasmin correlation was already well represented in the original simulation.

Usage Notes

The dataset is available through a THREDDS Data Server (TDS). NetCDF files can be downloaded through the link provided above with the http server access. As the dataset contains many netCDF files with only 4 years of data and one variable, an easier way to access the data is through NcMLs: https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.html. NcMLs are aggregations of netCDF files that can be accessed using xarray (https://docs.xarray.dev)³⁵ via the OPeNDAP protocol. A general workflow might look like:

1.
Select an NcML,
2.
Select the OPeNDAP access,
3.
Copy-paste the data URL (url) into your access call:

xarray.open_dataset(url, chunks = dict(time = 1460, rlat = 50, rlon = 50))

We recommend using the xclim (https://xclim.readthedocs.io)¹⁸ and xscen (https://xscen.readthedocs.io)³⁶ packages to perform further analysis on the data, including data validation/quality assurance, computing indicators, climatologies, projected change, and ensemble statistics. Additional examples of data analysis are available on PAVICS (https://pavics.ouranos.ca) and on the ESPO-G GitHub repository (https://github.com/Ouranosinc/ESPO-G). For less technical users, a series of indicators computed from the ESPO-G6-R2 dataset over Quebec can be consulted on the Ouranos Portraits Climatiques website (https://portraits.ouranos.ca).

It is important to note that this dataset is meant to serve multiple purposes rather than for a specific application. Other bias-adjustment methods may be better suited for a given use case. It is the responsibility of the user to verify that the dataset best reflects their needs and their region of interest.

Code availability

The code to reproduce the dataset ESPO-G6-R2 dataset and the figures from this paper are available in the release ESPO-G6-R2 v1.0.0 (https://github.com/Ouranosinc/ESPO-G/releases/tag/ESPO-G6-R2v1.0.0) of the ESPO-G GitHub repository (https://github.com/Ouranosinc/ESPO-G). The code works with xclim version 0.41.0 and xscen version 0.5.13.

References

Eyring, V. et al. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development 9, 1937–1958, https://doi.org/10.5194/GMD-9-1937-2016 (2016).
Article ADS Google Scholar
Gasset, N. et al. A 10 km north american precipitation and land surface reanalysis based on the gem atmospheric model. Hydrology and Earth System Sciences https://doi.org/10.5194/hess-2021-41 (2021).
Logan, T., Gauvin St-Denis, B., & Bourgault, P. cb-oura-1.0: Generic climate scenarios from bias- adjusted CMIP5 global models., Zenodo, https://doi.org/10.5281/zenodo.7682788 (2018).
Dee, D. P. et al. The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Quarterly Journal of the Royal Meteorological Society 137, 553–597, https://doi.org/10.1002/QJ.828 (2011).
Article ADS Google Scholar
Brasnett, B. A global analysis of snow depth for numerical weather prediction. Journal of Applied Meteorology 38, 726–740, 10.1175/1520-0450(1999)038<0726:agaosd>2.0.co;2 (1999).
Balsamo, G. et al. ERA-Interim/Land: a global land surface reanalysis data set. Hydrology and Earth System Sciences 19, 389–407, https://doi.org/10.5194/hess-19-389-2015 (2015).
Article ADS Google Scholar
Carrera, M. L., Bélair, S. & Bilodeau, B. The canadian land data assimilation system (CaLDAS): Description and synthetic evaluation study. Journal of Hydrometeorology 16, 1293–1314, https://doi.org/10.1175/jhm-d-14-0089.1 (2015).
Article ADS Google Scholar
Mahfouf, J.-F., Brasnett, B. & Gagnon, S. A canadian precipitation analysis (CaPA) project: Description and preliminary results. Atmosphere-Ocean 45, 1–17, https://doi.org/10.3137/ao.v450101 (2007).
Article ADS Google Scholar
Lespinas, F., Fortin, V., Roy, G., Rasmussen, P. & Stadnyk, T. Performance evaluation of the canadian precipitation analysis (CaPA). Journal of Hydrometeorology 16, 2045–2064, https://doi.org/10.1175/jhm-d-14-0191.1 (2015).
Article ADS Google Scholar
Fortin, V., Roy, G., Donaldson, N. & Mahidjiba, A. Assimilation of radar quantitative precipitation estimations in the canadian precipitation analysis (CaPA). Journal of Hydrology 531, 296–307, https://doi.org/10.1016/j.jhydrol.2015.08.003 (2015).
Article ADS Google Scholar
Fortin, V. et al. Ten years of science based on the canadian precipitation analysis: A CaPA system overview and literature review. Atmosphere-Ocean 56, 178–196, https://doi.org/10.1080/07055900.2018.1474728 (2018).
Article ADS CAS Google Scholar
Riahi, K. et al. The shared socioeconomic pathways and their energy, land use, and greenhouse gas emissions implications: An overview. Global Environmental Change 42, 153–168, https://doi.org/10.1016/J.GLOENVCHA.2016.05.009 (2017).
Article Google Scholar
Forster, P. et al. The earth’s energy budget, climate feedbacks, and climate sensitivity. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Pean, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekci, R. Yu, and B. Zhou (eds.)]. https://doi.org/10.1017/9781009157896.009 (2021).
Hausfather, Z., Marvel, K., Schmidt, G. A., Nielsen-Gammon, J. W. & Zelinka, M. Climate simulations: recognize the ‘hot model’ problem. Nature 2022 605:7908 605, 26–29, https://doi.org/10.1038/d41586-022-01192-2 (2022).
Article CAS Google Scholar
Maraun, D. et al. VALUE: A framework to validate downscaling approaches for climate change studies. Earth’s Future 3, 1–14, https://doi.org/10.1002/2014EF000259 (2015).
Article ADS Google Scholar
Petrie, R. et al. Coordinating an operational data distribution network for CMIP6 data. Geoscientific Model Development 14, 629–644, https://doi.org/10.5194/GMD-14-629-2021 (2021).
Article ADS Google Scholar
Mai, J. et al. The canadian surface prediction archive (caspar): A platform to enhance environmental modeling in canada and globally. Bulletin of the American Meteorological Society 101, E341–E356, https://doi.org/10.1175/BAMS-D-19-0143.1 (2020).
Article Google Scholar
Bourgault, P. et al. xclim: xarray-based climate data analytics. Journal of Open Source Software 8, 5415, https://doi.org/10.21105/joss.05415 (2023).
Article ADS Google Scholar
Gennaretti, F., Sangelantoni, L. & Grenier, P. Toward daily climate scenarios for canadian arctic coastal zones with more realistic temperature-precipitation interdependence. Journal of Geophysical Research: Atmospheres 120, 11,862–11,877, https://doi.org/10.1002/2015JD023890 (2015).
Article Google Scholar
Cannon, A. J., Sobie, S. R. & Murdock, T. Q. Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes? Journal of Climate 28, 6938–6959, https://doi.org/10.1175/JCLI-D-14-00754.1 (2015).
Article ADS Google Scholar
Thrasher, B., Maurer, E. P., McKellar, C. & Duffy, P. B. Technical note: Bias correcting climate model simulated daily temperature extremes with quantile mapping. Hydrology and Earth System Sciences 16, 3309–3314, https://doi.org/10.5194/HESS-16-3309-2012 (2012).
Article ADS Google Scholar
Agbazo, M. N. & Grenier, P. Characterizing and avoiding physical inconsistency generated by the application of univariate quantile mapping on daily minimum and maximum temperatures over hudson bay. International Journal of Climatology 40, 3868–3884, https://doi.org/10.1002/JOC.6432 (2020).
Article ADS Google Scholar
Li, H., Sheffield, J. & Wood, E. F. Bias correction of monthly precipitation and temperature fields from Intergovernmental Panel on Climate Change AR4 models using equidistant quantile matching. Journal of Geophysical Research: Atmospheres 115, 10101, https://doi.org/10.1029/2009JD012882 (2010).
Article ADS Google Scholar
Cleveland, W. S. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74, 829–836, https://doi.org/10.1080/01621459.1979.10481038 (1979).
Article MathSciNet Google Scholar
Themeßl, M. J., Gobiet, A. & Heinrich, G. Empirical-statistical downscaling and error correction of regional climate models and its impact on the climate change signal. Climatic Change 112, 449–468, https://doi.org/10.1175/JCLI-D-12-00821.1 (2012).
Article ADS Google Scholar
Lavoie, J. et al. ESPO-G6-R2: Ensemble de scénarios polyvalents d’Ouranos - modèles globaux CMIP6 - RDRS v2.1/Ouranos multipurpose climate scenarios - global models CMIP6 - RDRS v2.1. Zenodo https://doi.org/10.5281/ZENODO.7877330 (2023).
Weidner, G. et al. WMO evaluation of northern hemispheric coldest temperature: −69.6 °C at Klinck, Greenland, 22 December 1991. Quarterly Journal of the Royal Meteorological Society 147, 21–29, https://doi.org/10.1002/QJ.3901 (2021).
Article ADS Google Scholar
World Meteorological Organization’s World Weather & Climate Extremes Archive. Northern hemisphere: Greatest twenty-four-hour (1 day) rainfall. https://wmo.asu.edu/content/northern-hemisphere-greatest-twenty-four-hour-1-day-rainfall (Accessed on October 5th, 2023).
Pasch, R. J., Blake, E. S., Cobb, H. D. I. & Roberts, D. P. Tropical cyclone report: Hurricane wilma. https://www.nhc.noaa.gov/data/tcr/AL252005_Wilma.pdf (2006).
Gutiérrez, J. et al. Atlas. in climate change 2021: The physical science basis. contribution of working group I to the sixth assessment report of the intergovernmental panel on climate change[Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press. 1927–2058 https://doi.org/10.1017/9781009157896.021 (2021).
Chassé, S., Bélanger, M. Gens du pays, gens du fleuve. Cap-aux-Diamants 26–30 https://id.erudit.org/iderudit/7361ac (2003).
Adam, A., Hatvany, M. G. Un générique unique: analyse identitaire, historique et toponymique de la notion de “ fleuve “ au québec. https://dam-oclc.bac-lac.gc.ca/download?id=99cd6e56-25ff-45d8-a1f1-55857e0a7881&fileName=34285.pdf (2018).
Native Land Digital. Our home on native land. https://native-land.ca/ (2023).
Maraun, D. Bias Correction, Quantile Mapping, and Downscaling: Revisiting the Inflation Issue. Journal of Climate 26, 2137–2143, https://doi.org/10.1175/JCLI-D-12-00821.1 (2013).
Article ADS Google Scholar
Hoyer, S. & Hamman, J. xarray: N-D labeled arrays and datasets in Python. Journal of Open Research Software 5, https://doi.org/10.5334/jors.148 (2017).
Rondeau-Genesse, G. et al. Ouranosinc/xscen: v0.6.0 https://doi.org/10.5281/zenodo.7897543 (2023).
Dix, M. et al. CSIRO-ARCCSS ACCESS-CM2 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.2285 (2019).
Ziehn, T. et al. CSIRO ACCESS-ESM1.5 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.2291 (2019).
Xin, X. et al. BCC BCC-CSM2MR model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.1732 (2019).
Seferian, R. CNRM-CERFACS CNRM-ESM2-1 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.1395 (2019).
Lovato, T., Peano, D. & Butenschön, M. CMCC CMCC-ESM2 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.13168 (2021).
Li, L. CAS FGOALS-g3 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.2056 (2019).
John, J. G. et al. NOAA-GFDL GFDL-ESM4 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.1414 (2018).
Volodin, E. et al. INM INM-CM5-0 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.12322 (2019).
Byun, Y.-H. et al. NIMS-KMA KACE1.0-G model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.2242 (2019).
Shiogama, H., Abe, M. & Tatebe, H. MIROC MIROC6 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.898 (2019).
Schupfner, M. et al. DKRZ MPI-ESM1.2-HR model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.2450 (2019).
Schupfner, M. et al. DKRZ MPI-ESM1.2-LR model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.15349 (2021).
Yukimoto, S. et al. MRI MRI-ESM2.0 model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.638 (2019).
Seland, Y. et al. NCC NorESM2-LM model output prepared for CMIP6 ScenarioMIP https://doi.org/10.22033/ESGF/CMIP6.604 (2019).

Download references

Acknowledgements

We acknowledge the World Climate Research Programme (WCRP; https://www.wcrp-climate.org/), which, through its Working Group on Coupled Modelling, coordinated and promoted CMIP6. We thank the climate modelling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies who support CMIP6 and ESGF. We acknowledge Environment and Climate Change Canada (ECCC) as the source of the RDRS dataset. This article was made possible thanks to funding from the Gouvernement du Québec’s Ministère de l’Environnement, de la Lutte contre les changements climatiques, de la Faune et des Parcs (MELCCFP), through the 2030 Plan for a Green Economy (https://www.quebec.ca/en/government/policies-orientations/plan-green-economy). We want to thank Milena Dimitrijevic, Vincent Fortin, and Dikra Khedhaouiria from ECCC for their presentation on RDRS and for providing data for the land-sea mask.

Author information

These authors contributed equally: Juliette Lavoie, Pascal Bourgault.

Authors and Affiliations

Ouranos, Montreal, H3A 1B9, Canada
Juliette Lavoie, Pascal Bourgault, Trevor James Smith, Travis Logan, Martin Leduc, Louis-Philippe Caron, Sarah Gammon & Marco Braun

Authors

Juliette Lavoie
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Bourgault
View author publications
You can also search for this author in PubMed Google Scholar
Trevor James Smith
View author publications
You can also search for this author in PubMed Google Scholar
Travis Logan
View author publications
You can also search for this author in PubMed Google Scholar
Martin Leduc
View author publications
You can also search for this author in PubMed Google Scholar
Louis-Philippe Caron
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Gammon
View author publications
You can also search for this author in PubMed Google Scholar
Marco Braun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors participated in discussions for the project. T.L. supervised the project. P.B., T.J.S. and T.L. conceived the method. T.J.S. handled the downloading of the raw data. Based on P.B.‘s scripts, J.L. created and ran the workflow to build the dataset. J.L. created the figures and the code to analyze the data. J.L. lead the writing of the manuscript based on reports from P.B. and with help from M.L. and L.-P.C. To create part of the dataset, the workflow was also run by S.G. and M.B. All authors reviewed the manuscript.

Corresponding author

Correspondence to Juliette Lavoie.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lavoie, J., Bourgault, P., Smith, T.J. et al. An ensemble of bias-adjusted CMIP6 climate simulations based on a high-resolution North American reanalysis. Sci Data 11, 64 (2024). https://doi.org/10.1038/s41597-023-02855-z

Download citation

Received: 20 June 2023
Accepted: 12 December 2023
Published: 11 January 2024
DOI: https://doi.org/10.1038/s41597-023-02855-z

Subjects

Abstract

Similar content being viewed by others

CLIMBra - Climate Change Dataset for Brazil

Bias-corrected CMIP6 global dataset for dynamical downscaling of the historical and future climate (1979–2100)

A high-resolution daily global dataset of statistically downscaled CMIP6 models for climate impact analyses

Background & Summary

Methods

Data extraction

Regridding

Bias-adjustment

Variables

Grouping and calendar

Detrending

Adjustment of the residuals

Adjustment of the trend

Pre-processing for multiplicative bias-adjustment

Data Records

Technical Validation

Health checks

Inspection of the projected change

Evaluation Through VALUE Diagnostics

Marginal aspect

Temporal aspect

Multivariate aspect

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links