Introduction

Global warming will result in an intensification of the water cycle1. An increase in rainfall extremes is already observed in many regions in the world2,3,4,5,6, and research shows that extremes will increase in the future depending on the emission scenario7,8,9,10,11. Global climate models (GCMs) are the only available tools to study future daily rainfall extremes on the global domain, but come with limitations. A large limitation is that GCMs do not resolve convective processes, which are important drivers of extreme precipitation12,13. Recent research demonstrates that GCMs included in the Coupled Model Intercomparison Project Phase 6 (CMIP6)14 have decent skill in modelling extreme rainfall in comparison to observations15. Yet, when interested in absolute magnitudes or specific locations, a careful selection of models based on observations or advanced bias correction approaches are necessary16,17,18, but these are less relevant when studying relative changes over time.

Studies investigating the simulation of rainfall extremes in global climate models typically focus on one of two types of extremes: (1) common and (2) rare. Climate indices focusing on “common” extremes typically have probabilistic return times of a year or less. Examples of such indices include annual maxima5,19, a percentile-based threshold, e.g. the 90th, 95th, 99th, or 99.9th percentile20,21,22, or indices like R20mm (the number of days per year in which precipitation depth exceeds 20 mm) as defined by “the expert team on climate change detection and indices”8,23,24. These indices are well-studied on global and regional domains, and many regions expect a substantial increase in such common extremes9,19,20,22,23,25. The second type of extremes are the “rare” ones with multi-year or multi-decade return time periods, which are important for infrastructure design10. In hydrology these are typically estimated based on extreme value theory (using a historical time series of the same location), but model-based (e.g.26,27) or spatial pooling-based approaches (e.g.28,29) also exist to increase the time series length. There are fewer studies on the effect of global warming on such rare extremes30,31,32, or on the differences in future changes between “common” to “rare” extremes15,27. The latter studies point to a possible larger relative increase of the rare extremes.

The scientific debate regarding the effect of global warming on rainfall extremes has not yet fully addressed this difference in the expected change for the common and rare extremes, and if that differs for different climatic regions across the world. Here, we investigate the spatiotemporal patterns of a range of common to rare extremes using a large ensemble of precipitation estimates from the GCMs included in CMIP614.

We analyse the simulations of daily precipitation of 25 CMIP6 GCMs for both the historical late twentieth century period (1971–2000, referred to as “historical”) and the future late twenty-first century period (2071–2100, referred to as “future”) forced by four different scenarios (SSP1-2.6, SSP2-4.5, SSP3-7.0 and SSP5-8.5)33. To achieve robust multi-model ensemble statistics it is important to account for (a) model independence, i.e. the fact that some GCMs originate from similar development branches or share components and (b) model performance, i.e. the models’ ability to simulate historical climate34. Here we use the Climate model Weighting by Independence and Performance method35,36,37,38,39 to weight models by both independence and performance (see “Methods” for details). The multi-model ensemble means shown in this manuscript are weighted using the described method, but we note that using unweighted estimates does not considerably affect the results and conclusions drawn in this study (see Supplementary Figs. S1S3).

Extreme precipitation return levels ranging from common to rare were estimated using frequency and extreme value analyses (see “Methods”). We found the results to be largely independent of the statistical method, but here we mainly show the results obtained by using the metastatistical extreme value (MEV) distribution40,41 as it produces the smoothest spatial patterns42,43 and reduces uncertainty for the rarest extremes40.

This study is focused on relative changes in precipitation extremes in order to overcome the issues of systematic bias and different climate model resolutions. Moreover, relative changes allow for comparison between geographical regions with highly different precipitation amounts. As a reference for absolute values, we show the weighted mean precipitation depth for a precipitation event that would occur on average once every 100 years in Fig. 1 (see Supplementary Figs. S4 and S5 for the individual models). The highest model agreement is shown over the higher latitudes and arid regions, the lowest over the tropics, which most models have issues simulating correctly10,23,44 (see Supplementary Fig. S6).

Results

Future precipitation extremes for the climate scenarios are expected to increase in magnitude over land compared to historical extremes (Fig. 2 and Supplementary Figs. S79). This increase has high model agreement, irrespective of the climate scenario or how rare the extremes are. Regions with the largest magnitude increase in future extremes are mainly located in areas around and just north of the equator, stretching from the Equatorial Pacific Ocean, via northwest South America, through the Sahara and Western, Central and Eastern Africa, the Arabian-Peninsula and Arabian Sea to South Asia and the Tibetan Plateau. There are some locations over the subtropical Atlantic and South Pacific oceans where extreme precipitation is expected to decrease in the future, though more so for the common return levels. This is in agreement with findings by Pfahl et al.45, who demonstrated that the dynamic contribution of daily precipitation over subtropical oceans causes robust regional decreases in extreme precipitation. These regional patterns of increasing and decreasing precipitation extremes are similar to those of Li et al.15, their Fig. 5. Furthermore, the areas with the highest increases and lowest decreases overlap with the areas with the most positive and negative scaling with dew point temperature46, their Fig. 4.

The rarest precipitation extremes (i.e. the blue squares in Fig. 3) will increase more relative to the more common ones (i.e. the green triangles in Fig. 3). As expected, all individual models predict a global median magnitude increase in extreme precipitation for each of the four SSPs. However, the finding that this increase is relatively larger for rarer extremes is to the best of our knowledge the novel part of the results. Technically, this implies that the tails of extreme value distributions42 become heavier in a future climate. This behaviour is consistent at the global domain across all 25 CMIP6 models analysed (Fig. 3) and statistically significant for all SSP scenarios (Supplementary Table S1), as well as for other statistical methods (Supplementary Figs. S10 and S11) and other GCM realisations (Supplementary Figs. S12S22). Furthermore, these findings are statistically significant for most individual GCMs, particularly for the high emission scenarios (21 out of 25 GCMs for SSP3-7.0 and SSP5-8.5, Supplementary Table S2) and more so for the GCMs with the highest resolutions (Table 1).

Our main result that the magnitude of rarer extremes are expected to increase relatively more is also backed by earlier observation based studies over Australia47 and over Europe and the USA22. As well as based on a single model initial-condition large ensemble study over Western Europe27, and on CMIP6 global climate models15. The relative magnitude increase is also stronger with higher emission scenarios (see also Table 2), underlying the importance of emission reduction for extreme precipitation hazards.

Figure 4 and Table 2 show where the magnitude of most rare extremes increases relatively more than the common ones, as the difference in relative changes in the 100-year and 1-year return level estimates (see Eq. (4) in “Methods”). Land regions with the largest relative magnitude increase of rare extremes with respect to the common ones are around the subtropics (Sahara and surroundings, Amazon and Central America, and Central and Northern Australia), and oceanic regions include the South Pacific, South Atlantic, South Indic, and to a lesser extent their Northern counterparts. A few regions are exceptions where the common extremes instead are expected to increase relatively more than the rare ones, which are around the Equatorial Pacific Ocean and the poles. For future low-emission SSP scenarios, the models show large spatial discrepancies, contrary to high model agreement for the highest emission scenarios, predominantly over the subtropics. At the high latitudes and tropics, however, the models show more disagreement, which can be explained by more model uncertainty of extreme precipitation over the tropics in general due to the GCM differences23.

Discussion

We showed that in the future rare daily precipitation extremes are expected to increase more than common extremes. The CMIP6 GCMs exhibit high model agreement for this finding in general, particularly for the highest emission scenario (Fig. 4), but some spatial differences exist. The higher the emission scenario, the higher the relative difference found between rare (100-year return level) and common extremes (1-year return level), and with higher statistical significance (Table S1). Particularly we found for low emission scenario SSP1-2.6 and high emission scenario SSP5-8.5 global (land and ocean) daily rainfall extremes will increase by 8.6% and 23.1%, respectively, for 1-year events (Table S3) and by 11.9% and 32.5% for for 100-year (Table S4) events by the end of this century. Furthermore, regions are not affected equally, Africa and regions around and just north of the equator particularly will face a disproportionate increase in rare extreme precipitation hazards. This is notably the case for the higher emission scenarios, and much more than the regions most responsible for greenhouse gas emissions that often are expected to have a smaller increase than the global mean (Fig. 2). There are also areas in the subtropical Atlantic and South Pacific oceans that show decreasing precipitation extremes in the future. It should be noted that while the larger patterns of rarer extremes increasing relatively more is quite robust, there are also some regions with model disagreement. For such regions particularly, when compiling future extreme rainfall-intensity-frequency curves a more careful selection and weighting of climate models based on regional observations and advanced bias-correction techniques is advisable.

Here we did not formulate a hypothesis of why we observe this behaviour of rare extremes increasing relative more than common extremes under climate change. Yet, when looking at the changes in the parameters underlying the MEV-Weibull distribution (Eq. (4)), the statistics themselves give some indication about the processes (Supplementary Figs. S23S25). It should be noted that the behaviour could be caused by either a decrease in the number of wet days N (combined with an increase in the scale parameter C)48 or a decrease in the shape parameter W, which may, for example, be the result of dynamical feedback processes related to latent heating49. It appears that both the rainfall frequency (Supplementary Fig. S23) and dynamical feedback processes (Supplementary Fig. S25) play a role. This may serve as a starting point for future research to further disentangle the processes behind this behaviour. Regardless of the underlying mechanisms, the results of this study have important implications for the design of engineering standards as they are built on the basis of our knowledge of the frequency of precipitation events. If rare extreme precipitation events become more frequent in the future as suggested here, engineering design standards, such as those used for storm water drainage and other critical water system infrastructure, will need to be updated. Yet, it should be noted that bias correction methods ought to take into account the fact that the rarest quantiles of today’s climate are made up of different processes than the rarest quantiles in a future climate. Whereas the challenge of accurately predicting future changes in precipitation has been noted as one of the ’real holes in climate science’50, we think that the fact that the model agreement is so high should give confidence in the robustness of our climate models, and our own findings in particular, making that hole just a little bit smaller.

Methods

CMIP6 model data

Daily precipitation simulations from the Coupled Model Intercomparison Project Phase 6 (CMIP6) archive are analysed for the historical and future scenarios. The future late twenty-first century scenarios are Shared Socioeconomic Pathways (SSPs) coupled with the previous Representative Concentration Pathways (RCPs)33. We included in our analyses SSP1-2.6, SSP2-4.5, SSP3-7.0 and SSP5-8.5, ranging from the least to the most emissions. The two time periods that are compared are (1) the simulated historical late twentieth century period 1971–2000, and (2) the late twenty-first century period 2071–2100 (or 2070–2099 or 2069–2098, depending on the available climate-model output and ensuring that the latest 30 years leading up to 2100 are used). We use all 25 GCMs that provide complete simulations for the two time periods and all analysed scenarios51. For the main analysis we used only one realisation per GCM, but we also analysed all the different realisations with complete simulations for all five scenarios (one historical and four SSP scenarios, see Supplementary Figs. S12S22). We arbitrarily selected the first available realisation of each GCM, however, the similarity between different realisations (Supplementary Figs. S12S22) led us to believe that this did not affect our main findings. An overview of the models is displayed in Table 1. As there are large differences in the resolution of the different GCMs, the analyses are performed on each model’s native grid. The results are then remapped to a 0.25° × 0.25° grid using the nearest neighbour interpolation method for the ensemble means. The results remain mostly unaffected by the remapping, as the 0.25° × 0.25° grid is a higher resolution than the native grid of each of the models. Specifically, the remapped grid-cells are 4 to 63 times smaller depending on the model resolution, and as we used nearest neighbour interpolation no spatial averaging is taking place. Moreover, as the main results of this study are relative values instead of absolute values, the different resolutions of the models do not influence these results.

Model weighting

To account for GCM performance as well as model inter-dependencies in the used multi-model ensemble, we apply the Climate model Weighting by Independence and Performance (ClimWIP) method35,36,37,52. ClimWIP assigns a weight wi to each model to account for the models’ performance in simulating historical climate (Di) and independence from the other models (j = 1...M) in the ensemble (Sij):

$${w}_{i}=\frac{{e}^{-{\left(\frac{{D}_{i}}{{\sigma }_{D}}\right)}^{2}}}{1+\mathop{\sum }\nolimits_{j\ne i}^{M}{e}^{-{\left(\frac{{S}_{ij}}{{\sigma }_{S}}\right)}^{2}}},$$
(1)

with the shape parameters σD and σS determining the strength of the performance and independence weighting, respectively (see ref. 35 for more details). We use an implementation of ClimWIP within the Earth System Model Evaluation Tool (ESMValTool)53 version 2.354.

The independence weighting is based on model–model distances in 1979–2014 climatologies of temperature and sea level pressure in the same setup as used by Brunner et al.35 but updated for the 25 GCMs used in this study. These metrics have been shown to cluster models by known development families and account for dependencies35,52.

For model performance, we adapted the metrics used in ref. 35 to the target of global precipitation change. In contrast to other important climate variables, most prominently future warming35,55,56,57, emergent constraints58 for global precipitation changes have only recently been suggested59,60. Here, our main aim was to reduce the influence of models which simulate variables considered important for the representation of precipitation very different from observations rather than applying a constraint that necessarily reduces model spread. Performance weights were, therefore, based on five metrics: (1) the temperature trend, which has been found to be an important constraint for temperature and precipitation changes alike55,59, (2) the temperature climatology, (3) the variability of temperature, (4) the precipitation climatology and (5) the variability of precipitation, all in the period 1979–2014. Models which perform poorly in one or more of these metrics received less weight in the calculation of multi-model statistics as we trust their projections of future precipitation less. The strength of the weighting was established using a leave-one-out model-as-truth test35,37,61 on the target of global mean precipitation change. The resulting weights for each model are included in Table 1 and have a range comparable to recent studies, such as Brunner et al.35 (their Table S2 in the supplement). The weights of the models were used to create the multi-model weighted ensemble means for Figs. 1, 2, 4, and Supplementary Figs. S3, S7S9, and S23S25.

Changes in precipitation estimates

To study if the relative change in common extremes is different from the relative change in rare extremes, we use the following two equations:

$${{{{C}}}}_{{{{{{{{\rm{rel,t,x}}}}}}}}}=\frac{{{{{{{{{\rm{Tt}}}}}}}}}_{{{{{{{{\rm{SSPx}}}}}}}}}-{{{{{{{{\rm{Tt}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}{{{{{{{{{\rm{Tt}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}$$
(2)
$${{{{D}}}}_{100-1,{{{{{{{\rm{x}}}}}}}}}={{{{C}}}}_{{{{{{{{\rm{rel,100,x}}}}}}}}}-{{{{C}}}}_{{{{{{{{\rm{rel,1,x}}}}}}}}}=\frac{{{{{{{{{\rm{T100}}}}}}}}}_{{{{{{{{\rm{SSPx}}}}}}}}}-{{{{{{{{\rm{T100}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}{{{{{{{{{\rm{T100}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}-\frac{{{{{{{{{\rm{T1}}}}}}}}}_{{{{{{{{\rm{SSPx}}}}}}}}}-{{{{{{{{\rm{T1}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}{{{{{{{{{\rm{T1}}}}}}}}}_{{{{{{{{\rm{historical}}}}}}}}}}$$
(3)

With Eq. (2) we estimate (Crel,t,x), which is the relative change between historical and future precipitation for each of the return levels (Tt) and for any SSP scenario (SSPx). We use Eq. (2) as input for Eq. (3), where D100-1,x stands for the difference in change of rare and common extremes, T100 stands for the 100-year return level and T1 for the 1-year return level. For the all-day percentile method the same formula applies, but T100 and T1 were substituted by T30 and T0.3.

Extreme precipitation estimates

In this study, we estimated the common and rare extreme precipitation at each model grid-cell using three different methods: (1) the Metastatistical Extreme Value (MEV) distribution, (2) the Generalised Extreme Value (GEV) distribution, and (3) quantiles directly obtained from the precipitation simulations of all models. By using different methods for the calculation of precipitation extremes, we show the robustness of our results and allow for comparison with other studies.

The rare precipitation extremes (with return levels of 10 and 100 years) we present in this paper are calculated using the first method: the MEV distribution41. As opposed to traditional extreme value distributions, MEV uses all available data, and is, therefore, able to estimate return periods higher than the period of record with reduced uncertainty if the tail of the true distribution matches.40,62,63,64, and shows more consistent geographical patterns than traditional methods as GEV42,43,62. Following the approach of Zorzetto et al.40, for each individual year the Weibull distribution is fitted to all days with a precipitation depth exceeding 1 mm. Years are grouped together if the number of events per year is lower than twenty, to allow for more accurate parameter estimation65. The Weibull parameters are fitted using probability-weighted moments66. The cumulative distribution function of MEV-Weibull is as follows:

$${\zeta }_{m}(x)=\frac{1}{M}\mathop{\sum }\limits_{j=1}^{M}{\left\{1-\exp \left[-{\left(\frac{{{{x}}}}{{{{{C}}}}_{{{{j}}}}}\right)}^{{{{{w}}}}_{{{{{{{{\rm{j}}}}}}}}}}\right]\right\}}^{{n}_{j}}$$
(4)

where j is the year (j = 1, 2, …, M), Cj > 0 is the Weibull scale parameter, wj > 0 is the Weibull shape parameter, and nj is the number of wet events in hydrological year j41.

The second method to calculate the rare extremes is using the traditional GEV distribution. Annual maxima are used to estimate the GEV parameters with the L-moments approach67. The cumulative distribution function of GEV is:

$$G(z)=\left\{\begin{array}{l}\exp \left\{-{\left[1+\xi \left(\frac{{{{{{{{\rm{z}}}}}}}}-\mu }{\sigma }\right)\right]}^{-\frac{1}{\xi }}\right\},\xi \ne 0\\ \exp \left\{-\exp \left[-\left(\frac{{{{{{{{\rm{z}}}}}}}}-\mu }{\sigma }\right)\right]\right\},\xi =0\end{array}\right.$$
(5)

with location parameter μ ϵ (−, ), scale parameter σ > 0, and shape parameter ξ ϵ (−, ). The results for GEV are included in Supplementary Fig. S10.

The third method is to obtain the precipitation extremes directly from the precipitation estimation of each model, using all-day percentiles. The common extremes we present in this paper, the ones with a return level of 1 year (99.7262th percentile), are directly estimated from the precipitation time series for each grid-cell. This is because extreme value distributions are designed for return levels greater than the length of the time series. We also estimated the precipitation depths for all-day percentiles corresponding to the 0.3-year (approximately once every 109 days), 3-year, and 30-year (highest value in the 30-year time-series) return levels: the 99.0874th, 99.9087th, and 99.9909th percentile respectively. For the 30-year return level it is particularly uncertain whether the maximum observed precipitation event actually represents an event with a 30-year return period, which is why we did not use this as a primary method. Yet, averaged over large regions or globally this approach can still be considered valid. The results for this method are included in Supplementary Fig. S11.

Statistical analysis

To determine whether the results found that rarest extremes will increase more than the common ones (D100−1,x > 0, Eq. (3)) are statistically significant, we followed Livezey and Chen68. To account for spatial correlation, we first calculated the number of spherical harmonics that explain 95% of the observed variation in change in T1 (Crel,1,x, Eq. (2)) and change in T100 (Crel,100,x, Eq. (2)) for each of the 25 GCMs and SSP scenarios. The degrees of freedom in these data were set to be equal to this number of harmonics. Note that the lower the percentage used, the more conservative the test will be, with 95% being conservative. We then drew a number, equal to the degrees of freedom, of random samples from the change in T1 and T100 estimates for each GCM and SSP scenario, and calculated the median of these samples. The drawing excluded masked pixels (weighted mean of less than 3 events per year, see Section Mask dry areas) and was weighted according to the area of the pixels. This analysis was done as a monte carlo (mc) simulation and repeated 10,000 times to determine the Cumulative Distribution Function of the median of samples of size degrees of freedom of randomly chosen pixels. Finally, the values that exceeded 5%, 2%, and 1% of the medians of the samples, were taken as the 95%, 98%, and 99% confidence levels with which the null-hypothesis that there was no increase could be rejected.

We also applied the Spearman’s rank correlation to analyse whether the median change in the monte carlo 100-year return level (mc-Crel,100,x, Eq. (2)) is significantly larger than the median change in the monte carlo 1-year return level (mc-Crel,1,x). We assumed that each GCM is an independent experiment resulting in an estimate for mc-Crel,100,x and mc-Crel,1,x. The null-hypothesis is that there is no statistical relationship between the change and being a member of either the mc-Crel,100,x or the mc-Crel,1,x family (i.e., mc-Crel,100,x = mc-Crel,1,x). We tested whether the changes in 100-year return levels are significantly larger than the changes in 1-year return levels at 99%, 99.9%, and 99.99% significance levels.