Introduction

Figure SPM.3 from the International Panel on Climate Change's Special Report on Extremes (IPCC SREX)1 depicts three simplified scenarios of how temperature extremes could shift as a result of climate change (Figure 1). The first, “Shifted Mean”, is an increase in the entire probability distribution of temperature, leading to an equivalent increase in the distribution of all temperature extremes. The second, “Increased Variability”, is a symmetric widening of the variability of temperature, leading to intensification of extremes in both tails. The third, “Changed Symmetry”, is asymmetric, where the statistics of the lower tail temperatures would remain approximately at historical intensities and that the distribution of the uppermost extremes would increase. We use this concept to motivate the framework and hypothesis of this study. Emerging literature2,3 suggests that non-Gaussian, power law tail distributions and hence weather or climate extremes may be generated from simplified physics-based models with state-dependent noise. The processes generating extremes within fully coupled Global Climate Models (GCMs) such as the CMIP5 suite and indeed in the real climate system, are likely to be more complex. Thus, the adoption of even seemingly intuitive mechanistic explanations is useful but must be done with care. The work in this paper was motivated by the contrast between the simplicity of the IPCC depiction of temperature extremes versus the underlying complexity. Here we analyze statistical properties of the tails under a changing climate with a 14-member ensemble of CMIP5 GCMs and reanalysis datasets.

Figure 1
figure 1

IPCC SREX1 conceptual changes in the extremes of the temperature distribution are linked to exaggerated but tenable changes in GEV parameters.

The outer panel (a) shows how increases strictly in the location parameters for either tail would impact the distribution of extremes and similarly panels (b) and (c) show the same for scale and shape parameters. Changes in location parameters correspond to shifts in typical or average extreme events, scale to changes in the width of the distribution of extremes and shape to the behavior of the uppermost extremes. Baseline GEV distributions are shown in black and shifted distributions are shown in blue and red for simulated seasonal minima and maxima statistics, respectively. The SI gives details on the construction of the 6 side graphs, which are built with randomly simulated data from GEV models.

The growing awareness and salience4,5,6 of the occurrence, severity and societal impacts of weather extremes motivates detailed exploration of the degree to which recent observations may be attributed to climate change4,5,7,8 or, given assumed emissions trajectories, how the statistical attributes of the extremes may change over the next century9,10,11. Trends in temperature extremes in particular have recently been attributed to anthropogenic climate change with relatively high confidence12,13.

In addition, several initial examinations of the latest collection of GCMs, the CMIP5 suite14, have provided useful insights on aggregate projected and historically simulated statistics of extreme temperature events. Two studies9,10 analyzed a large subset of the CMIP5 repository in terms of 27 impacts-relevant temperature and precipitation extremes indices. The first9 compared CMIP5 and CMIP3 historical simulations to four reanalysis datasets and the HadEX2 gridded observational dataset. This study found that the CMIP5 models simulated extremes with skill comparable to CMIP3 as measured visually and via squared-error based metrics, with some modest improvements. The second10 subsequently explored CMIP5 and CMIP3 projections under several climate change scenarios. Results show that extremes indices based on daily minima are generally projected to increase more than maxima in terms of spatial-temporally aggregate intensity, duration and frequency, corroborating past work15,16.

Extreme value theory is an alternative framework from which to garner statistically rigorous insights into temperature tail behavior. One recent study11 performed an analysis with regard to projected changes in temperature and precipitation extremes in all available CMIP5 models using the Generalized Extreme Value (GEV) model. Findings resembled the group's previous similar work17 with the CMIP3 generation of models. Specifically, 20th century temperature extremes were found to be reasonably well simulated by multimodel median projections. Cold extremes were found to be more uncertain than warm ones in terms of multimodel variability, yet they still did not greatly differ from reanalysis datasets. This study focused primarily on aggregate global and regional projections of 20-year return levels as well as their uncertainty driven by GCM and RCP scenario differences.

Beyond aggregate (central or mean) behavior of the tails, it is of interest and in some cases of even more importance to explore their variability18. The IPCC SREX1 depictions discussed earlier (Figure 1) only capture this issue of tail variability conceptually and idealistically. In reality, multimodel projections could imply something in between or outside of these three scenarios. Given the importance of the variability of extremes18,19,20 to multiple impacts sectors21,22,23,24 that may have to prepare for the full range of potential extremes and not just average shifts, this work aims to explore more completely the projected changes in the distributions of temperature extremes.

Results

Figures 23 shows multimodel ensemble projected changes in percentiles of spatial-temporally pooled seasonal extrema. Projections are broken down by all combinations of the following factors: (1) tail type - seasonal maxima versus minima, (2) season - summer versus winter, (3) terrain type - land versus ocean and (4) region - South Pole or SP (−90° to −60°), Southern Hemisphere Extratropics or SHEX (−60° to −30°), Tropics or TX (−30° to 30°), Northern Hemisphere Extratropics or NHEX (30° to 60°) and North Pole or NP (60° to 90°). Figure 2 shows results only over land and Figure 3 only over ocean. Multimodel averages are plotted along with surrounding uncertainty defined as the lower and upper bounds of ensemble changes.

Figure 2
figure 2

Each panel shows changes in percentiles (in Kelvins) of seasonal temperature extrema statistics demarcated by region, season and extrema type (minima or maxima) over land only.

Thick green lines show multimodel ensemble average projections and green opaque bounds reflect the maximum and minimum of the 14-member ensemble at each percentile. Higher percentile statistics are always based on hotter events (e.g., the 95th percentile of minima is hotter than the 5th). Details on the computation of these statistics are described in the Methods section.

Figure 3
figure 3

Like Figure 2 but only over ocean.

Thick blue lines show multimodel ensemble average projections and blue opaque bounds reflect the maximum and minimum of the 14-member ensemble at each percentile.

In general, all percentiles of extremes are projected to increase in almost all cases. However, in most cases, there is an upward trend in the change of the percentiles, showing that the highest percentiles of most extrema are projected to increase more than the lowest ones, for each respective extrema (summer or winter and minima or maxima) in isolation. This suggests that a metric of “asymmetry” in extrema projections can be constructed by subtracting the change in the 5th percentile of extrema from the change in the 95th percentile. Notably, we do not compare extremes across seasons or compare maxima events to minima events; all following inferences about asymmetry that hold generically are made within seasons and tails.

Figure 4 displays statistical maximum likelihood estimates of asymmetry for all combinations of the four factors described above, as averaged over all GCMs. This allows for an examination of the relationship between each of the four factors described above and multimodel ensemble variability with asymmetry through the lens of a linear mixed effect statistical model (Methods).

Figure 4
figure 4

Linear mixed effect model estimates for asymmetry in extrema projections are made for each combination of extrema type (minima or maxima), season (summer, winter), terrain type (ocean, land) and region (NP, NHEX, TX SHEX, NP).

Dark shaded bars indicate asymmetry for maxima events and light for minima events. Asymmetry is defined and metricized by subtracting the change in the 5th percentile of extrema from change in the 95th (ΔP95-ΔP05), as shown in Figures 23. Intervals are a measure of variability in the estimates attributed to systematic differences among GCMs measured as random effects in the statistical model; intervals are all 0.8 Kelvins in width. Bar plot heights, or asymmetry estimates, are numerically tabulated in Table 1.

In general, there is consensus: asymmetry is greater than zero for more than 80% of the data points used in the analysis. The most dramatic asymmetry is found for winter minima events, especially those over ocean. These patterns are most pronounced in the NHEX and NP regions. Over land, the asymmetry is particularly pronounced for summer maxima and winter minima. From an impacts perspective, these are important to note since they suggest a wider range of temperature extremes within each season. Over land, summer maxima events show more asymmetry than summer minima. Dichotomously, winter minima events exhibit larger asymmetry than winter maxima over land. The Discussion section briefly touches on plausible physical hypotheses for broad asymmetry patterns summarized here and in Figure 4.

In some cases, estimates of asymmetry may be relatively less robust than others due to smaller data samples. For instance, there is little land north of 70°N, lending a smaller effective sample size to the estimation of asymmetry for NP extrema over land. Similarly, in the SP region, there is little ocean south of 65°S. Overall, though, asymmetry results are robust and based on stable estimates of extrema percentiles for each combination of the four factors (see Methods for details on robustness).

Systematic differences among ensemble members explain a significant but small (5%) proportion of variance in the asymmetry metric. In total, all of the four factors in addition to these systematic GCM differences account for 30% of the variance in projected asymmetry. A large portion of variance (~70%) in asymmetry is not accounted for by the factors suggests that other factors, such as fine spatial scale25, internal model variability and/or more complex tail processes2,3,26 likely also play roles.

We utilize extreme value theory not only to compute statistics of the extrema percentiles (see Methods) but also as a framework for examining the reliability with which the CMIP5 ensemble simulates tail properties. Figures 5 and 6, for land and ocean respectively, show that the broad latitudinal patterns simulated by historical CMIP5 runs are generally well aligned with reanalyses. GEV models fit location, scale and shape parameters to time series of seasonal extrema at each grid point. The three parameters respectively describe typical or average values of seasonal extrema, their variability and the behavior of the uppermost portion of the tail. Ensemble simulations of all three parameters generally follow latitudinal patterns of reanalyses, suggesting their ability to simulate the statistical properties of extremes with fidelity at aggregate scales. This realism provides additional confidence in projections of asymmetric changes in extremes. There are several notable outlying patterns, including for example the scale parameter behavior of the mirocesm and mirocesmchem GCMs. In addition, there is larger ensemble spread in simulations of scale and shape parameters closer to the poles. The Supplementary Information (SI) provides further support for the fitness of the GEV models by testing the significance of their shape parameters (Figure S1). In addition, we reach similar conclusions regarding the historical reliability of the CMIP5 ensemble by conducting the same analysis as Figures 56 and Figure S1, respectively, using the alternative Generalized Pareto Distribution model (Figures S2–S4).

Figure 5
figure 5

From top to bottom, latitudinal averages of grid-wise fitted GEV location, scale and shape parameters fit to historical seasonal extrema of all GCMs and reanalyses are displayed.

These respectively measure latitudinal aggregate reliability of historical GCM simulations of typical or average extrema, their variability and the behavior of the uppermost tails of extrema. From left to right, results are shown for summer maxima, summer minima, winter maxima and winter minima. Each latitudinal average value uses only grid cells over land.

Figure 6
figure 6

The same results are displayed as in Figure 5 but using only ocean grid cells.

To further examine robustness of the asymmetric change insights, we conduct the full asymmetry analysis described in Methods but on deseasonalized data from three GCMs. Results are found to be extremely similar to those found here and can be found in the SI: Figure S5 and Table S5.

Discussion

While virtually all statistics of temperature extremes are projected to increase, GCMs consistently show asymmetry in projected changes of seasonal extrema. For each of summer and winter maxima and minima, the highest percentiles will increase significantly more than will the lowest. These insights most closely resemble the third panel of the cartoon IPCC SREX scenarios (Figure 1), suggesting a wider range of extreme temperature events across the globe in the future. In Figure 1, pure changes in location parameters correspond to shifts in typical or average extreme events, scale to changes in the width of the distribution of extremes and shape to the behavior of the uppermost extremes, or the heaviness of the tail. We do note, however, that there is an interpretative difference between our analysis and the SREX component of Figure 1. Namely, the SREX shows the full temperature distribution and how it could change conceptually and we explore only the tails in terms of summer and winter maxima and minima, all four in isolation. Our results suggest that the uppermost tail statistics of all types of extrema will increase more than the lowest tail statistics, which does not translate directly to the bottom panel of the SREX concept (“Changed Symmetry”). In this study, the effect is similar but is projected within all four classes of extremes (summer and winter maxima and minima): in each class, the most intense extremes (i.e., the 95th percentiles) will increase more than the least intense (i.e., the 5th percentiles).

The linkage between the GEV model parameters and the projections of asymmetric changes is intuitive and can be related to a large extent to the idealized concept of Figure 1. In many cases, asymmetry appears to be more reflective of widening of total extremes variability (i.e., the scale parameter, Figure 1 b and e) in some other cases, larger increases in the highest percentiles and less so the lowest ones (i.e., the shape parameter, Figure 1 c and f). We emphasize that the GEV itself does not create the asymmetric insight; rather, the nature of the asymmetry can be understood by examining the scale and shape GEV parameters that ultimately reflect the behavior of the extremes of GCM data outputs themselves. Details can be found in the SI: Figure S6 and Table S6.

Literature on trends in the variability of temperature extremes in observations reveals a mixture of results. One study27 found decreasing trends in the variance of intraannual daily minima of observed 20th century temperature but a less clear signal for corresponding maxima. Similarly another study28 identified decreasing variability in observed record breaking (cold and hot) temperatures, more so in winter than in summer. On the other hand, one study19 points toward increasing variability in temperatures as a likely driver of the European heatwave of 2003, suggesting that increases purely in the location of temperature distributions were unlikely to have produced such events. Similarly, this study points to a wider range of temperature extremes in the future as a function of asymmetry in projected changes of their percentiles.

A broad hypothesis is that natural processes that have driven anomalously cool weather or dampened the intensity of extremes will continue to do so within a warming trend, while a climate change signal may exhibit relatively more influence over the events in the uppermost extremes of both tails. For example, literature has linked La Niña29 and the North Atlantic Oscillation30,31 with persistent anomalously cold extremes within a longer term warming trend. Although some literature points to observed and simulated decreases in cold air outbreaks often located downstream from atmospheric blocking events30,31, one study32 suggests they will persist and even increase in frequency in some regions in the future due to atmospheric circulation changes and natural variability that counters greenhouse effects. In addition, asymmetry could be explained by projections of increased interannual temperature variability19,20. Differences in asymmetric projections broken down by the four factors shown in Figure 4 could serve as a hypothesis generation tool for exploring potential physical mechanisms. A rudimentary physical basis for our results showing stronger asymmetric winter projections further north is the larger synoptic variability ultimately due to the greater equator-to-pole temperature gradient in the winter33. In considering the most prominent asymmetry specifically over the northern hemispheric ocean in winter, it may be useful to consider how variability in sea ice cover projections interacts with temperature extremes34,35. Recent research35 suggests that Arctic amplification of warming could lead to increases in the probability of both cold and heat waves in the mid-latitudes, driven by warming-influenced dynamics of atmospheric north-south Rossby waves.

Projections of a wider range of extreme temperature behavior may have consequences for many stakeholders. Although seasonal extrema studied here do not directly measure common impacts-relevant temperature extremes9,10, they may serve as a basis for extrapolating to likely impacts. For example, asymmetric changes may lead to increases in the intensity of heat waves and yet occasional persistence of cold waves, both of which have had significant effects on public health and mortality24. A wider spectrum of temperature extremes within seasons could also have implications for agricultural production yields21,22 and marine ecological stability23,36,37 since these impact areas can be sensitive to seasonal temperature thresholds. A wider spectrum of temperature extremes could also increase energy demand and potentially fossil fuel consumption38,39,40, helping to induce a positive warming feedback. Another potential consequence of asymmetry relates to human perception of the existence and danger of climate change conditional on firsthand experience with related natural hazards41. Persistent cold extremes42 and even long cooling periods43 do not necessarily imply a lack of global warming; however, literature41,44 suggests that people who have experienced extreme weather events that are typically associated with climate change may be more prone to perceive global warming as a significant threat. Regionally heterogeneous manifestations of relatively cold events or occasionally dampened extremes could conceivably create or change geographical and political patterns in public perception41,44 and hence impact policy.1

Table 1 Bar plot heights from Figure 4 are tabulated for each combination of extrema type (minima or maxima), season (summer, winter), terrain type (ocean, land) and region (NP, NHEX, TX SHEX, NP). Numbers represent the asymmetry metric in Kelvins, calculated as ΔP95-ΔP05, or in other words, the increase in the 95th percentile minus the increase (decrease) in the 5th percentile of a given extrema

Methods

Summer and winter maxima of daily maximum temperature are obtained from a 14-member CMIP5 ensemble of GCMs as well as three reanalysis surrogate observation datasets. Likewise, summer and winter minima of daily minimum temperature are obtained from the same datasets. Summer is defined as June-July-August (JJA) in the Northern Hemisphere and December-January-February (DJF) in the Southern Hemisphere and winter is defined as DJF and JJA in the Northern and Southern Hemispheres, respectively. All these data are obtained for historical (1970–1999) and, for GCMs only, future (2070–2099) periods. Future GCM data is obtained for the Relative Concentration Pathway 4.5 (RCP4.5) moderate greenhouse gas emissions trajectory scenario14. The climate models and reanalysis datasets are summarized in SI: Table S7.

GEV and GPD models45 are fit separately to each of the four classes (summer maxima, summer minima, winter maxima and winter minima) of extrema at each grid point for all GCMs and reanalyses. All minima are negated to enable the model-fitting process and transformed back afterward45. To assess ensemble skill in emulating historical statistical attributes of extremes, historical parameters from GEV and GPD fits are averaged over latitudinal bands, for land (Figure 5 and Figure S2) and ocean (Figure 6 and Figure S3). Shape parameters are almost always significantly less than 0 or not statistically differentiable from 0 and almost without exception never greater than 0, for both models (Figures S1 and S4).

We perform a bootstrap resampling analysis similar to recent work17,46 with seasonal extrema data to obtain robust estimates of changes in the distributions of extrema. Specifically, at each grid cell, for each GCM and separately for each of summer and winter maxima and minima (generally stated, extrema), we treat future extrema data (29 data points) as a pool of approximately independent data to iteratively resample from with replacement. We draw 29 samples with replacement from this pool 100 times. In each of those 100 random samples of size 29, we fit GEV parameters via maximum likelihood45 and subsequently simulate a random realization from the fitted parameters. As such, we expect to obtain robust estimates of extrema percentiles where sufficient realizations exist. Rarely, unstable (i.e., large) estimates of large shape parameters led to unrealistic bootstrap realizations. In these cases, where resultant temperature was simulated to be above 375 Kelvins for maxima or below 180 Kelvins for minima, these realizations were discarded and replacement realizations were simulated. We note that the asymmetry results described next do not change materially if the bootstrap analysis is not employed; the bootstrap enhances confidence in the robustness of the particular extrema percentiles used subsequently.

Separately for each of summer and winter maxima and minima, we estimate spatial fields of historical climatology Z by averaging historical seasonal extrema at each grid cell over time. Then, for each grid cell, we subtract Z from each T = 1,…t,…29 historical seasonal extrema spatial fields Ht, which yields 29 spatial fields ΔHt = HtZ. We perform the same operation for the S = 1,…s,…100 bootstrapped future spatial fields of extrema, Fs, which yields 100 spatial spatial-temporal fields ΔFs = FsZ. ΔF and ΔH represent the three-dimensional concatenation of 100 future and 29 historical spatial fields, respectively and are considered deviations from the historical climatology Z, in Kelvins.

We define four factors, with their levels in parentheses: X1 - Tail type (minima or maxima), X2 - Season (summer or winter), X3 - Terrain type (land or ocean), X4 - Region (SP, SHEX, TX, NHEX, or NP). Over each possible combination of the four and for each of the 14 GCMs, denoted X5, we obtain percentiles (0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95) from each of ΔF and ΔH, defined as PF and PH, respectively, where subscripts F and H denote “Future” and “Historical”, respectively. As percentiles increase, events always increase in temperature (i.e., 95th percentile of winter minima is hotter than the 5th). These percentiles are subtracted from each other to yield ΔP = PFPH that shows projected changes in the distribution of extrema deviations (Figures 23). We note that percentiles for each combination of factors may be obtained from effective samples sizes that differ substantially and as such especially the more extreme percentiles (0.05 and 0.95) may be relatively more robust with larger sample sizes. For example, closer to the poles, size of grid cells becomes smaller and as a result the effective sample size of those regions shrinks. Because of this and since the bootstrap also implies that estimates of more extreme percentiles (e.g., 0.01 and 0.99 and beyond) are less robust, we do not venture that far into the tails in the following asymmetry analysis.

Next, we describe relationships between the X1 through X5 and the asymmetry inferred visually from Figures 2 and 3. “Asymmetry”, or for short, Y, in extrema projections is constructed by subtracting the change in the 5th percentile of extrema (ΔP05) from the change in the 95th percentile (ΔP95), i.e., Y = ΔP95 − ΔP05. We fit a linear mixed effect model47 to obtain maximum likelihood estimates relationships between four factors, X1 through X4, as well as their two way interactions, with Y and to estimate how much variability in Y can be explained by systematic GCM differences, X5. Those estimates, as well as their variability in terms of inter-GCM differences, are portrayed for each possible combination of X1 through X4 in Figure 4. More details on the linear model implementation and results can be found in the SI.

To further test the robustness of the asymmetry insight, we select three GCMs and repeat the above entire GEV bootstrap-driven asymmetry analysis but first remove a historical daily seasonal cycle from each grid cell of each GCM's historical and future output. This analysis ensures that the findings in this study are not an artifact of seasonal timing (e.g., seasonal maxima or minima that will tend to be repeatedly extracted from very specific times of the year). Indeed we find this analysis strongly implies the robustness of the asymmetric projections to seasonal timing of extremes (SI: Figure S5 and Table S5).

Similar analyses are performed using the GPD. When using the GPD, we choose grid-wise 99th percentiles as the location parameters and fit scale and shape parameters to data exceeding those thresholds. With the GPD, the location parameters for both 20th century and projected data are estimated directly as percentiles and therefore they are not associated with statistical model-fitting uncertainty. Of primary interest here is whether the GPD analysis reveals similar patterns compared to the GEV model, especially in terms of evaluating historical GCM runs against reanalysis statistics. Outputs related to the GPD can be found in the SI: Figures S2–S4.