Introduction

Analysis of geochemical archives provides insight into Earth’s climate history through proxies of paleoclimate conditions1. Characterizing this history is critical for understanding the evolution of modern Earth and for constraining possible future responses to anthropogenic greenhouse gas emissions2. Estimates of cumulative emissions so far, remaining fossil fuel reservoirs, and the long-term sensitivity of climate to cumulative emissions3,4 indicate that humanity has the potential to perturb the climate system enough that the large changes in Earth’s paleorecords1 are relevant indicators of its potential response on millennial timescales. It is thus particularly important to determine how paleoclimatic variations may depend on baseline climate state because this is directly linked to the risk of a large long-term Earth system response to anthropogenic forcing. Variations in Cenozoic climate are studied using deep-sea benthic foraminiferal δ18O, which relates approximately inversely to global temperature and linearly to global ice volume such that low δ18O corresponds to warm climate states5. Much of the Cenozoic was a greenhouse climate state with minimal ice volume1, and so δ18O is used as an inverse linear proxy for global temperature6. Analogously, foraminiferal δ13C records past carbon cycle changes through isotopic fractionation during photosynthesis. The tremendous scientific effort has gone into producing, refining, and interpreting these records; it is a marvel that we can infer with some confidence so much about Earth’s climate tens of millions of years ago based on the isotopic composition of shells of protist algae that sink to and are preserved in the seabed7,8. Figure 1 shows the δ18O record from8 leveraging new methods and measurements, which we focus on here. Four phenomena are evident: (i) a long-term cooling trend, (ii) the emergence of periodic Pleistocene glacial–interglacial cycles at 2.6 million years ago (Ma), (iii) noisy sub-million-year fluctuations before then and superimposed on these recent periodic cycles, and iv) punctuations of the record by large, rapid, negative δ18O excursions corresponding to multimillennial timescale warming events, most notably the Paleocene-Eocene Thermal Maximum (PETM, 56Ma). The long-term cooling trend and Pleistocene glacial–interglacial cycles have been the subject of extensive study1,7, and the sub-million year noise has recently been shown to be consistent with multiplicative fluctuations9, potentially due to metabolic temperature-sensitivity of the biosphere10. The tendency for large negative δ18O excursions, perhaps the most concerning from a future climate perspective, has been noted9, and considerable investigation of individual events such as the PETM shows promise for providing useful constraints on Earth’s future climate11. However, these thermal extreme events (iv) have not been studied quantitatively and collectively, meaning a general explanation for these extremes and their magnitude is lacking, impairing our ability to use these extremes to make inferences about future climate. Here, following the lineage of stochastic climate modeling beginning with12 and most recently typified with respect to the Cenozoic by9, we study these extremes from a stochastic lens.

Fig. 1: δ18O over the Cenozoic (66Ma-present), from8.
figure 1

Inset is the same for the most recent 2.6 Ma.

The generalized extreme value (GEV) distribution is widely used to study such extremes in other settings13. Analogously to how the ubiquity of normal and log-normal phenomena in nature is explained by the central limit theorem14, the maxima of many natural phenomena tend to be GEV-distributed, which is explained by the extreme value theorem (Methods). The GEV distribution has three parameters μ, σ, and ξ, the last of which controls the weight of its upper tail13 (Methods). We show that the GEV distribution describes thermal extremes (i.e., δ18O minima) in the Cenozoic excellently, then utilize it to study how the magnitude of these extremes depends on baseline climate state, allowing us to project the increased likelihood of large ( > 3 standard deviations above baseline) thermal extremes as a function of cumulative emissions.

Results

The distribution of thermal extremes, as captured by standard (z−) scores of δ18O minima in blocks of consecutive δ18O values, is well-characterized as GEV-distributed (Fig. 2). The Kuiper statistic V quantifies the deviation between the theoretical and empirical distributions; here V = 0.0357, well below the threshold V5% = 0.0499 for significance at the 5% level for this sample size (a smaller V-value indicates a better correspondence between the null hypothesis of GEV distribution, and a V-value below V5% indicates a failure to reject the GEV distribution at the 5% significance level; Methods). The GEV distribution also applies for δ18O maxima (i.e., thermal minima, V = 0.0278), δ13C maxima (V = 0.0256), and δ13C minima (V = 0.0360). This result is also robust to the choice of block size (Methods). This excellent agreement suggests we can utilize the GEV distribution to characterize the rarity of individual events in terms of return levels and return periods, but more importantly, motivates the use of the GEV to investigate the possible dependency of extremes on baseline climate state.

Fig. 2: Cumulative distribution functions for standardized δ18O block minima and generalized extreme value distribution with maximum-likelihood-estimated parameters.
figure 2

Lower inset: corresponding probability density functions. Upper inset: corresponding percentiles of each distribution.

Through this lens of the GEV distribution, we investigate whether the magnitude of thermal extremes changes with the baseline climate state. We fit the GEV distribution to ‘metablocks’ of standardized δ18O minima grouped according to their associated mean δ18O values. Figure 3A shows that the shape parameter ξ decreases monotonically as baseline δ18O increases, from ξ = + 0.01 ± 0.03 when δ18O = 0 ± 0.5‰, to ξ = − 0.32 ± 0.08 when δ18O = 4 ± 0.5‰. The implication of this ξ-change is shown in Fig. 3B, which plots the GEV distribution with best-fit parameters for δ18O = 0 ± 0.5 andδ18O = 4 ± 0.5. The relative likelihood of an δ18O minimum > z standard deviations below the mean for a given z-score is captured by the ratio of these distributions’ complementary cumulative distribution functions (CCDFs). When δ18O ~ 4 as over much of the past ~ 3.5 Ma, δ18O minima with z-scores > 3 are virtually impossible/nonexistent, whereas when δ18O ~ 0 as at the boundary between the Paleocene and the Eocene, such large excursions still had some probability of occurring. We found no other significant or systematic changes in any other parameters (μ, σ, ξ) of extremes’ (maxima/minima of δ18O or δ13C) distributions as a function of baseline climate or carbon cycle state (mean δ18O or δ13C), indicating this phenomenon is restricted to the potential for large thermal maxima depending on the baseline climate state.

Fig. 3: The shape of the extremes’ distribution changes with the baseline climate state.
figure 3

A Generalized extreme value (GEV) distribution’s shape parameter ξ as a function of baseline climate state. Compare with analogous figures, including δ13C and maxima in δ18O in Fig. S1. Error bars are bootstrap median absolute deviation. B GEV probability density function with the parameters from mean δ18O = 0 ± 0.5 (solid black line) and 4 ± 0.5 (dashed black line) from (A). The orange line is the ratio of these two distributions' complementary cumulative density functions, indicating e.g., that δ18O extremes \( > 2\frac{2}{3}\) standard deviations below the mean are > 3× more likely when δ18O ~ 0 than when δ18O ~ 4. C Relative likelihood of δ18O extremes > 3z (three standard deviations below the mean) for different mean δ18O values (upper x-axis) compared to present (mean δ18O = 3.5 ± 0.25). For instance, such extremes are ~5× more likely when δ18O = 2 ± 0.25.

As the state-dependency seen in Fig. 3A is restricted to thermal maxima and does not materialize in the δ13C record, we interpret it to be driven by the state-dependency of the physical climate system rather than the carbon cycle. Thermal extremes have been interpreted as being caused by the release of isotopically depleted organic carbon into the surface environment, such as methane hydrates, permafrost, or dissolved organic carbon. Many of these thermal extremes have been shown to be accompanied by extremes in δ13C9. The lack of ξ-changes in δ13C is consistent with temperature-dependent climate feedback; when the climate is warmer, the same input of carbon produces a larger temperature change11. Temperature-dependent climate feedbacks occur in most Earth System Models, most notably due to the water vapor feedback15 though also possibly due to e.g., ice-albedo or cloud feedbacks11. While we cannot exclude the possibility that this dependency is due to carbon cycle perturbations that are balanced in their effect on organic and inorganic carbon and, therefore, not observed in the δ13C record, or an external aspect of the Earth system that co-varies with the baseline climate state such as silicate weathering, these are less parsimonious explanations given the lack of any relationship involving δ13C extremes and baseline δ18O or vice versa. Additionally, while by any analysis, the PETM is an outlier in the δ18O and δ13C record, our results help contextualize it statistically; such a large outlier was far more probable during such a warm climate state due to the far heavier tail of the thermal extreme distribution.

We can utilize the trend in Fig. 3A to estimate Earth’s increased susceptibility to large (>3z) multimillenial thermal extremes resulting from potential human emissions (Methods). Figure 3C shows the probability of large multimillenial thermal extremes increases with background warming, doubling at approximately 2 °C warming, quadrupling at approximately 5 °C warming, and sextupling at approximately 7.5 °C warming. (These relationships are approximate; this extrapolation should be taken illustratively/qualitatively.)

Altogether our results suggest that thermal extremes over the Cenozoic are more likely to be large when the baseline climate state is warmer. As similar behavior is not seen in carbon cycle extremes, this dependency is most plausibly due to the temperature sensitivity of physical climate feedback. The probability of large multimillenial thermal extremes (superimposed on anthropogenic warming) may considerably increase if a substantial portion of remaining fossil fuel reserves is combusted.

Methods

δ18O records were taken from8 (Fig. 1), along with associated δ13C records; these variables and their relationship to temperature and other aspects of the Earth system are described extensively elsewhere. As discussed in8, the average temporal resolution of these data is 2 kyr for the 0–34 Ma portion of this dataset and 4.4 kyr for the 34–67 Ma portion due to lower sedimentation rates and a lower sample resolution of the available records; this difference is not enough to affect our conclusions or justify additional manipulation of the data to generate temporally equal blocks. This implies that maxima of blocks of size 20 correspond to multimillenial-timescale maxima within <100 kyr intervals. This timescale is small relative to the dominant 405kyr periodicity described in8, and therefore such periodicity can be neglected for present purposes.

For blocks of consecutive values, the mean, standard deviation, and minima were calculated to determine the standard deviations below the mean (z-score) of the minimum δ18O value for that block. The distribution of minima’s z-scores is then fit by a generalized extreme value (GEV) distribution via maximum likelihood estimation (Matlab’s mle function). Note that the GEV is fit to sets of block minima, not to the full blocks of samples themselves. This is because the extreme value theorem states that the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed (i.i.d.) random variables. Natural phenomena are rarely, if ever truly i.i.d., but the GEV distribution holds and is applied broadly nonetheless13, analogous to the central limit theorem holding quite accurately for only a handful of summed or multiplied random variables14. The GEV distribution has the form:

$$f(x;\mu ,\sigma ,\xi )=\frac{1}{\sigma }t{(x)}^{\xi +1}{e}^{-t(x)}$$

where f(  ) is the probability density function and

$$t(x)=\left\{\begin{array}{ll}{(1+\xi (\frac{x-\mu }{\sigma }))}^{-1/\xi }\quad &\,{{{{\rm{if}}}}}\,\xi \ne 0\\ {e}^{-(x-\mu )/\sigma }\hfill &\,{{{{\rm{if}}}}}\,\xi =0\end{array}\right.$$

so μ, and σ are the location and scale parameters, and ξ is the parameter that controls the shape of the distribution. Whether the empirical distribution of maxima deviates significantly is then determined by calculating the Kuiper statistic V, which is the maximum of the hypothesized minus empirical cumulative distribution functions plus the maximum of the empirical minus hypothesized cumulative distribution function, chosen because it gives equal weight to all portions of the distribution16. We compare V to a critical value at the 5% significance level, V5%17; the difference is not significant if V < V5%. Figure 2 uses the minimum block size of 20 from18, for which V = 0.0357 < V5% = 0.0499, and the median z-score is 1.91. Here we focus on the minimum block size because maximizing the number of blocks is useful to assess changes in the distribution’s shape as a function of baseline climate state, as this requires grouping sets of blocks into ‘metablocks.’ In general, the larger the block size, the larger the minima’s z-scores will be, and also, the larger V5% will be due to a smaller sample size of maxima. For instance, using a block size of 67 (the largest prime factor of the length of the δ18O record, 24321) yielded a V = 0.0516 < 0.0909 = V5% and a median z-score of 2.46, while a block size of 33 (another factor of 24321) yielded V = 0.0373 < 0.0640 = V5% and a median z-score of 2.17. V-values for δ18O maxima and δ13C maxima and minima reported in the main text are for the same block size of 20 and are also significant for larger block sizes.

Figure 3 A was generated by repeating this process on metablocks of block minima, where blocks were grouped by their mean δ18O values into the bins (0,1,2,3,4) ± 0.5‰. Uncertainties (shown using the robust metric of median absolute deviation) were estimated by bootstrap resampling the distribution of maxima and re-fitting the GEV distribution. We use 10,000 bootstrap iterations, which we find to be more than sufficient as ten 1000-member subsets were negligibly different. The other GEV distribution parameters (location μ and scale σ) vary negligibly, neither systematically nor significantly (p ≥ 0.33 for the block and bin sizes in Fig. 3A; this also holds for the bin sizes in Fig. 3C), with baseline climate state (i.e., across metablocks). The decreasing trend of ξ with mean δ18O holds for larger block sizes (e.g., 33 from above) or narrower bin widths (e.g., ± 0.25 from Fig. 3C. In Fig. 3B, the complementary cumulative distribution function of a probability distribution is one minus its cumulative distribution function. For Fig. 3C, we repeat the procedure to estimate the δ18O-dependence of ξ (with uncertainties) using the bins (2,2.5,3,3.5) ± 0.25. We then perform a weighted regression of ξ vs. mean δ18O to estimate ξ(δ18O) over this range, yielding an estimate of the GEV distribution for any given δ18O value between 1.75–3.5‰. (Note again that other GEV parameters do not change systematically or significantly over this range or over the entire δ18O range.) This is then used to calculate the probability density > 3 z-scores, which is shown in Fig. 3C relative to the probability density > 3 z-scores estimated for present-day δ18O = 3.5‰. We underscore that this subfigure, which includes assumed proportionalities between δ18O, global temperature change, and cumulative emissions, should be interpreted as illustrative and qualitative.

We repeated these calculations for block maxima of δ18O and for block maxima and minima of δ13C. All of these were well-characterized by GEV distributions (V < V5% in each case), but we found no evidence for any state dependence other than that reported in the main text. In other words, only the shape parameter ξ for δ18O minima was dependent on mean δ18O, and no other GEV distribution parameter of any other maxima or minima was dependent on mean δ18O or δ18C.

The glacial-interglacial cycles of the Quaternary period (2.6 Ma–present) are recognized not to follow the same sort of fluctuation characteristics as the rest of the Cenozoic, which must be accounted for in any analysis of extremes. Figures 13 exclude the last 2 Ma; neither increasing this to exclude the entire Quaternary period (2.6 Ma) nor decreasing this to exclude only after the mid-Pleistocene transition (1.25 Ma) affects the results or conclusions. Additionally, these are robust to including the Quaternary period and filtering out the glacial-interglacial cycles via robust locally estimated scatterplot smoothing (R-LOESS) with a window size of 10. We note that our interpretation of δ18O minima as thermal maxima is robust to the effects of ice volume on δ18O because ice sheets primarily act to change the slope and intercept of the linear temperature-δ18O relationship T ≈ α − βδ18O, with α, β > 0 approximately constant over the timescales of the extremes considered here. We note that excluding the PETM did not affect our results (as would be expected, as this is only one thermal maximum, and we are analyzing distributions of many thermal maxima), and therefore our inferences about PETM likelihood are not confounded by including it in our analyses.

For Fig. 3C, as a 0.22‰ change is associated with a ~1 °C temperature change19, cumulative carbon emissions so far plus remaining fossil fuel carbon resources are on the order of 5 EgC (=5 TtC = 5000 PgC = 5000 GtC)3, and 1 EgC cumulative emissions are associated with ~1.35 °C long-term warming4, we focus on the δ18O range 3.5( ± 0.25)−2( ± 0.25)‰, and estimate the ξ change over the equivalent ranges 3.5–1.75‰ δ18O, 0–8 °C temperature anomaly, and 0–6 EgC emissions.