## Introduction

The Pliocene Epoch (2.588 to 5.3 Ma) was a time when global temperatures were ~3 °C warmer than the pre-industrial1 and sea level was ~20 m higher than present2,3, largely due to the presence of smaller Greenland and Antarctic ice sheets2. Given that many other tectonic boundary conditions were similar (but not identical4) to today, the Pliocene Epoch, and the mid-Piacenzian warm period (mPWP; 3 to 3.3 Ma) in particular, are useful targets for climate model validation studies (e.g. refs. 4,5,6,7). Phase 1 of the Pliocene Model Intercomparison Project (PlioMIP), found an overall agreement between climate model simulations of mPWP surface temperatures and the available data when run with a CO2 of 405 ppm1. The North Atlantic and Pacific Oceans, however, are areas of consistently poor data-model agreement1,6. Haywood et al.4 noted that the experiment design of PlioMIP simulations precluded a determination of whether this result is attributed to poor model performance or poor data quality. This deficiency in experimental design, in part, resulted from the data collection time interval being from 3 to 3.3 Ma which spans a number of orbital climate cycles, whereas the models were run for less than 1,000 years with an invariant modern orbit5.

To address this weakness, PlioMIP2 (phase 2)5 part of the model evaluations8 feeding into the 6th Assessment Report (AR) for the Intergovernmental Panel on Climate Change IPCC AR6, will focus on the KM5c interglacial interval at ~3.205 Ma which has a close-to-modern orbital configuration4. Other data compilation efforts with future data-model comparisons in mind, have targeted the interval from 3.3 Ma to 3.205 Ma1,9 because this also includes marine isotope stage M2, a prominent anomalously cold marine isotope stage (MIS) that provides a cold-to-warm transition within the overall warm background climate state of the late Pliocene (Fig. 1). A major limitation of these latest efforts, however, is the lack of data on atmospheric CO2 during these rather narrow time intervals9,10,11. For instance, although ~40 δ11B-based determinations of CO2 are available for the 3 to 3.3 Ma window12,13,14, the M2 to KM5 transition remains poorly sampled and disagreement between datasets from different analytical techniques persist (Fig. 1). Furthermore, the available δ11B-CO2 data exhibit relatively large short-term variability, which could signal orbital cyclicity (41 kyrs) aliased by low sampling resolution. Data from other CO2 proxy systems such as stomata and palaeosol δ13C, are lacking in this interval and it has recently been shown that the marine-based alkenone-δ13C-CO2 proxy underestimates CO2 levels in the Pliocene15,16 therefore limiting its usefulness to providing a minimum CO2 during the mPWP of >270 ppm16.

To address this data deficiency, we developed δ11B-based CO2 estimates from Ocean Drilling Program (ODP) Site 999 in the Caribbean (Supplementary Fig. 1) at a resolution of 1 sample per 3–6 kyr for the time interval 3.15 to 3.35 Ma, encompassing the M2 glaciation and the KM5 interglacial (including KM5c). Although focusing predominantly on the mixed layer dwelling planktic species Globigerinoides ruber (45 new data points, 63 in total), we also present new measurements of Trilobus sacculifer (2 new data points, 5 total) on the same samples to provide a check on the consistency of the δ11B-pH calibration for G. ruber which has been recently called into question17.

## Results

Our new high-resolution CO2 record is shown in Fig. 2 (and Supplementary Fig. 2 and Supplementary Table 2) and is consistent with earlier studies12,13,14 in showing that CO2 was higher than the pre-industrial during the mPWP (mean=360 ppm). Our more detailed record reveals that CO2 variations ranged from $${330}_{-21}^{+14}$$ to $${394}_{-9}^{+34}$$ ppm (based on the mean and distribution of CO2 in the <25% and >75% interquartile range during the whole studied period; ref. 18), with a peak-to-trough range of 56–75 ppm determined by a Welch T-test of the data within the upper and lower quartiles (at 95% confidence; p < 0.01). From a comparison with benthic δ18O data from ODP 999 (Supplementary Table 2) and the δ18O stack19 (LR04) we observe that cold marine isotope stages (e.g. KM2; Fig. 2) are typically closely associated with low CO2 and warm stages with high CO2 levels. However, during the prominent M2 cold stage and through the warming out of M2, CO2 appears to lag benthic δ18O by ~10 kyr (Supplementary Fig. 5). This lag is not attributable to age model uncertainty because it is present when comparing δ11B-derived CO2 and benthic δ18O from the same samples (Fig. 2). CO2 during the interglacial KM5c, determined using the mean of the δ11B of the five samples in this interglacial, is estimated at $${391}_{-28}^{+30}$$ ppm (at 95% confidence). Using a broader time window (10 to 15 ky) for the KM5 interglacial moderately alters the estimate (see Supplementary Table 1).

Our estimates of borate ion, pH and CO2 derived from the δ11B of T. sacculifer (without sac-like final chamber) from ODP 999 (our new data, and ref. 14) and ODP 92620 overlap well with those based on G. ruber from ODP 999 in the mPWP (Fig. 2 and Supplementary Fig. 3). Calculated CO2 does not exhibit substantive inter-species offset with a mean difference of 10 ± 29 ppm with no consistent bias towards higher or lower pH/CO2.

### Accuracy of δ11B-CO2 from G. ruber in the Plio-Pleistocene

Ref. 17 suggested that the δ11B-pH calibration of G. ruber may have evolved through time and that G. ruber may suffer from morphotype-differences in δ11B-derived pH estimates for the surface Pliocene ocean resulting in underestimates of the true pH and corresponding overestimates of true CO2. The principal evidence presented for this assertion was the disagreement between Pliocene CO2 calculated using the T. sacculifer data of Bartoli et al.12 and the G. ruber data from Martinez-Boti et al.13 (Fig. 1). As shown here, when T. sacculifer and G. ruber are compared from the same samples and measured with the same analytical technique (in this case MC-ICPMS) there is no significant offset between the methods in terms of reconstructed borate ion δ11B, pH or CO2 (Fig. 2 and Supplementary Fig. 3). This finding suggests that the G. ruber δ11B-pH calibration has not evolved through time and therefore that the pH (and hence CO2) we reconstruct here is an accurate record of the surface water carbonate system parameter. This finding indicates that disagreements between published Pliocene δ11B-based datasets12,13,20,21 are likely attributable to either: (i) sampling driven aliasing due to the relatively large short-term CO2 variability in the mPWP (e.g. Figure 2), (ii) differences in core site location between studies (with possible different local CO2 disequilibrium), (iii) the lack of a comparison between species on exactly the same sample, or (iv) the well-documented analytical offset between MC-ICPMS and negative-ion thermal ionization mass spectrometry (NTIMS; see refs. 21,22). We note that, if the offset is attributable to analytical issues, the agreement between data generated by both methodologies for the younger Pleistocene time slices examined here (Fig. 2) confirms the suggestion of Sosdian et al.20 that the T. sacculifer dataset of Bartoli et al.12 (measured with NTIMS) requires an additional analytical correction (see ref. 16 for details), beyond that currently used for the NTIMS δ11B datasets of refs. 17,23.

### mPWP CO2 cycles – variability and causes

Our new high-resolution data set clearly demonstrates that the mPWP is an interval of relatively high CO2, with a mean of 367 ppm whereas lower values are observed during the mid- and late-Pleistocene (262 ppm and 241 ppm18, respectively Fig. 2). Our dense sampling permits, for the first time, an assessment of CO2 variability during mPWP on orbital timescales, although the length of our record still precludes reliable time series analysis. The CO2 cycles (and derived climate CO2 forcing) we observe in the mPWP (including the M2 glaciation) are similar, yet smaller in amplitude, than those evident in δ11B-CO2 data from the Late Pleistocene (LP = 0–250 kyr; ref. 18) and early mid-Pleistocene Transition (eMPT = 1050–1250 kyr; Figs. 2 and 3). The variation in climate forcing between these different intervals correlates with the magnitude of δ18O variability, although there is an apparent increased response in δ18O during the late Pleistocene (δ18O range ~ 2‰) relative to the mPWP (δ18O range ~ 0.5–1‰ Fig. 3; ref. 13) likely reflecting the increased influence of ice-volume change on δ18O following the northern hemisphere glaciation13.

We also compare the δ11B-derived CO2 data with ice-core based CO2 for the Pleistocene intervals18,23 (Supplementary Fig. 4). We expect the range of CO2 variability from the various Antarctic ice cores to be narrower than our marine-based records because of the greater precision associated with the ice core records (± 6 ppm vs. ±20–30 ppm) and the CO2 data from blue ice from the mid-Pleistocene may not capture a full climate cycle24,25. Despite these caveats, and as discussed elsewhere18,25, there is good agreement between the ice-core and δ11B-derived CO2, providing confidence in the accuracy of the distribution (and absolute values) we determine here for the mPWP.

Three phenomena are suggested to exert a major control over glacial-interglacial CO2 change in the Late Pleistocene record: (i) changes in stratification south of the Antarctic Polar Front influencing the ventilation of CO2-rich circumpolar deep water;26 (ii) variations in the magnitude of dust-borne Fe fertilization in the sub-Antarctic that influences the strength of the biological pump in this region;27,28 (iii) changes in the extent to which southern component deep water fills the North Atlantic increasing deep ocean carbon storage29. Ref. 30 proposed that polar waters in the North Pacific and the entire Southern Ocean (SO) were destratified prior to the onset of Northern Hemisphere Glaciation at 2.7 Ma. This mechanism may contribute to elevated CO2 during the mPWP30, but it would also rule out water mass ventilation in the high-latitude SO as an important driver of the CO2 cycles we observe in the mPWP. Both the accumulation and variability of dust-borne Fe at ODP Site 1090 (Fig. 2) were reduced during the mPWP to a fraction of that observed in the other time intervals examined (~ factor of four reduction during the mPWP, compared to the late Pleistocene), suggesting that dust supply did not play a major role in generating the observed CO2 cycles in the mPWP.

Chemical stratification of the North Atlantic Ocean, due to incursions of southern component water (SCW), during the mPWP is suggested by gradients in ɛNd and δ13C from ref. 31, albeit at reduced magnitude compared to the latest Pleistocene and eMPT. Ref. 31 showed that the SCW-signal was particularly marked after the M2 glacial maximum as shown by reduced % NCW (northern component water, Fig. 2), and also observed that ɛNd lagged δ18O by 10 kyr during the onset of the M2 glaciation, as is also evident in our new CO2 data (Fig. 2, and supplementary Fig. 5). In general, those intervals of the mPWP where SCW contributed significantly to the waters of the deep North Atlantic were characterized by low CO2 (and vice versa; Fig. 2), yet with variable leads and lags (supplementary Fig. 5). This association is consistent with the suggestion31,32 that glacial expansion of a CO2-charged SCW reservoir plays at least a first order role in driving orbital CO2 cycles, even prior to the onset of northern hemisphere glaciation at 2.7 Ma.

### M2 Glaciation – the role of CO2

Marine Isotope Stage M2 (at 3.3 Ma) is an anomalously cold stage clearly evident in the LR04 δ18O stack19 and many other temperature records, such as arctic air temperature33 and sea surface temperature34,35 (including site ODP 999, supplementary Fig. 6) It is also often described as a failed attempt at Northern Hemisphere Glaciation34,36,37, that eventually initiated ~600 kyr later at ~2.6 Ma. Refs. 13,14 used δ11B-CO2 to suggest that CO2 dropped below 280 ppm for the first time during the intensification of Northern Hemisphere Glaciation (iNHG), consistent with the suggestion that 280 ppm CO2 is an important threshold  below which extensive glaciation of continents in the northern hemisphere develop38. Our new data reveal that while the lower bound of the error envelop in our smoothed CO2 record approaches this value, atmospheric CO2 during M2 is unlikely to have crossed this threshold. Furthermore, in our records, CO2 variability associated with M2 lags δ18O by 10 kyr, which also means that minimum CO2 is not coincident with minimum northern hemisphere orbital forcing (Supplementary Fig. 5). Additional support for CO2 playing a secondary role in M2 is that other periods of low CO2 are evident in the mPWP (Fig. 2). For instance, the KM2 glaciation is clearly evident in the benthic δ18O data, Mg/Ca-SST at site 999 (Fig. 2 and Supplementary Fig. 6) and our CO2 record, but is not considered to be a major glacial interval19 (Fig. 2). This therefore suggests that other boundary conditions, such as orbital configuration39 (Supplementary Fig. 5) were perhaps dominant in triggering the M2 cold stage.

### M2 Glaciation - CO2 lags δ18O

Minor dephasing between ɛNd, δ13C and benthic δ18O has been observed in the late Pleistocene40, with carbon budget change lagging δ18O and preceding the change in ocean circulation, but with smaller lags than we observe during M2 (~2.5 ky vs. ~10 ky).

Similarly, while variations in atmospheric CO2 are not entirely in phase with ice volume changes over the late Pleistocene, the leads-lags are small41 (<3 ky) and are not readily observed when comparing δ11B-derived CO2 and benthic foraminiferal δ18O time series (e.g. Figure 2 left panels).

This therefore either requires the operation of subtly distinct carbon-cycle dynamics during the Pliocene, and M2 in particular, or implies some other driver operated to explain the observed ~10 kyr lag of CO2 during M2.

One possible non-carbon cycle driver for the observed lag at M2 could be a preservation bias in our data because partial dissolution of planktic foraminiferal tests drives δ11B towards more negative isotopic composition (lower pH, higher CO2) in some species14,42. However, while our chosen species for pH/CO2 reconstruction, G. ruber, is known to be relatively susceptible to partial dissolution on the seafloor43, its δ11B signal has been observed to be robust14,44. Furthermore, a recent study45 of test weight and fragmentation at ODP 999 showed that tests become better preserved during M2 (Supplementary Fig. 7), as observed during glacial periods of the late Pleistocene46. This is inconsistent with the high CO2 observed during the descent into M2 maximum (especially given fragmentation and δ18O are in phase, Supplementary Fig. 7), ruling out an effect of partial dissolution on our CO2 reconstruction and observed lag.

An alternative explanation for the delayed pH change observed at ODP 999 during M2 lies in local water mass changes during this glacial episode. Although the Central American Seaway was closed to deep water by this time36, it is possible that limited exchange of surface water was still occurring around M234 and the early Pleistocene47. Temperature data from Mg/Ca in T. sacculifer34 and G. ruber at site 999 (this study, Supplementary Fig. 6, data from the East Equatorial Pacific site 124148 is also shown for comparison) show a cooling prior to the M2 maximum. A connection between the East Equatorial Pacific and the Caribbean, or local intensification in upwelling could have brought cold, nutrient- and carbon-rich waters (low Mg/Ca, low pH) to Site 999, potentially explaining the apparent high CO2 at the inception of M2. However, because no noted decline in the abundance of G. ruber occurred, given this is a species known to favour oligotrophic conditions49, it seems unlikely this was accompanied by a significant influx of such water. Also, a mechanism for increased influx of EEP water during sea level regression is lacking. Importantly, during KM5c (3205ky), the CAS likely remained closed impeding the influx of Pacific waters to site 999, as shown by elevated temperature (relatively to the M2 interval) in the Caribbean (Supplementary Fig. 6), hence local changes in hydrography are unlikely to have unduly impacted our CO2 estimates for this central interval.

The cause of the apparent lag of CO2 compared to δ18O during the inception of M2 therefore remains uncertain, but because this lag is also present in North Atlantic ɛNd records31 we favour a carbon cycle-based interpretation. A Southern Ocean-driven explanation for the CO2 lag during M2 is perhaps indicated by the observation that the tail of the CO2 decline during M2 is out of phase with decline in northern hemisphere insolation, but apparently in-phase with insolation decline at 65°S (i.e. offset by one half precession cycle, Supplementary Fig. 5).

### Past to future – when will we exceed mPWP CO2 levels?

Atmospheric CO2 has been increasing since the industrial revolution from a background of ~280 ppm, reaching 411 ppm in 2019. The mPWP is often cited as being the last time CO2 levels were as high as today13, although we note methane and other greenhouse gases remain poorly constrained and may contribute to extra radiative forcing50.

Our refined view of CO2 during this interval however reveals that in terms of the mean (367 ppm), mPWP values were exceeded in the mid-1990s. Our upper quartile range ($${394}_{-9}^{+34}$$ ppm) suggests that CO2 during the mPWP is very likely to have been ≤ 427 ppm (using the distinctions of the IPCC). Atmospheric CO2 rose by 2.5 ppm from 2017 to 2018, if this rate is sustained, our new data indicate that CO2 will surpass even the highest mPWP values within the next 5 to 6 years (2025–2026).

## Conclusions

Our new δ11B-data for the mPWP provide a tightly constrained and robust estimate of CO2 during KM5c of $${391}_{-28}^{+30}$$ ppm (at 95% confidence), and documents CO2 variability during the mPWP from $${330}_{-21}^{+14}$$ to $${391}_{-9}^{+34}$$ ppm. This extended view suggests that changes in ocean carbon storage may play an important role in natural variability in CO2 on orbital timescales in warmer than present climate states. By better constraining the upper levels of CO2 during the mPWP we conclude that, given current rates of CO2 increase, we will very likely surpass mPWP values within 4 to 5 years, meaning that, by 2024/2025 levels of atmospheric CO2 will be higher than any point in the last 3.3 million years.

## Methods

ODP Site 999 is located in the Caribbean Sea and has been reliably used to reconstruct past atmospheric CO2 in a number of previous studies18,51,52. Air-sea disequilibrium for CO2 in the surface waters in the region of ODP Site 999 are close to 0 (+20 ppm) today and ref. 13 suggest this remained the case for at least the last 3.3 million years. An age model for the interval 3.15 to 3.35 Ma studied here was constructed by analyzing the epibenthic foraminifera Cibicidoides wuellerstorfi for δ18O in each sample, combining these data with similar data from ref. 53, and tuning the combined record to the LR04 benthic δ18O stack19 (Fig. 1). From each sample, ~120 individual tests of the mixed layer dwelling species Globigerinoides ruber white sensu stricto (300–355 μm) were hand separated, clay removed, oxidatively cleaned and analyzed for boron isotopic (δ11B) and trace element composition (e.g. Mg/Ca, Al/Ca) at the University of Southampton using well-established procedures21,52. These data were converted to pH using the G. ruber δ11B-pH calibration of ref. 51 and a modern δ11B of seawater54 (39.6‰). Sea surface temperatures (SST) were derived from each samples Mg/Ca ratio using the calibrations of ref. 55 corrected for the changing Mg/Ca for seawater following ref. 20 based on ref. 56.

It was recently suggested17 that the δ11B-pH calibration for G. ruber may have varied over the last 3.5 million years. We therefore also analyse T. sacculifer (300–355 μm) from two samples from ODP 999 to combine with existing T. sacculifer data14,20,52 to provide a test of the G. ruber δ11B-pH calibration of ref. 51. Following previous studies13, uncertainty relating to the following factors, including uncertainty in the δ11B-pH calibration, were propagated using a Monte Carlo approach (n = 10,000): temperature (±1.5 °C; 2σ), salinity (± 3 psu, 2σ), δ11Bsw (± 0.2, 2σ; refs. 20,57), analytical uncertainty (± 0.12–0.3‰, 2σ; see Supplementary Table 2).

In order to calculate atmospheric CO2 from the reconstructed surface water pH we use dissolved inorganic carbon (DIC) from the reconstructions of ref. 20. Uncertainty in this term is also propagated into CO2 uncertainty using a Monte Carlo approach (n = 10,000), but with a uniform distribution encompassing the range of the reconstructed DIC for our study interval (1765 to 1840 μmol/kg). CO2 was then determined by the maximum probability of all 10,000 realisations.

To obtain CO2 during the KM5c interglacial the mean and associated uncertainty of the reconstructed borate ion δ11B and SST of the data lying within +/− 7 kyr of the peak of KM5c (n = 5) was determined. In our age model we defined that the peak of KM5c was at 3212 kyr, although expanding the window to ±15 kyr, or uncertainty in the identification of the peak of KM5c to within 10 kyr, did not have a significant effect on our calculated mean (Supplementary Table 1).

This average was then used to calculate pH and CO2, but only considering the uncertainty in δ11Bsw and DIC detailed above. This approach was chosen because it allows the random uncertainties arising from SST and δ11B measurement error to be reduced through replication, whilst still realistically accounting for the systematic uncertainties in δ11Bsw and DIC.

In order to place our new data into a Plio-Pleistocene context, we recalculated the δ11B data of ref. 13 from 3.0 to 3.3 Ma using the same methodology outlined here and combined it with our new data. The difference in the method is the choice of the second carbonate parameter where constant modern alkalinity is used in ref. 13 whereas we use DIC20 in this study. These new and published data are then combined into a single record with an average temporal resolution of 1 sample every 4 kyr. This was then interpolated to 1 kyr resolution and a 6-point running mean was used to reduce the influence of analytical uncertainty on our reconstructed CO2 record. Uncertainty in the smoothed record was determined using the output of the original Monte Carlo simulation.

CO2 forcing was calculated relatively to preindustrial CO2 (278 ppm) and is defined as:

$$({\text{R}}_{{\mathrm{CO}}_{2}})=\text{ln}\frac{C{O}_{2}}{C{O}_{2preind}}$$