Introduction

Recent trends in global emissions suggest that the cumulative emissions budget compatible with limiting global warming to 1.5 °C, as agreed at Paris COP21 (2015), and reaffirmed at Glasgow COP26 (2021), is likely to be exceeded with corresponding additional warming1,2,3. Hence, the latest IPCC reports4,5 include emission pathways for which reductions in greenhouse gas emissions are complemented by technologies that allow direct removal of excess atmospheric carbon dioxide (CO2)6,7,8,9. These carbon dioxide removal (CDR) technologies10 are anticipated to be available in following decades11,12. They are needed to counterbalance hard-to-abate emissions in the mid-term and achieve and sustain net negative emissions in the long-term by exceeding annual residual gross emissions4,13,14. Therefore, it is essential to understand the response of the Earth system to reduce emissions either artificially or by enhancing the sink strength of the ocean or land biosphere15.

The Carbon Dioxide Removal Model Intercomparison Project16 (CDRMIP) provides an idealised 140 year long 1% year−1 CO2 ramp-down simulation and constant CO2 thereafter from the end of 140 year long 1% year−1 CO2 ramp-up simulation. These simulations have allowed the hysteresis of relevant Earth system parameters to be quantified as the difference between their ramp-up and ramp-down paths (for example ref. 17,18,19,20,21,22). Hysteresis refers here to the dependence of the Earth system on its past state, not only on its current state. Ramp-up/ramp-down simulations are useful as they provide an idealised framework consistent with the definition of hysteresis, i.e., CO2 concentrations begin and finish at the same steady-state. However, an experiment with more realistic levels of overshoot is needed to assess the implications of CDR in a plausible socio-economic scenario, for example, the shared socio-economic scenario SSP5-3.4-OS23,24,25, which has also been used to detect hysteresis in the Earth systems. SSP5-3.4-OS follows a high-emission pathway consistent with the no-mitigation scenario SSP5-8.5 until year 2040, at which point aggressive mitigation is assumed to quickly decrease CO2 concentrations, resulting in net zero emission levels by year 2080. Then, negative emissions are phased out by 2170. In the ocean, reversibility studies have mainly focused on temperature, dissolved oxygen (DO), and acidity since ocean warming, deoxygenation, and acidification constitute notable risks to life in the world ocean26,27,28,29. However, the impacts of changes in these potential ocean ecosystem stressors after a temporary overshoot, referred here as a consecutive increase and decrease of CO2 levels, on the resilience of marine ecosystems have been analysed in isolation so far (for example refs. 19,30), while their combined effects remain a gap in our knowledge.

Habitability of marine ecosystems depends on the rate of oxygen (O2) supply that must exceed the resting metabolic demand. As the ocean warms in response to global warming31, organisms’ metabolic demands are anticipated to increase32,33. At the same time, ocean warming reduces the solubility of oxygen in the seawater and increases vertical stratification34, reducing DO concentrations altogether26,35,36. To quantitatively account for the combined effects of warming and oxygen loss, the metabolic index37,38,39,40 (Φ) is defined as the oxygen supply (S) to demand (D) ratio for a marine organism. Oxygen supply increases with ambient O2 pressure (pO2), which scales with temperature (T), and with respiratory efficacy, such that S = αs(T)BσpO2. Respiratory efficacy results from a per-mass rate of gas transfer between the water and the animal (αs), and it scales with body size (Bσ). Resting metabolic demand also scales with body mass (Bδ) and with absolute temperature, such that D = αDBδexp{−Ed/kB(1/T–1/Tref)}, where αD is a taxon-specific baseline metabolic rate, and kB is Boltzman’s constant. The parameters Es and Ed (eV) represent the temperature dependence of the O2 supply and the metabolic rate, respectively. Therefore, Φ can be expressed as

$$\varPhi =\frac{S}{D}=\frac{{\alpha }_{S}}{{\alpha }_{D}}{B}^{\varepsilon } \, p{O}_{2}\exp \left(\frac{{E}_{d}-{E}_{s}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$

Using estimates of this key metabolic index computed for 72 different species across the water column (see Methods), we determine the level of hysteresis of marine ecosystems’ habitability after an overshoot. For this, we use a suite of seven Earth system model (ESM) simulations of the fifth41 (CMIP5) and the sixth42 (CMIP6) Intercomparison Projects (see Supplementary Note 1 and Supplementary Table 1) (GFDL-ESM4;43 GFDL-ESM2M;44 UKESM;45 MIROC-ES2L;46 CNRM-ESM2-1;47 ACCESS-ESM1;48 NorESM2-LM49,50) following the 1pctCO2 and (four of the ESM simulations following) the SSP5-3.4-OS simulations to explore the response of Φ to a consecutive increase and decrease in atmospheric CO2 concentrations. The metabolic index has been successfully deployed in biogeography studies38, climate change impacts37,51, assessments of past mass extinction using fossil records39,40, and projecting risks of future mass extinctions of marine organisms52. Here, we aim to use it to estimate the loss of habitable ocean volume after levels of CO2 recover from overshoot.

Results

Global ocean response

Multi-model global-averaged sea surface temperature (SST) responds almost immediately to a 1% increase per year of atmospheric CO2 (blue line in Fig. 1 and Supplementary Fig. 1). In the 1pctCO2 experiment, peaking at 4x CO2 (~1140 ppm) occurs at year 140 of the experiment. However, peak SST occurs after a seven-year lag, reaching a maximum value of ~3.1 °C above the pre-industrial level. When the pre-industrial CO2 levels are restored at model year 280, SST is still warmer (+0.7 °C) than the pre-industrial value. From model year 280 until the end of the simulation (model year 340), CO2 concentrations are kept constant at ~285 ppm. SST continues to decrease during this phase, but it is still 0.54 °C warmer than the pre-industrial initial state at the end of the experiment. In the SSP5-3.4-OS, peak CO2 (~571 ppm) occurs at year 2062, reaching net zero emissions after the overshoot around year 320 from the beginning of the scenario. SST follows the same pattern, though with a slightly longer delay of 10 years from peak CO2. As in the 1pctCO2 simulation, SST continues to decrease after reaching net zero, being 0.09 °C warmer at the end of the scenario with respect to year 2170.

Fig. 1: Time series of atmospheric CO2, ocean temperature, oxygen partial pressure, and metabolic index for the global ocean over the ramp-up and ramp-down and the SSP5-3.4-OS experiments.
figure 1

Thick lines illustrate multi-model mean. Shading indicates multi-model standard deviation. Vertical dashed lines indicate the time at which atmospheric carbon dioxide peaks (model year 140 and year 2062, for each experiment respectively), and the year it returns to initial values for the 1pctCO2 experiment (model year 280) and net zero is reached after the overshoot (year 2170) for the SSP5-3.4-OS experiment. Time series are shown for the sea surface, and averaged over epipelagic (0–200 m) and mesopelagic (200–1000 m) waters (top, middle and bottom panels respectively). Metabolic Index corresponds to the mean over 72 species (see Methods).

As SST increases, surface waters’ capability to hold oxygen diminishes26,53,54. The sea surface pO2 time series (orange line in Fig. 1 and Supplementary Fig. 1) mimics that of SST, reaching a minimum after a similar lag from the CO2 peak. A similar deviation from initial conditions is observed after CO2 has returned to pre-industrial levels in the case of the 1pctCO2 simulation, and reaches net zero emissions in the case of SSP5-3.4-OS. In the idealised scenario, this negative hysteresis is represented by a decrease of surface pO2 at the end of the experiment of about -1.1% with respect to initial values. In the SSP5-3.4-OS, the hysteresis is represented by a decrease of surface pO2 of about −0.24% at the end of the experiment with respect to year 2170.

For temperature, the delay with respect to the CO2 peak year increases in the epipelagic (0–200 m depth; Fig. 1 middle panels) and mesopelagic (200–1000 m depth; Fig. 1 bottom panels) waters for both experiments. In the 1pctCO2 simulation, the temperature peak in epipelagic waters roughly coincides with the peak year for SST but occurs much later (57 years) in mesopelagic waters, indicating a strong vertical gradient in hysteresis30. With increasing depth, some decoupling of pO2 from temperature occurs, as it becomes evident with a slight delay in the minimum of pO2 with respect to peak temperature in epipelagic waters (1 year after temperature peaks), and an earlier minimum of pO2 in mesopelagic waters (~45 years before temperature peaks). Furthermore, pO2 presents slightly positive hysteresis (~1.0% higher than at the beginning of the experiment) at this depth layer consistent with the increase of oxygen found in deep waters of the North Atlantic by Bertini and Tjiputra21. In the overshoot scenario, the temperature in mesopelagic waters continues to increase until the end of the experiment, at which point the global ocean is 0.1 °C warmer than the year 2170 when negative emissions are phased out. The minimum in pO2 also occurs at the end of the scenario, being slightly lower than at the moment net zero emissions are reached.

The Metabolic Index Φ (green lines in Fig. 1 and Supplementary Fig. 1) is positively correlated with pO2 at the sea surface (hence, negatively correlated with SST) throughout the full experiments. However, Φ hysteresis at the end of the 1pctCO2 simulation (−5.7% with respect to the initial value) almost doubles that for SST (3.0% with respect to the initial value) and triple that for pO2 (−1.1% with respect to the initial value) in absolute values. In epipelagic waters (see Supplementary Fig. 2), the level of hysteresis for Φ (−5.8% with respect to initial value) remains similar to the surface, with the trough occurring later (13 years) after the peak of temperature and the pO2 minimum. In mesopelagic waters, Φ’s level of hysteresis slightly increases (−7.3% with respect to the initial value). The evolution of Φ at this level differs from the changes in the other two variables; Φ minimum occurs 14 years after the pO2 minimum but 34 years before the peak of temperature, thus reflecting a legacy of concomitant changes in temperature and pO2 on ocean habitability. In the overshoot scenario, Φ also mimics the pattern displayed by pO2. In agreement with the higher hysteresis observed in the 1pctCO2 simulation, Φ is 2.05 and 2.21% lower at the end of the experiment with respect to year 2170 in the sea surface and the epipelagic waters respectively. In the mesopelagic, Φ is 1.53% lower at the end of the experiment than at year 2170, as the response to decrease in emissions is lower than in upper waters.

Regional patterns after the overshoot

Differences between the behaviour of the global ocean during the last 30 years of the 1pctCO2 simulation (After peak) and that of 100 years of the piControl simulation before the 1% ramp-up started (Before peak), allow us to explore regional differences in the reversibility of a given parameter. Global maps of the multi-model mean of this difference (After peak—Before peak) for temperature, pO2 and Φ illustrate their deviation from pre-industrial levels (Fig. 2). In several regions, these parameters deviate from initial pre-industrial conditions at the ocean surface, in the epipelagic and, especially, mesopelagic depth layers, indicating the After peak global ocean differs from the global ocean at the beginning of the experiment.

Fig. 2: Changes in the last 30 years of the 1pctCO2 simulation relative to 100 years-average piControl simulation.
figure 2

Multi-model mean differences of temperature (a, °C), oxygen partial pressure (b, atm), and metabolic index (c) at the ocean surface (top panels), epipelagic (0–200 m; middle panels) and mesopelagic (200–1000 m; bottom panels) depth levels. Stippling in the maps marks disagreement across models on signs of change. Metabolic Index corresponds to the mean over 72 species (see Methods). The metabolic index scale is non-linear.

Consistently across the seven ESMs (stippling indicates disagreement in the sign of change), the world ocean will generally be warmer after the overshoot at the sea surface except for small regions close to the Pacific sector of the Southern Ocean and the eastern coast of South Africa (Fig. 2a). Warming is also lower in the centre of the subpolar North Atlantic in the epipelagic and mesopelagic realms, where the so-called “warming hole” has been well documented in both observations and models; typically related to a persistent slowdown of the Atlantic meridional overturning circulation55,56,57,58. The general surface pattern is maintained through the epipelagic layer, while the warming with respect to the piControl situation is higher in the mesopelagic. The hysteresis in temperature on centennial timescales is a consequence of an imbalance between the radiative forcing resulting from changes in atmospheric CO2 levels and changes in ocean heat fluxes (Supplementary Fig. 3). Around model year 230, the global ocean heat fluxes become negative meaning that the ocean only releases part of the previously absorbed heat by the end of the simulation period59,60. Similarly, a lag in ocean heat uptake has been observed after emissions stop in recent studies (for example ref. 61).

Global pO2 maps display more complex variability (Fig. 2b). At the sea surface, temperate and tropical regions will generally have lower pO2 at the end of the experiment. In epipelagic waters, the pattern is similar to that displayed at the sea surface. In the mesopelagic, vast areas of the Atlantic and, especially, the Southern Oceans will have lower pO2. The North Pacific shows the highest increases, in agreement with Li et al. 62, which found increases in DO in this region as a result of changes in ocean circulation and of ventilation, which dominated over biological changes in DO concentrations.

Although Φ depends on temperature and pO2, it displays some differences in regional variability due to the non-linear relationship that ties these two parameters together (Fig. 2c). At the sea surface, bands of lower Φ are seen in the polar regions of the global ocean resulting from the combined effects of positive (warmer) and negative (less oxygen) hysteresis of SST and pO2, respectively. Some increases are seen in the Pacific sector of the Southern Ocean and part of the Arctic. This general pattern of habitability loss denoted by the decreasing values of Φ is largely reproduced in the epipelagic and mesopelagic. In the mesopelagic layer, a strong increase of Φ occurs in the subpolar North Pacific associated with the strong increase in pO2, although model confidence in the mesopelagic Pacific is low, as indicated by the stippling. Globally, however, there is a net loss of habitability that increases with depth, as already seen in Fig. 1.

Supplementary Fig. 4d–f indicate changes between the global ocean after and before CO2 peaks during the SSP5-3.4-OS. Though both experiments differ in the length and magnitude of the overshoot, regional changes in temperature, pO2 and Φ bring some similar patterns at the sea surface and the epipelagic layer. The main differences occur in the North Atlantic where upper waters are cooler in the “warming-hole” region after the overshoot in the SSP5-3.4-OS. The pattern of changes in temperature in the mesopelagic broadly agree with those found for the 1pctCO2, with larger differences in pO2 and Φ. In these mesopelagic waters, most regions, including the North Pacific, present lower values of pO2. Similarly, ocean habitability, as informed by Φ, reduces after the overshoot except in some regions of the tropical Pacific, Indian, and North Atlantic.

Drivers of changes in Φ

To better understand the processes driving changes of Φ, we have decomposed it into changes driven by the saturation concentration of oxygen (O2,sat) and those driven by changes in biological and ocean dynamical processes (see Methods). Here we assume that the oxygen concentration in the ocean is the result of (1) its solubility, which is dependent on water mass temperature and salinity63, and (2) on the biological activity and ocean dynamics that can be approximated by the apparent oxygen utilisation (AOU), which accounts for oxygen consumption in a water parcel since its last contact with the atmosphere. Therefore, Φ can be computed using the O2,sat partial pressure to extract the component that depends on it (Φsat). Then, by subtracting Φsat from Φ, we obtain the component that depends on AOU (ΦAOU).

Multi-model mean maps of After peak—Before peak differences during the idealised experiment for these two components of Φ (Fig. 3a–b) highlight that negative hysteresis of Φ is mainly driven by the O2,sat component in most parts of the upper (surface and epipelagic layers) world ocean, in close agreement with positive hysteresis of temperature (residual warming). In contrast, the AOU component is smaller but brings a spatial imprint of the change in AOU. AOU components of Φ display negative hysteresis in the subpolar regions of both hemispheres across depth, and positive hysteresis in temperate and tropical regions. Positive hysteresis of the AOU components of Φ is also shown in the coastal region of the Antarctic. The magnitude of the hysteresis of both components of Φ increases with depth.

Fig. 3: Changes in the O2,sat and AOU components of the metabolic index.
figure 3

Multi-model mean differences of the metabolic index at the ocean surface (top panels), epipelagic (0–200 m; middle panels) and mesopelagic (200–1000 m; bottom panels) depth levels. Metabolic index is decomposed into O2,sat (a) and AOU components (b). Magenta/green indicates lower/higher metabolic indexes at the end of the experiment. A map of the absolute dominant effect of both components is included (c). Maps indicate differences in the last 30 years of the 1pctCO2 simulation relative to 100 years-average of piControl simulation. Blue colours indicate the dominance of O2,sat components. Orange colours indicate the dominance of AOU components. Lighter colours indicate the absolute contribution of both components is similar ([−0.25, 0.25]). Metabolic Index corresponds to the mean over 72 species (see Methods).

Differences of these two components of Φ after and before the overshoot simulated in the SSP5-3.4-OS simulation roughly coincide with those simulated in the 1pctCO2 experiment for the two components in the three vertical layers considered as informed in Supplementary Fig. 5d–e. Changes in O2,sat dominate positive changes of Φ on the North Atlantic upper waters in agreement with the cooling of the region after the overshoot (see Supplementary Fig. 4). In contrast, positive changes in AOU, as observed for the 1pctCO2 simulation in most temperate waters in the mesopelagic, reduce as a function of latitude in the northern and southern hemispheres.

We have computed the dominant component of Φ throughout the water column (Fig. 3c and Supplementary Fig. 5c, f) by computing the differences between their absolute values. If the differences are smaller than ±0.25, we consider neither component dominant (see Methods). In general, O2,sat components drive most of the changes in Φ during the 1pctCO2 simulation, except for regions where old waters accumulate and, consequently, the AOU component dominates, such as in the mesopelagic layer of the North Pacific. Overall, there is no clear dominance in tropical regions at all depth levels, except for the Atlantic where O2,sat components dominate in mesopelagic waters. The patterns for the dominance of O2,sat vs AOU components for the SSP5-3.4-OS (Supplementary Fig. 5f) are almost identical to those observed for the 1pctCO2 simulation for the upper layers. In contrast, the regions with no clear dominance are higher with respect to 1pctCO2 in the mesopelagic waters. The effects of AOU components are not dominant in this scenario. Using lower thresholds to define regions with no clear dominance also makes the O2,sat component dominant in tropical regions (see Supplementary Figs. 68).

Loss of marine ecosystem habitability

For organisms to carry out aerobic metabolic activities, oxygen supply must meet at minimum the resting metabolic demand. While in theory, balance between demand and supply in oxygen occurs when unity is reached, that is, the resting rate (Φrest; Φ = 1), minimum values of Φ between 2 and 7 are observed in viable marine ecosystems37. These values are thus considered as a critical metabolic index (Φcrit) that reflects the Φrest plus additional energy to undertake key ecological activities such as feeding, growth and reproduction38,51. We track the fractional change of the global ocean volume of the epipelagic and mesopelagic waters at which Φ values are (1) lower than Φrest, (2) between Φrest and Φcrit, and (3) higher than Φcrit during the experiments (Fig. 4), and with respect to global warming (Supplementary Fig. 9). A compilation of 72 species (see Methods and ref. 38) is used to compute quantiles of fractional change in ocean habitable volume. This compilation cannot be considered as an exhaustive representation of all marine species, but includes species of fishes, crustaceans, and molluscs, most of which are of economic relevance, living across a wide range of depths. Habitability is thus discussed in reference to this broadly subset of marine species.

Fig. 4: Change in ocean habitability volume.
figure 4

Model-mean time series of the fractional change of global ocean volume where Φ < Φrest (non-viable zone; a, d), Φrest ≥ Φ < Φcrit (critical zone; b, e), and Φ ≥ Φcrit (viable zone; c, f). Time series for 1pctCO2/SSP5-3.4-OS experiments are shown in the upper/lower panels. For 1pctCO2, the fractional change is computed relative to a 100 year-average of piControl simulation. For SSP5-3.4-OS, the fractional change is computed relative to the years 2035 to 2055, just before the peak in CO2 concentrations. Time series are shown for epipelagic (0–200 m; top panels) and mesopelagic (200–1000 m; bottom panels) depth layers. Red solid (dashed) lines illustrate the mean (median) over 72 species and seven (1pctCO2)/four (SSP5-3.4-OS) ESMs. Red shading indicates the multimodel’s 72 species’ full distribution: 1, 2.5, 5, 10, 17, 25, 75, 83, 90, 95, 97.5 and 99th percentiles. ‘Vol’ indicates the volume that each type of water (non-viable, critical and viable zones) represents to the total volume of the epipelagic and mesopelagic layers at the reference periods. ‘Δvol’ indicates the level of hysteresis depicted as the difference between ocean habitability volume at the last 10 years of the simulations relative to the reference period for each simulation.

We follow the evolution of the global ocean volume at which Φ is below the resting rate (Φ < Φrest), that is, the volume of the world ocean that is unsuitable for supporting resting aerobic metabolism. We call this zone the ‘non-viable zone’. Further, we follow the evolution of the ‘critical zone’ characterised by Φ values within the resting and the critical rates (Φrest ≤ Φ < Φcrit). These waters can be considered as transitional since some marine organisms may develop acclimatisation or genetic adaptation strategies to cope with low O2 levels (see for example ref. 64). Finally, we follow the evolution of the ‘viable zone’, which corresponds to the volume of the global ocean above the critical rate (Φ ≥ Φcrit), where most organisms can thrive.

In the upper 200 m of the water column (Fig. 4, top row), the global ocean’s volume of non-viable zone waters represented ~8% of epipelagic volume at the beginning of the 1pctCO2 simulation, and it increases to be 28.8 ± 1.9% larger after the overshoot. As already observed for marine biodiversity heat stress65, the increase in the volume of non-viable waters scales rapidly with global warming (Supplementary Fig. 9), while the global cooling flattens during the ramp-down slowing down the recovery. This is possibly caused by the long responding timescale of the ocean to reductions in atmospheric CO2, especially when the overshoot is large and persists for a long period62. The critical zone also increases from the initial volume (~36%), showing a hysteresis of 9.1 ± 0.2% at the end of the experiment. The rate of loss of these waters is also slightly larger during the ramp-up phase of the experiment than during the ramp-down phase. The volume of habitable waters, that is, those at which Φ is higher than Φcrit, decreases steadily during the ramp-up, reaching a loss of >16% when it reaches a minimum around model year 150. These waters represented ~56% of the total volume of epipelagic waters and declined by about ~5.2% at the end of the experiment with respect to the volume at the start of the experiment (Fig. 4c).

In the mesopelagic waters (Fig. 4, second rows), the mean loss in habitability is smaller than in epipelagic waters. At the beginning of the experiment, the non-viable and the critical zones represented ~13% and ~29% of the mesopelagic volume respectively. At this depth level, there is almost no overall expansion in volume of the non-viable zone (1.2 ± 0.3%), while the initial critical zone increases by 6.0 ± 0.2%. However, the distribution of both ocean volumes increase to up to 150%, almost doubling the initial volume. The viable zone represented ~58% of the volume of mesopelagic waters, with a 4.6 ± 0.1% loss at the end of the experiment. As with non-viable and critical zones, the viable zone distribution indicates much higher losses of ocean volume -of up to 70%- with respect to the initial volume.

We carry out the same exercise for the SSP5-3.4-OS. However, as CO2 concentrations do not return to initial conditions, we modify the approach slightly, by selecting the period just before the overshoot (years 2035–2055) as reference. In the upper 200 m of the water column (Fig. 4, third rows), the global ocean’s volume of non-viable zone waters represented ~9% of epipelagic volume at the baseline period, increasing to be 12.7 ± 2.1% greater at the end of the scenario. This is also the case for the critical zone that increases 1.1 ± 0.2% from the baseline volume (~38%). Habitable waters reduce their baseline volume (~53%) by about 1.2 ± 0.1%. Interestingly, this loss occurs after the volume of the viable zone returns to about pre-overshoot conditions as decreasing CO2 levels.

In the mesopelagic (200–1000 m, Fig. 4 fourth panels), habitability volume loss is slightly higher than in upper waters as contrasted with observed for the 1pctCO2 simulation since the timescales of both overshoots are different. The volume of the non-viable and critical zones represented ~14% and ~31% of the mesopelagic volume, and increase by 12.2 ± 0.3% and 5.9 ± 0.1% respectively at the end of the simulation. This loss in habitability is also observed by a decrease in the viable zone that represented ~55% and declined by −4.9 ± 0.1% at the end of the experiment. The mesopelagic time series until year 2170 demonstrates similar behaviour as that observed during 1pctCO2, suggesting the overshoot signal takes long to propagate into deep waters. In contrast, the volume of viable waters in the epipelagic realm follows more closely the CO2 concentration.

Changes in the volume of waters that can or cannot hold marine habitats to the 72 species listed here differ regionally. For the 1pctCO2 experiment, we have computed the global ocean multi-model After peak—Before peak (as defined for Figs. 23) differences in volume for the epipelagic and mesopelagic depth layers for each of the three types of waters (Fig. 5 upper two rows). In epipelagic waters, the volume of non-viable waters increase both at the tropical and temperate regions. The gain in critical waters occurs mainly in temperate regions, compensated by losses in the tropics. The map showing the differences in the viable zone displays an overall loss in habitability across the global ocean except for regions such as the so-called warming hole (see above) and the North Pacific. In the mesopelagic, the pattern of changes in viable volumes are amplified. For example, the tropical Atlantic will lose habitable volume by the increase of non-viable waters and the decrease of the viable zone, but the North Pacific will gain habitable volume as pO2 is higher (see Fig. 2) by the end of the experiment. The volume of non-viable waters strongly increase in the subtropical Pacific. However, a concomitant increase in the critical zone volume may indicate that those organisms capable of coping with low levels of oxygen may still thrive in this region after the simulation period.

Fig. 5: Regional changes on ocean habitability volume.
figure 5

Multi-model mean differences of the global ocean volume where Φ < Φrest (non-viable zone; a, d), Φrest ≥ Φ < Φcrit (critical zone; b, e), and ΦΦcrit (viable zone; c, f) at epipelagic (0 – 200 m; top panels) and mesopelagic (200 – 1000 m; bottom panels). For 1pctCO2 (upper two rows), maps indicate differences in the last 30 years of the experiment relative to 100 years-average of piControl simulation. For SP5-3.4-OS (lower two rows), maps indicate differences after CO2 peak (years 2080 to 2100) relative to before CO2 peak (years 2035 to 2055). Metabolic Index corresponds to the mean over 72 species (see Methods).

For the SSP5-3.4-OS, we have computed regional differences in the habitable volume between after and before overshoot, i.e., years 2080–2100 minus years 2035–2055 (Fig. 5 lower two rows). Regional patterns coincide with those observed for the 1pctCO2 simulation in epipelagic waters, though relative changes are weaker. In mesopelagic waters, main differences occur in the North Pacific as the increase in oxygen supply after recovering from the overshoot is not simulated under the SSP5-3.4-OS scenario (see Supplementary Fig. 4), leading to net loss in viable waters in this region.

Reversibility and long-term commitment

We can now discuss the level of (ir)-reversibility of temperature, pO2, and Φ by tracking the deviation from initial values and the trends over last decades of the 1pctCO2 experiment for each parameter (see Methods). We have combined in a diagram deviation-slope pairs for each ESM and depth level (Fig. 6). According to recent studies (for example ref. 62,66), several processes linked with ocean warming and deoxygenation are largely reversible (within centuries) at the sea surface. At the same time, they will take centuries to millennia to recover at depth21,67,68,69,70,71. To determine whether this is the case in our study, we also estimate the same deviation-slope pairs arising from internal climate variability. A change is considered as irreversible at centennial timescales for a given ESM, depth level (and species in the case of Φ) if the slope of the last 30 years of the model simulations falls within the range of the internal variability whereas its deviation doesn’t (light red shading boxes). A change is considered as reversible if the deviation falls within internal variability (light green shading boxes).

Fig. 6: Trend-deviation diagram.
figure 6

Last thirty year trend (years 311-340, x axis) and deviation from the initial value (y axis) of the 1pctCO2 experiment for T (a, d, g), pO2 (b, e, h) and metabolic index (c, f, i) at the sea surface (top panels), epipelagic (0–200 m; middle panels) and mesopelagic (200–1000 m; bottom panels) depth levels for each ESM (bold dots). Trend-deviation pairs are also computed for 25 randomly sampled 30 year periods from the piControl simulations of each ESM. Light green rectangles indicate reversibility on centennial timescales as trends fall within piControl internal variability. Light red rectangles indicate irreversibility at longer timescales as the slope is within piControl internal variability, while deviation falls outside piControl internal variability. The median over 72 species is displayed (bold dots) for each ESM for metabolic index. Slope and deviation for pO2 are in 10-3 atm.

In epipelagic waters, ocean temperature does not return to initial values after the temperature overshoot within the experiment timescale. Ocean temperatures are still adjusting in mesopelagic waters, indicated by both trend and deviation parameters being far from the reference values defined by piControl. This indicates that the time to recover in deeper water is proportional to the strength of the overshoot, consistent with literature30,62. Similarly, pO2 trend-deviation pairs indicate irreversible behaviour for pO2 at both surface and epipelagic waters. Multi-model uncertainty on pO272 results in positive and negative deviations at the mesopelagic realm. Nonetheless, ESMs in which global mesopelagic pO2 increases or decreases after the experiment show irreversibility at centennial timescales except for MIROC-ES2L, for which the trend and deviation fall within natural variability.

Figure 6c shows each ESM’s slope/deviation pairs for the 72 species median Φ. In general, all ESMs show irreversible behaviour at the sea surface and the upper 200 m depth. In addition, in some models, particularly CNRM-ESM2-1 and UKESM-1-0-LL, Φ is still adjusting. In mesopelagic waters, the number of ESMs at which Φ is still adjusting increases, as seen in ACCESS-ESM1-5 and NorESM2-LM.

Unlike the 1pctCO2 simulation in which CO2 levels return to initial pre-industrial conditions, the SSP5-3.4-OS scenario simulation proposes a different state after recovering from the overshoot. We make use of the extension of the SSP5-3.4-OS after the overshoot to give insights into the regional reversibility behaviour of the ocean’s habitability. Figure 7 follows the regional changes in ocean habitable volume in the last 20 years of the experiment with respect to after the CO2 overshoot (years 2080–2100). In epipelagic waters, the volume of habitable waters generally increases after the overshoot, as opposed to the decrease in habitability observed when comparing after and before the overshoot (Fig. 5). However, even after ~200 years post overshoot, the general pattern towards recovering ocean habitability is relatively low. Changes are stronger at depth. In the North Pacific mesopelagic waters, the pattern shifts from losing to gaining habitability by the increase of viable waters and the decrease on non-viable waters. A similar finding was observed for the After peak—Before peak differences of the idealised experiment (Fig. 5), in which an increase in pO2 occurred in the region driving the increase in habitability. In the tropical regions, the volume of the critical zone decreases. In contrast, the volume of non-viable waters increases in the Southern ocean, limiting habitability in that important region. Altogether, these maps suggest that the volume of habitable waters increases when CO2 levels recover from the overshoot. However, though the response to the atmospheric CO2 forcing is more rapid in upper waters, this recovery may take centuries to accomplish. Furthermore, many deeper ocean regions will continue to lose habitability, reflecting the legacy of the global warming overshoot propagating into deep waters.

Fig. 7: Regional changes in ocean habitable volume after the overshoot simulated in SSP5-3.4-OS.
figure 7

Multimodel mean differences of the global ocean volume where Φ < Φrest (non-viable zone; a), Φrest ≥ Φ < Φcrit (critical zone; b), and Φ ≥ Φcrit (viable zone; c) at epipelagic (0–200 m; upper panels) and mesopelagic (200–1000 m; lower panels) waters. Maps indicate differences in the last 20 years of the experiment relative to years 2080 to 2100, immediately post the peak of CO2 concentrations. Metabolic Index corresponds to the mean over 72 species (see Methods).

Summary and implications

The experiments presented here allow us to study the ocean system’s reversibility after hypothetical global warming overshoot levels in the Earth system. Through the evolution of a metabolic index (Φ), this study provides insights into the marine organisms’ ecophysiological response to changes in ocean temperature and pO2. Using a suite of ESMs, we have shown that the level of hysteresis in the upper layers of the ocean may be higher for Φ than for its drivers (Figs. 1, 6), potentially affecting the resilience of marine habitats if CO2 levels were to decline. As pO2 is directly related to temperature for a given O2 concentration, the limited solubility of O2 as the ocean remains warmer than during the pre-industrial period after the overshoot (Fig. 2) mainly drives this negative hysteresis (Fig. 3).

Moving into the ocean interior, the lagged response of temperature increase results in a warmer global ocean after the overshoot (Fig. 2). Though the global hysteresis of pO2 reverses in the ocean interior during the 1pctCO2 simulation (Supplementary Fig. 2), there are regional differences: in the Indian and Pacific Oceans, increases in pO2 are simulated in the mesopelagic realm, while in the Atlantic and Southern Oceans, pO2 decreases after the overshoot. In these deep waters (200–1000 m), values of Φ decrease in the global ocean, except in the North Pacific. Changes in the sea ice cover and vertical stratification due to warming (Fig. 2) may amplify the vertical mixing, thereby ventilating in oxygen the deeper layer of the ocean. The hysteresis of pO2 is less obvious for the SSP5-3.4-OS scenario (Supplementary Fig. 4) as the overshoot duration and magnitude is shorter; hence, the warming intensity is not enough to produce these such large perturbations.

The decrease of Φ after the temperature overshoot suggests the capacity of the ocean to hold viable marine ecosystems also declines. Our analysis indicates that the global ocean will lose on average 4–5% of habitable waters after the overshoot above 1000 m depth during the 1pctCO2 simulation, and 1 up to 5% on average during the SSP5-3.4-OS. Furthermore, the volume of waters whose level of oxygen is insufficient to maintain life increases significantly (~13–29% for the SSP5-3.4-OS and the 1pctCO2 respectively) in the upper 200 m where most of the biota live73, limiting altogether the resilience of shallow marine ecosystems. In these upper waters, habitable waters will be lost across the global ocean, and marine organisms’ resilience will be impacted. In mesopelagic waters, even though the magnitude of habitable volume loss is higher in the Atlantic, it is offset by strong gains in the North Pacific (Fig. 5). The magnitude of these gains is lower in the SSP5-3.4-OS, as the ocean state does not come back to initial conditions but moves into a new quasi-stable state for at least two hundreds of years. (Fig. 7).

As dependent on temperature and pO2 extracted from ESM simulations, it is worth to specify that values of Φ are sensitive to model biases. Nonetheless, and irrespective of the ESM considered, the multi-model response across the metabolic index computed for 72 different species provides a robust and consistent feature that brings evidence on the loss in ocean habitability after an overshoot.

These global warming induced alterations of the habitability of marine ecosystems are likely to require centuries to recover after the peak in CO2 concentrations even if the anthropogenic climate forcing is removed, that is, by returning CO2 concentrations back to pre-industrial levels (Fig. 6). These timescales may also be longer than the time needed for temperature and pO2 to recover back to pre-industrial levels in the upper layers of the ocean. As our analysis considers an idealised overshoot scenario consisting of a rapid increase and decrease in atmospheric CO2, the overshoot and implementation (if any) of negative emissions following CDR deployment would be expected to take much longer. Under a more comprehensive overshoot scenario, the loss of habitability starts to halt after the overshoot, yet the recovery takes a long time even though the magnitude and extension of the atmospheric CO2 overshoot are lower. Together, this means that the hysteresis and recovery times might be considerable longer than that estimated in this study and are likely underestimated.

Using an ecophysiological index that combines aerobic metabolism and cold temperature tolerances of global marine biodiversity, Penn and Deutsch52 quantified the risk of extinction for marine organisms as global warming continues during the present century. As equivalent to the habitability volume evaluated here, they assessed local extirpations as wherever Φ falls below Φcrit, and found larger impacts in the tropics. Our results agree suggesting that the volume of non-viable waters after passing an overshoot will increase in the tropical oceans where species are living closer to their limits (see also ref. 74). Furthermore, a mechanistic model developed by Deutsch et al. 51 suggests that reduction in Φ may be compensated by a reduction in body size for those organisms for which their metabolic demand increases with temperature. This implies these species may still be capable of occupying those regions at which Φ falls below critical values in detriment of their body size. In addition, application of novel indices that are not based on experimentally derived temperature-dependent O2 thresholds such as the Aerobic Growth Index75,76, or the use of ecophysiological traits defined to represent marine biodiversity (see ref. 39,52) has the potential to expand our framework to a wider range of marine species. Nonetheless, the set of available ecophysiological parameters used here provide a novel mapping for economically-relevant species that can be found in the global ocean and across a wide range of depths.

Notwithstanding with the above-mentioned limitations, we show that the combined effect of ocean warming and deoxygenation after a temperature overshoot will impact ocean habitability by increasing the volume of waters that are unsuitable for supporting marine ecosystems at the expense of those waters that can hold them. We found this emerging picture is robust across a range of idealised and comprehensive overshoot simulations. Even if temperature returned to pre-industrial levels, as simulated in the idealised experiment, these habitability loss effects will likely be felt for centuries. The lagged effects of global warming on marine ecosystems after CO2 levels recover from overshoot found here underlines that mitigation actions should be promptly implemented as any delay in stabilising climate are highly likely to have an irreversible impact on marine ecosystems and the vital ecosystem services they provide.

Methods

Simulations

The CDRMIP cdr-reversibility experiment (1pctCO2-cdr) is fully described in Keller et al. 16, and only the main characteristics are briefly described here. It corresponds to a highly idealised experiment that investigates the reversibility of the climate system by leveraging the prescribed 1% yearly increase in atmospheric CO2 concentrations (ramp-up) from pre-industrial levels (~285 ppm), until quadrupling of CO2 concentrations as designed for the 1pctCO2 standard DECK run for CMIP642,77. At this concentration, it prescribes a symmetric 1% yearly decline in atmospheric CO2 concentrations (ramp-down) until reaching initial conditions (pre-industrial level), at which point concentrations are held constant for an additional 60 years. Peak atmospheric CO2 concentration of ~1140 ppm is reached after 140 years of experiment. Therefore, 280 years of the combined ramp-up—ramp-down experiments are considered in addition to 60 years of constant concentrations.

The socio-economic shared scenario SSP5-3.4-OS explores a peak and decline in CO2 concentrations during the 21st century. Carbon emissions during this scenario initially starts by following a high emission pathway consistent with SSP5-8.5 until 2040, at which point strong mitigation policy including CDRs is undertaken, reducing emissions to zero by around 2080. Then, net-negative emissions continue until 2140. Net zero is again reached by 217023,24. SSP5-3.4-OS is extended to year 2300. As CO2 concentrations do not return to initial values, reversibility is assessed by comparing a period after the peak in CO2 (2080–2100, when CO2 levels are ~500 ppm) with respect to 20 years before the peak on CO2 -that occurs around year 2062 considering the mean across the four ESMs considered- in which CO2 concentrations are similar to those after the overshoot. These years correspond to 2035–2055 when CO2 levels vary between 475 and 564 ppm.

Outputs from a suite of six CMIP6 generation and one CMIP5 generation ESM models participating in the CDRMIP project are used in this study. They are listed in Supplementary Table 1, and described in Supplementary Note 1. All simulations provide monthly 3D outputs. Here, only those vertical layers describing the first 1000 m depth of the water column are used. Multi-model mean and uncertainty are displayed throughout the study. Following previous assessments28,78, inter-model uncertainty is estimated as the inter-model standard deviation (SD).

A pre-industrial control (without anthropogenic forcing) one-member 500 yearlong simulation (piControl) is used for each ESM to extrapolate the level of global warming as the difference between atmospheric temperature during the experiment and the long-term piControl mean. Likewise, 100 years of these piControl experiments are used as the Before Peak period (see Figs. 2, 3, and 5).

To facilitate intercomparison, model outputs were re-gridded from their model grids to a regular 1° × 1° horizontal grid using distance weighted average remapping (climate data operators; remapdis). Likewise, vertical coordinates were all interpolated to the 5, 10, 20, 30, 50, 75, 100, 125, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 and 1000 m depth levels (climate data operators; intlevel).

Estimation of metabolic index

The supply of oxygen (S) to a marine species should be higher than the resting metabolic oxygen demand (D) in order to be metabolically viable. As a result of global warming, the solubility of oxygen in upper waters is reduced as warmer waters saturate quicker, while the vertical stratification increases reducing mixing of surface oxygen-rich waters with subsurface oxygen-poor waters79. These processes reduce the concentration of DO in the upper ocean26. In addition, warming increases the metabolism of organisms80,81 thus making them more oxygen demanding. Therefore, there is a need for a metric that makes a link between global warming perturbations and their effects on the ocean distribution of life. In this regard, the aerobic energy balance can be represented by a Metabolic Index37,38 (Φ), which represents the ratio of oxygen supply to resting demand.

$$\varPhi =\frac{S}{D}$$

The rate of oxygen supply depends on ambient partial pressure of oxygen (pO2) and on respiratory efficacy82, which is equal to a per-mass rate of gas transfer between water and the organism (αs), and its scaling with body mass (Bσ). The denominator of metabolic index depends as well on body mass (Bδ) and on the absolute temperature (T). Also, it depends on species specific terms of metabolic rate (αD), its temperature dependence (Ed), and to the Boltzmann’s constant83 (kB), which captures the temperature dependence often described by a Q10 factor83.

$$S={\alpha }_{S}(T){B}^{\sigma }p{O}_{2}$$
$$D={\alpha }_{D}{B}^{\delta }\exp \left(\frac{-{E}_{d}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$

Because diffusive gas (as O2) fluxes are governed by physico-chemical kinetics, the temperature dependence of the oxygen supply (αs(T)) can be approximated by the Arrhenius function that reflects standard gas exchange across a diffusive boundary layer model84.

$${\alpha }_{s}(T)={\alpha }_{s}\exp \left(\frac{-{E}_{s}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$

Therefore, the metabolic index includes both environmental and physiological variables as

$$\varPhi =\frac{S}{D}={A}_{o}{B}^{\varepsilon } \, p{O}_{2}\exp \left(\frac{{E}_{o}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$

Where Ao is the ratio of rate coefficients for oxygen supply and metabolic demand (αSD), and ε is the difference between the allometric scales of body size and mass (\(\sigma\)\(\delta\)). Eo (Ed—Es) describes the effect of temperature on the critical pO2, which corresponds to the minimum pO2 required for maintaining a metabolic rate equals to 1, that is, the resting metabolic demand. To take into account the uncertainties in the computation of the evolution of Φ caused by species-specific ecological parameters (Eo, Ao), Φ is computed for a total of 72 marine species compiled by Deutsch et al. 38. For most marine organisms, the allometric scales of body size (ε) tend to 0, thus Bε is considered to be 1.

For environmental parameters, we use ocean temperature and DO from the simulations described above to estimate Φ. pO2 is calculated as the ratio between the oxygen concentration and the Henry’s law coefficient for oxygen saturation following Garcia and Gordon63. Maps of Φ computed using observations from the World Ocean Atlas 2018 and the seven ESMs piControl simulations are presented in Supplementary Fig. 10. World Ocean Atlas 2018 observations can be accessed via https://www.ncei.noaa.gov/access/world-ocean-atlas-2018/.

Decomposition of the metabolic index

In the ocean, O2 concentrations are the result of the solubility of O2, which depends on the physical properties of water masses (temperature and salinity), and the effects of ocean dynamics (ventilation and stratification) and biological activity that can be approximated by changes in AOU. The solubility of O2 can be obtained by changes in O2 saturation (O2,sat), which indicates the maximum O2 concentration a parcel of water can contain. O2,sat is computed from ESMs temperature and salinity outputs, and represents the effect of changes in oxygen solubility on dissolved O2 concentration. Therefore, O2 can be decomposed as O2 = O2,sat—AOU.

Following the decomposition of O2, the metabolic index can thus be decomposed in a term mainly affected by O2,sat, and a term driven by AOU.

$$\varPhi =f(p{O}_{2},T)={A}_{o}{B}^{\varepsilon } \, p{O}_{2}\exp \left(\frac{{E}_{o}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$

Given that

$$p{O}_{2}=\frac{{O}_{2}}{s{p}_{dep}}=\frac{{O}_{2,sat}-AOU}{s{p}_{dep}}$$

Being spdep the pressure correction which is temperature dependent, and that Bε ~ 1, then:

$$\varPhi =\frac{{A}_{o}{O}_{2,sat}-{A}_{o}AOU}{s{p}_{dep}}\exp \left(\frac{{E}_{o}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$
$$\varPhi =\frac{{A}_{o}{O}_{2,sat}}{s{p}_{dep}}\exp \left(\frac{{E}_{o}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)-\frac{{A}_{o}AOU}{s{p}_{dep}}\exp \left(\frac{{E}_{o}}{{k}_{B}}\left(\frac{1}{T}-\frac{1}{{T}_{ref}}\right)\right)$$
$$\varPhi ={\varPhi }_{sat}-{\varPhi }_{AOU}$$

We thus compute Φsat by using O2,sat partial pressure. Finally, we subtract Φsat from Φ to obtain ΦAOU.

We have also computed the relative dominance of each of Φ components throughout depth levels by computing their difference in absolute values. If the differences are above a given threshold in absolute values we consider a component to be dominant. If these are below the thresholds in absolute values, we consider none of the components dominant. We have used the median of the multi model SD of Φ at the sea surface, that roughly corresponds to 0.25, as the threshold. In order to assess the effect of different thresholds, we also computed the dominance approach using 0.5, 0.1 and 0.05 as thresholds (Supplementary Figs. 68). Increasing the threshold to consider no clear dominance between components of Φ mainly affects the dominant role of the O2,sat term in tropical regions. In these regions, no clear dominance between O2,sat and AOU terms dominate widely as thresholds increase.

Determination of missing Φ crit

In this study, a compilation of ecological parameters (Φcrit, Vh and Eo) provided by Deutsch et al. 38. for 72 different species is used. Nonetheless, Φcrit is only available for 58 species, thus it needs to be determined for 14 species. To approximate the values of Φcrit for these latter species, we have computed the linear regression between Φcrit and the species-specific hypoxia vulnerability (1/Vh; inverse of Ao) of the rest of the species. Then, the slope and intercept of those linear regression were used to derive missing Φcrit.

Estimation of ocean volume for each type of waters

For each model (seven for 1pctCO2 and four for SSP5-3.4-OS simulations) and species (72), we compute, at epipelagic and mesopelagic depth levels, the ocean volume at which Φ is below Φrest, that is, non-viable zone; the ocean volume at which Φ is between Φrest and Φcrit, that is, critical zone; and the ocean volume at which Φ is above Φcrit, that is, viable zone. Once the volume is estimated for each ocean zone, we compute -during the 1pctCO2 simulation- its fraction with respect to total volume of the depth level at the beginning of the experiment, and the fractional change with respect to piControl simulation. For the SSP5-3.4-OS, we compute its fraction with respect to total volume of the depth level just before the overshoot (years 2035–2055), and the fractional change with respect to this reference period. Being VΦx the ocean volume at each depth level for the three (x) types of waters at a given time and VΦx_0 the reference period mean value of VΦx, the fractional change can be estimated as: (VΦx- VΦx_0)/VΦx_0. We multiply by 100 to express it in percentage.

Reversibility timescale analysis

To estimate the time for recovering initial values for temperature, pO2 and the metabolic index, we track the slope of the linear regression of the last 30 years of the 1pctCO2 simulation, and plot against the difference between the mean value over these last 30 years of the simulation and the initial value of the simulation. To discern whether these parameters display irreversibility, we also estimate the same quantities arising from internal climate variability by using 25 randomly selected 30 years-long subsamples of the piControl simulation for each model. A change is considered as irreversible at these centennial timescales for a given depth level and model (and species in the case of Φ) when the slope is within internal variability whereas its deviation doesn’t. In contrast, a change is considered to be reversible at these centennial timescales when both slope and deviation fall within the internal variability as diagnosed from the piControl.