Introduction

Cumulative emissions of anthropogenic carbon dioxide (CO2) have been driving long-term global warming1,2,3, which has negatively impacted the physical environment, ecosystem, and humanity4,5. To minimize the potential risks of climate change, the 2015 Paris Agreement aims to keep global warming well below 2 °C above pre-industrial levels and pursue efforts to limit it to 1.5 °C above pre-industrial levels6. Modeled pathways to limit global warming to 1.5 °C indicate emissions need to reach net zero, and net-negative emissions (i.e. a decline in atmospheric CO2 levels) are required to return global warming to 1.5 °C following a temperature overshoot7. To accomplish this, anthropogenic emissions must be reduced and carbon dioxide removal (CDR), which permanently removes CO2 from the atmosphere, is likely required7,8,9,10.

Despite the increasing attention on CDR in political and economic discussions, there remain uncertainties in the effectiveness of CDR due to a poor understanding of the future behavior of the Earth system to reduced atmospheric CO2 levels11,12,13,14. The terrestrial biosphere is a natural carbon (C) sink that removes a large fraction of anthropogenic CO2 from the atmosphere15. However, there are uncertainties in the future behavior of the terrestrial carbon cycle16,17,18, especially their response under net-negative emissions; thus, there is an urgent need for a better understanding of terrestrial C fluxes after reaching net-zero.

To date, studies on the reversibility of land C pool have been conducted based on idealized CO2 ramp-up and ramp-down forcing experiments using Earth system models (ESMs)19,20,21. Despite similar experimental designs, studies have shown considerable differences in the response of land C stocks to CO2 forcing: the temporal evolution, the extent of change, and its spatial characteristics19,20,21. For example, a previous study21 reported that land C stores are largely reversible within the timescale of changing CO2 due to the balance between an overshoot in the tropics and delayed response in the northern high latitudes. On the other hand, other studies19,20 reported that the terrestrial biosphere continues to remove CO2 immediately after the start of CO2 ramp-down due to inertia in vegetation dynamics and soil C pool, and as a result, stores more C at the end of the simulation than in its initial state.

Inconsistencies in the literature lead to uncertainties in estimates of land C fluxes after net-zero emissions are reached. This hinders establishing effective climate mitigation strategies and thus highlights the need for a multimodel approach and a better understanding of the terrestrial C cycle response to negative emission and its underlying mechanisms. Here, we analyze eight ESMs from the Coupled Model Intercomparison Project Phase 6 (CMIP6)22, which performed the climate and carbon cycle reversibility experiment12 (see “Methods” and Supplementary Tables 1, 2), to assess the reversibility of terrestrial C flux and stock in a multimodel context. In this experiment, atmospheric CO2 concentrations are prescribed to increase at 1% year until quadrupling (~139 years) and then decrease at the same rate until reaching pre-industrial CO2 levels, after which the simulation continues for at least 60 years.

Results

Lagged response of global terrestrial carbon fluxes and stocks

First, the multimodel mean (MME) temporal evolutions of key variables corresponding to the ramp-up and ramp-down CO2 forcing were examined (Fig. 1). The global mean land surface air temperature (SAT) anomaly increased by ~6.7 K and peaked at model Year 144, after which it decreased with a slower rate of change than observed during the ramp-up period. Land temperatures show a delayed response due to the thermal inertia of the ocean23,24 and remain ~1 K higher than the initial state until the end of the simulation. Land precipitation (PRCP) follows land temperature25,26, showing a peak with a 4-year delay after that of temperature and a greater response on the ramp-down CO2 pathway. However, the land PRCP anomaly exhibited large internal variability and a wide range of inter-model spread.

Fig. 1: Temporal evolution of global terrestrial carbon fluxes and stock.
figure 1

ad Time-series of annual mean land surface air temperature (SAT) and precipitation (PRCP) anomaly (a), annual net primary production (NPP) and heterotrophic respiration (Rh) (b), annual net biome productivity (NBP) (c), and annual mean total land C stock anomaly (d). All values are calculated over the global land area excluding Antarctica. The solid lines and shadings show the MME mean and the range of 95% confidence level based on the bootstrap method. All calculations were conducted after taking the 11-year running mean. The beginning and end of CO2 changes are indicated by the gray dashed vertical line.

The changes in climate system and atmospheric CO2 concentration affected terrestrial C fluxes by regulating terrestrial ecosystem processes. At a global scale, net primary production (NPP), synonymous with net carbon uptake by vegetation, linearly increased and subsequently decreased, showing an almost reversible response that was mainly attributable to the CO2 fertilization effect27,28,29. However, heterotrophic respiration (Rh) exhibited a lagged response to CO2 forcing and slowly decreased during the ramp-down period. This is mainly because Rh changes in proportion to the C pool altered by changes in NPP, but there is a time lag between changes in NPP, C sequestration, and its release through microbial respiration30,31,32,33. The delayed increase of litter-soil C content increased the decomposition during the CO2 ramp-down period despite the same CO2 level33. In addition, warmer and wetter conditions on CO2 ramp-down pathway likely enhanced microbial activity34,35,36, thereby partly contributing to the lagged response of Rh.

The net atmosphere-to-land C flux, net biome productivity (NBP), mainly determined by the imbalance between NPP and Rh, also exhibited a lagged response. NBP was positive during the CO2 ramp-up period, demonstrating a well-known role of the land as a C sink15. NBP rapidly increased during the initial period, but it soon became relatively constant due to the declining effect of CO2 fertilization37,38. The terrestrial biosphere continues to uptake ~43 Gt C for decades after the CO2 concentration begins to decrease. Although the CO2 is prescribed in the present modeling experiments, this result suggests that the terrestrial ecosystem will further contribute to the reduction of CO2 concentration for decades after achieving net-zero emissions, thereby lessening the reliance on CDR, in line with the previous literature19,20,39,40.

Thereafter, the terrestrial biosphere becomes a C source as Rh exceeds NPP due to the lagged response of Rh. During the remainder of CO2 ramp-down period, NBP gradually decreased, showing a maximum negative value at the end of the ramp-down period. This result implies that climate mitigation policy should be designed taking into account terrestrial ecosystem C that will be released under negative emissions. During the restoring period, NBP showed a tendency to return to its initial state. Overall, these responses in NBP led to a lag in the total land C stock. The total land C stock anomaly continued to increase immediately after CO2 ramp-down began, with the land retaining more C than its pre-industrial level until the end of the simulation, indicating its positive role in mitigating anthropogenic climate change.

Latitudinal dependency of the lagged response of the terrestrial carbon cycle

Latitudinal differences in the response of the physical climate system and terrestrial C cycle to CO2 forcing were identified (Fig. 2). Land temperature showed a similar timescale of the delayed response to CO2 forcing at all latitudes, and remained warmer than its initial state for the entire simulation period. Precipitation in the northern mid-high latitudes showed a similar evolution to the global mean, but with a heterogeneous response between 60°S–20°N. In the northern mid-high latitudes where cold temperature limits vegetation growth41,42, the lagged response of the climate system resulted in warmer and wetter conditions during CO2 ramp-down period, which enhanced photosynthesis and lengthened the growing season. However, in the tropics where the temperatures are close to the optimal temperature for photosynthesis43 and precipitation decreased during the CO2 ramp-down period (Supplementary Fig. 1), the leaf area index (LAI), an indicator of vegetation growth, rapidly decreased after the CO2 peak. Therefore, LAI in the tropics showed a reversible response within the timescale of CO2 change, but with the LAI response to CO2 forcing becoming increasingly delayed at higher latitudes.

Fig. 2: Latitudinal differences in the response of the climate system and terrestrial carbon cycle to CO2 forcing.
figure 2

af Time-latitude diagrams of annual mean anomalies of changes in land SAT (a), land PRCP (b), and leaf area index (LAI) (c). The zonal sum of annual NBP (d), vegetation carbon (cVeg) anomaly (e) and sum of cLitter and cSoil anomalies (f). All values are MME and smoothed by the 11-year moving average. The beginning and end of CO2 changes are indicated by the gray dashed vertical line.

The latitudinal dependence of the terrestrial biosphere response was more evident in the evolution of NBP. The terrestrial biosphere in the mid-high latitudes continued to absorb C for decades after atmospheric CO2 concentrations decreased. This was due to the formation of favorable climate conditions for vegetation growth, and thus the transition of C sinks to sources was more delayed than in the tropics. Accordingly, the annual mean vegetation carbon (cVeg) anomaly was almost reversible in the tropics, whereas the mid-high latitudes retain more C after the CO2 peak due to the longer timescale of reversibility.

The annual mean C anomaly stored in the litter–soil system also exhibited latitudinally dependent delayed response similar to cVeg, but with a greater time lag to CO2 forcing because of the C flow from plant biomass to soil-litter decay. The increase of litter–soil C and its delay were greatest in high latitude regions with longer C residence (or turnover) time due to slow decomposition in cold environments44,45,46. Consequently, because of this latitudinal dependency, the lagged response of global land C stock is mostly attributable to the mid-high latitudes, not the tropics, constituting most of the global land C stock anomaly after the end of CO2 forcing (Supplementary Table 3 and Supplementary Fig. 1).

Inter-model diversity of terrestrial carbon cycle response to CO2 forcing and its regional characteristics

There was considerable inter-model diversity in the lagged responses of the terrestrial C cycle to CO2 forcing (Supplementary Fig. 2). The extent of the difference in global land C stock between the ramp-down and ramp-up period was dependent on how much C was stored during CO2 ramp-up period, which was related to NPP sensitivities to increased CO2 (dNPP/dCO2) (Supplementary Fig. 3). This NPP sensitivity to CO2 can be expressed as the product of carbon use efficiency (CUE: dNPP/dGPP, the fraction of GPP turned into NPP after considering autotrophic respiration losses) and strength of CO2 fertilization (dGPP/dCO2). The higher CUE and stronger CO2 fertilization effect lead to the greater increase in land C storage. For example, ACCESS-ESM1-5 exhibited the lowest dNPP/dCO2 due to its weak CO2 fertilization effect and the low CUE47 and hence almost reversible response of the terrestrial C stock.

The MME pattern of the land C pool anomaly differs from previous single model results19,21: the amplitude and spatial pattern of land C stock changes differ between ESMs (Supplementary Figs. 4 and 5). Moreover, the peak of land C stock and the timescale of lagged response to CO2 forcing are diverse. These results imply that a single model study cannot draw a concrete conclusion due to large uncertainties. Exploring inter-model diversity can advance our understanding of the future terrestrial carbon cycle in the ESMs and nature. Differences in representations of terrestrial processes and climate change between ESMs may be responsible for this large inter-model diversity. In the tropics and high latitudes, the inter-model spread is considerable, but it cannot be fully explained by the sensitivity of vegetation productivity to increased CO2 (Supplementary Fig. 3). Though previous studies pointed to the importance of nitrogen cycling and dynamic vegetation19,39,47,48, there is no significant impact of the inclusion of these processes on inter-model spread40.

Therefore, to further understand the regional responses and their inter-model diversity, we investigated the spatial pattern of the lagged response of total land C stock (Fig. 3a, b). The total land C stock during the CO2 ramp-down phase was distinctly higher than that during the CO2 ramp-up phase despite the same CO2 concentration, especially in boreal forests, Maritime Continents, and East Asia. In particular, boreal forests can store C for a long time owing to the long turnover time of soil C44,45,46,49. However, the differences in the land C stock in Amazon (details in Supplementary Note 1) and permafrost regions are statistically insignificant due to the diverse response among ESMs (Supplementary Figs. 6 and 7). Inter-model diversity, as estimated by the coefficient of variation, is highest in the continents above 60°N (Fig. 3c, d), indicating the greatest relative variability in high latitudes. This is because of two exceptional models (CESM2 and NorESM2-LM), which simulate lower land C stock in the ramp-down period than in the ramp-up period, especially in permafrost regions (Supplementary Figs. 5, 6 and 8).

Fig. 3: Lagged response of terrestrial carbon cycle and its inter-model spread.
figure 3

a, b Difference of MME anomalies of total land C stock between model Year 210 and 70 (2 × CO2) (a). MME anomalies of total land C stock at model Year 280 (1 × CO2) (b). The zonal sum is plotted on the right side of the map. Only significant values at the 95% confidence level, based on the bootstrap method, are shown. The simulated MME permafrost extent and boundaries of continuous and discontinuous permafrost from the CCI-PF data (see “Methods”) are superimposed in green and blue, respectively. c, d Coefficients of variation (CV: the standard deviation of the spread divided by the mean) of the difference in the total land C stock between model Year 210 and 70 (2 × CO2) (c) and total land C stock anomalies at model Year 280 (1 × CO2) (d) in the tropics (30°S–30°N), mid-latitudes (30°–60°N), and high-latitudes (above 60°N). All calculations were conducted after taking the 11-year running mean.

Irreversible carbon release to the atmosphere in permafrost region

We conducted a more detailed analysis to understand the contrasting terrestrial C stock response to CO2 forcing in permafrost regions (Fig. 4). Most models showed a positive cSoil anomaly due to the lagged response at the end of changing CO2, but CESM2 and NorESM2-LM exhibited a negative cSoil anomaly. This is attributed to a faster transition of land C sinks to sources (~80 years faster than the other models) without a lagged response due to the sharp increase of Rh. Notably, only these two ESMs, coupled with Community Land Model 5, include the representation of deep and frozen soil C and hence permafrost C pools (Supplementary Note 2)50,51. Vertically resolved soil biogeochemistry enables the model to generate large C stocks in the permafrost domain, as observed (1460-1600 Pg C)50,52,53. Consequently, these two models (~1466 Gt C) simulate ~7 times greater cSoil climatology than the other ESMs (~206 Gt C). Therefore, they include the C decomposition in the permafrost zone containing large soil organic C and thus simulate a negative cSoil anomaly. However, the soil C stock in the other models is remarkably low compared to the observed value due to the absence of permafrost and related processes, so they possibly underestimate the soil respiration (Supplementary Fig. 9).

Fig. 4: Irreversible terrestrial carbon loss in the permafrost region.
figure 4

a, b Scatterplot of the cSoil anomalies at model Year 280 versus the model year when Rh exceeds NPP (a). Scatterplot of the cSoil anomalies at model Year 280 versus the mean state of cSoil in the control simulation (b). c, d Time-series of annual NBP (c) and total land C stock anomalies (d): the MME mean (black), the average value from ESMs, including permafrost representation (brown, group A), and the average value for the other models (green, group B). The shadings represent the 95% confidence intervals based on the bootstrap method. The beginning and end of CO2 changes are indicated by the gray dashed vertical line. All values are averaged over the permafrost region above 60°N and smoothed by the 11-year moving average.

As a result, CESM2 and NorESM2-LM (group A) simulated C release to the atmosphere in the permafrost region from the later parts of the ramp-up period due to enhanced microbial C decomposition under warmer conditions. This indicates that permafrost regions could be a net C source rather than a C sink over the course of the CO2 ramp-up and ramp-down forcing, implying its accelerating role in global warming. However, the other models (group B) showed a transition from C sink to source that is similar to the global mean response of NBP but much slower due to the greater lag at high latitudes.

Consequently, group B models simulated a positive total land C stock anomaly over the entire experimental period due to their lagged response to CO2 forcing. In group B models, the terrestrial biosphere serves as a C sink, storing more C (~38 Gt C) at the end of the simulation than in its initial state. However, the land C stock anomaly in group A models, including the permafrost C pool, showed quite different behavior with the total land C stock slightly increasing in the early phase of the ramp-up period but gradually declining thereafter. As a result, the permafrost lost ~33 Gt C by the end of the restoring period compared to the pre-industrial period, offsetting ~30% of enhanced land C retention due to the lagged response of the terrestrial C cycle (Supplementary Table 3). This suggests that an evident irreversible response to CO2 forcing could worsen global warming.

Discussion

In this study, we investigated the reversibility of land C fluxes and stocks to CO2 forcing in idealized CO2 ramp-up and ramp-down simulations and especially focused on their responses under negative emissions. Total land C stocks exhibit a lagged response to CO2 forcing; even after CO2 removal starts, land C maintains considerably higher levels compared to that in the ramp-up period at the same CO2 level. This lagged response of the terrestrial C cycle is latitudinally dependent, and the timescale of reversibility is much longer in high-latitude regions. At a regional scale, boreal forests, East Asia, and Maritime Continents can store C for longer than other regions under net-negative emissions. These spatiotemporal characteristics can be considered for establishing an effective strategy for natural climate solutions, such as forest management.

To deal with the inconsistency among the results of the previous studies, we examined the multi-model response using eight ESMs from CMIP6. The lag in global terrestrial C stock response is mostly attributable to the mid-high latitudes because the inertia of soil C pool is greatest at high latitudes. In addition, the lag in the climate system response in the mid-high latitudes also contributes to this by enhancing vegetation productivity during the ramp-down period. Through the inter-model comparison, we found that the intermodel diversity in the lagged response of the terrestrial C stock is considerable and largely explained by the different NPP sensitivity to increased CO2 between ESMs. We also pointed out that the diverse precipitation response in the Amazon to CO2 forcing and the inclusion of permafrost C pools are important factors in increasing the inter-model spread of response of the land C stock to CO2 forcing.

We have demonstrated that irreversible permafrost C loss would considerably hinder efforts to mitigate global warming even if we achieve net-zero emissions. This should be considered in climate policy discussions and decisions. In particular, we quantitatively examined the role of permafrost in asymmetric terrestrial C cycle response to CO2 forcing, thereby advancing our understanding of previously identified knowledge gap19,39,40. However, more careful quantification is further needed as no land surface model considers abrupt thawing: the rapid degradation of ice-rich permafrost54,55. Our findings include uncertainties resulting from model biases associated with permafrost processes and their initialization procedures (Supplementary Note 2), which should be taken into account and further examined.

We note that the present experimental design (up to 4 × CO2) results in larger changes in land SAT and C stock compared to those changes under the SSP5-3.4-overshoot scenario40, which may lead to an excessive nonlinear response. Understanding the nonlinear and variable responses of the terrestrial C cycle according to the rate of CO2 change or under the different CO2 pathways is further needed for effective climate policy.

Methods

CMIP6-CDRMIP data and experimental design

The climate and carbon cycle reversibility experiment (short name: CDR-reversibility) from the CMIP6 Carbon Dioxide Removal Model Intercomparison Project (CDRMIP)12 was analyzed to investigate the carbon cycle response to large-scale CO2 removal. This experiment was branched from the 1pctCO2 experiment, in which the CO2 level increases at a rate of 1% yr−1 from pre-industrial levels to quadrupling for 140 years, from the CMIP6 Diagnostic, Evaluation, and Characterization of Klima (DECK)22. The piControl experiment, which started after the model spin-up during which the climate begins to come into balance with forcing, serves as a baseline for 1pctCO2 experiment. Then, a 1% yr−1 removal of CO2 from the atmosphere is prescribed for 140 years until the pre-industrial CO2 level is reached and then held for as long as possible (minimum of 60 years) in the 1pctCO2-cdr simulation (minimum of 200 years). Thus, the total length of the CDR-reversibility experiment employed herein is 340 years. We calculated the anomalies of CDR-reversibility simulations using the pre-industrial control simulation (piControl) from DECK as a baseline.

We used eight ESMs (ACCESS-ESM1-5, CanESM5, CESM2, CNRM-ESM2-1, GFDL-ESM4, MIROC-ES2L, NorESM2-LM, and UKESM1-0-LL), which were coupled with the full carbon cycle and performed the CDR-reversibility experiment (Supplementary Tables 1, 2). The MME mean was derived by regridding the outputs from ESMs to a common 1° × 1° grid, then averaging them. GFDL-ESM4 does not provide cLitter and cSoil data, UKESM1-0-LL does not provide cLitter data, and NorESM2-LM does not provide precipitation data. Due to this limitation, GFDL-ESM4 and UKESM1-0-LL were excluded when calculating the MME mean of total land C stock. Herein, the bootstrap method was used to test the statistical significance of the difference between the experiments. For MME, eight values were randomly selected from eight ESMs with replacements, and then, their average was computed. By repeating this process 1000 times, the confidence intervals were determined, and only significant values were shown to indicate the model agreement.

Diagnosing the permafrost extent in the model

The permafrost extent in the model was diagnosed using the temperature at the minimum soil depth (Dzza) where the monthly mean variation of soil temperature within a year is <0.1 °C56. If the temperature at Dzza is <0 °C for 2 years or more, that grid cell is assumed to be permafrost. However, there are most CMIP6 models in which a soil profile is not deep enough to identify Dzza (Supplementary Table 2). For such models, permafrost is assumed to be present in grid cells where the 2-year mean soil temperature of the deepest soil layer is <0 °C57. The extent of permafrost in the model was diagnosed using piControl simulation, except for GFDL-ESM4M, which does not provide soil layer temperature data. In the multimodel context, if the grid cell is diagnosed as permafrost in four or more of the seven ESMs, we define those grid cells as permafrost regions. The MME permafrost extent is almost similar to the boundaries of permafrost extent (>50% coverage) from ESA Climate Change Initiative permafrost (CCI-PF) reanalysis dataset58 (Fig. 3a, b).