Introduction

The implementation of the Paris Agreement on climate in 2015 to reduce CO2 emissions by signing countries should translate into a decrease in the atmospheric growth rate of CO2 starting now. To assess the collective progress towards achieving the purpose of this agreement, signing countries will take stock of its implementation every five years in a process known as “global stocktake”.

However, at present, independent verification of the collective effort to reduce greenhouse gas emissions is hindered by both large uncertainties in the reporting of national emissions1,2, and by year-to-year fluctuations in the carbon fluxes of the atmosphere exchanges with the land and ocean due to the natural variability of the Earth System1. In fact, by using the best estimates of atmospheric carbon sources and sinks coming from direct but sparse observations completed by numerical model data, we can only partially account for the year-to-year variability of the past observed atmospheric CO2 concentration. The difference between our reconstruction and this past observed atmospheric CO2 values represents the overall uncertainty of the Global Carbon Budget3 (GCB), the so-called “budget imbalance”.

In practical terms, such a budget imbalance translates into a delay in our ability to detect a change in the global CO2 emissions by humans. Some authors have quantified that, because of this budget imbalance, it would take close to 10 years to distinguish between a scenario of flat emissions (0% growth) from one with 1% per year growth1. Additionally, the GCB has the limitation of working only as a backward check of the global CO2 emissions. Its update from one year to the next relies on the availability of several observation-based products, with the associated unavoidable delays linked to technical reasons.

To move beyond this approach, the GCB could use near-term predictions of future atmospheric CO2 growth rate. Predicting the near-term (up to a decade) evolution of carbon sinks and atmospheric CO2 growth is an active field of research that has received increasing interest in the last few years4,5,6. Near-term carbon cycle forecasts attempt to predict the combined effect of human activity and natural climate variability on the global carbon cycle. These predictions are performed using Earth System Models (ESMs) that include detailed representations of the global carbon cycle, coupled to climate models, and build upon the progress made by the near-term climate prediction community in the last years6,7,8,9,10,11.

For the global stocktake, using information from the near-term predictions would offer the advantage of independently verifying global emissions pledges in advance, by quantifying the CO2 growth rate to be expected if pledges are kept. Yet, to provide useful information from those predictions, the natural variability of the climate system must be correctly simulated and predicted by the numerical forecast systems. In fact, fluctuations of the atmospheric carbon sources and sinks, linked to climate natural variability, are important drivers of the observed year-to-year variability of the atmospheric CO2 growth rate (ref. 12,13).

Others authors14 quantified the relative importance of the internal climate variability of CO2 fluxes compared to potential emission changes by contrasting two sets of large ensemble simulations performed with the same model but following two different emission scenarios: one with a 2% decrease per year of CO2 emission and one with a 1% increase per year14. They showed that the delay in the detection of a change between those two emission policies is up to 10 years due to confounding effects of natural variability14. That is to say, the confounding effect due to internal variability is larger than the one due to the budget imbalance, as estimated in ref. 1.

In summary, representing and predicting natural climate variability becomes paramount if we aspire to move beyond the backward verification approach of the GCB and attempt to predict the evolution of future atmospheric CO2 growth rate. This calls for an extensive analysis of how forecast systems-type ESMs represent climate variability and its effects on atmospheric CO2. To achieve this, here we consider the historical and piControl simulations from the 6th phase of the Coupled Model Intercomparison Project15 (CMIP6) performed with ESMs to:

  1. 1.

    evaluate their performance in reproducing the interannual variability of the atmospheric CO2,

  2. 2.

    identify the main sources of interannual variability and

  3. 3.

    investigate the origins of disagreement among models.

Results

Global atmospheric CO2 growth rate interannual variability

We start by comparing the interannual variability of the CO2 growth rate (\({\sigma }_{t,\frac{{dCO}2}{{dt}}}\)) of the CMIP6 historical simulations to the observed one (Fig. 1). No long-term trend has been removed at this point in order to evaluate the full performance of the CMP6 historical simulations in reproducing the observed variability. The year-to-year rate of change of the atmospheric CO2 amount is directly linked to the sum, over a year, of the anthropogenic emissions and of the fluxes of Carbon exchanged by the atmosphere with the land and the ocean:

$$\frac{{\rm{dCO}}2}{{\rm{dt}}}={{\rm{CO}}2}_{{\rm{Emissions}}}+{{\rm{CO}}2}_{{\rm{Ocean}}}+{{\rm{CO}}2}_{{\rm{Land}}}$$
(1)

where \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Emissions}}}\), \({{\rm{CO}}2}_{{\rm{Ocean}}}\) and \({{\rm{CO}}2}_{{\rm{Land}}}\) represent the temporal variations of the global amount of atmospheric CO2, the global anthropogenic CO2 emissions, the globally integrated atmosphere-ocean CO2 fluxes, and the globally integrated atmosphere-land CO2 fluxes, respectively. We stress that, in Eq. (1), the CO2 fluxes associated with land use changes due to human activities are included in the \({{\rm{CO}}2}_{{\rm{Land}}}\) term.

Fig. 1: CO2 growth rate decomposition in the historical period.
figure 1

Interannual standard deviation of (1st section) the rate of change of the atmospheric CO2 amount (\({\sigma }_{t,\frac{{{\rm{d}}CO}2}{{{\rm{d}}t}}}\)), (2nd section) the globally integrated CO2 fluxes over Land (\({\sigma }_{t,{Land}}\)), (3rd section) the globally integrated CO2 fluxes over Ocean (\({\sigma }_{t,{Ocean}}\)) and (4th section) the anthropogenic emissions (\({\sigma }_{t,{Emissions}}\)), for observations and CMIP6 ESMs historical simulations (cf. list of models in Supplementary Table 1). This analysis is performed over the 1986–2013 period, which is common to observation estimates and historical simulations. Externally forced signal is not removed from the data.

To quantify the variations in the annual rate of change of the atmospheric CO2 amount (\(\frac{{\rm{dCO}}2}{{\rm{dt}}}\); cf. Equation (1)), we compute its interannual standard deviation (\({\sigma }_{t,\frac{{dCO}2}{{dt}}}\)). In addition, to assess the origin of those variations, we compute the interannual standard deviations of the annual globally integrated \({{\rm{CO}}2}_{{\rm{Land}}}\) (called \({\sigma }_{t,{Land}}\)) and \({{\rm{CO}}2}_{{\rm{Ocean}}}\) (called \({\sigma }_{t,{Ocean}}\)) and of the \({{\rm{CO}}2}_{{\rm{Emissions}}}\) (called \({\sigma }_{t,{Emissions}}\); cf. “Interannual variability and inter-model spread”). Focusing on the period 1986-2013, Fig. 1 shows that interannual variability of the observed atmospheric CO2 growth rate is coming equally from the \({{\rm{CO}}2}_{{\rm{Land}}}\) and from \({{\rm{CO}}2}_{{\rm{Emissions}}}\) and that \({{\rm{CO}}2}_{{\rm{Ocean}}}\) plays a negligible role. In addition, \({\sigma }_{t,{Land}}\) and \({\sigma }_{t,{Emissions}}\) show similar values as \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\), which implies that the time anomalies of \({{\rm{CO}}2}_{{\rm{Land}}}\) and \({{\rm{CO}}2}_{{\rm{Emissions}}}\) are partly compensating each other. In fact, there is a temporal anti-correlation of ~−0.85 between those 2 terms on average among models (see also Supplementary Results 1).

The multi-model mean of \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\) from the historical simulations (“Numerical simulations”) is close to the observed value (47763 kg s−1 vs 38239 kg s−1; see also Supplementary Results 2). Similarly, the historical simulations reproduce the observed \({\sigma }_{t,{Land}}\) and \({\sigma }_{t,{Ocean}}\), as well as their relative contribution to \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\). However, there is a large inter-model spread in the simulation of those interannual variability, with the highest (INM-CM4-8) and the lowest (CESM2) values for \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\) differing by a factor of 3. In agreement with ref. 16, we find that such a spread comes from the atmosphere-land CO2 fluxes.

The large inter-model difference in \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\) in the historical simulations reveals considerable uncertainties in our ability to predict the near-term evolution of atmospheric CO2 using state-of-the-art ESMs. This prevents us from providing in advance useful information for independently verifying global emissions pledged by the signing parties of the Paris Agreement.

Large inter-model spread in \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\)

To explore the origins of the intermodel-spread, we remove the common multi-model mean response to time-varying external forcing, \({\beta }_{t}\), to each historical simulation. We call those residuals the historical_DT (from historical detrended) simulations (for more details, cf. “”Decomposing the sources of inter-model spread). The temporal fluctuations of \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) in the historical_DT simulation of a given model \(m\) can be described as the sum of the specific externally forced response of the model that deviates from \({\beta }_{t}\) (cf. \({\varepsilon }_{{mt}}\) in Eq. (3)) and the internally driven variability simulated by the model (cf. \({\gamma }_{{mt}}\) in Eq. (3)). In historical_DT simulations, the inter-model spread in \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\) (Fig. 2) can therefore come from different responses across models to the same external forcings (\({\varepsilon }_{{mt}}\)), different simulations of the internal variability of the climate system (i.e., different \({\gamma }_{{mt}}\)), or due to differences across models in the temporal interaction of \({\varepsilon }_{{mt}}\) and \({\gamma }_{{mt}}\) (cf. details in “Decomposing the sources of inter-model spread”).

Fig. 2: CO2 growth rate decomposition and role of the internal variability.
figure 2

Standard deviation of the CO2 fluxes in total (columns 1–3), and splitting into the land (columns from 4 to 6) and ocean contributions (columns 7 to 9) for historical_DT CMIP6 simulations (1st, 4th, and 7th), piControl CMIP6 simulations (2nd, 5th, and 8th) and for observations (3rd, 6th, and 9th). Numbers over the columns indicate the inter-model spread, \({\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(x\right)\right]\). The externally forced signal was removed from all the historical simulations and observations (cf. details in “Removal of the externally forced variability in observation” and “Decomposing the sources of inter-model spread”).

To disentangle the effects on the inter-model spread in \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\) coming from \(\varepsilon\) and \(\gamma\) in the historcial_DT simulations, we compute the temporal variance of \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) (\({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\)) in piControl simulations, which isolate the internal variability by construction (cf. “Decomposing the sources of inter-model spread”). Hereinafter, we denoted the interannual variance as \({\sigma }_{t}^{2}\), where the subscript “t” indicates the variance is computed over the time dimension. This is computed as a measure of the variability of the year-to-year rate of change of the atmospheric CO2 and of its drivers, which we note \({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\), \({\sigma }_{t,{Land}}^{2}\), \({\sigma }_{t,{Ocean}}^{2}\), and \({\sigma }_{t,{Emissions}}^{2}\) for \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Land}}}\), \({{\rm{CO}}2}_{{\rm{Ocean}}}\) and \({{\rm{CO}}2}_{{\rm{Emissions}}}\), respectively.

Comparing the inter-model variance of \({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\) between piControl and historical_DT simulations, we find that 65% of the inter-model variance in historical_DT simulations is directly coming from the different simulations among models of the internal variability, whereas 35% is explained by the different model responses to the same external forcing and its potential interaction with the internal variability.

In addition, results from Figs. 1 and 2 highlight that the spread among models in the interannual variability of the atmospheric CO2 growth rate (\({\sigma }_{t,\frac{{dCO}2}{{dt}}}\)) is coming from how the models simulate the atmosphere-land CO2 fluxes. In fact, we quantify that 94% and 0.3% of the inter-model variance of \({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\) in historical_DT is coming from the inter-model variance of \({\sigma }_{t,{Land}}^{2}\) and \({\sigma }_{t,{Ocean}}^{2}\), respectively.

Since the inter-model spread is primarily coming from different simulations of the internal variability across models, we further analyze the piControl simulations to better understand the origins of this spread. In particular, we will answer the three following questions:

  1. 1.

    Which land regions are contributing the most to the global changes in atmospheric CO2 concentration?

  2. 2.

    What are the main drivers of the variability of the air-land CO2 fluxes?

  3. 3.

    Where is the large spread among models coming from?

Land areas controlling \({\sigma }_{t,\frac{{dCO}2}{{dt}}}\)

In this section, we analyze which land areas contribute the most to the interannual variability of the atmospheric CO2 growth rate. Figure 3a shows that, on average over all piControl simulations, tropical areas are the regions contributing the most to \({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\), in particular: tropical South America, tropical-southern Africa, southeast Asia and part of Oceania (in agreement with17,18,19,20). These regions explain, in average among models, 22%, 10%, 5%, and 4% of \({\sigma }_{t,{Land}}^{2}\), respectively, whereas their covariances explain as much as 34% of \({\sigma }_{t,{Land}}^{2}\) (see also Supplementary Results 3). These are also the areas where models disagree the most on the amplitude of the interannual standard deviation of the land CO2 fluxes (Fig. 3b). This indicates that the inter-model spread in Fig. 2 is dominated by differences among models of the CO2 flux over tropical land areas. In fact, there is an inter-model correlation of 0.99 between \({\sigma }_{t,{Land}}\) and the temporal standard deviation of the land CO2 fluxes integrated over the tropics (35°S–35°N).

Fig. 3: Main driver of the global atmosphere-land CO2 fluxes.
figure 3

Regression between globally integrated and local atmosphere-land CO2 fluxes from piControl simulations: a multi-model mean (mean of the model-specific regression coefficients) (Units: kgs−1 kg−1s), b inter-model spread (standard deviation across models of the model-specific regression coefficients) (Units: kgs−1 kg−1s). cf are the same as a and b but for the local SST (Units: K kg−1s) and 2 meters air temperature (Units: K kg−1s) instead of the local CO2 fluxes, respectively. Regression maps are computed using the annual average from January to December. The CO2 fluxes are defined positively into the land. Black boxes in a indicate the regions contributing the most to the global atmosphere-land CO2 fluxes. See also Supplementary Figs. 2 and 3 for maps of each model.

To identify whether one of the key regions marked in Fig. 3a is leading the inter-model variance of \({\sigma }_{t,{Land}}^{2}\), we decompose the global \({\sigma }_{t,{Land}}^{2}\) into its regional contribution (details in Supplementary Note). As results, Fig. 4 shows that all the tropics explain 44% of the total inter-model spread of \({\sigma }_{t,{Land}}^{2}\), the covariance between the \({\sigma }_{t,{Land}-{Tropics}}^{2}\) and \({\sigma }_{t,{Land}-{Extratropics}}^{2}\) 32%, extratropics accounts for 8%, and the residual 13%. These results imply that there is no only one triggering of the inter-model variance, but all regions are contributing to it. Moreover, the large covariance between \({\sigma }_{t,{Land}-{Tropics}}^{2}\) and \({\sigma }_{t,{Land}-{Extratropics}}^{2}\) suggests that models with high variability of \({{\rm{CO}}2}_{{\rm{Land}}}\) in the tropics have also high \({{\rm{CO}}2}_{{\rm{Land}}}\) in the extratropics. Both terms together, all the tropics and the covariance between \({\sigma }_{t,{Land}-{Tropics}}^{2}\) and \({\sigma }_{t,{Land}-{Extratropics}}^{2}\) explain the 76% of the total inter-model spread observed on Fig. 2.

Fig. 4: Contributions to the inter-model variance by tropics and extratropics.
figure 4

Percentage of the inter-model variance explained by the Tropics (44%), Extratropics (8%), covariance between \({{\rm{\sigma }}}_{{\rm{t}},{\rm{Land}}-{\rm{Tropics}}}^{2}\) and \({\sigma }_{t,{Land}-{Extratropics}}^{2}\) (31%) and the residual term (13%). In light gray and close to the Tropics bar appears the percentage explained by each of the tropical key regions from Fig. 3a. More details about the decomposition can be found in Supplementary Note.

Ocean drivers of the atmospheric CO2 growth rate variability

Previous studies have shown a strong relation between ENSO and \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) as well as \({{\rm{CO}}2}_{{\rm{Land}}}\)12,13,21,22,23. In particular, during El Niño events (positive phase of ENSO), the land biosphere becomes a net source of CO2 for the atmosphere12,24,25. The anomalously warm temperatures over the tropics reduce vegetation productivity by degrading the photosynthesis processes26, which decreases the atmospheric CO2 uptake by vegetation. Moreover, warmer conditions increase the heterotrophic respiration in the tropics through an enhanced microbial metabolism that decomposes soil carbon, leading to more CO2 outgassing27. In addition, the broadly drier conditions during El Niño events increase vegetation mortality rates, which releases CO2 into the atmosphere28,29. The opposite mechanisms operate during the negative phase of ENSO, La Niña events.

We find that there is a general agreement across models that ENSO is the main oceanic diver of \({\sigma }_{t,{Land}}\,\) (Fig. 3c, e, Supplementary Fig. 3), in agreement with previous studies. The multi-model mean of the regression map of grid-points SST on \({{CO}2}_{{Land}}\) from the piControl simulations shows a very similar pattern as the regression map of grid-points SST on the Niño4 index (cf. “The Niño4 index as a proxy for ENSO”; Fig. 3e), pointing to the tropical Pacific SST variations as the dominant driver of \({\sigma }_{t,{Land}}\) (Fig. 3c). The regression map of SST on \({{\rm{CO}}2}_{{\rm{Land}}}\) also shows negative regression coefficients over the tropical Indian and Atlantic oceans, although much weaker than in the equatorial Pacific. Such correlations can be explained by the impacts of ENSO on the other tropical basins30,31,32 and, therefore might not reflect a causality link between those local SSTs and the \({{\rm{CO}}2}_{{\rm{Land}}}\).

Because ENSO is the main driver of the interannual \({{\rm{CO}}2}_{{\rm{Land}}}\) variability, inter-model differences in ENSO characteristics (spatial pattern, amplitude and teleconnections; Fig. 3f, Supplementary Figs. 36) could explain the inter-model variance of \({\sigma }_{t,{Land}}^{2}\). In fact, the inter-model spread in the regression map of local SST on \({{\rm{CO}}2}_{{\rm{Land}}}\) (Fig. 3d) shows a similar pattern as the inter-model spread in the regression map of local SST on Niño4 (Fig. 3f; correlation pattern of 0.89). This similarity supports the link between ENSO diversity among models and the inter-model variance in \({\sigma }_{t,{Land}}^{2}\). However, it is also possible that the inter-model variance in \({\sigma }_{t,{Land}}^{2}\) is coming from different land vegetation model responses to similar atmospheric forcing33 (including the ENSO teleconnections). In the following section, we discuss in more detail those two hypotheses.

Origin of the multi-model spread in CMIP6 models

Figure 5a shows that ENSO has a strong impact on land CO2 fluxes over the tropical areas, as expected from Fig. 3. In particular, positive ENSO conditions (i.e. El Niño states) are associated with a decrease of the land CO2 fluxes over most of the tropical areas. Physically, the influence of ENSO on the land CO2 fluxes takes place through its impacts on the 2-meter air temperature and precipitation conditions. During El Niño events, the whole tropical band warms up (Fig. 5e). These warm anomalies are detrimental for tropical vegetation already stressed by heat26,34,35 and they are expected to decrease the land CO2 uptake. However, the highest absolute regression values between CO2 fluxes and ENSO appear mostly where there are additionally strong negative regression values between ENSO and precipitation, like in tropical South America, South-East Asia and Australia (Fig. 5c). These drier conditions prevent the growing and development of vegetation and therefore lead to a decrease of the land CO2 uptake as well (in agreement with12 and reference therein).

Fig. 5: Impact of Niño4 in land CO2 fluxes, precipitation and 2-meter air temperature.
figure 5

Multi-model mean of regression maps of the Niño4 index onto a local land CO2 fluxes (defined positively when pointing into land) (Units: kgs−1 K−1), c precipitation (Units: kgm−2 s−1 K−1) and e 2-meter air temperature (Units: K/K). Inter-model spread of the regression maps of Niño4 onto b \({{CO}2}_{{Land}}\), d precipitation (Units: kgm−2 s−1 K−1) and f 2-meter air temperature (Units: K/K), computed as the standard deviation of the regression maps across models. See also Supplementary Figs. 4, 5 and 6 for maps of each model. The local land CO2 fluxes, precipitation and air surface temperatures are annually averaged from January to December, whereas the Niño4 index is computed as the average from October to March in such a way that ENSO is contemporary or slightly leading the anomalies seen on the maps.

The tropical regions that show the largest inter-model spread in the land CO2 flux response to ENSO coincide with the regions showing the largest inter-model spread in the land CO2 flux regression with the globally integrated \({{\rm{CO}}2}_{{\rm{Land}}}\) (compare Figs. 3b and 5b). In addition, those tropical regions are also the ones where there is the most inter-model spread in the precipitation and temperature responses to ENSO (Fig. 5d, f). This supports the idea that different ENSO teleconnections among models contribute to the inter-model variance in \({\sigma }_{t,{Land}}^{2}\), in agreement with the study of36.

However, in case different ENSO teleconnections among models would really be responsible for the inter-model variance in \({\sigma }_{t,{Land}}^{2}\), we should expect a positive inter-model correlation between the amplitude of the temperature and precipitation responses to ENSO and the interannual variability of land CO2 fluxes at the regional scale. Specifically, we expect that models simulating stronger ENSO impacts on temperature and precipitation would also simulate stronger \({{\rm{CO}}2}_{{\rm{Land}}}\) interannual variability. Yet, here we don’t find such correlations in general among the four key regions controlling the global \({\sigma }_{t,{Land}}\) (Table 1). Only the Southeast Asia region is showing a positive significant inter-model correlation (r = 0.47) between the local CO2 flux temporal variability and the sensitivity of the precipitation response to ENSO (see Supplementary Figs. 7 and 8). Overall, this result implies that differences in ENSO teleconnections among models is not the main reason for the inter-model variance in \({\sigma }_{t,{Land}}^{2}\). We note that results based on the Niño3.4 index rather than the Niño4 index are very similar to those shown here.

Table 1 Role of ENSO teleconnections in the inter-model spread of \({\sigma }_{t,{Land}}^{2}\).

Land vegetation is more tightly linked to soil moisture than precipitation37,38. Ultimately, land vegetation conditions depend on the soil moisture availability. Therefore, their response to ENSO depends on ENSO atmospheric teleconnections and their relation to soil properties. The different representation of water storage and soil processes among the CMIP6 models could therefore explain part of the inter-model spread in \({\sigma }_{t,{Land}}^{2}\).

Another potential explanation for the large inter-model variance in \({\sigma }_{t,{Land}}^{2}\), is that the different land vegetation components of the CMIP6 ESMs have different sensitivity to similar atmospheric forcing. To verify this hypothesis, we compute the inter-model variance in \({\sigma }_{t,{Land}}^{2}\) from the land-hist simulations (cf. “Numerical simulations”) and compare it to the historical ones. In fact, in land-hist simulations the land surface model components used in historical simulations are forced with the same atmosphere reanalysis forcing. Differences among land CO2 fluxes in land-hist can only arise from different land surface model sensitivity to identical meteorological conditions. Similarly to the historical simulations, to focus on the inter-model spread, we remove to each of the land-hist simulations the time varying multi-model mean signal, and we call this residual land-hist_DT. We stress that the number of land-hist simulations available is limited (only 7), the results based on those datasets should therefore be considered with caution.

In Fig. 6, it is possible to see differences for the same land surface model between \({\sigma }_{t,{Land}}\) computed from historical_DT and land-hist_DT (e.g., UKESM1-0-LL, MPI-ESM1-2-LR, and IPSL-CM6A-LR). This can be explained by the different ENSO teleconnections seen by the land surface model between the forcing coming from Observations (land-hist) and the one coming from the free ESM integration (historical). However, overall the inter-model variance in \({\sigma }_{t,{Land}}^{2}\) from historical_DT and land-hist_DT simulations are similar. This result indicates that the main source of spread among models in \({\sigma }_{t,{Land}}^{2}\) is due to the different sensitivity of the land vegetation models to identical atmospheric forcing.

Fig. 6: Variability of the CO2 fluxes over land in historical CMIP6 simulations and land-hist LUMIP simulations.
figure 6

The atmosphere-land CO2 fluxes are globally integrated, annually averaged and with the externally forced signal removed, as in Fig. 2. Period 1986–2013. Numbers over the columns indicate the inter-model spread, \({\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(x\right)\right]\).

Discussion

Using observation-based products, we analyzed the interannual variability of the atmospheric CO2 growth rate over 1986-2013. We show that the main sources of interannual variability over this period are due to anthropogenic emissions and to atmospheric-land CO2 fluxes. We find that, on average, the historical simulations of the CMIP6 database reproduce the observed variability and its partitioning. However, we reveal a large spread among models, which implies a strong uncertainty in our ability to predict the near-term evolution of atmospheric CO2 using state-of-the-art ESMs. In particular, it prevents us from providing a trustworthy forecast of the atmospheric CO2 concentration, assuming the global emissions pledged in advance by the signing countries of the Paris Agreement were respected.

Comparing the historical simulations to the piControl simulations, we find that the inter-model spread is mostly coming from the different simulations across models of the internal variability of the land CO2 fluxes, explaining 64% of the total inter-model variance. Although in all models, ENSO is the main driver of the interannual land CO2 flux variability at a global scale, we find that the diversity of ENSO and of its associated teleconnections among models are not the main causes for this inter-model spread. Indeed, using land-hist simulations, we show that most of this spread can be due to the different sensitivities of the ESMs’ land vegetation component to identical atmospheric forcing.

Overall, our results show that it is pressing to constrain the sensitivity of the land vegetation models to atmospheric forcing better in order to improve our ability to predict the future evolution of the atmospheric CO2 concentration. This calls for large-scale measurement campaigns of the land surface CO2 fluxes in order to calibrate models better. Better understanding and constraining the sensitivity of land vegetation models to climate forcing will have several important benefits:

  1. 1.

    Reduction of the carbon budget imbalance. For the current estimations of \({{\rm{CO}}2}_{{\rm{Land}}}\), the GCB used outputs from several land-vegetation models forced with the same atmospheric reanalysis. The improvement of the land vegetation models sensitivity improve the \({{\rm{CO}}2}_{{\rm{Land}}}\) estimations and thus reduce the carbon budget imbalance.

  2. 2.

    As mentioned before, GCB has the limitation of working only as a backward check of the global CO2 emissions. To move beyond this approach, the GCB could use near-term climate predictions. An improvement of the land-vegetation models sensitivity could induce an upgrade of the near-term future CO2 predictions, which could be used by GCB as a forwards verification of the implementation of the Paris Agreement implementation.

  3. 3.

    Finally, it will also have important benefits for the future climate scenarios. The fact that different land-vegetation models present different sensitivities to the same climate conditions introduces a large spread in the response of \({{\rm{CO}}2}_{{\rm{Land}}}\) to climate conditions32 and on the future evolution of the atmospheric CO2 concentration, and therefore on the climate changes for a given emission scenario.

Methods

The main objective of our study is to evaluate the performance of the CMIP6 class ESMs in simulating the temporal changes of the observed amount of atmospheric CO2 and its associated drivers. In particular, we focus on the variability of the year-to-year rate of change of the atmospheric CO2 amount, which is directly linked to the sum, over a year, of the \({{\rm{CO}}2}_{{\rm{Emissions}}}\) and of the \({{\rm{CO}}2}_{{\rm{Ocean}}}\) and \({{\rm{CO}}2}_{{\rm{Land}}}\) (see Eq. (1)).

Observation estimates (\(\frac{{\bf{dCO2}}}{{\bf{dt}}}\), CO2Land, CO2Ocean, CO2Emissions)

We use as reference dataset the observed estimate of global mean monthly mole fraction of CO2 at the surface air from the historical Greenhouse Gases dataset39 which was used to force CMIP6 historical simulations (cf. input4MIPs: https://esgf-node.llnl.gov/projects/input4mips/). To estimate the observed rate of change of the atmospheric CO2 amount, \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), we converted those surface values from parts per million of CO2 to kg of Carbon by multiplying them by \({\mathrm{2.124.10}}^{12}\) kgC.ppm−1, following16. By computing the differences among consecutive December–January means we get an estimation of the annual rate of change in the global atmospheric CO2 quantity (in kgC/s). We compute this observed \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) over the period 1850-2013 as the original observed estimate dataset of surface atmospheric CO2 concentration only covers the period January 1849 to December 2014.

Analogously, \({{\rm{CO}}2}_{{\rm{Emissions}}}\) are taken from the annual sums of the anthropogenic emissions from the Community Emissions Data System (CEDS) dataset from Hoesly et al (2018) used in CMIP6.

\({{\rm{CO}}2}_{{\rm{Ocean}}}\) is based on the monthly atmosphere-ocean CO2 flux data of Watson40 and the 7 Global Carbon Budget 2021 observation-based Data Products (cf. Table 4 in ref. 41): CSIR-ML642, NIES-NN43, JMA-MLR44, OS-ETHZ-GRaCER45, CMEMS-LSCE-FFNNv246, Landschützer47,48(MPI-SOMFFN) and Rödenbeck49 (Jena-MLS). The monthly values from these products are globally integrated and annually averaged. The common period to all the ocean fluxes data products spans 1986–2020. However, given that the observed estimate of the \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) is only available until 2013, our period of study is limited to 1986–2013.

\({{\rm{CO}}2}_{{\rm{Land}}}\) is estimated as a residual of the sum of the three other terms of Eq. (1) as there are no observational-based global measurements of the atmosphere-land CO2 fluxes. Since 8 observation-based products of the atmosphere-ocean CO2 fluxes exist (see above), we compute and use 8 different globally integrated atmosphere-land CO2 fluxes in this article.

We note two caveats in our estimate of \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\). Due to the temporal resolution of the original data, the annual rate of change is not estimated here from differences of instantaneous values taken on the 1st of January at 00:00 but from December–January mean values. Using an emission-driven esm-piControl simulation performed with the EC-Earth3-CC models (cf. Supplementary Table 1), for which daily data are available, we quantified that this approximation leads to a ~4% error on average in our annual rate estimate, which translates into an under estimation of 0.6% of the annual rate standard deviation. Second, the conversion of atmospheric CO2 concentration from input4MIPs into global atmospheric Carbon mass is based on a well-mixed atmosphere hypothesis (Ballantyne et al 2012). This assumes that variations of CO2 concentration in the marine boundary layer (where the observed measurements are located) are representative of the horizontally and vertically integrated atmospheric CO2 changes. Using an emission-driven esm-hist simulation of EC-Earth3-CC, we compared the globally integrated atmospheric CO2 mass with the atmospheric CO2 concentration at the ocean surface converted into CO2 mass following the same conversion as for the observed values. We find a correlation of 0.99 between the annual rate of atmospheric CO2 computed from the two variables. And, the standard deviation of the annual rate computed from the surface CO2 concentration overestimates only by 0.8% the standard deviation computed from the atmospheric CO2 mass.

Additionally, we stress that by estimating \({{\rm{CO}}2}_{{\rm{Land}}}\) as a residual, this term is by construction compensating for any imbalance in the observed carbon budget raising from error existing in the observed \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Emissions}}}\) and \({{\rm{CO}}2}_{{\rm{Ocean}}}\).

Numerical simulations

A list summarizing the ESMs and simulations used in this study is provided in Supplementary Table 1. We use the so-called historical simulations of the CMIP6 ESMs to evaluate the performance of those models in simulating the interannual variability of observed surface CO2 fluxes. In the historical simulations, the climate components of the ESMs (atmosphere, ocean, land surface, sea-ice) are freely interacting with each other, and the external forcings (also known as boundary conditions; cf. “Removal of the externally forced variability in Observation”) are following the observed ones over the period 1850–2014. We only used one member per model of those historical simulations (see Supplementary Table 1).

To gain more insights into the role of the interannual variability rising from intrinsic climate fluctuations, we also analyze the piControl simulations of the CMIP6 ESMs. Those simulations are similar to the historical ones, but the boundary conditions are kept fixed at their pre-industrial values. Comparing historical and piControl simulations allows quantifying the contributions to the total variability of the Earth system of the variability driven by changes in the external forcings and the variability driven by internal climate interactions (cf. “Removal of the externally forced variability in observation”).

We also use the land-hist simulations of the Land Use Model Intercomparison Project (LUMIP; Lawrence et al 2016). In those simulations, the land vegetation component of different ESMs participating in CMIP6 is forced with the same observation-based atmospheric conditions derived from the Land Surface, Snow and Soil moisture Model Intercomparison Project50 (LS3MIP) over a period covering at least 1850–2014. The land use is evolving as in the historical ESM simulations51. For the same land vegetation model, any differences in \({{CO}2}_{{Land}}\) between land-hist and historical simulations can therefore be attributed to the difference of atmospheric forcing received by the land.

As stated above, the historical and land-hist simulations cover the period from January 1850 to December 2014. However, we limit our analysis to the period 1986-2013, which is the longest overlapping period between those simulations and the observation records. For the piControl simulations, we consider the full length of the simulations.

Estimation of the annual atmospheric \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) in simulations

As already stated in the previous section, we use in this study the historical and piControl simulations of the CMIP6 ESMs. These simulations are driven by prescribed atmospheric CO2 concentration (also called concentration-driven). This implies that the balance between the terms of Eq. (1) is broken in those simulations as the surface fluxes (\({{\rm{CO}}2}_{{\rm{Ocean}}}\) + \({{\rm{CO}}2}_{{\rm{Land}}}\)) cannot drive the rate of change of the amount of atmospheric CO2 (\(\frac{{\rm{dCO}}2}{{\rm{dt}}}\)). Using the esm-hist and esm-piControl simulations of the CMIP6 ESMs, which are CO2 emission-driven simulations, would have allowed working in a framework respecting Eq. (1) balance. However, on the Earth System Grid Federation (ESGF) nodes, there are less models providing land and ocean surfaces CO2 fluxes for both esm-hist and esm-piControl simulations (a total of 11) than models providing those fluxes for both historical and piControl simulations (a total of 20). To quantify as precisely as possible the inter-model spread in the representation of the surface CO2 fluxes, we opted to focus on the concentration-driven simulations.

Because of this choice, we need to diagnose the compatible rate of change of atmospheric CO2. This is achieved here by computing \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\) for the historical and piControl simulations as the annual sum of anthropogenic emissions and the globally integrated ocean and land fluxes, i.e. \({{\rm{CO}}2}_{{\rm{Emissions}}}\) + \({{\rm{CO}}2}_{{\rm{Ocean}}}\) + \({{\rm{CO}}2}_{{\rm{Land}}}\) from Eq. (1). The \({{\rm{CO}}2}_{{\rm{Emissions}}}\) used here is the same as for the observations (cf. “Observation estimates (\(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Land}}}\), \({{\rm{CO}}2}_{{\rm{Ocean}}}\), \({{\rm{CO}}2}_{{\rm{Emissions}}}\))”). \({{\rm{CO}}2}_{{\rm{Ocean}}}\) and \({{\rm{CO}}2}_{{\rm{Land}}}\) are computed from the simulation outputs called “fgco2” and “nbp”, respectively. Those fluxes are defined positively when pointing into land/ocean.

By adopting this procedure, we are omitting the possible feedback that changes in the atmospheric CO2, driven by surface fluxes, can have on the surface fluxes themselves (i.e. the carbon feedback). Comparing 10 historical and 10 esm-hist simulations of EC-Earth3-CC over the period 1986-2013, we find that not accounting for this carbon feedback leads to an underestimation of ~15% (p value < 0.01) of the interannual standard deviation of the \({{\rm{CO}}2}_{{\rm{Ocean}}}\). However, this has no significant impact (p value > 0.1) on the interannual standard deviations of \({{\rm{CO}}2}_{{\rm{Land}}}\) and \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\). In fact, in this study, we find that the inter-model spread is almost exclusively coming from the land CO2 flux variability, for which the inter-model variance dominates by a factor of 50 that of the ocean CO2 flux variability (cf. “Global atmospheric CO2 growth rate interannual variability”). The underestimation of the variability of the ocean surface CO2 flux has, therefore, negligible consequences on the results and conclusions of this study.

Interannual variability and inter-model spread

As mentioned before, for the observations and model simulations, we use the interannual variance (noted \({\sigma }_{t}^{2}\); where the subscript “t” indicates the variance is computed over the time dimension) as a measure of the variability of the year-to-year rate of change of the atmospheric CO2 and of its drivers, which we note \({\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\), \({\sigma }_{t,{Land}}^{2}\), \({\sigma }_{t,{Ocean}}^{2}\), and \({\sigma }_{t,{Emissions}}^{2}\) for \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Land}}}\), \({{\rm{CO}}2}_{{\rm{Ocean}}}\) and \({{\rm{CO}}2}_{{\rm{Emissions}}}\), respectively.

In addition, we quantify the inter-model spread in simulating those interannual variance by computing their variance across models (noted \({\sigma }_{m}^{2}\); “m” for model), that is to say: \({\sigma }_{m}^{2}\left[{\sigma }_{t,\frac{{dCO}2}{{dt}}}^{2}\right]\), \({\sigma }_{m}^{2}\left[{\sigma }_{t,{Land}}^{2}\right]\), \({\sigma }_{m}^{2}\left[{\sigma }_{t,{Ocean}}^{2}\right]\). We stress that there is not inter-model spread in the interannual variance of the \({{\rm{CO}}2}_{{\rm{Emissions}}}\) (i.e., \({\sigma }_{m}^{2}\left[{\sigma }_{t,{Emissions}}^{2}\right]=0\)) as all the model simulations were virtually forced by the same human CO2 emissions.

Removal of the externally forced variability in observation

To focus on the internal variability in Observation, we detrend the observed estimates by removing the multi-model mean from the historical simulations (\({\beta }_{t}\), cf. “Decomposing the sources of inter-model spread”), which we consider as our best estimate of the externally forced variability.

Decomposing the sources of inter-model spread

In the observations and the historical simulations, the variability in the surface CO2 fluxes and in the rate of change of atmospheric CO2 can be driven by changes in the boundary conditions of the Earth system. In our framework, those boundary conditions, also known as external forcings, include incoming solar radiation, the volcanic eruptions, the Land Use changes and the atmospheric CO2 concentration. However, without changes in those external forcings, the Earth system can also experience variability due to its chaotic behavior: the internal variability. Therefore, the variability of \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Land}}}\) and \({{\rm{CO}}2}_{{\rm{Ocean}}}\) could be decomposed into two components: externally and internally driven variability. More generally, to account for the inter-model spread in the historical simulations, the temporal variability of a variable x from a model \(m\) can be decomposed into the sum of:

$${x}_{{mt},{\rm{historical}}}={\beta }_{t}+{\varepsilon }_{{mt}}+{\gamma }_{{mt}}$$
(2)

where the subscripts \(m\) and \(t\) refer to a given model and time step, respectively; \({\beta }_{t}\) is the temporal variations of the multi-model average, which we consider as our best estimate of the externally forced variability (cf. “Removal of the externally forced variability in observation”); \({\varepsilon }_{{mt}}\) is the externally forced response of model \(m\) that deviates from the multi-model mean externally forced response (\({\beta }_{t}\)); and \({\gamma }_{{mt}}\) represents the pure internal variability (cf. more details in Supplementary Methods).

To focus on the sources of the inter-mode difference, we detrend all historical simulations subtracting the common externally forced signal \({\beta }_{t}\) and we call those residuals the historical_DT simulations. In the historical_DT simulations, the temporal variability of a variable x is therefore:

$${x}_{{mt},{\rm{historica}}l{\rm{\_}}{DT}}={\varepsilon }_{{mt}}+{\gamma }_{{mt}}$$
(3)

Moreover, we note that the temporal variability of a variable \(x\) in the piControl simulations is:

$${x}_{{mt},{\rm{piControl}}}={\gamma }_{{mt}}$$
(4)

Following Eqs. (3) and (4), the temporal variance of \(x\) of a model \(m\) (\({\sigma }_{t}^{2}\left({x}_{m}\right)\)) in the historcial_DT and piControl simulations are equal to:

$${\sigma }_{t}^{2}\left({x}_{m,{\rm{historical\_DT}}}\right)={\sigma }_{t}^{2}\left({\varepsilon }_{m}\right)+{\sigma }_{t}^{2}\left({\gamma }_{m}\right)+{2\mathrm{cov}}_{t}\left({\varepsilon }_{m},{\gamma }_{m}\right)$$
(5)
$${\sigma }_{t}^{2}\left({x}_{m,{\rm{piControl}}}\right)={\sigma }_{t}^{2}\left({\gamma }_{m}\right)$$
(6)

And the inter-model variances of the temporal variance of x (\({\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(x\right)\right]\)) are equal to:

$$\begin{array}{l}{\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left({x}_{{\rm{historical}}\_{\rm{DT}}}\right)\right]={\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(\varepsilon \right)\right]+{\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(\gamma \right)\right]+{\sigma }_{m}^{2}\left[2{{\rm{\cdot }}\mathrm{cov}}_{t}\left(\varepsilon ,\gamma \right)\right]+2{{\rm{\cdot }}\mathrm{cov}}_{m}\left[{\sigma }_{t}^{2}\left(\varepsilon \right),{\sigma }_{t}^{2}\left(\gamma \right)\right]\\\qquad\qquad\qquad\qquad\qquad\,+2{{\rm{\cdot }}\mathrm{cov}}_{m}\left[{\sigma }_{t}^{2}\left(\varepsilon \right),2{\mathrm{cov}}_{t}\left(\varepsilon ,\gamma \right)\right]+2{\mathrm{cov}}_{m}\left[{\sigma }_{t}^{2}\left(\gamma \right),2{\mathrm{cov}}_{t}\left(\varepsilon ,\gamma \right)\right]\end{array}$$
(7)
$${\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left({x}_{{\rm{piControl}}}\right)\right]={\sigma }_{m}^{2}\left[{\sigma }_{t}^{2}\left(\gamma \right)\right]$$
(8)

From Eqs. (7) and (8), we see that comparing the inter-model variance of \({\sigma }_{t}^{2}\left(x\right)\) between piControl and historical_DT simulations allows us to quantify the relative importance played by the different representation of the internal variability among models in the inter-model variance of the historcical_DT simulations (see also more details in Supplementary Methods). Finally, this work, \(x\) represents each one of the variables: \(\frac{{\rm{dCO}}2}{{\rm{dt}}}\), \({{\rm{CO}}2}_{{\rm{Land}}}\), \({{\rm{CO}}2}_{{\rm{Ocean}}}\).

The Niño4 index as a proxy for ENSO

In this study, we use as a proxy for ENSO the Niño4 index, which is the SST averaged over the equatorial Pacific region [5°N-5°S; 160°E-150°W]. We obtain one Niño4 value per year by averaging over October to March, that is, the season during which ENSO anomalies are peaking in observations and in most of the models. Compared to a more common annual average between January and December, this definition of Niño4 has the advantage of not mixing signals from potentially different ENSO states between the beginning and the end of a calendar year.

Sub-tropical regions and local drivers of \({\sigma }_{t,{Land}}\)

In our analysis, we find that four tropical regions are contributing the most to \({{\rm{CO}}2}_{{\rm{Land}}}\): Amazonian [280°E-320°E,10°N-30°S], South tropical Africa [10°E-40°E, 35°S-5°N], southeast Asia [115°E-155°E,10°S-35°S] and Oceania [80°E-125°E,6°N-31°N] (see boxes on Fig. 3a). To explore their contribution to the inter-model spread of \({\sigma }_{{Land}}\), we further decompose the global land CO2 flux into their sum and a residual (Res): \({{\rm{CO}}2}_{{\rm{Land}}}={{\rm{CO}}2}_{{\rm{Land}}-{\rm{Amazonian}}}+{{\rm{CO}}2}_{{\rm{Land}}-{\rm{SouthtropicalAfrica}}}+{{\rm{CO}}2}_{{\rm{Land}}-{\rm{SoutheastAsia}}}+{{\rm{CO}}2}_{{\rm{Land}}-{\rm{Oceania}}}+{{\rm{CO}}2}_{{\rm{Land}}-{\rm{Res}}}\) (see also Supplementary Results 3).

We also estimate the sensitivity of the precipitation (\(\alpha\)) and surface air temperature (\(\rho\)) fields to Niño4 by computing the regression maps of Niño4 index normalized onto these two fields. Regression coefficients are averaged over each one of the key regions, representing the sensitivity of each one of those regions to ENSO. Note that, as the Amazonian region shows positive and negative regression coefficients in the precipitation fields (Fig. 4c), when averaging, they can cancel out each other. Thus, for this case, we repeat our analysis splitting the Amazonian region into a northern ([280°E-320°E,10°N-10°S]) and a southern ([280°E-320°E,10°S-30°S]) region.