Introduction

The ocean’s biological carbon pump (BCP) is a key mechanism for the ocean’s uptake and sequestration of CO2 from the atmosphere. It comprises the fixation of CO2 through photosynthesis in the surface ocean and the subsequent transport of organic carbon into the deep ocean, where it is remineralized and sequestered. Therefore, accurate estimation of the BCP is important for understanding the ocean’s role in the global carbon cycle and for predicting how it may be changing as climate warms.

Earth System Models (ESMs) are considered the best tools for predicting the ocean’s response to climate change on decadal to centennial timescales. Under global warming, ESMs project a decline in carbon export, i.e. the carbon flux leaving the euphotic zone, over the 21st century1,2,3. However, there are large variations in the underlying drivers and magnitude of the decline among the different ESMs which implies that uncertainty in these estimates is large. The model-to-model difference is larger than the projected reductions in carbon export during the 21st century1. Therefore, it is necessary to partition and diagnose the different factors that contribute to model uncertainty in simulating the BCP and to reduce it.

The BCP can be assessed by quantifying carbon export out of the euphotic zone and the depth-dependent transfer efficiency4,5,6,7,8, which measures the fraction of the exported carbon that survives respiration and reaches the deep ocean. Given the complex mechanisms controlling the transfer efficiency, simple parameterizations with different mechanistic underpinnings have been used to represent and compare the transfer efficiency9,10. The Martin curve11, which normalizes the vertical carbon flux to a reference depth, e.g., 100 m, through the power-law equation F = F0 × (z/z0)b, is the most widely used parameterization12,13,14,15,16. The exponent b is an indicator of the carbon flux attenuation with depth: the larger the exponent b, the faster the attenuation and, hence, the lower the transfer efficiency. A recent global mean estimate of the exponent b is 0.84 ± 0.1413, which is equivalent to a range in transfer efficiency from 5% (b = 0.98) to 12% (b = 0.70) at 2000 m and corresponds to an uncertainty in equilibrium atmospheric CO2 levels of 2813,14 or 46 μatm9 depending on the climate model used. However, this neglects the uncertainty resulting from the choice of the Martin curve over a different parameterization, which has been estimated to contribute an additional uncertainty in atmospheric CO2 levels of 12 μatm9. Rather than using the simple parameterizations which are compared in ref. 9, current ESMs simulate the transfer efficiency in a more mechanistic manner by resolving some key controlling processes, such as the ballast protection of organic matter17,18,19, temperature-18,19,20 and oxygen-dependent17,18,19,20,21 remineralization, and aggregation/disaggregation20. The extent to which the uncertainty in transfer efficiency in state-of-the-art ESMs can be represented by changing the exponent b of the Martin curve, or not, warrants further evaluation.

In this study, the model uncertainty in simulating the BCP is quantified by the variance of averaged vertical carbon flux over different spatial scales across ESMs. We distinguish between the model uncertainty within the euphotic zone and below by partitioning it into the model uncertainty of carbon export and the model uncertainty of transfer efficiency. Motivated by the wide application of the Martin curve, we further decomposed the latter into two components: the uncertainty which can be represented by changing the exponent b of the Martin curve (referred to here as type I) and the uncertainty which cannot (referred to as type II). To achieve this goal, we chose the 14 ESMs that are used by the Coupled Model Intercomparison Project Phase 6 (CMIP6) and report depth-resolved vertical carbon flux (Table 1). We averaged their estimates over the global ocean and different biological biomes prior to the uncertainty analysis (Fig. 1a). A caveat is that the uncertainty analyzed is sampled over the realized diversity of the selected CMIP6 models, which do not necessarily span the real uncertainty range. However, our results indicate which aspects of ESMs should be prioritized for improvement to reduce model uncertainty and should guide future observational strategies.

Table 1 List of the 14 CMIP6 models compared in this study and the subset of 6 models, shown in bold, that are used in the uncertainty analysis
Fig. 1: Map of biological biome and biome-averaged vertical carbon flux.
figure 1

a Map of biogeochemical biomes. SO Southern Ocean, ETI Equatorial Tropical Indian, ETU Equatorial Tropical Upwelling regions, STG Subtropical Gyres, NP North Pacific, NA North Atlantic, and Arctic Arctic Ocean. b The vertical carbon flux averaged over each biome at 100 m and 2000 m. Error bars represent the mean and standard deviation of 14 CMIP6 models, and black dots represent each single model. The global distributions of all models are shown in Supplementary Fig. 1.

Results and discussion

Vertical carbon flux from CMIP6 models

As shown in Fig. 1b and Supplementary Fig. 1, the globally integrated carbon export flux at 100 m varies by a factor of 1.8 from 5.4 Pg C yr−1 (15 g C m−2 yr−1) in CMCC-ESM2 to 9.2 Pg C yr−1 (26 g C m−2 yr−1) in GFDL-CM4. At 2000 m, the model-to-model variations increase to a factor of 6.2. On biome and local scales, the carbon export flux at 100 m is consistently low in the Subtropical Gyres (STG) and high in the Equatorial Tropical Upwelling across different CMIP6 models. Nevertheless, the model-to-model variability in these two biomes reaches factors of 2.3 and 1.6 at 100 m and increases to factors of 14 and 16 at 2000 m, respectively. In addition, there are large disagreements in the spatial patterns across biomes between the different CMIP6 models. At 100 m, the North Atlantic and the North Pacific tend to be the two biomes with the highest carbon export flux in the 4 CESM2 and 2 GFDL models especially in GFDL-CM4. However, in the 3 MPI models, the average carbon export flux in these two biomes is even lower than in the STG. As a result, the model-to-model variability in carbon export at 100 m reaches factors of 8.0 and 5.7 in these two biomes. Unlike the STG and the Equatorial Tropical Upwelling region, the North Atlantic and the North Pacific experience a relatively mild increase of model-to-model variability at 2000 m. In the Southern Ocean, the averaged carbon export flux at 100 m is relatively high in the 2 GFDL models, the 3 IPSL models, EC-Earth3-CC, and CMCC-ESM2. However, the high carbon export flux is confined to the Subantarctic Ocean in the 3 MPI models, and the average carbon export flux is even lower than in the STG in the 4 CESM2 models. As a result, the model-to-model variability in the Southern Ocean is about a factor of 2.4 at 100 m and grows to a factor of 10 at 2000 m.

In summary, there are substantial differences in the simulated vertical carbon flux across different CMIP6 models on biome and local scales. In addition, the model-to-model variability in vertical carbon flux at 2000 m is much larger than that of the carbon export at 100 m, which suggests large discrepancies in transfer efficiency across different CMIP6 models.

Transfer efficiency from CMIP6 models

The global median of the exponent b varies from 0.63 in IPSL-CM5A2-INCA to 1.01 in CMCC-ESM2. This is higher than the estimates of 0.64 and 0.54 from observations in refs. 5,15. However, the global average of the exponent b ranging from 0.70 to 1.03 is consistent with a more recent data-constrained model estimate from 0.70 to 0.9813. The simulated transfer efficiency at 2000 m can be classified into three, qualitatively different latitudinal patterns (Fig. 2 and Supplementary Fig. 2): the first is a homogenous transfer efficiency, the other two are opposite latitudinal patterns, either high in the oligotrophic STG and low in productive regions or vice versa.

Fig. 2: Three different latitudinal patterns of simulated transfer efficiency at 2000 m displayed by the CMIP6 models.
figure 2

a Three latitudinal patterns of simulated transfer efficiency at 2000 m with one latitudinal homogenous pattern and two opposite patterns, either high in the oligotrophic Subtropical Gyres and low in productive regions or vice versa. Error bars represent the interquartile range within a 5-degree latitudinal band. b The global distribution of transfer efficiency at 2000 m from four representative models. The global distributions of all models are shown in Supplementary Fig. 2.

The simulated transfer efficiency at 2000 m is homogenous in the four CESM2 models, the three MPI models, and GFDL-ESM4 across latitudes except for the elevated transfer efficiency in oxygen minimum zones (OMZs). This is because the remineralization rate is suppressed under low oxygen17,19,22 in these biogeochemical models (Tables 12). In GFDL-ESM4, the remineralization rate is dependent on temperature. The high temperature in low-latitude regions facilitates remineralization and should reduce the transfer efficiency. However, this effect is compensated for by the ballast effect of elevated CaCO3 and lithogenic fluxes in low-latitude regions, which enhances the transfer efficiency by protecting organic matter from remineralization or by accelerating the sinking velocity19. In the four CESM2 models, the ballast effect of CaCO3 and opal are both simulated and may cancel each other out, flattening the latitudinal gradient of transfer efficiency17, because the former is predominant in the STG and the latter in productive regions. No temperature dependence is simulated in the four CESM2 and three MPI models17,21,22.

Table 2 Mechanisms included in different biogeochemical models to simulate the attenuation of vertical carbon flux

In the three IPSL models and EC-Earth3-CC, the transfer efficiency at 2000 m is high in the productive high-latitude regions and the Equatorial Tropical Upwelling biome but low in the STG. These broad qualitative features can also be found in global patterns derived from free-drifting neutrally buoyant sediment traps (NBSTs)12, an inverse model23, and a data-assimilative ecosystem model24, which have also reported the lowest transfer efficiency in STG. However, the transfer efficiency is not elevated in the Equatorial Tropical Upwelling region in the NBST-derived global estimate12. Global patterns in the three IPSL models and EC-Earth3-CC are due to the temperature-dependent remineralization rate and the differentiation of sinking velocities between two size classes of organic particles in their biogeochemical models, i.e., PISCESv220. In productive regions where diatoms are dominant, the exported organic particles tend to be large and fast-sinking because diatoms aggregate efficiently. This is consistent with findings from an inverse model analysis suggesting that the transfer efficiency is tightly correlated with changes in temperature and phytoplankton community structure23. We, therefore, use the empirical relationship from ref. 23 to re-calculate their transfer efficiency at 1000 m for comparison with the CMIP6 models. The ballast effects are neglected in this analysis because the model output of the SiO2:POC ratio in ref. 23 is not available. This underestimates their transfer efficiency in the Antarctic Ocean but has minor impacts elsewhere. Comparisons show that the transfer efficiency at 1000 m simulated by the four PISCESv2 ESMs is, in general, higher than that in the inverse model23 (Supplementary Figs. 35). The simulated transfer efficiency at 2000 m from 10.53 ± 0.83% to 17.99 ± 1.26% by the four PISCESv2 ESMs is also higher than the 5% to >8% range from the data-assimilative model24.

In CMCC-ESM2, the latitudinal patterns of transfer efficiency are similar to the four PISCESv2 ESMs. However, the transfer efficiency is low in the North Pacific and Equatorial Tropical Upwelling biomes. Unlike other models analyzed here, the utilization of organic particles by bacteria is simulated explicitly in CMCC-ESM2. It is dependent on the nutritional content of organic particles when bacteria are abundant and otherwise on temperature and bacterial biomass25. Although the bacterial utilization of organic particles can be modulated by oxygen, it appears that this oxygen dependence is deactivated in the CMCC-ESM2 because transfer efficiency in OMZs is not elevated. In general, the simulated transfer efficiency at 2000 m in the CMCC-ESM2 is comparable to the inverse model23 and the data-assimilative ecosystem model24 except that the transfer efficiency from CMCC-ESM2 is much higher in the North Atlantic and Arctic Ocean.

In contrast, in GFDL-CM4, the transfer efficiency at 2000 m is high in the STG and low in the productive regions. This is because the ballast effect of CaCO3 outweighs the temperature dependence of the remineralization rate, and the ballast effect by opal is missing in this model18. General patterns of the fitted exponent b between the productive and oligotrophic regions are similar to two observational estimates5,15 (Supplementary Figs. 2, 3) despite regional discrepancies in transfer efficiency (Supplementary Figs. 6, 7). For instance, the elevated exponent b, which is equivalent to reduced transfer efficiency near the equator, is missing in ref. 5. because it cannot be resolved by sparse concurrent measurements of carbon export and deep vertical carbon flux. In addition, the variation of b in GFDL-CM4 (0.50–1.03) is smaller than that of 0.36–1.09 and 0.42–1.75 from refs. 5,15. (Supplementary Fig. 3). Aside from ballast effects, this global pattern has also been suggested to be driven by particle composition because of the elevated relative contribution from fast-sinking fecal pellets in oligotrophic regions5,26,27. Although the production of aggregates by phytoplankton and fecal pellets by zooplankton is formulated in the 14 CMIP6 models, there is no differentiation between their remineralization rates and sinking velocities. In addition, zooplankton activity can be another reason, as high mesopelagic zooplankton abundance can be sustained by carbon export in productive regions and, in turn, can attenuate the vertical carbon flux efficiently16.

The contrasting global patterns of transfer efficiency have different implications for the future response of the BCP to climate change. Warming-enhanced stratification will likely reduce nutrient supply to the surface ocean, leading to an expansion of oligotrophic ecosystems. This may amplify or dampen the currently predicted reduction of transfer efficiency due to increased temperature. Currently, observation-based estimates of transfer efficiency suggest contrasting patterns, with one supporting high transfer efficiency in oligotrophic ecosystems5,15 and the other favouring the opposite pattern12. This indicates that existing in-situ observations provide insufficient constraints and suggests that an expansion of concurrent measurements of carbon export and deep vertical carbon flux is necessary to reconcile the contrasting patterns or falsify one of them.

Model uncertainties of carbon export and transfer efficiency

As shown in Table 1, there are 6 different biogeochemical models that are used by the 14 ESMs analyzed here. Despite other model discrepancies, e.g., resolution, the ESMs that use the same biogeochemical model tend to produce similar magnitudes and global patterns of vertical carbon flux (Supplementary Fig. 1) and transfer efficiency (Fig. 2 and Supplementary Fig. 2). Therefore, a subset of 6 representative ESMs, using different biogeochemical models, namely CESM2, GFDL-CM4, GFDL-ESM4, IPSL-CM5A2-INCA, MPI-ESM1-2-HR, and CMCC-ESM2 (Table 1), are used to calculate the different uncertainties that are described in the Methods.

As shown in Fig. 3a, the model uncertainty of the vertical carbon flux (\({U}_{{{{\rm{F}}}}}\)) at 100 m is exclusively contributed by the model uncertainty of carbon export (\({U}_{{{{\rm{EP}}}}}\)), because at that depth the transfer efficiency is defined as 100% in all CMIP6 models. Thereafter, due to discrepancies in the transfer efficiency between different CMIP6 models, the model uncertainty of the vertical carbon flux grows quickly with depth following the model uncertainty of transfer efficiency (\({U}_{{{{\rm{TE}}}}}\)), which eventually dominates over the model uncertainty of carbon export and the co-dependence between carbon export and transfer efficiency (\({U}_{{{{\rm{EP\& TE}}}}}\)). The crossover depth is defined as the depth where the model uncertainty of transfer efficiency starts to dominate the model uncertainty of the vertical carbon flux, i.e., it is the largest source of uncertainty and its relative contribution is >50%. On the global scale, the crossover depth is located at 900 m, leading to the dominance of model uncertainty of transfer efficiency at 2000 m which alone can contribute to about 83% of model uncertainty of the vertical carbon flux. On the biome scale, after excluding the Arctic Ocean and the North Atlantic, the crossover depth varies from 500 m in the Equatorial Tropical Upwelling to 1500 m in the STG, indicating differences in the relative importance of model uncertainties due to carbon export versus transfer efficiency in different biomes. Nevertheless, the model uncertainty of transfer efficiency has become the dominant source of uncertainty at 2000 m, with the relative contributions ranging from 57% in the North Pacific to above 85% in the Southern Ocean and the Equatorial Tropical Upwelling. In the Arctic Ocean and the North Atlantic, the model uncertainty of carbon export is high and remains dominant until 1900 m and 1700 m, respectively. Unlike other biomes, the simulated carbon export and transfer efficiency are weakly anti-correlated (corr > −0.5) across the CMIP6 models, and hence their co-dependence counteracts about 23% and 71% of model uncertainty of the vertical carbon flux in these two biomes.

Fig. 3: Partitioning of model uncertainty in the vertical carbon flux (UF) into the model uncertainty of carbon export (UEP), the model uncertainty of transfer efficiency (UTE), and the co-dependence between carbon export and transfer efficiency (UEP&TE) as described in methods.
figure 3

a Vertical profiles of different source of uncertainty in the global ocean and different biomes. The horizontal dashed black lines represent the crossover depth where the uncertainty of transfer efficiency (\({U}_{{TE}}\)) starts to dominate the model uncertainty in the vertical carbon flux (\({U}_{F}\)). b Global distributions of different sources of uncertainty at 2000 m.

On the local scale, the model uncertainty of the vertical carbon flux at 2000 m is, in general, dominated by the model uncertainty of transfer efficiency (Fig. 3b). The simulated vertical carbon flux at 2000 m is highly uncertain in the Arctic Ocean, the Gulf Stream regions, the western part of equatorial Atlantic Ocean, the STG in the South Atlantic Ocean, the OMZs in the equatorial Pacific Ocean and the Indian Ocean, the Pacific Ocean around 40°N and 40°S, and the Indian Ocean around 40°S, primarily because the model uncertainty of transfer efficiency is high in these regions. In addition, the co-dependence can amplify or dampen the model uncertainty of the vertical carbon flux because the simulated carbon export and transfer efficiency can be highly correlated (corr > 0.5) or anti-correlated (corr < −0.5) around the latitude of 40° and in the STG, respectively (Figs. 3b and 4).

Fig. 4: Correlations between the simulated carbon export at 100 m and transfer efficiency at 2000 m.
figure 4

Black contour lines represent the correlation coefficient being ±0.5.

The relative importance of the model uncertainty of carbon export versus the model uncertainty of transfer efficiency for the BCP varies with the depth scale of interest. Since the BCP provides energy and material for metabolisms in the mesopelagic zone, the model uncertainty of the carbon export is more important when studying mesopelagic ecosystems, especially in the North Atlantic and the Arctic Ocean, where the model uncertainty of carbon export is dominant above 1900 and 1700 m, respectively (Fig. 3). With respect to long-term carbon sequestration and global climate, the relative importance of model uncertainty between carbon export and transfer efficiency depends on the competition between the crossover depth and the sequestration horizon, i.e. the depth where carbon is sequestered for climate-relevant timescales (e.g., >100 years). Typically, the deeper the organic carbon is respired, the longer it is sequestered28,29. The 1000 and 2000 m depths have been widely used as static horizons5,6,7,15,23,24. However, the sequestration horizon is spatially variable. There have been some published global maps about the carbon sequestration time27 and mean residence time at each depth30. A recent study suggests that carbon sequestration occurs throughout the water column and that about half of the sequestration is above 1000 m31. On the global scale, the crossover depth of 900 m underscores the importance of both model uncertainty of carbon export and transfer efficiency for understanding the century-scale sequestration. The importance of model uncertainty of transfer efficiency is expected to increase when considering longer-term sequestration.

Type I and type II uncertainties

As shown in Fig. 5a, all sources of uncertainty grow with depth. The type I uncertainty (can be represented by variations in the exponent b of the Martin curve) is initially the largest source of uncertainty before the type II uncertainty (not representable by variations in b) takes over. Both have important contributions to the model uncertainty of transfer efficiency. On the global scale, both type I and type II uncertainties at 2000 m contribute about 30% of the model uncertainty of transfer efficiency individually, with the remaining 40% attributed to their co-dependence. On the biome scale, the contribution of the type I uncertainty ranges from 24% in the STG to 47% in the Equatorial Indian biome. The relative contribution from the type II uncertainty varies over a wider range across biomes. It can reach up to 73% in the North Atlantic and can be as low as 16% in the Equatorial Indian biome. On the local scale, all these sources of uncertainty have comparable contributions to the model uncertainty of transfer efficiency, except in the western part of the Equatorial Atlantic, the boundary along the STG, and the west coast of the Indian Ocean, which are characterized by high type I uncertainty (Fig. 5b).

Fig. 5: The partitioning of model uncertainty of transfer efficiency (UTE) into the type I (UI) and type II (UII) uncertainties that are representable with the Martin curve and not, as well as their co-dependence (UI&II) as described in methods.
figure 5

a Vertical profiles of different source of uncertainty in the global ocean and different biomes. b Global distributions of different source of uncertainty at 2000 m.

The Martin curve has been widely used to fit the sparse observations of deep vertical carbon flux for extrapolating and comparing the transfer efficiency. The exponent b has also been perturbed to assess the sensitivity of atmospheric CO2 levels to transfer efficiency14,32. Our results show that the model uncertainty of transfer efficiency is underestimated by using the Martin curve and exponent b. This is consistent with findings in ref. 9. where six simple alternative parameterizations were compared with the Martin Curve. Therefore, we recommend observationalists to sample the vertical profiles of carbon flux rather than only reporting the exponent b and modelers to compare the profiles during model calibration and intercomparisons.

Future implications

In this study, we find large variability in simulating the BCP across different CMIP6 models because of discrepancies in model structures and parameters. This casts doubt on the reliability of future projections. Which model aspects need prioritized improvement depends on the depth and timescales of interest. When the global, century-scale carbon sequestration is of interest, the model uncertainties of carbon export and transfer efficiency are both important.

A reduction of model uncertainty requires a mechanistic understanding of the carbon export and flux attenuation within the euphotic zone and below, and the integration of this knowledge into ESMs. Model skill and model representation in the euphotic zone have seen many improvements between the fifth and sixth phases of the Coupled Model Intercomparison Project (CMIP5 and CMIP6), e.g., an increase in the number of phytoplankton/zooplankton functional groups and allowance of variable nutrient stoichiometries33,34. Unlike the euphotic zone, model representations of flux attenuation have remained relatively simple35 and there is no agreement on which processes should be included. For instance, temperature-dependent remineralization7,12,35 and particle composition5,7,35 are key drivers in setting the global patterns of transfer efficiency, but are not included in all CMIP6 models. A recent study prioritized these two processes along with fragmentation and zooplankton migration as key mechanisms to be included in the next generation of ESMs1. However, adding a mechanism does not necessarily bring about improvements in models’ predictive skills, as it may introduce additional uncertainty that cannot be constrained by observations. Therefore, comprehensive evaluations of the value added and uncertainty of different mechanisms should be conducted before routine use in ESMs.

Apart from model structures, the divergent degrees of parameter tuning across different CMIP6 models can contribute to model uncertainty. Parameter optimization should be applied to address this issue by searching the optimal values and providing a range of acceptable solutions of parameters that are related to the BCP24,27,35,36,37.

It is worth noting that both refinement of model structures and optimization of model parameters critically depend on appropriate observations38. Given the fact that the currently available in-situ observations of vertical carbon flux are biased to productive regions, increased availability of observations in STG would be desirable and should reduce the large model uncertainty there. In addition, models producing opposite global patterns of transfer efficiency may have only minor differences in model’s goodness-of-fit when compared with in-situ observations, e.g., the GFDL-CM4, GFDL-ESM4, and IPSL-CM6A-LR-INCA (Supplementary Fig. 8). This suggests that an increase in observations of not only carbon export but also depth-resolved deep vertical carbon flux is urgently needed.

Over the past decades, the vertical carbon flux has been measured by sediment traps39,40 and radioactive tracers such as 234Th-238U and 210Po-210Pb, but these are limited due to logistical constraints41,42,43. These constraints are now being alleviated by autonomous profiling platforms. Biogeochemical-Argo (BGC-Argo) floats that are equipped with bio-optical sensors such as backscatter and optical sediment traps can inform us on vertical carbon flux in a cost-effective manner44,45,46. They also bring about a large expansion of biogeochemical measurements including nutrient and oxygen, which reflect the accumulative effects of production, respiration, and transport of organic matter23,24,47,48,49,50,51. However, assessing model uncertainty in simulating the BCP through comparisons with these biogeochemical measurements is made difficult by the fact that the CMIP6 models are initialized and spun up following different procedures33. A consistent protocol for model initialization and spin-up would provide better opportunities for model assessment. Since different mechanisms can cancel out each other, leading to the same vertical carbon flux, observations of the vertical carbon flux are insufficient on their own to tease apart different mechanisms52,53. This is a well-known issue of underdetermination and is common in biogeochemical models38. Although independent observations are becoming more readily accessible through BGC-Argo floats, their utilization for model assessment requires the corresponding tracers to be included explicitly in ESMs. For instance, the availability of particle size distribution has increased rapidly via the separation between small and large particles based on high-frequency measurements of backscatter54,55 and is expected to expand further due to the emergence of miniaturized imaging sensors56. This information can not only provide supplementary constraints on the vertical carbon flux but also gain process-level insights into the distinction between remineralization and fragmentation, which are two complementary processes in setting the vertical carbon flux but leave contrasting effects on the particle size distributions52,57. However, model comparison with these independent observations requires at least two particle size classes to be simulated.

Conclusions

In this study, model uncertainty in simulating the BCP within state-of-the-art ocean biogeochemical models was partitioned into different contributors and analyzed. Comparisons across 14 CMIP6 models show large variations in carbon export at 100 m and increased variations at depth due to differences in transfer efficiency. Owing to the discrepancies in model structures and parameters, the simulated transfer efficiency from 14 CMIP6 models can be classified into three broad categories, with the first category featuring a homogenous global pattern of transfer efficiency and the remaining categories encapsulating two contrasting patterns. In addition, uncertainty analysis based on 6 of 14 representative CMIP6 models suggests an initial dominance of model uncertainty of carbon export which is overtaken by the model uncertainty of transfer efficiency at 900 m on the global scale. Since a recent study has suggested that the century-scale carbon sequestration is equivalently important above and below 1000 m31, model uncertainties of carbon export and transfer efficiency are both important, implying the necessity to improve model representations both within the euphotic zone and below. In addition, an expansion of observations of not only carbon export but also flux attenuation in the deep ocean is important. The model uncertainty of transfer efficiency was further decoupled into uncertainties that are representable by the Martin curve or not, both of which show comparable significance, suggesting that the usage of Martin curve and exponent b for comparing transfer efficiency underestimates the model uncertainty of transfer efficiency. Therefore, the vertical profile of carbon flux is recommended for model validation rather than using the exponent b.

Methods

CMIP6 models

To assess different factors that contribute to model uncertainty in simulating the BCP within state-of-the-art biogeochemical ocean models, we chose the 14 ESMs that are part of the CMIP6 and report depth-resolved vertical carbon flux (Table 1). Model outputs from the historical simulations were averaged over 15 years (2000–2014) to produce the climatology of vertical carbon flux. Only one ensemble member (r1i1p1f1) is used for each model. To facilitate comparisons between models, the model outputs were first remapped onto a common 1° × 1° regular grid with a vertical resolution of 100 m and then compared with different spatial averaging, i.e., from single grid cells to the global average. On biome scales, the simulated vertical carbon flux is averaged over different biomes, which encapsulate the large-scale variability in biogeochemical properties (Fig. 1a). This map of biomes is adapted from ref. 58. by combining the subtropical regions of the Pacific Ocean and the Atlantic Ocean in the northern and southern hemisphere into the STG. In addition, the equatorial upwelling regions of the Pacific Ocean and the Atlantic Ocean are combined into the equatorial tropical upwelling.

In-situ observations of the vertical carbon flux

In-situ observations of the vertical carbon flux were compiled by ref. 59 and are available in the supporting information of ref. 13. Most of the available in-situ observations (~90%) were collected by sediment traps deployed for <30 days and hence are compared to the monthly climatology in the nearest model grid cell of each ESMs.

Different sources of model uncertainty in simulating the BCP

To diagnose the model uncertainty in simulating the BCP, we describe the vertical carbon flux at depth \(z\) (\(\overline{{{{\rm{F}}}}\left({{{\rm{z}}}}\right)}\)) as a product of the carbon export (\(\overline{{{{\rm{EP}}}}}\)) and transfer efficiency (\(\overline{{{{\rm{TE}}}}\left({{{\rm{z}}}}\right)}\)):

$$\overline{{{{\rm{F}}}}\left({{{\rm{z}}}}\right)}=\overline{{{{\rm{EP}}}}}\times \overline{{{{\rm{TE}}}}\left({{{\rm{z}}}}\right)},$$
(1)

where the carbon export is defined as the vertical carbon flux at 100 m, and the overbar represents the average over different spatial scales. Although the depth horizon of 100 m does not necessarily separate the upper layer from the lower layer where no organic particles are produced, it facilitates comparisons between different models.

The model uncertainty in simulating the BCP is calculated based on the variance of the vertical carbon flux across different ESMs and hence is referred to as the model uncertainty of the vertical carbon flux (\({U}_{{{{\rm{F}}}}}\), in unit of \({(\log ({{{\rm{mg \, C}}}}{{{{\rm{m}}}}}^{-2}{{{\rm{da}}}}{{{{\rm{y}}}}}^{-1}))}^{2}\)). In addition, prior to the calculation of variance, a log-transformation is applied to the vertical carbon flux, which varies by orders of magnitude:

$${\log }_{10}(\overline{{{{\rm{F}}}}({{{\rm{z}}}})})={\log }_{10}(\overline{{{{\rm{EP}}}}})+{\log }_{10}\left(\overline{{{{\rm{TE}}}}\left({{{\rm{z}}}}\right)}\right),$$
(2)
$${U}_{{{{\rm{F}}}}}={{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{F}}}}({{{\rm{z}}}})}\right)\right).$$
(3)

The model uncertainty of the vertical carbon flux can then be decomposed into the model uncertainty of carbon export (\({U}_{{{{\rm{EP}}}}}\), in unit of \({(\log ({{{\rm{mg \, C}}}}{{{{\rm{m}}}}}^{-2}{{{\rm{da}}}}{{{{\rm{y}}}}}^{-1}))}^{2}\)), the model uncertainty of transfer efficiency (\({U}_{{{{\rm{TE}}}}}\), dimensionless), and the co-dependence between carbon export and transfer efficiency (\({U}_{{{{\rm{EP\& TE}}}}}\), in unit of \({(\log ({{{\rm{mg \, C}}}}{{{{\rm{m}}}}}^{-2}{{{\rm{da}}}}{{{{\rm{y}}}}}^{-1}))}^{2}\)):

$${U}_{{{{\rm{F}}}}}={U}_{{{{\rm{EP}}}}}+{U}_{{{{\rm{TE}}}}}+{U}_{{{{\rm{EP\& TE}}}}},$$
(4)
$${U}_{{{{\rm{EP}}}}}={{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{EP}}}}}\right)\right),$$
(5)
$${U}_{{{{\rm{TE}}}}}={{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{TE}}}}}\right)\right),$$
(6)
$${U}_{{{{\rm{EP\& TE}}}}}=2\times {{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{EP}}}}}\right),{\log }_{10}\left(\overline{{{{\rm{TE}}}}}\right)\right).$$
(7)

The model uncertainty of the transfer efficiency (\({U}_{{{{\rm{TE}}}}}\)) is further decoupled into type I (\({U}_{{{{\rm{I}}}}}\), dimensionless) and type II (\({U}_{{{{\rm{II}}}}}\), dimensionless) to distinguish between model uncertainty of transfer efficiency, which can be represented by changing the exponent b of the Martin curve or not. To achieve this, the simulated transfer efficiency from each ESMs was fitted to the Martin curve, and the variance of the predicted transfer efficiency (\(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}({{{\rm{z}}}})}\)) is exclusively due to the exponent b:

$$\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}\left({{{\rm{z}}}}\right)}={\left(\frac{z}{100}\right)}^{-b}.$$
(8)

The simulated transfer efficiency is represented as:

$$\overline{{{{\rm{TE}}}}\left({{{\rm{z}}}}\right)}=\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}\left({{{\rm{z}}}}\right)}\times \overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{deviation}}}}}\left({{{\rm{z}}}}\right)},$$
(9)

where the \({{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{deviation}}}}}({{{\rm{z}}}})\) represents the deviations of the simulated transfer efficiency from the corresponding Martin curve. By applying a log-transformation, equ. (9) becomes:

$${\log }_{10}\left(\overline{{{{\rm{TE}}}}\left({{{\rm{z}}}}\right)}\right)={\log }_{10}\left(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}\left({{{\rm{z}}}}\right)}\right)+{\log }_{10}(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{deviation}}}}}\left({{{\rm{z}}}}\right)}).$$
(10)

Therefore, the model uncertainty of the transfer efficiency (\({U}_{{{{\rm{TE}}}}}\)) can be expressed as the sum of the type I (\({U}_{{{{\rm{I}}}}}\)), type II (\({U}_{{{{\rm{II}}}}}\)), and their co-dependence (\({U}_{{{{\rm{I\& II}}}}}\), dimensionless):

$${U}_{{{{\rm{TE}}}}}={U}_{{{{\rm{I}}}}}+{U}_{{{{\rm{II}}}}}+{U}_{{{{\rm{I\& II}}}}}$$
(11)
$${U}_{{{{\rm{I}}}}}={{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}}\right)\right)$$
(12)
$${U}_{{{{\rm{II}}}}}={{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{deviation}}}}}}\right)\right)$$
(13)
$${U}_{{{{\rm{I\& II}}}}}=2\times {{\mathrm{var}}}\left({\log }_{10}\left(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{martin}}}}}}\right),{\log }_{10}\left(\overline{{{{\rm{T}}}}{{{{\rm{E}}}}}_{{{{\rm{deviation}}}}}}\right)\right)$$
(14)