Introduction

Global environmental change is altering vegetation phenology, thereby disturbing the terrestrial carbon cycle balance1. Northern mid-high latitudes show a substantial warming trend in spring due to anthropogenic warming and Arctic amplification2. Simultaneously, the seasonal landscape in boreal vegetated regions where temperature is a prominent limiting factor, has experienced a noteworthy shift. As a result, the advancement in leaf unfolding date and increase in spring vegetation productivity (referred to as earlier spring greening, ESG) are widespread phenomena in the Northern Hemisphere (30° to 90°N)3,4. These changes in phenological stages5 and seasonal thermal conditions6 influence the pattern of interseasonal vegetation-carbon coupling. This interaction has been corroborated by in-situ observation of carbon fluxes, and suggests potentially immense impacts on large-scale ecological functions and the global carbon cycle7,8,9. However, it remains uncertain how vegetation phenology change during the springtime has affected ecosystem productivity later in the annual cycle.

Previous studies10,11,12,13,14,15 have centered on seemingly contradictory hypotheses that the contrasting lagged effects of spring-warmth may either beneficially or adversely influence the terrestrial vegetation productivity in subsequent seasons. Under the theoretical framework of ecological memory16, those lagged effects can be encapsulated as exogenous (environmental) and endogenous (biological) components of memory. It essentially emphasizes the impact of antecedent conditions on current ecological dynamics, involving multiple aspects of hydrological processes and plant physiology. For instance, escalated extreme climate risk caused by enhanced summer water stress from the ESG (e.g., higher evapotranspiration12 and soil moisture deficit17) may result in the excess losses of carbon, and increased ecosystem respiration in the autumn may offset carbon uptake18,19. Conversely, the endogenous vegetation growth carryover (VGC) effect, which plays a crucial role in the seasonal vegetation dynamics, can continuously induce additional vegetation activity after a warmed spring, versus the aforementioned exogenous climatic legacy effect15. In particular, the strong VGC effect can neutralize the adverse abiotic effects induced by the ESG to preserve the lush gross primary productivity (GPP) of the resilient ecosystem15,20. The core of these foundational hypotheses rests in which of the memory effects dominates the interseasonal vegetation-climate-carbon coupling. Although progress has been made in the evaluation for summer GPP, a comprehensive understanding that quantifies the response of summer net ecosystem productivity (NEPsummer) to the ESG is still lacking.

Here, we hypothesized that the lagged effects of ESG exert a vital function in modulating summer ecosystem carbon sink. To do so, we integrated a long-term satellite-based vegetation index, three explicit estimates of carbon flux, and essential hydrometeorological variables to characterize interseasonal (spring-to-summer) ecosystem carbon feedback. These carbon flux datasets include (1) dynamic global vegetation models (DGVMs) obtained from TRENDY21 version 9, (2) two atmospheric CO2 inversions (ACIs), i.e., CAMS22 and Jena CarboScope23, and (3) in-situ eddy covariance measurements from the global FLUXNET24 network. We first investigated the spatial pattern of partial correlation between the spring leaf area index (LAIspring, which served as a proxy of vegetation greenness and phenological metric25) and two gridded independent NEPsummer estimations from 1982 to 2015, and unveiled their non-local linkages in space and time by lagged maximum covariance analysis (MCA). In addition, we conducted an analysis based on flux-tower measurements to further test the overarching hypothesis and introduced a tree-based machine learning model in combination with an explainable AI approach (i.e., SHAP26,27) to quantify this process (more details in Methods). Last, we explored the relationship between the sensitivity of summer carbon sink to the ESG and reforestation potential. Our study applied the boreal climatological definition for spring and summer, i.e., spring is the period of March-April-May (MAM) and summer is the period of June-July-August (JJA).

We demonstrated that the lagged effects of ESG are increasing simulated biomass production in summer across the northern vegetated areas from 1982 to 2015. In terrestrial biomes, we found response disparities that forest is stimulated more strongly by the ESG than grassland. Additionally, these findings are reconciled with the results from the atmospheric observation-based estimates and eddy covariance measurements of carbon flux. This study highlights the impacts of spring phenology and vegetation changes on summer carbon sink.

Results

Evidence of enhanced summer carbon sink induced by earlier spring greening (ESG)

By removing the covarying effects of summer temperature and precipitation, partial correlation analysis shows a prevailing positive correlation pattern between LAIspring and simulated NEPsummer from TRENDY across the northern mid-high latitudes (70.2% of study area, Fig. 1b), and the correlation is significant (p < 0.05) in western North America, Siberia, Central Asia, and southern Europe (Supplementary Fig. 1). This is direct statistical evidence of the possible promoting effects of ESG on summer net carbon sink. In addition, a conspicuous negative correlation pattern is found over the cropland of Central North America, which may be purely attributed to the non-climate-driven agricultural management effects, i.e., intensified agriculture practice, including the expansion of cultivation and planting of new crop variants28. Considering the potentially ambiguous consequences of anthropogenic regulation on crop phenology and productivity, our subsequent focus will be solely on non-agricultural vegetation areas.

Fig. 1: Partial correlation between observed LAIspring and simulated NEPsummer from 1982 to 2015.
figure 1

a widespread earlier leaf unfolding date and spring greening in the mid-high latitudes of the Northern Hemisphere (30° to 90°N) from 1982 to 2015. The horizontal and vertical axes of the color legend are the linear trends of LAIspring and the start of the growing season (SOS), respectively. Their spatial patterns are shown in Supplementary Fig. 2. b the spatial distribution of partial correlation coefficients between GIMMS LAIspring and TRENDY NEPsummer. c the changes of mean partial correlation coefficients with the increasing fraction of forest or grassland in pixels. d the changes of mean partial correlation coefficients with the increasing tree coverage in pixels. The error bars indicate a 95% confidence interval. The fraction of forest and grassland and the tree coverage is collected from MOD12C1 and MOD44B products, respectively.

With the increasing forest and grassland coverage, their partial correlation coefficients vary conversely (Fig. 1c). Due to the higher resistance to enhanced water stress and stronger endogenous (biological) memory effects relative to exogenous (environmental) memory effects, forest tends to show an increase in correlation coefficients15,29. By contrast, exogenous memory effects dominating the response of grassland to the ESG, particularly in drylands, and the notable vulnerability of herbaceous plants to climate extremes (e.g., drought) lead to decreased correlation coefficients for grassland15,17,29,30. As a result, after excluding agricultural samples, the average partial correlation coefficients show a clear increase with increasing tree coverage (Fig. 1d), which is in agreement with results shown in Fig. 1c. This trend is less apparent within relatively low tree coverage, since the related grid samples are a mixture of the other two biome types (namely shrubland and savanna), which present non-robust linear variation.

To further reveal the non-local connection between the ESG and summer ecosystem carbon sink, we employed a lagged MCA to explore the spatiotemporal coupling relationship between LAIspring and simulated NEPsummer. Overall, the annual time expansion coefficients (refer to Methods) associated with the two data fields correlate significantly (r = 0.93, p < 0.001, Supplementary Fig. 3), indicating that strong coupling exists in their leading modes. We found that the spatial patterns of the paired leading modes show relatively broad consistency in anomalies across northern areas (30° to 90°N, Fig. 2a, b). Generally, in line with partial correlation analysis, the results based on the lagged MCA are robust evidence for the positive response of NEPsummer to the ESG. It is noted that the corresponding anomaly patterns of LAIspring and NEPsummer are contrary in parts of North America, where the negative partial correlation is observed in Fig. 1b. This is explained by the large-scale opposite phenomenon (i.e., spring phenology delay and browning occurrence; Fig. 1a), and drier summer induced by covarying climate oscillations (e.g., ENSO)11.

Fig. 2: Maximum covariance analysis (MCA) between observed LAIspring and simulated NEPsummer from 1982 to 2015.
figure 2

a, b the spatial patterns of MCA leading modes of GIMMS LAIspring and TRENDY NEPsummer, respectively. The squared covariance fraction (SCF) between the two fields is 45%. A two-tailed Student’s t-test of heterogeneity regression confidence level is conducted to show the significance distribution (Supplementary Fig. 4). c the overall distribution of the leading modes’ singular vectors of LAIspring and NEPsummer. d the divergence of the leading modes’ singular vectors of four vegetation types. The extent of the box indicates the 25th and 75th percentiles.

The dominantly positive anomalies in the overall distribution of the standardized singular vectors (refer to Methods) for LAIspring and simulated NEPsummer further support the comparatively unified conjunction of spring vegetation greening and summer carbon sink increase (Fig. 2c). Subsequently, we examined the distinction in the responses of four vegetation types by comparing their singular vector distributions (Fig. 2d). The synergetic LAI-NEP coupling pattern of the forest is found, i.e., the positive LAIspring anomalies in singular vectors are followed by more positive NEPsummer anomalies, which differs from grassland. As seen by partial correlation analysis, this also embodies the converse responses between forest and grassland. The patterns of LAIspring anomalies are near zero for shrubland and savanna, whose NEPsummer anomalies are predominantly positive, implying another kind of completely different response pattern.

To obtain more reliable insights into the interseasonal vegetation-carbon coupling, we further conducted similar analyses using atmospheric observation-based estimates of summer net ecosystem productivity (i.e., ACI NEPsummer) to compare with the above results from DGVMs. It is noteworthy that the spatial resolution of NEPsummer estimated by ACIs is much coarser than that of satellite-based LAIspring, which prevents calculating the precise partial correlation coefficients for each grid (Supplementary Fig. 5a). Nonetheless, the trends of mean correlation coefficients with the increasing fraction of forest, grassland and tree coverage exhibit overall conformity with DGVMs (Supplementary Fig. 5b, c). In addition, at a larger spatial scale, there is better agreement in the spatial patterns of standardized singular vectors as identified by MCA (Supplementary Fig. 6). These results suggest that the lagged effects of ESG on NEPsummer are also identifiable with atmospheric observations. Thus, multiple lines of evidence collectively support the notion that vegetation conditions related to greenness and phenology during springtime likely contribute to the enhancement of summer carbon sink.

Site-based observations confirm our hypothesis

We further assessed the observation-based interseasonal connections between the ESG features (namely LAIspring and the start of the growing season, SOS) and NEPsummer across 45 available FLUXNET sites. By using the Theil-Sen slope estimator and the Mann-Kendall test, we calculated the trends and statistical significance in each observational time series. It is found that the trends of LAIspring and SOS correlate with that of site-based NEPsummer significantly (r = 0.41, p = 0.009 for LAIspring, r = −0.34, p = 0.041 for SOS, respectively; Fig. 3a, b). These weak cross-site correlations may be dissembled by biome-dependent sensitivity to long-term climate effects31. On the other hand, those sites with significant increasing trends in spring greenness and significant advancing trends in SOS demonstrate higher level of  NEPsummer trends, compared to sites with non-significant trends in LAI and SOS (Fig. 3c). We also discovered a strong negative correlation between pr(LAIspring - NEPsummer) and pr(SOS - NEPsummer) (r = −0.46, p = 0.008; Fig. 3d), reflecting a certain consistency within the ESG features. Despite the sparse spatial distribution of in-situ observed sites and the limited length of instrumental records, the overall agreement between these observations and the results from process-based models and atmospheric inversions enhances our confidence in the parallel findings.

Fig. 3: Relationship of site-based summer carbon sink with two ESG features.
figure 3

a the partial correlation between the trends of satellite-based LAIspring and site-based NEPsummer, denoted as pr(LAIspring - NEPsummer). b the same as a but for pr(SOS - NEPsummer). c trends of NEPsummer in sites with different significant levels of LAIspring and SOS trends. Sig. + (−) and Non-sig. + (−) indicate significant and non-significant increasing (decreasing) trends. d scatterplot of pr(LAIspring-NEPsummer) versus pr(SOS-NEPsummer). The observations of NEPsummer are collected from 45 flux-tower FLUXNET sites. The extent of the box indicates the 25th and 75th percentiles. The envelope indicates a 95% confidence interval.

Explainable machine learning confirms our hypothesis

To test the hypothesis on the distinct responses of four vegetation types and quantify the effects of ESG, we implemented an explainable machine learning model that is a combination of a tree-based model (XGBoost) with SHAP algorithm26,27. First, we used two ESG features (LAIspring and SOS) as analyzed in the previous sections, as well as summer LAI and seven hydrometeorological factors as potential drivers, and simulated NEPsummer as the targeted predicted variable (more details in Methods). Second, we built an overall model based on all samples from the non-agricultural vegetated areas, and separate models for four vegetation types (forest, shrubland, savanna and grassland). We utilized SHAP values to distinguish between the positive and negative effects of ESG. In the testing phase, the coefficient of determination (r2) is 0.86 for the overall model and ranges from 0.59 to 0.71 for each separate models, indicating the high reliability of our established models in capturing the spring-to-summer vegetation-carbon coupling (Supplementary Fig. 7).

The analyses identified that the influence of spring greenness on summer carbon sink shows mixed effects, whereas the influence of spring phenology is consistent across four vegetation types (Fig. 4). In terms of forest, the samples having beneficial effects cluster in the quadrant with positive LAIspring anomalies (or negative SOS anomalies) and positive NEPsummer anomalies, and vice versa (Fig. 4a, e). The response patterns of the other vegetation types are somewhat similar to those of the forest. However, for shrubland and savanna, we discovered that vegetation over-browning in springtime can also play a relatively notable role in boosting summer vegetation productivity (Fig. 4b, c). There is a possible explanation that this weak spring plant growth constrained by temperature can act as strong negative feedback to enhance interseasonal vegetation-soil moisture interaction towards elevating peak season activity15,32,33. In addition, the response patterns of grassland are less concentrative in contrast to the forest (Fig. 4d, h). These findings provide valuable insights into the impact of ESG features on summer carbon sinks across different biomes, thus contributing to a comprehensive understanding and reconciling previous assessments.

Fig. 4: Response patterns of NEPsummer on the ESG features for different vegetation types from 1982 to 2015.
figure 4

Four types of vegetation were analyzed and compared, including forest in a, e, shrubland in b, f, savanna in c, g and grassland in d, h. The results were from tree-based models combining the SHAP algorithm. The beneficial (adverse) effects on model output were classified by positive (negative) SHAP values.

We next used the overall model to calculate the marginal effects (refer to Methods) for the ESG features to quantify the magnitude of their isolate and conjoint effects, by adjusting the inputs of LAIspring and SOS with the additive perturbation of one standard deviation (std). Across the northern mid-high latitudes, the summer ecosystem carbon sink tends to be reinforced by the ESG widely (73.5% of the study area; Fig. 5a). It is an important signal for the future changes of natural terrestrial carbon stocks under the scenario that anthropogenic warming alters spring vegetation dynamics continually. We also examined the NEPsummer sensitivities to each ESG feature, which exhibit widespread positive patterns in space, particularly for spring phenology (Supplementary Fig. 8). In general, the overall beneficial effects of ESG on NEPsummer are found in all non-agricultural vegetation types (Fig. 5b). Specifically, spring phenology advancement exerts a stronger influence on the summer carbon sink than vegetation greening. This is because vegetation greenness predominantly manifests mixed effects as seen in Fig. 4. Moreover, we found that shrubland and savanna biomes are more sensitive to the ESG in contrast to forest and grassland. This may be explained by the intrinsic water-use efficiency of shrubland and savanna can rapidly rise when exposed to enhanced water stress, with the prerequisite of summer soil moisture drying induced by the ESG17,30,34.

Fig. 5: Marginal effects of ESG features on NEPsummer based on the machine learning model.
figure 5

a the spatial pattern of the marginal effects on NEPsummer with concurrently perturbing 1 standard deviation (std) in LAIspring and SOS. b the heterogeneity of the marginal effects of four vegetation types. The error bars indicate ±1std.

Implications of our study for reforestation efforts

The northern forest biome stands at the forefront of efforts to mitigate climate change35. As indicated above, the different sensitivities of biomes to the ESG motivate us to analyze whether reforestation could further amplify the beneficial effects of ESG. We found that the hotspots for reforestation36 (Supplementary Fig. 9) spatially collocate with the regions emerging with high marginal effects, such as east and central Siberia, and northwest America. The total area with coincidental tree restoration potential and positive marginal effects is roughly 1.5 × 106 km2, accounting for 35.8% of the non-agricultural vegetated areas. This suggests that tree restoration could reinforce the extra gain from the ESG. Another important implication is that the areas with greater tree restoration potential will gain more benefits from the stronger ESG, which are apparently higher than that of zero-planting areas (Fig. 6). However, these benefits will diminish when the tree restoration potential exceeds 40%, which may be attributed to the objective fact that those regions are still in low tree coverage constrained by environmental factors, e.g., water supply and solar radiation. Generally, the presented results show that regions with moderate level of total tree restoration have the largest potential for summer carbon sink enhancement after a warmed spring (Supplementary Fig. 10).

Fig. 6: Implications of the tree restoration potential and enhanced NEPsummer.
figure 6

The tree restoration potential36 indicates the maximum carrying capacity per pixel for tree restoration. The zero-planting means that the tree restoration potential is close to zero (Supplementary Fig. 7). The error bars indicate a 95% confidence interval.

Previous work37,38 has reported that vegetation greening and phenological dynamics collectively impact both plant productivity and respiration, leading to the uncertain trade-off of ecosystem carbon sequestration. Our findings offset this concern by demonstrating that summer net carbon sink benefits from the spring-to-summer VGC effect. Furthermore, it should be noted that our study only focuses on the impact of ESG on summer above-ground biomass changes. Across the northern areas, climate change threatens an unknown quantity of carbon stocks which are massively stored in permafrost and peatlands39,40. Accelerated underground carbon releases can also be triggered by anthropogenic warming and vegetation dynamics, versus the increased carbon uptake in vegetated areas. Future research is encouraged to make critical advances in the evaluation of the imbalance between carbon gain and loss concerning the seasonal vegetation-carbon coupling. This will provide a deeper understanding of the complex dynamics and interactions involved, ultimately contributing to more accurate assessments of carbon sequestration potential and the impacts of environmental changes on ecosystem carbon balance.

In summary, our work provided robust evidence that vegetation dynamics in springtime act to increase summer vegetation productivity, based on the parallel results from the process-based and atmospheric observation-based estimates of carbon flux. We further exploited site-derived eddy covariance measurements and a machine learning approach to elucidate the effects of ESG toward a better understanding of the biophysical process associated with greenness and phenology. These time-lagged effects vary with vegetation types due to the mutually independent ecosystem functions. Specifically, the forest biome maintains a consistent and positive relationship with the ESG, revealing its relatively stable functioning on carbon sequestration. By contrast, grassland in arid and semi-arid regions is weakly stimulated, as indicative of increasing risks of declining carbon sink in summer. Finally, we found that terrestrial biomes have a predominantly positive sensitivity to the ESG over the mid-high latitudes of the Northern Hemisphere, particularly for shrubland and savanna. These findings underscore the importance of investigating the spring-to-summer vegetation-climate-carbon interaction under global warming.

Methods

Experimental framework design

In this study, we first used satellite-based LAIspring (as a proxy of vegetation greenness and phenology) and three explicit NEPsummer estimates to provide the evidence for enhanced summer ecosystem carbon sink from ESG, by partial correlation analysis and MCA. These methods enable us to reveal the complex spatiotemporal coupling relationships that underlie them. Subsequently, we built explainable machine learning models to gain the process understanding of interseasonal vegetation-carbon coupling. For this modeled analysis, we complemented the start of the growing season to better characterize spring phenological dynamics, which moderately correlated with LAI (r = −0.43, p < 0.001).

Vegetation index and phenological metric

Vegetation indices are key parameters of vegetation structure and function to study global change. Here, LAI was used as an observational proxy of vegetation greenness and phenological metric. The LAI dataset was derived from the Global Inventory Monitoring and Modeling Studies (GIMMS) LAI3g product, which uses the GIMMS NDVI3g and Moderate Resolution Imaging Spectroradiometer (MODIS) LAI dataset as input and is assimilated based on an Artificial Neural Network model41. This global satellite product has a spatial resolution of 8 × 8 km2 at biweekly intervals and has been widely applied in environmental science. We calculated the spring LAI average for the period 1982-2015.

The growing-season vegetation dynamics and regional carbon budget have a close relationship with the surface freeze/thaw state, in particular for mid-high latitudes, which can serve as a natural representation of both commence and cease of biological activities11,42. Therefore, we extracted the date of spring phenological metric (i.e., SOS) by using a daily satellite microwave freeze/thaw record archive from the Making Earth System Data Records for Use in Research Environments (MEaSUREs) program at a 25 × 25 km2 spatial resolution43. Despite this coarse resolution is subject to inherent limitations which lie in the potential loss of spatial details and the local-scale variations of phenological state, this dataset still well serves our research purpose as other datasets used for investigating SOS impact share similar spatial resolution. Specifically, we prescribed the SOS based on the criteria of at least 12 days in thaw status out of consecutive 15 days, and this condition should be satisfied in the next 60 days11. Note that the leaf-out time of forest species is driven by various factors including light (e.g., beech trees) and temperature, whereas it is not possible to distinguish between temperature-driven and light-intensity-driven forest types with the technique used in this paper.

Ecosystem carbon fluxes

To estimate the summer ecosystem carbon sequestration, we used three independent carbon flux datasets for the period of 1982 to 2015. The first is eight dynamic global vegetation models ensemble from TRENDY21 v9, including CABLE, IBIS, SDGVM, DLEM, ISAM, LPJ, ORCHIDEE, and VISIT. Note that we only included the models with a spatial resolution of 0.5°, while omitting the models with coarser resolution44. These models are forced under the simulation 2 (varying CO2 and climate). The net ecosystem productivity is represented as the difference between GPP and terrestrial ecosystem respiration (the sum of autotrophic respiration and heterotrophic respiration). The second dataset of carbon flux is two long-term atmospheric inversions ensemble from CAMS version 17r122 and Jena CarboScope version s76oc_v2022 (ref. 23), which both assimilated surface-to-atmosphere CO2 measurements and their spatial resolutions are 1.9° × 3.75° and 4° × 5°, respectively. We remapped them to regular gridded outputs with a spatial resolution of 1° × 1°. In addition to TRENDY and ACIs, we also collected the eddy covariance measurements derived from FLUXNET 2015 dataset24. We selected the EC carbon flux tower sites that provide at least 7 years of observational records, and are overall free from low-quality measurements (e.g., missing values). We also excluded those sites dominated by cropland. Finally, 45 sites (Supplementary Table 1) were included for the investigation of the seasonal vegetation-carbon relationship.

Hydrometeorological data

The root-zone soil moisture (SM) data were collected from GLEAM v3.2a45. The GLEAM algorithm assimilates microwave-based surface soil moisture into the soil profile to correct for random errors in forcing datasets. The generated surface hydrological elements from GLEAM have been proven effective in land-atmospheric interaction studies46. The monthly downwelling shortwave radiation (Srad) with 0.5° × 0.625° spatial resolution was obtained from Modern-Era Retrospective analysis for Research and Applications (MERRA) v2 global reanalysis product47. The monthly temperature (Temp), precipitation (Prec) and vapor pressure deficit (VPD) are collected from the Climatic Research Unit Time Series (CRU TS) v4.05 dataset48 with a spatial resolution of 0.5°. The VPD is represented as the difference between saturated vapor pressure (SVP) and actual vapor pressure, and the SVP is calculated based on the empirical equation using air temperature.

Land cover map, vegetation continuous fields and tree restoration potential

Divergent vegetation is highly heterogeneous in the aspect of ecosystem functioning, thereby resulting in different responses to ESG. Hence, we considered vegetation types and composition structures to discern these responses. For vegetation types, we primarily investigated forest, shrubland, savanna, and grassland from the MODIS land cover map (MCD12C1) in 2011 according to the International Geosphere-Biosphere Programme (IGBP) classification scheme. Note that cropland is a land-use type extensively impacted by human activities (e.g., tillage, irrigation), and is therefore excluded from this study. Moreover, we supplemented the analysis using vegetation continuous fields data, which is derived from MODIS global surface vegetation cover product (MOD44B) in 2011. This dataset includes the percentages of three components in each grid cell, i.e., tree cover, non-tree, and non-vegetated cover. We focused on the distinctive responses of tree cover percentage to carbon sink. In this study, both datasets mentioned above refer to a specific year (2011) because the coarse spatial resolution can offset the impacts induced by their potential temporal changes during our study period. In addition, the tree restoration potential dataset36 was collected to represent the realizable scenario of large-scale tree recovery. All data sets were resampled to match the two gridded carbon flux data with 0.5° and 1° spatial resolutions, and unified to cover the time span from 1982 to 2015 (information summarized in Supplementary Table 2).

Maximum covariance analysis (MCA)

MCA separates multiple independent coupling modes from two data fields (left field \({Q}_{\left(n\right)}\) and right field \({P}_{\left(n\right)}\)), and reveals their spatial relationship in the temporal domain by applying Singular Vector Decomposition (SVD) on the maximum covariance matrix \(C\left(Q,P\right)\)49. The correlation of two data fields’ time expansion coefficients denotes the level of their coupling strength. In this study, MCA is used to explore the coupling pattern of LAIspring and NEPsummer. Mathematically, we defined two data matrixes \(Q\left[i\times n\right]\) and \(P\left[j\times n\right]\), where \(Q\) and \(P\) indicate the normalized LAI and NEP weighted by the square root of the cosine of the corresponding latitude17, respectively, \(i\) and \(j\) indicate the number of pixels, and n indicates the number of samples (34 years). The covariance matrix \(C\left(Q,P\right)\) is represented as a sum of orthogonal SVD modes:

$$C=\mathop{\sum }\limits_{m=1}^{N}{\sigma }_{m}{q}_{m}{p}_{m}^{T}$$
(1)

Where \({q}_{m}\) and \({p}_{m}\) indicate the m-th modes of the left and right fields corresponding to the eigenvalue \({\sigma }_{m}\), respectively, and \(N\) indicates the number of decomposed dimensions. The time expansion coefficients (\({a}_{m}\) and \({b}_{m}\)) are calculated as:

$$\begin{array}{c}{Q}_{\left(n\right)}=\mathop{\sum }\limits_{m=1}^{N}{a}_{m}\left(n\right){q}_{m},{P}_{\left(n\right)}=\mathop{\sum }\limits_{m=1}^{N}{b}_{m}\left(n\right){p}_{m}\\ {a}_{m}\left(n\right)={q}_{m}^{T}{Q}_{\left(n\right)},{b}_{m}\left(n\right)={p}_{m}^{T}{P}_{\left(n\right)}\end{array}$$
(2)

Machine learning method and interpretability

Extreme Gradient Boosting (XGBoost) is an advanced supervised tree-based model that has proved its excellent performance in carbon flux prediction50,51. Owing to its flexible framework50 and inherent interpretability52, XGBoost is a trustworthy method to fulfill the task of this study. We used two factors describing spring phenology and greenness (namely SOS and LAI), summer LAI and seven hydrometeorological controlling factors (namely Temp, Prec, SM, VPD and Srad in summer, as well as Temp and Prec in spring) as model inputs, and NEPsummer as the targeted predicted variable. To get the optimal model configuration, we randomly split all samples into four parts: training set (75%), validation set (10%), optimization set (10%) and test set (5%). We first trained an initial model and exploited a validation set for early stopping. Then, we applied a specific incremental search algorithm to tune the hyperparameters with the optimization set, and evaluated the model performance in the testing phase. Here, we embedded an explainable approach (i.e., SHAP algorithm27) into the XGBoost Model to deepen our understanding of interseasonal ecosystem carbon feedback. In particular, we applied TreeSHAP26 and integrated it into XGBoost modeling. TreeSHAP builds theoretical knowledge on previous model-agnostic work based on classic game theory (Shapley values), aiming to improve the interpretability of tree-based models26,53. From the perspective of application in machine learning, interpretability means that the model can use input features as an information source to determine their positive or negative effects on model output. Generally, we exploited this SHAP value to decipher the impacts of each model feature on the target prediction. For instance, the negative SOS anomalies inputs (Fig. 4e) produce positive SHAP values accompanied by positive NEPsummer anomalies, which indicates beneficial signals captured by the model between the input and output.

In the sensitivity analysis, we calculated the mean values of each variable from 1982 to 2015, and perturbated each variable by adding one std for LAIspring and subtracting one std for SOS. Then, the overall tree-model was run to predict NEP (NEPpre) in different situations. The sensitivity of NEP to LAIspring is calculated using Eq. (3), and it is the same for SOS. We also calculated the marginal effects (ME) to assess the changes of summer carbon sink induced by ESG, according to Eq. (4).

$${{NEP}}_{{sensitivity}}=\frac{{{NEP}}_{{pre},+1{std}\, \left({LAI}\right)}-{{NEP}}_{{pre}}}{{{std}}_{({LAI})}}$$
(3)
$${ME}=\frac{{{NEP}}_{{pre},+1\, {std}({LAI})\, {{\& }}\, -1{std}({SOS})}-{{NEP}}_{{pre}}}{{{std}}_{({NEP})}}$$
(4)