Estimating changes of forest carbon storage in China for 70 years (1949–2018)

In the realm of forest resource inventory and monitoring, stand-level biomass carbon models are especially crucial. In China, their importance is underscored as they form the bedrock for estimating national and international forest carbon storage. This study, based on the data from 52,700 permanent plots in the 9th National Forest Inventory (NFI) of China, was directed towards developing these models. After computing biomass and carbon storage per hectare using specific tree models for 34 species groups, we devised robust volume-derived biomass and carbon storage models for 20 forest types. The application of these models and historical data reveals notably a decline in China's forest carbon storage to 4.90Pg by the late 1970s due to aggressive forest exploitation. However, subsequent conservation and afforestation campaigns have affected a recovery, culminating in a storage of 8.69Pg by the 9th NFI. Over the past 40 years, China's forest carbon storage has surged by 3.79Pg, split between natural forests (2.25Pg) and planted forests (1.54Pg). In benchmarking against three pre-existing models, we discerned discernible biases, underscoring the need for larger modeling sample sizes. Overall, our models stand as a monumental stride in accurately gauging forest carbon storage fluctuations in China, both regionally and nationally.

Across the nation, there are a total of 52,700 effective plots documented, all of which boast a forest volume greater than 0. Each of these plots underwent an assessment wherein the forest volume, biomass (including both above-and below-ground biomass but excluding the biomass of understory shrub and herbaceous layers), and carbon storage per hectare were meticulously calculated.This calculation utilized the one-variable tree volume and biomass models and incorporated the carbon factors of primary tree species [9][10][11][12][13][14][15][16][17][18][19][20][21] .
To ensure an effective modeling and validation process, the plots were systematically divided: two-thirds of the plots were designated for modeling, and the remaining one-third was allocated for validation purposes.Table 1 offers a comprehensive overview, detailing the basic data pertaining to the modeling samples and validation samples across the 20 forest types.

Method
Model development Forest stand biomass is intrinsically tied to its volume, and this relationship has been extensively explored through volume-derived biomass models in prior studies 27,28,32,35,38,[45][46][47][48] .Fang et al. 30 , in their research on 21 Chinese forest types, established a linear correlation between forest stand biomass and volume stock.This linearity is further corroborated by the scatterplot depicting the relationship between forest biomass and volume data per hectare across the 52,700 sample plots (as shown in Fig. 1).
While total biomass provides valuable insights, it is often essential to discern the distinction between aboveand below-ground biomass.The latter's proportion to the former is termed the root-to-shoot ratio (RSR), which has been noted to exhibit variation across different forest types.Once the total forest biomass estimation is secured, the subsequent step involves calculating forest carbon storage.This is achieved by multiplying the biomass by the average carbon factor, which is typically either 0.5 or 0.47 2,29,30 .Yet, it's worth noting that distinct tree species and forest types might present varying carbon factors.
Given the recursive nature of the relationship between total biomass and either above-ground biomass or carbon storage, this study employed simultaneous equations with error-in-variables, a method previously harnessed for tree-level modeling [46][47][48] .The equations are articulated as: (1) In these equations, B T signifies total biomass per hectare (t/ha), B A represents above-ground biomass (t/ha), and C denotes carbon storage (t/ha).V is the stand volume (m 3 /ha), and a 0 , b 0 , c 0 , d 0 are the model's parameters, ε 1 , ε 2 , ε 3 are the error items, which are postulated to adhere to a normal distribution, averaging to zero.
By dividing Eq. (1) by V, a stand biomass conversion factor (BCF) model is derived: (2)  Within this context, BCF amalgamates three parameters: basic wood density (WD), biomass expansion factor (BEF) and the root-to-shoot ratio (RSR).This is consistent with the constructs proposed in the IPCC Guidelines for national greenhouse gas inventories 2 , wherein BCF = WD*BEF*(1 + RSR).The d 0 parameter in Eq. (3) aligns with the carbon factor (CF).Additionally, drawing from the c 0 parameter in Eq. ( 2), one can derive the RSR as: Given the heteroscedastic nature of the data concerning forest biomass, carbon storage, and forest volume, the study recommends the adoption of the weighted regression method 45 .The weight function employed in this analysis was defined as w = 1/V 0.5 .Using the ordinary least square (OLS) method, without accounting for this heteroscedasticity, could inadvertently introduce biases.This is rooted in the premise that OLS methodological application is contingent upon homoscedasticity-one of its foundational assumptions.Additionally, in the light of the intertwined relationship between total and above-ground biomass or carbon storage, it becomes imperative to utilize simultaneous equations paired with error-in-variables for accurately fitting the models outlined in Eqs.(1) through (3) [45][46][47][48] .

Model evaluation
Six indices were used to evaluate the models: coefficient of determination (R 2 ), standard error of the estimate (SEE), total relative error (TRE), average systematic error (ASE), mean prediction error (MPE), and mean percentage standard error (MPSE) [47][48][49] .TRE, ASE, MPE and MPSE are calculated as follows: In these equations, y i are observed values, ŷi are estimated values, y is the mean of observed values, n is the number of plots, and t a is the t-value at confidence level a.For the developed models, the values of six indices above were calculated out and to be used for model evaluation.

Results
For the purpose of this study, data from 35,120 plots spanning 20 distinct forest types were employed to fit models (1) through (3) via the application of simultaneous equations with error-in-variables.The resulting fit and its associated statistical assessment for model (1) are elucidated in Table 2.It is noteworthy that, when comparing the standard error of the estimate (SEE) for models (2) and (3) with model (1), significant discrepancies were observed.Such variations in the SEE provide insights into the consistency and reliability of the models.However, when examining the other five evaluative indices, only marginal differences were discerned.Owing to their minimal variation, these indices have been omitted from Table 2 for clarity and brevity.
In a comprehensive examination of the evaluation metrics presented in Table 2, several key insights about the model's performance emerged.The coefficient of determination R 2 consistently recorded values above 0.87, illustrating the model's high level of explanatory power.The total relative error (TRE) was observed to be close to zero, indicating minimal discrepancies between the observed and predicted values across the models.Moreover, the average systematic error (ASE) was predominantly confined within a range of ± 7%.Interestingly, for 18 of the 20 forest types under study, the ASE remained even more constrained, falling within ± 5%.Additionally, the mean prediction error (MPE) across all types was kept below 3%, and for eight forest types, it was remarkably less than 1%.On evaluating the mean percentage standard error (MPSE), it was found that values for most of the forest types fell within a bracket of 10% to 20%.It's worth noting that only one forest type exhibited an MPSE exceeding 20%, while three types demonstrated values under the 10% mark.
Further, in a subsequent phase of analysis, data from 17,580 plots, as delineated as validation samples in Table 1, were subjected to an independent validation.The results, particularly the TRE and ASE values, are detailed in Table 3.A key observation from this validation was that, for model (3), the TRE exceeded ± 3% exclusively for the ' other coniferous' category.In all other instances, the ASE for each forest type remained tightly bound within a range of ± 5%.
Considering the aforementioned metrics and observations, it is evident that the stand biomass and carbon storage models for the 20 forest types display an admirable performance.Such precision and reliability underscore their potential in providing an accurate and robust framework for estimating forest biomass and carbon storage at the stand level.

Data collection
The dataset for this study is primarily sourced from the records of the nine National Forest Inventories (NFI) as well as forest area and volume data documented prior to the first NFI [50][51][52][53][54][55] .This compilation encompasses The parameter estimates and evaluation indices of biomass and carbon models for 20 forest types.All parameter estimates are significant at the level a = 0.01.Parameter d 0 is equivalent to the average carbon factor (CF) of each forest type, and RSR is the average root-to-shoot ratio resulting from Eq. ( 5).

Estimation by classification
The time span covering the 1st to the 9th NFI witnessed two significant alterations in the definition of a forest.Initially, before the 5th NFI was embarked upon, there was a revision in the canopy closure parameter, shifting it from above 0.3 (excluding 0.3, akin to more than 35%) to above 0.20.A subsequent modification, just before the 6th NFI concluded, entailed the inclusion of specifically defined shrubs in the forest area and forest coverage measurements 56 .
In striving for a data alignment compatible with global norms, the study embraced the FAO's forest definition 57,58 .Accordingly, only arboreal forests, bamboos, and rubber-woods were included, while the specially defined shrubs were left out.The adopted canopy closure standard for forests stood at more than 10%.Sparse forests from prior NFIs were incorporated, defined by canopy closures ranging between 10 and 35% (1st to 4th NFI) and 10-19% (5th to 9th NFI).
The methodology further delves into specifics: 1) Arboreal forest carbon storage.Leveraging the earlier developed models for 20 forest types, estimations for biomass and carbon storage were derived using the area and volume metrics for varying forest types.2) Bamboos' carbon storage.The 9th NFI provided a foundation using the individual bamboo plant biomass model and the 0.5 carbon factor.Proportional methods, relying on bamboo area from previous NFIs, were employed to compute the biomass and carbon storage.3) Sparse forest carbon storage.Historical data showcasing the volume proportion of sparse to arboreal forests informed the proportional method used for determining their biomass and carbon storage.

Treatment of incomparable data
Ensuring that forest carbon storage transitions across distinct periods were accurately portrayed necessitated some recalibrations.Built on the data from earlier NFIs, these adjustments were steered by a thorough evaluation of data comparability and cohesion.
1) Tibet's forest data.Till the 6th NFI, Tibet had been surveyed for its forests only twice (in 1977 and 1991).A stark contrast was observed in its forest area and volume, which lagged behind the 2001 records.Given an inferred mild reduction in forest resources pre-2001, data from the 1st to the 5th NFI, along with 1949 and 1962 datasets, underwent necessary adjustments.2) Taiwan's forest data.Four forest inventories had been executed in Taiwan across 1957, 1976, 1992, and 2012.
While aggregating national statistics, the 1976 dataset represented the 2nd, 3rd, and 4th NFIs.Subsequently, the 1992 data echoed the findings of the 5th through the 8th NFIs.The 1957 and 2012 records, on the other hand, paralleled the 1st and 9th NFIs respectively.Ensuring an objective reflection of forest carbon storage transitions, interpolative and extrapolative techniques were applied to the NFI datasets, grounding them on the quartet of Taiwan's inventories.3) Other data amendments.The second NFI marked the adoption of the Continuous Forest Inventory (CFI) methodology, rendering the 1st NFI data, as well as the 1949 and 1962 records, somewhat incongruent.Thus, employing a mix of the 2nd and 3rd NFIs' dynamic datasets and the trend analysis for 1949 and 1962 50 , certain figures from the 1st NFI and the two mentioned years underwent revisions and augmentations.

Results
Using the aforementioned estimation methods for biomass and carbon storage, combined with the processing of incomparable data, we charted the evolution of forest metrics in China over a span of 70 years from 1949 to 2018.Concurrently, we also traced the changes in metrics specific to planted forests from the 2nd to the 9th NFI, covering an approximate timeline of 40 years from 1977 to 2018, as detailed in Table 4.
From the insights of Table 4, it becomes evident that China's forest resources underwent a period of decline from the early years following the establishment of the People's Republic of China until the concluding years of the 1970s.The carbon storage metric mirrors this trend, beginning at 5.89Pg and decreasing to its lowest recorded level at 4.90Pg due to extensive forest cutting for the rapid economic and social development.Being a developing nation at the time, China's economic progression necessitated large volumes of wood 50 .Venturing into the era post-reform, we witnessed a resurgence in plantation growth.Coinciding with the large-scale afforestation initiatives that swept across the nation, the carbon storage values associated with these plantations surged from a meager 0.15Pg in the late 1970s to an impressive 1.69Pg by the time of the 9th NFI.An additional significant observation centered around the end of the twentieth century.As China rolled out pivotal forestry programs, such as the natural forest protection and the transition of farmlands back to their original forested state, there was a marked acceleration in the growth of forest resources.This uptick is quantified with the carbon storage values rising from its previous low of 4.90Pg in the late 1970s to the highest of 8.69Pg recorded in the 9th NFI.
Stretching our lens over the entire 70-year span, we discern a distinctive U-shaped trajectory in both forest volume and carbon storage, which is further exemplified in Fig. 2. Narrowing our scope to just the last four decades, the increment in forest carbon storage is quantified at 3.79Pg.Of this, natural forests contributed 2.25Pg, and planted forests added 1.54Pg, translating to respective proportions of 59% and 41%.Both afforestation and natural forest protection have contributed greatly to the growth of forest carbon storage in China.

Discussions Biomass models
In our introduction, we highlighted the limited number of biomass models for forest types-specifically larch, Chinese fir, and Chinese pine-established by Fang et al. 30 that had modeling samples exceeding 30.To offer a comparative analysis, we assessed the biomass models crafted by Fang et al. 30 , Wang et al. 32 and Zhang et al. 41 , utilizing the data from all sample plots of these three forest types, as depicted in Table 5.
A close examination of the evaluation indices TRE and ASE from Table 5 reveals a noteworthy trend: as sample size increases, the precision of the three model sets generally improves.Broadly speaking, Zhang et al. 's models 41 show a superior performance over those by Wang et al. 32 , and Fang et al. 's models 30 exhibit the most substantial errors.Dissecting the parameters from the models formed in our study highlights that Fang et al. 's models possess larger intercept parameters but smaller slope parameters, leading to considerably abnormal ASE values as illustrated in Fig. 3.
Several factors can potentially explain these disparities.Foremost, a biased estimation method could be a contributing factor.Relying on ordinary regression instead of weighted regression, especially when faced with heteroscedasticity, may result in such skewed outcomes.Additionally, both the magnitude and the structural quality of the sample cannot be dismissed as influencing variables.Conventionally, for statistical hypotheses to be valid, the sample size should exceed 50.As underscored by the key index in Eq. ( 9) 49 , an uptick in sample size is inversely related to MPE.The quality of sample structure is another crucial determinant, further elucidated by certain studies 42,47 .Other elements such as tree species, age, and regional variability can also affect model performance.For instance, when we divide larch forest plots into three regional subsets, outcomes from regional models differ from the national model.1962 1973-1976 1977-1981 1984-1988 1989-1993 1994-1998 1999-2003 2004-2008 2009-2013 2014-2018   Forest carbon (Pg) Forest volume ( 10 Notably, the challenge of uncertainty in forest biomass and carbon stock estimations is not an isolated phenomenon in China but is witnessed globally.The essence of modeling underscores the significance of garnering ample samples, employing the correct parameter estimation method, and employing a diverse range of evaluation indices. Figure 3 sheds light on the residual errors in the biomass model (1) alongside the other three biomass models for larch.Distinctively, owing to the intercept parameter a 0 = 33.806 in Fang et al. 's model 30 , a larch forest stand with a volume of 0 m 3 /ha exhibits a biomass of an elevated 33.806 t/ha.This inevitably leads to biases in biomass estimates for larch forest stands.It's evident that not only is the total biomass consistently underestimated, but the same issue of biases persists.This pattern mirrors the residual errors found in biomass models for other forest types like Chinese fir and Chinese pine.However, to optimize space, these patterns are not presented here.

Carbon changes
To further elucidate the disparities between the outcomes of previous studies and our own, we examined the fluctuations in China's forest carbon storage across various time frames as estimated by Fang et al. 30 , Zhou et al. 38 and Zhang et al. 40 .A comparison was then made with the data found in Table 4 of our study (refer to Table 6).www.nature.com/scientificreports/Given that all four investigations derive from the NFI data, the variations in results are relatively minimal.Specifically, the alterations in forest carbon storage in China across different periods as discerned by Zhang et al. 40   1977-1981, 1984-1988, 1989-1993, and 1994-1998  The trend in forest carbon storages from 1949 to 1998 as determined by Fang et al. 30 aligns perfectly with our study, yet their values are also systematically reduced.The authors neither supplied intermediate data nor delineated what was included or excluded.Beyond utilizing disparate models, the causes for this underestimation may include the following: an exclusion of sparse forest in each period's data; a possible omission of bamboos; and a likely exclusion of both Taiwan and parts of Tibet.
In summary, the aforementioned three studies appear to have overlooked the ramifications of changes in forest definitions and disparities in statistical scope across different periods while employing NFI data.This oversight led to a systematic underestimation of forest carbon storage across various time frames, with particularly pronounced underestimations in the earlier periods.

Conclusions
Utilizing the data acquired from 52,700 permanent plots during the 9th NFI in China, we established biomass and carbon storage models for 20 distinct forest types.This was achieved through the application of simultaneous equations with error-in-variables.Furthermore, leveraging data pertaining to forest area and volume across different timeframes, we delineated the shifts in forest carbon storage over a 70-year span and charted the alterations in planted forest carbon storage over the past 40 years.From these results, several significant conclusions emerge: 1) The biomass models for 20 forest types, derived from volume data in our study, demonstrated robust predictive capability.With an R 2 exceeding 0.87 and a MPE under 3%, these models offer a foundational basis for accurate estimation of the status and changes in forest carbon storage, both nationally and regionally.2) We assessed the validity of three existing sets of biomass models using our dataset.Although all models exhibited notable biases, the accuracy of their predictions appeared to enhance as the modeling sample size grew.3) Historical data reveals that China's forest carbon storage stood at 5.89Pg in 1949, dipped to a nadir of 4.90Pg by the late 1970s, and subsequently rose to 8.69Pg by the time of the 9th NFI.Over the past seven decades, the trajectory of China's forest carbon storage can be best described as U-shaped.4) The last 40 years have witnessed a surge of 3.79Pg in China's forest carbon storage.This growth encompasses an increment of 2.25Pg in natural forests and 1.54Pg in planted forests.This trend underscores the pivotal roles of both afforestation initiatives and the protection of natural forests in bolstering China's forest carbon storage.

Figure 2 .
Figure 2. The change trend of forest volume and carbon storage in China.

Table 1 .
The basic data pertaining to the modeling samples and validation samples across the 20 forest types.

Table 3 .
The independent validation results of biomass and carbon models for 20 forest types.provincialdata for dominant tree species or forest types, along with specific datasets for bamboos and sparse forest.

Table 4 .
The estimation results of forest biomass and carbon storage in different periods in China.FA, FV, FB, FC, and FD are forest area, forest volume, forest biomass, forest carbon, and forest carbon density, respectively; and PA, PV, PB, PC, and PD are planted forest area, planted forest volume, planted forest biomass, planted forest carbon, and planted forest carbon density, respectively.Forest is the land with a tree canopy closure of more than 10%, including bamboos and rubber-woods, excluding specially defined shrubs.

Table 5 .
The comparison of estimation results of different biomass models for three forest types.
largely mirror our findings, although their values are systematically lower.Besides the models' inherent negative bias, several distinctions can be noted.First, Zhang et al. 's forest data for 1949 and 1950-1962 omitted Tibet.Second, data for 38ly partially included Tibet.Third, their data across all periods excluded both bamboos and sparse forest.Likewise, Zhou et al. 's38forest carbon storage figures across eight NFIs in China consistently undershoot, and the lowest value manifests not in 1977-1981 but in 1973-1976-a discrepancy in the observed trend.Beyond employing diverse methodologies, this difference can also be traced back to certain contrasts with our study.Zhou et al. restricted their scope to the mainland, excluding Taiwan; only partially included Tibet in the data from the 1st to the 5th NFI; and omitted bamboos and sparse forest from each period's data.

Table 6 .
The comparison of estimation results of forest carbon storage changes from different sources.