Thresholds in aridity and soil carbon-to-nitrogen ratio govern the accumulation of soil microbial residues

Microbial moribunds after microbial biomass turnover (microbial residues) contribute to the formation and stabilization of soil carbon pools; however, the factors influencing their accumulation on a global scale remain unclear. Here, we synthesized data for 268 amino sugar concentrations (biomarkers of microbial residues) in grassland and forest ecosystems for meta-analysis. We found that soil organic carbon, soil carbon-to-nitrogen ratio, and aridity index were key factors that predicted microbial residual carbon accumulation. Threshold aridity index and soil carbon-to-nitrogen ratios were identified (~0.768 and ~9.583, respectively), above which microbial residues decreased sharply. The aridity index threshold was associated with the humid climate range. We suggest that the soil carbon-to-nitrogen ratio threshold may coincide with a sharp decrease in fungal abundance. Although dominant factors vary between ecosystem and climate zone, with soil organic carbon and aridity index being important throughout, our findings suggest that climate and soil environment may govern microbial residue accumulation. Aridity, soil nutrient stoichiometry and bulk carbon content are reliable predictors of microbial residue accumulation in soil carbon pools, according to a meta-analysis of amino sugar concentrations in grasslands, woodlands and forests.

S oil is the largest carbon (C) reservoir in terrestrial ecosystems. Small changes in the C budget can cause large fluctuations in atmospheric carbon dioxide concentrations, which, in turn, can have profound impacts on the structure and function of terrestrial ecosystems [1][2][3] . As native inhabitants of soil, microorganisms can regulate soil C dynamics through catabolism and anabolism [4][5][6][7] , and the role of microbial anabolism in transforming soil organic matter into more stable forms is increasingly highlighted 6 . Plant residues are obtained by microorganisms and part of them are used to construct microbial biomass 6,8 . After the death of microorganisms, microbial-derived C (in vivo turnover products, including dead microbial residues and some metabolites) 6 is stabilized in the soil by association with minerals or occlusion in aggregates 6,9 , which can resist disturbance from external factors over longer residence times. This process results in the preservation of large amounts of underground microbialderived C. Some studies have shown that microbial residual C accounts for a large proportion (more than 50%, up to 80%) of the soil organic carbon (SOC) pool [10][11][12][13][14][15] . Therefore, although microbial biomass C only accounts for 2-4% of SOC 11,16 , the contribution of microbial residues to the SOC cannot be ignored. The relatively stable storage of microbial residues contributes to the persistence of organic C in soil 6,11 . Consequently, it is essential to study the accumulation patterns of microbial residues to enhance our understanding of C sequestration in terrestrial ecosystems 6,11,17,18 .
Microbial communities are highly sensitive to environmental change 19,20 . External factors, including soil properties and climate change, can affect the transfer of microbial metabolites to the soil and their stabilization by influencing microbial physiological characteristics, such as microbial growth rate and growth efficiency, as well as biochemical characteristics [21][22][23] . For example, under limited soil moisture conditions, both microorganism growth and activity are weakened 24 , and the transformation of plant residues into microbial biomass and then into microbial residues is decreased 25,26 . Under increased soil moisture conditions, the C utilization efficiency (CUE) of microorganisms and the microbial residue accumulation efficiency should improve. However, excessive soil moisture would lead to oxygen limitation, which would decrease microbial substrate utilization efficiency, and consequently reduce microbial necromass accumulation 27 .
Multiple studies have shown that low CUE limit the synthesis of microbial biomass 28,29 . The higher the microbial growth rate and growth efficiency, the higher the rate of accumulation of microbial residues 30 . However, with increased microbial activity, SOC decomposition and utilization ability (plant-derived or microbial-derived C 31 , etc.) increases 29 . Consequently, there may not be a simple linear relationship between microbial physiological characteristics and their mediated C sequestration 23 . Considering the influence of environmental change on the internal characteristics of microbial communities 19,20 , there is potentially an optimal range of environmental conditions for maximizing the accumulation of microbial residues. Determining the optimal environmental conditions for microbial growth with the greatest microbial residue accumulation and the least SOC decomposition could facilitate the management of C sequestration in soil. However, the optimal environmental conditions for microbial C accumulation remain poorly understood.
The remains of dead microorganisms (microbial residues) can be traced using amino sugars 32,33 , given that only a small proportion of amino sugars in the soil is associated with microbial biomass, and plants do not contain amino sugars 34,35 . In addition, different amino sugars have been linked to specific microbial populations 35,36 . Therefore, using amino sugars as a biomarker to characterize microbial residues facilitates the characterization of SOC sources and the estimation of their stabilization potentials 18,32,37,38 . In the majority of studies, only four types of amino sugars, glucosamine, galactosamine, mannosamine, and muramic acid, have been quantified 32 , among which glucosamine, galactosamine, and muramic acid are most abundant 38 , of these four, fungi produce most of the glucosamine, while bacteria produce only a small proportion of glucosamine, in addition to muramic acid 32,38 .
Amino sugar biomarkers are increasingly being used to study the mechanisms underlying storage of microbial residues 26,39,40 . This enables extensive quantification of the global heterogeneity of soil amino sugar content and its predictors. Forest and grassland ecosystems account for approximately 30% 41 and 26% 42 of the earth's land surface area, respectively, and approximately 47% and 20% of the earth's land SOC content, respectively 42,43 . In the present study, we collected amino sugar data from forest and grassland ecosystems for meta-analysis. We found that most of the studies on soil microbial residues are carried out in the 0-10 or 0-20 cm mineral soil layers in these ecosystems. We synthesized 268 data points of microbial residues in the 0-20 cm soil layer of forest and grassland ecosystems (Fig. 1).
The aim of the present study was to investigate the relative importance of climate, geographical location, and soil physicochemical properties in predicting microbial residue accumulation, as well as to explore the variation in microbial residues with changes in environmental variables. We hypothesized that climate, geographical location, and soil physicochemical properties have different effects on microbial residue accumulation. We further hypothesized that their relative importance is in the order geographical location > climate > soil physicochemical properties, since differences in geographical location would lead to distinct climates, and different geographical location and climate conditions would have varying effects on soil physicochemical properties. Moreover, we speculated that the response patterns of microbial residues to different environmental variables is not always gradual. Hence, we hypothesized the occurrence of an optimal range or thresholds of environmental conditions for the accumulation of microbial residues.

Results
Geographical patterns of microbial residues. The concentrations of topsoil amino sugars ranged from 0.04 to 11.21 mg g −1 soil, with a mean of 2.25 ± 0.13 and a median of 1.68 mg g −1 soil (Fig. 1c). The Scheirer-Ray-Hare test, using ecosystem type and climate zone as fixed factors, revealed no significant difference in amino sugar concentrations between forests and grasslands (2.21 ± 0.22 and 2.29 ± 0.17 mg g −1 soil, respectively) (p = 0.111, Table 1). However, significant differences were detected between climatic zones (p = 0.019, Table 1). Among them, significantly higher amino sugar concentrations were detected in the temperate zone than in the subtropical zone (p = 0.031), while no significant difference in amino sugar concentrations was observed among other climatic zones. No significant interaction on amino sugar concentrations was found between vegetation types and climatic zones (p = 0.094, Table 1). Kruskal-Wallis tests across the five categories (subtropical grasslands, temperate grasslands, tropical forests, subtropical forests, and temperate forests) revealed significant differences in amino sugar concentrations between subtropical forests and temperate forests (p = 0.009) and between subtropical forests and temperate grasslands (p = 0.047, Fig. 1b). Globally, amino sugar concentrations were significantly correlated with SOC concentrations (Spearman's correlation R 2 = 0.490, p < 0.001, Fig. 2b).
Predictors of microbial residues on a global scale. The Random Forest model showed that all the environmental variables considered in this study were important for predicting the concentrations of amino sugars (Overall model: R 2 = 0.768, p < 0.001; environment variable p < 0.05, Fig. 2d). Our structural equation modeling (SEM) explained 65.4% of the variance in amino sugars (Fig. 2a). SOC, soil carbon-to-nitrogen (C:N) ratio and aridity index still had more standard total effects (STEs) (0.929, −0.289, −0.226, respectively, Fig. 2c) after considering the causal relationships between variables.
Nonlinear responses of microbial residues to drought, soil carbon-to-nitrogen ratio, and soil organic carbon. Three important environmental variables (SOC, soil C:N ratio, and aridity index) suggested by SEM were selected for further analysis (Fig. 2c). Linear models, quadratic models, and general additive models (GAM) were fitted to determine the relationships between amino sugar concentrations and environmental variables. Comparisons of Akaike information criteria (AIC) values among the models showed that nonlinear relationships used in the GAMs were best fits for the relationship between amino sugars and aridity index, soil C:N ratio, and SOC concentration ( Table 2 and Supplementary Fig. 1). We identified threshold levels for the increase in aridity index (0.768) and soil C:N ratio (9.583 [0.57 after ln (x + 1) conversion, and 2.26 after ln (x) conversion, respectively], Fig. 3a, b) for the accumulation of amino sugars. Above this threshold, the concentrations of amino sugars decreased significantly (Fig. 3c, d). According to the generalized climate classification scheme of aridity index values formulated by the United Nations Environment Program in 1997, this threshold level would lie within the humid climate class (aridity index >0.65; Supplementary Table 1). The threshold of soil C:N ratio was between the lower quartile and the lower decile, and was less than the mean C:N ratios (14.16 ± 0.35) observed in our study ( Supplementary Fig. 2). For SOC, the curvature at the threshold did not affect the original trend, and there was a linear increase in accumulation of amino sugars with SOC (Fig. 4).
Predictors of microbial residues in different ecosystem types and climatic zones. Since the majority of study sites, forests and grasslands (87.5% and 99.3%, respectively, Fig. 1) are located in subtropical and temperate zones, we conducted independent analyses for subtropical forests, temperate forests, subtropical grasslands, and  temperate grasslands. Our Random Forest models suggested that SOC and aridity index were the most important predictors of amino sugars in subtropical grasslands, temperate grasslands, subtropical forests, and temperate forests (environment variable: p < 0.05, Fig. 5).
Soil pH and C:N ratio also significantly predicted amino sugars in grasslands (p < 0.05, Fig. 5a, b), while the absolute latitude predicted amino sugars in forests (p < 0.05, Fig. 5c, d). In contrast to other categories, soil clay content was the most influential factor determining amino sugars in temperate forests (p < 0.05), while the effect of soil pH was not significant (Fig. 5). Our SEMs explained 88.2, 74.4, 74.6, and 62.4% of the variance in amino sugars in subtropical grasslands, temperate grasslands, subtropical forests, and temperate forests, respectively (Fig. 6). From the STEs of SEMs of all categories, we identified SOC as the main Correlations between environmental variables. c Standard total effect of SEM. d Random Forest model, which shows average predictive importance (mean square error (MSE) increase percentage) for each environmental factor for absolute content of soil amino sugars. SEM shows the causality and correlation between the absolute content of amino sugars in soil and the environmental variables. Red lines indicate positive effects; blue lines indicate negative effects. The thickness and color of the lines are directly proportional to the standardized path coefficient on the single arrow. SOC soil organic carbon. R 2 represents the variance of biomarkers explained by the model. Asterisks indicate the significance of each predictor, with one and two asterisks indicating p < 0.05 and p < 0.01, respectively.   positive regulatory factor for amino sugars (Fig. 7). In grasslands, aridity index and soil pH were the next most important predictors (Fig. 7a, b), while the absolute latitude strongly predicted amino sugars in temperate grasslands (STE: 0.280, Fig. 7b). In subtropical forests, the effect of soil pH on amino sugars was second only to that of SOC, followed by soil C:N ratio and soil clay content (Fig. 7c). The soil C:N ratio in temperate forests showed a strong negative effect on amino sugars (STE: −0.272, Fig. 7d).

Discussion
In terrestrial ecosystems, latitude significantly affects light intensity and thus photosynthesis in surface vegetation, resulting in the inhibition of soil C input and changes in soil C fluxes [44][45][46] .
Globally, although the amino sugars closely related to SOC had a significant correlation with latitude (p < 0.05, Fig. 2b), in the present study, the correlation coefficient between them was markedly low (R 2 = 0.044, Fig. 2b), and the SEM revealed low STE for latitude (0.030, Fig. 2c). This suggests a weak effect of latitude on the accumulation of amino sugars. However, we found that SOC, soil C:N ratio, and aridity index were the most robust predictors of amino sugars (Fig. 2c). Several studies have reported strong positive correlations between amino sugars and SOC 47,48 , with higher soil organic matter content associated with higher concentrations of soil microbial residues. Our results are consistent with these findings; although microbial necromass C accounts for a large proportion of SOC 10 , it does not completely determine the amount of SOC. However, SOC can be used as a predictor of microbial residue concentration. Aridity is a key driver of biological and geochemical processes 49,50 . Increasingly arid climate conditions have altered the elemental (C, N, etc.) balance in ecosystems 51 , resulting in decreased available soil C and N, poor soil nutrient supply, and increased N limitation on the growth of plants and soil microorganisms 51,52 . This could cause a decline in microbial function, thereby affecting ecosystem sustainability 53 . Additionally, drought conditions weaken the evaporative cooling effect of plant leaves, reinforcing the effects of high temperatures 54 . Such relationships between soil water and temperature may lead to the production of heat waves 55 , further threatening microorganism survival. Although SOC increased significantly with decreased aridity (increase in aridity index; Supplementary Fig. 3a), the amino sugar concentrations did not increase continuously (STE: −0.226, Fig. 2c). Therefore, no significant linear relationship was observed between aridity index and microbial residues (Supplementary Fig. 3b). It is known that an increase in aridity results in limitation of water availability 26,56 , reduction in soil hydraulic conductivity 57 , formation of disconnected resource islands 58 , and reduction in metabolic activity and substrate utilization efficiency of soil microbial communities 24,59 . Moreover, aridity can reduce the capacity of microorganisms as decomposers to utilize SOC and plant litter for growth and reproduction, thus reducing the efficiency of accumulation of microbial residues 25,60 . Drought conditions can also reduce the production of plant litter and root biomass, reducing plant C input 61 . The consequent lack of available substrates can in turn delay the production and accumulation of microbial residues.
If the global climate situation improves and aridity stress decreases, soil water content could increase, and ecosystem processes could gradually recover. In such a context, the content of plant-derived C transformed into microbial-derived C by microbial utilization gradually increases, and the efficiency of transfer and stabilization of microbial residues to soil increases 6,26 . However, extremely humid climates and active microorganisms are not conducive to the accumulation of SOC (plant-derived or microbial-derived C, etc.). The movement of oxygen in the soil is limited by extremely high soil moisture content 58 . Under such conditions, the growth efficiency of microorganisms is reduced, resulting in lower microbial CUE, which in turn affects the production, accumulation, and stabilization of microbial residues in the soil.
Such environments are more disadvantageous to fungal communities. Because fungi are aerobic, many bacteria can survive under anaerobic conditions 62 , shifting microbial community structure to bacterial dominance under humid climates 61 . However, bacterial residues are more easily decomposed than fungal residues, explaining the gradual decrease in microbial residues 27 . Additionally, under wet conditions, the fast leaching channels are reused first 61 , and the leaching loss of available N entails additional N requirement by surface soil microorganisms to meet their growth needs 61 . The  reduction in microbial growth efficiency and CUE reduces the efficiency of microbial-derived C stabilization. Moreover, soil microbial residues may move to deeper layers with leaching, sharply reducing microbial residues in surface soil.
Aridity indirectly affects the accumulation of microbial residues through changing the soil C:N ratio 61 . Changes in microbial residues with changes in this ratio occur after an irreversible critical point, which may also be considered a threshold. C and N are closely related to the growth and development of microorganisms in the biogeochemical cycle 63 . In an ecosystem with limited nutrients and resources, microorganisms put more energy into catabolism through the synthesis of extracellular enzymes, decomposing complex organic matter and releasing compounds suitable for plants and microorganisms 64 , rather than on anabolism (microbial biomass synthesis) 65 . The efficiency of microbial anabolism is weakened, thereby reducing microbial CUE 66 , the production efficiency of microbial-derived C, and the efficiency of long-term slow-cycling microbial-derived C accumulation in soil 67 , as well as ultimately reducing the relative contributions of microbial residues to SOC 68 . The contributions of microbial-derived C to the SOC pool could be enhanced by application of fertilizers with high N content to alleviate environmental pressure 69,70 . Excluding the case of intensive traditional farming activities, environments with sufficient nutrient supply can be considered to accelerate the accumulation of soil microbial residues 68,71 . Hence, the proportion of microbial residues in SOC is higher in soils with low C:N ratios 37,72 due to relatively high soil nutrient content enabling greater anabolism in microorganisms. Moreover, unstable substrates (organic matter with low C:N ratio) can improve the CUE of microorganisms, which is conducive to the formation and accumulation of microbial necromass in mineral soils 52,73 .
We found that soil nutrient supply did not continue to promote the accumulation of microbial residues above a certain level (Fig. 3b). This may be explained by the effect of the soil C:N ratio on the bacteria:fungi ratio 74 . The distribution of fungi is strongly limited by the availability of resources, with low nutrition and acidic environments more conducive to bacterial than fungal growth 74 . However, fungal necromass constitutes the majority of microbial residues (Supplementary Fig. 4 and Supplementary Data 1). Hence, microbial residues decrease when conditions are not optimal for the accumulation of fungal residues. Alternatively, we speculate that this could be the result of interactions among drought, plants, and microorganisms. Drought reduces the plant N absorption, increasing ammonium N and nitrate N in the soil 75 . Furthermore, the weakening of microbial activity reduces the mineralization rate of N 76 , leading to a lower C:N ratio in the soil; this also explains the lower content of microbial residues in high-nutrient soil. To clearly identify the changes in microbial residues caused by drought, and the coupling between plants and microorganisms, we fit the values of amino sugars using a GAM, which revealed a clear threshold for the reduction in microbial residues ( Supplementary  Fig. 5). Under increased drought conditions in terrestrial ecosystems, soil nutrients seem to become sufficient (p < 0.001, Supplementary Fig. 3c). In such a context, the microbial community structure, enzyme secretion characteristics, and microbial biomass formation will reach optima. The microbial carbon pump process, which is driven by microorganism turnover following growth and death in microbial communities 6 , has the greatest effect, with microbial residues gradually reaching a peak. When microbial communities achieved states in which stable C produced by anabolism is transferred to the soil with maximum efficiency, the accumulation of microbial residues in soil began to slow down, and then decreased sharply.
The dominant factors predicting microbial residue accumulation varied, but SOC and aridity index were key factors in all cases (Figs. 5  and 7). Although Random Forest analysis showed that soil C:N ratio was not important in the forest (Fig. 5c, d), it still had large STEs (Fig. 7c, d). Among the other environmental variables, soil pH had a considerable effect on microbial residues in all datasets, excluding temperate forests. Previous studies have shown that soil pH can control soil microbial community structure, affect the adsorption of soil minerals on soil organic matter, and even affect the recycling efficiency of microbial residues by microbial communities 74,77 . Therefore, dynamic changes in soil pH can affect microbially mediated C sequestration 7 . Moreover, we noticed that the effect of pH in subtropical forests was similar to that of SOC, which was significantly stronger than the effect of these factors in other regions. This may be because subtropical forest pH is relatively low, and the accumulation of microbial residues is more sensitive to changes in pH. Similarly, clay mineral content also plays an important role in stabilizing microbial residues 26,78 . The stable storage of SOC is related to both low decomposition as well as the protective mechanism of soil minerals 79 . Microbial residues are stabilized in fine-grained soil minerals by adsorption and binding, and physical protection prevents microbial residues from being reused by microbial communities 69,79,80 . Therefore, soil with higher clay mineral content generally has higher organic C content 81 . Moreover, clay can maintain the moisture and nutrient elements of the surface soil, resulting in greater surface microbial activity 82,83 . Hence, the effect of clay on the accumulation of microbial residues is not the same in different data sets due to these two factors.
Our research has some limitations. First, our criteria for data compilation were not perfect. For example, we calculated the mean for soil amino sugars for the 0-10 and 10-20 cm layers in some studies assuming equal weights, because we reasoned that in most studies, the same quality of soil was used for the determination of amino sugar content; however, this may not be the actual situation. In addition, although we expected the topsoil data to be 0-20 cm, a small number of studies did not reach this soil depth, but we also included them in our synthesis. As the amino sugar concentrations decreased sharply with soil depth, these data may not be consistent with the amino sugar concentrations of soil at the depth of 0-20 cm.
Second, most of our data are from the northern hemisphere, especially from temperate and subtropical regions, with little to no data from boreal and tropical regions (Fig. 1). Although our selection criteria for reducing publication bias excluded some data, we were able to obtain heterogeneous data. This lack of data presents significant limitations on global estimates of microbial residue accumulation. Improvements in data collection and accurate estimation of such residues will be the focus of our future study.
Third, we used data from the global high-resolution (250 m) gridded soil properties database (http://data.isric.org) because some data were not provided in original studies. Such gridprediction data may not match data from the actual research studies. Finally, there remains uncertainty about soil amino sugars as a biomarker for microbial necromass. Because the soil needs hydrolysis or digestion before the detection of soil amino sugars 33 , which means that the detected sugars exist more or less in the complete cell biomass 10 ; therefore, the content of soil amino sugar is affected by microbial biomass to a certain extent. Therefore, future studies should explore the relationship between amino sugars and biomass to determine the accurate amino sugar content for the characterization of microbial residues.
In conclusion, our meta-analysis identified SOC, and drought as the main factors influencing global and regional microbial residues. In the wake of global climate change, microbial communities could experience an optimal anabolic phase. In this phase, the soil microbial carbon pump mediated soil C capture process will exhibit the best response state. At the global scale, the optimum appears when the aridity index is~0.768 or the soil C:N ratio is~9.583, and the amount of microbial residues accumulation peaks. Above this level, microbial residues reduce significantly. We suggest that may be due to changes in dominant microbial species in communities. Collectively, our results provide insights into the optimal conditions for microbial residue accumulation. Our findings provide a useful reference for monitoring and management of terrestrial C storage.

Methods
Data sources. We extensively searched the Web of Science (http:// apps.webofknowledge.com) and the China Knowledge Resource Integrated Database (http://www.cnki.net) for scientific articles on microbial residues. Data were collected from field trials published before December 2020. Search terms were "amino sugars" or "microbial necromass" or "microbial residues," combined with "grassland" or "forest" or "woodland". To reduce the influence of publication bias, only studies meeting the following seven criteria were selected: 1. As previous studies have shown that amino sugars, the biomarkers of microbial residues, exhibit maximum release after hydrolysis with hydrochloric acid for 6-8 h at 105°C 33 , only data from articles using hydrolysis with 6 mol/L hydrochloric acid at 105°C for 6-8 h and determination by gas or liquid chromatography were used. In the datasets we collected, only eight did not use the method of Zhang and Amelung 33 (Supplementary Data 2); therefore, we ignored small differences between the methods for determining the concentration of amino sugars in different studies, such as derivatization, so that we could effectively compare and contrast data from different studies. 2. Sampling depth was clearly defined. Our aim was to investigate the mineral topsoil, excluding the litter layer and O horizon soil. Since most of the studies were carried out at mineral soil depths less than 20 cm, we defined topsoil for this study as 0-20 cm. We selected all the studies that measured amino sugars at 0-20 cm depths. In some studies, one sample included both 0-10 cm and 10-20 cm layers. Since the quality of soil samples used to determine the concentrations of amino sugars in most studies was similar, we calculated the arithmetic mean of the two concentrations at the same point to represent the concentration of amino sugars at 0-20 cm depth. Similarly, with concentrations determined at 0-5 cm, 5-10 cm, and 10-15 cm depths in some studies, we used the arithmetic mean of the three concentrations. 3. Studies in which contents of glucosamine, galactosamine, muramic acid, and mannosamine were provided in the absence of direct supply of total amino sugar concentrations, as these four types of amino sugars are most commonly quantified 32 . Since the proportion of mannosamine is usually less than 4% 32 , its concentrations were not provided in some studies. Therefore, for these studies, we used measurements of the other three types of amino sugars, which are readily quantified, to calculate the total amino sugar concentrations 32 . Some studies only reported the concentrations of fungal necromass C and bacterial necromass C. We used two conversion factors to convert these to glucosamine and muramic acid concentrations 10 . Subsequently, we added them for the calculation of total amino sugar concentrations. Other studies only reported concentrations of glucosamine C, galactosamine C, muramic acid C, and mannosamine C. We converted these concentrations to corresponding amino sugar concentrations, using their relative molecular masses. See below for the detailed calculation method. 4. All amino sugar concentration data were obtained from in situ measurements; laboratory incubation and model simulation data were discarded. 5. Data from soil fractions with different aggregate sizes were not used; only those from bulk soil were used. Data from living microbial cells were not used. 6. In manipulation experiments (e.g., warming, CO 2 rising, and nitrogen addition), the concentrations of amino sugars from the control treatment were used. 7. Data obtained from the averages of large ranges of sample points, from which accurate location information was not available, were not used. To study the factors affecting amino sugars in different communities, grassland and forest ecosystems were divided into tropical, subtropical, and temperate zones. If data from the same study were published in different journals, only the data from one of them were used to avoid pseudoreplicates. The amino sugar content of different altitudes, grassland, or forest types was regarded as an independent duplication in the global analysis for a single study.
We obtained 268 data points (Supplementary Fig. 6 and Supplementary  Data 2). At each study site, we also recorded other information from the original publications, including geographic (latitude, longitude, and altitude) and climate (mean annual temperature, mean annual precipitation, and aridity index) variables. Each dataset is indicated in the graph (Fig. 1a).
Other climate and soil attribute data sources. Since most published articles do not include such data, a global climate database (Worldclim, version 2.0) was used to obtain climate information (mean annual temperature, mean annual precipitation). Similarly, the global high-resolution (250 m) gridded soil properties database (http://data.isric.org) was used to obtain some soil physical and chemical properties (i.e., clay content, pH, SOC content, and N content). Aridity index data were obtained from the Global Aridity and PET Database (http://www.cgiarcsi.org/data/global-aridity-and-pet-database). The Global Land Cover Characteristics Database v2.0 was used to obtain altitude data (https://lta.cr.usgs.gov/GLCC). To evaluate the consistency of the predictors of amino sugars, we ensured that the depth of data used in the database was consistent with the depth of sample points. These data were obtained using the ESRI ArcMap (Environmental Systems Research Institute, Redlands, CA, USA).
Calculation of amino sugar concentrations. We calculated the absolute concentration of amino sugars mg g −1 soil in the surface layer (0-20 cm) of soil. The total amino sugar content is equal to the sum of glucosamine, galactosamine, muramic acid, and mannosamine contents.
Some studies only reported the contents of fungal necromass C and bacterial necromass C in SOC, which is based on the concentration of glucosamine and muramic acid combined with two conversion factors. The absolute concentrations of muramic acid mg g −1 soil and glucosamine mg g −1 soil were calculated as follows 39 In these equations, 45 is the conversion factor from muramic acid to bacterial necromass C, and 9 is the conversion factor from glucosamine to fungal necromass C. It is assumed that the ratio of muramic acid and glucosamine in bacterial cells is 1:2 39,85 .
In some studies, only glucosamine C, galactosamine C, muramic acid C, and mannosamine C were reported. Since muramic acid has nine C atoms, while the other amino sugars have six C atoms, the absolute concentration of each amino sugar mg g −1 soil was calculated as 10 Statistical analysis Microbial residues. Unless otherwise specified, all statistical analyses were conducted using R 4.0.3. Before conducting statistical analyses, we tested the normality of all data. The Scheirer-Ray-Hare test was used to assess differences in amino sugars across vegetation types, climate zones, and to determine their interactions. Additionally, since there was only one data point for tropical grassland, differences in amino sugars among the five categories (subtropical grasslands, temperate grasslands, tropical forests, subtropical forests, and temperate forests) were analysed using the Kruskal-Wallis test. The two-tailed Spearman's linear correlation was used to explore global correlations between amino sugars and environmental variables. Statistical significance was assessed at p < 0.05.
Evaluating the importance of environmental variables. We used all the amino sugar data for model analysis. The Random Forest model is a machine learning algorithm for regression and classification. We used Random Forest models to identify the most important environmental variables (absolute latitude, aridity index 26 , soil clay content, SOC, soil C:N ratio, and soil pH) 86,87 . The importance of variables was evaluated by classifying multiple decision trees 86 . Since our purpose was to only determine the importance of predictors and not to predict the data, we used the whole dataset for analysis, without dividing the data into training and prediction sets. These analyses were performed using the randomForest 88 package in R 4.0.3 (http://cran.R-project.org/). The significance of the model and the cross-validation R 2 were evaluated by using the A3 package. Similarly, the rfPermute package was used to assess the significance of each predictor's importance to amino sugars. We used SEM to test whether the relationship between amino sugars and environmental factors remains unchanged when considering causal relationships among multiple environmental factors at the same time. Because the correlation between SOC and N was significant (Spearman's R = 0.91), only SOC was selected as the organic component of SEM. Before performing the SEM, we performed logarithmic transformation for non-normal variables and standardized each variable using the Z-score transformation to improve the comparability of the data 89 . We built a prior model (Supplementary Fig. 7) based on existing knowledge, and determined the final SEM through maximum likelihood estimation and based on an overall goodness-of-fit, including chi-square (χ 2 ) statistics, whole-model p value, goodness-of-fit index, and the root-mean-square error of approximation 90 . Since some variables were non-normal, the Bollen-Stine bootstrap test was used to recalculate the overall fit of the model 90 . When the bootstrap p value was greater than 0.1, the model was considered to have a good fit 90 . To integrate the function of SEM, we calculated the STE of each environmental variable. Since most studies on forests and grasslands (87.5% and 99.3%, respectively, Fig. 1) were located in subtropical and temperate zones, we only conducted independent Random Forest and SEM analyses on temperate and subtropical forests and grasslands. SEM analysis was performed using Amos 26.0 (Amos IBM, USA).
Linear and nonlinear responses of environmental variables. For global data, we fitted linear and nonlinear (e.g., GAM 91 ) regressions to the relationships between variables with large effect values and amino sugars shown by SEM. The linear model assumes that the response of amino sugars to environmental variables is gradual 92 . The GAM models indicated that the gradient of environmental variables is nonlinear but continuous 92 . We selected the GAM model to describe the complexity of nonlinear trends (through smoothing parameters 91 ). We then used the AIC to determine the best-fit model for each environmental variable 92 . In general, a difference in AIC values >2 indicates that the models are significantly different, with the most likely model being the one with the lowest AIC value 92 .
Threshold detection. The existence of thresholds can be explored and nonlinear trends determined only when the nonlinear model is suitable 92 . As described by Goffman et al. 93 and Miguel et al. 92 , we fitted segmented regressions by actively searching for continuous thresholds, with abrupt changes in the slope on both sides of the threshold 94 , and searching for discontinuous thresholds or breakpoint to fit step + segmented (stegmented) regressions, with changes in intercept and slope on both sides of the threshold 94 . In addition, when segmented or stegmented regression are fitted to the GAM regression model, segmented or stegmented regression can reveal the maximum curvature point of fitting 92 . This can be considered to be a threshold because it shows the extreme value of amino sugar response to environmental variables, even if the fitting of segmented or stegmented regression is worse than that of the GAM model 92 .
Therefore, for environmental variables that the GAM models fit better than linear models, we fit segmented and stegmented regressions. These models all provide a threshold point for prediction, which demonstrates the change in functional relationship (slope or slope + intercept of segmented and stegmented regressions, respectively 94 ). We considered this to be the threshold of the GAM regression model. We used the AIC to select the most suitable threshold model for data. Segmented/stegmented and GAM regressions were fitted with the chngpt 94 and gam packages of R 4.0.3, respectively.
Verifying the importance of the determined threshold. To test whether the determined threshold significantly affects the intercepts of stegmented regressions, we conducted linear regressions on both of its sides of the threshold 92 . Then, we extracted the intercepts and used the boots package in R to perform 1000 bootstrap samplings before and after the threshold of environmental variables for prediction, and tested the difference using the Mann-Whitney U-test. The global map, fitting curve, and histogram of sample distribution in this study were all plotted using R 4.0.3.

Data availability
All datasets used here are publicly available. The global climate database (Worldclim, version 2.0) was used to obtain climate information (mean annual temperature, mean annual precipitation). The global high-resolution (250 m) gridded soil properties database (http://data.isric.org) was used to obtain soil physical and chemical properties. Aridity index data were obtained from the Global Aridity and PET Database (http:// www.cgiar-csi.org/data/global-aridity-and-pet-database). The Global Land Cover Characteristics Database v2.0 was used to obtain altitude data (https://lta.cr.usgs.gov/ GLCC). The datasets during the current study are available at: https://doi.org/10.6084/ m9.figshare.16749967.v1. All data that support the findings of this study are also available in Supplementary Data.