Background & Summary

Grasslands cover ~50 million square kilometers or ~40 percent of the terrestrial area on Earth (excluding Greenland and Antarctica), and comprise various types including prairies, savannahs, rangelands, agricultural grasslands, and coastal grasslands1. As one of the largest ecosystems, grasslands are essential to living organisms including plants, animals and bird species, and it functions as a habitat for wildlife2 and livestock3, and the livelihoods of ~800 million people globally4.

China has the largest grasslands in the world with ~330 million hectares used for feeding animals for human foods5. However, a significant proportion has been degraded since the 1960s, mainly due to: (i) large-scale land reclamation from grasslands to croplands in the 1980s6,7 that aimed at producing more grain-based food for the growing human population but led to a rapid decline in available grasslands for grazing8 and accelerated the degradation of the remaining grasslands9; (ii) an expansion of the livestock industry in the 1990s that reduced the amount of grasslands per head of livestock10, and led to overgrazing of the remaining grasslands with high stocking rates11 and decreased grassland productivity12,13; (iii) exploitation of industrial by-products or mineral resources in some conventional grassland areas14 that decreased feeding-type grasslands; (iv) freshwater shortages for agriculture that presented a major problem for restoring grazing-induced degradation, especially in the arid and semiarid northwest where annual precipitation is typically <160 mm15 and annual evaporation >1,800 mm16; and (v) a shift from traditional nomadic grazing to sedentary feeding systems in the 1990s and 2000s that led to the loss of self-recovery capacity of grazed grasslands17 and reduced the intrinsic biological functions for soil structure18 and ecosystem equilibrium19.

To rejuvenate degraded grasslands, China took drastic measures by establishing a number of rejuvenation programs—including ‘grazing exclusion’ that aimed to eliminate grazing grasslands17, ‘grain-for-green’ program that returned steep cultivated land to grassland20 to alleviate the shortage of forage availability for the livestock industry21, ‘returning croplands to grasses’ that involved re-seeding grasses on marginal lands of cropping22—combined with ‘alternating seasonal-grazing with fallowing’ in moderately degraded grasslands23. Some of these programs have been in place for decades. Numerous studies have been conducted to determine the effectiveness of these programs for rejuvenating degraded grasslands; there is large variation between the studies and between geolocations, largely due to the variation in climatic conditions, grass types, duration of grazing practices, and the magnitude of human activities. It is often misleading to draw general conclusions from individual or site-specific experiments. A systematic analysis is required to define the outcome of the decade-long rejuvenation efforts, and a large dataset across various studies provides a unique opportunity to elucidate those effects.

We generated a comprehensive dataset24 derived from multiple studies on the relationship among grazing intensities, duration of grazing exclusion, and plant community traits and soil physiochemical properties. Data were extracted from 65 data-rich studies (Online-only Table 1) conducted across the major grassland areas in northern China (Fig. 1) with representative grassland types and sub-types included. The key variables include plant traits (percent vegetation coverage, plant biomass, root biomass, and plant diversity) and soil physiochemical properties (soil bulk density, moisture content, organic C, total N and available N, total and available P, C:N ratio, and soil pH).

Fig. 1
figure 1

Geographic locations of the selected studies for the Data Descriptor. The selected locations/sites (indicated by blue dots) cover the major grassland areas in China from the semi-desert and arid northwest to semiarid and humid northeast regions.

We organized the dataset with a ‘Master’ tab and 15 associated tabs. This dataset can be used to (i) determine whether the decade-long grazing management policies have had positive, negative or neutral impacts on plant community composition and diversity, vegetation characteristics, and soil physiochemical properties; (ii) assess plant–soil–climate interactions using a systemic approach as the dataset contains key information on plants, soil, grassland ecosystems, and geographic information; and (iii) conduct comprehensive assessments of the effectiveness of grazing policies on ecological and socioeconomic implications. The synthesis of this dataset can help draw sound recommendations to manage grasslands sustainably and effectively.

Methods

We employed a three-step approach for data collection and synthesis.

First – literature search

We conducted a comprehensive search of peer-reviewed literature, mainly through Agricola, Google Scholar, and Scopus. Search terms were defined and used to query the Institute for Scientific Information Web of Science since 1979, when the first grazing policy took place in China. The initial search in the article title, abstract and keywords revealed 2,520 articles (n0; Fig. 2) of potential interest. From the initial search, we identified articles meeting the first five criteria: (i) studies conducted under field conditions excluding those conducted in a controlled environment or simulation study; (ii) studies conducted in northern China; (iii) studies including ‘non-grazing or grazing exclusion’ and ‘continuous grazing’ treatments in their experimental structure with other treatments considered optional or additional; (iv) at least two years of field evaluation with replications each year; and (v) at least two or more variables measured. These criteria narrowed the selection of searched articles to 317 (n2; Fig. 2). We narrowed the 317 articles further to 65 using additional criteria—(vi) articles published in journals with a full-text in English and are searchable by main stream databases available scholarly, such as Scopus or Web of Science25. Non-English articles or articles that have an English abstract only but lack the measurements described in the above-described criteria were excluded.

Fig. 2
figure 2

Workflow chart for generating dataset output. Brown boxes represent the number of articles (ni, where i = 0, 1… or 6), included or excluded, step-by-step, based on the selection criteria; dark green boxes denote the articles selected for the present Data Descriptor. The six selection criteria are briefly described.

Second – data extraction

We extracted relevant data on a treatment-by-treatment basis from each of the identified articles (Online-only Table 1), entered the data in Excel spreadsheets, examined visually for usefulness, and combined them into a ‘Master’ data file. Some of the results presented in graphs in the original articles were converted into values using a graph-to-value conversion program (https://automeris.io/WebPlotDigitizer/). In some studies, where the results were aggregated, treatment means with a standard error of the mean were determined. Additionally, we reviewed 12 other articles that presented meta-analysis results, including articles describing the effects of grazing exclusion on soil C sequestration that included 78 studies26, plant biomass from 48 studies27, soil microbial communities from 71 studies28, soil C and N cycling from 115 studies29, grassland management and greenhouse gas emissions from 67 studies30, seasonal grazing and soil respiration from a 6-year study31, and grassland management and soil chemical properties in different climates from 105 studies32. Relevant data points from these articles were selectively entered into our ‘Master’ file, if they met the criteria we defined above.

Third - database structure

The Excel ‘Master’ file contained all the data points24 extracted from the original articles, detailed in 997 rows and in various columns (Table 1), with column (A) as code, (B) author name (first author only) and publication year, (C) latitude/longitude coordinates of the study, (D) and (E) ecological region and locations (city or province), (F) year(s) in which the field experiment was conducted, (G) types of grasslands evaluated in the published study (some of the studies did not specify the type), (H) duration (number of years) of grazing treatments imposed, (I) type of animals involved in the study, (J) name of the variable and units reported, and (K) depths of soil sampled for soil physiochemical property measurements.

Table 1 Detailed names and descriptions for each column shown in the ‘Master’ Tab of the Excel file.

Values in columns (L) to (W) related to the four different levels of grazing intensities, including (L) ‘non-grazing or grazing exclusion’, (N) ‘light or mild grazing’, (Q) ‘moderate grazing’, and (T) ‘heavy grazing or overgrazing’. The values in these four columns (L, N, Q, and T) were either actual values extracted directly from the original articles or converted into the same unit across studies. The four adjacent columns (i.e., columns M, O, R and U) following each of the four grazing-intensity columns are the number of replications used in the specific study. To facilitate further analysis (by the authors of the present paper or others who might be interested in using this dataset), we calculated the differences between (P) ‘grazing exclusion’ and ‘light grazing’, (S) ‘grazing exclusion’ and ‘moderate grazing’, and (V) ‘grazing exclusion’ and ‘heavy or overgrazing’. Standard errors in column (W) referred to the variation among the four grazing intensity treatments. In column (X), we specified the data origin, if they were converted from a figure in the original articles.

Data Records

Having entered all the data extracted from original articles into the ‘Master’ tab in the Excel file described above, we created 12 separate tabs to provide detailed information on the 12 key variables derived from the ‘Master’ file. The ‘Article’ tab lists the 65 selected data-rich articles, including the names of attributes for the article, each starting with the name of the first author, followed by the name of the journal and other attributes of the articles.

The 12 variable tabs provide potential data-users with much-needed convenience and may enhance the usefulness of the dataset24. The 12 tabs to the right present data on four plant-related and eight soil-related variables: (1) ‘Veget%’ is percent vegetation coverage (%), (2) ‘AbvBiom’ is aboveground plant biomass (g m−2), (3) ‘RootBiom’ is root biomass (g m−2), (4) ‘PltDiv’ is plant diversity, (5) ‘BulkD’ is soil bulk density, (6) ‘Soil-H2O’ is soil moisture content, (7) ‘SOC’, soil organic carbon at 0–15 cm depth, (8) ‘Soil-N’ is total soil-N at 0–60 cm depth, (9) ‘Avail-N’ is available soil-N including NH4+-N and NO3-N at 0–60 cm depth, (10) ‘Soil-P’ is total and available soil-P at 0–40 cm depth, (11) ‘C-N Ratio’ is C:N ratio as reported in original articles (not calculated from this dataset), and (12) ‘Soil-pH’ is soil pH. We used a similar layout for each of the 12 variables, considering ease of use for data-users. Two final tabs (‘Units’ and ‘Note’) provide background information on the calculation of each variable’s units and the categories and observations.

Technical Validation

Six grassland types (Ti, where i = 1, 2, 3, 4, 5, and 6) were included, with the number of observations varying among the four ecological zones (Nj…with j = 1, 2, 3, and 4) (Fig. 3). The dataset contained 34,747 observations by experimental site × growing season × treatments × grassland types.

Fig. 3
figure 3

Six types of grasslands in the four ecological zones in China. Ti (where i = 1, 2….0.6) in each of the six types represents the number of treatments × studies × years; Ni (where i = 1, 2….0.4) in each of the four ecozones represents the number of treatments × studies × years.

Each of the 12 key variables had enough data points to perform some basic analyses. To validate the usefulness of the dataset, we present four plant-related variables to demonstrate how the dataset could be analyzed by potential users, as follow:

For the variable ‘percent vegetation coverage’, we demonstrated that the mean difference (n = 226) between ‘grazing’ and ‘non-grazing’ practices was 20.61% (±1.43), ranging from 17.8 to 23.4% (Table 2). The distribution patterns across the studies showed that the mean differences between the two grazing systems for each of the studies (Fig. 4), where the majority of the studies showed a greater vegetation coverage under non-grazing practice compared to that under continuous grazing. Similarly, for the variable ‘aboveground plant biomass’, the mean difference between the two grazing practices was 6.67 (±0.84) kg ha−1, ranging from 5.02 to 8.31 kg ha−1 (Table 3). Apart from two studies that did not show a difference, all studies showed that aboveground plant biomass distribution patterns favored ‘non-grazing’(Fig. 5). Of the 32 studies that measured root biomass, 28 favored ‘non-grazing’ over ‘grazing’ with a mean difference in root biomass of 97.6 (±7.3) kg ha−1, ranging from 83.2 to 111.9 kg ha−1 (Table 4), and the distribution patterns favored the non-grazing practices with a few exception (Fig. 6). For the variable ‘plant diversity’, the effect of grazing practices was marginal, despite p-values > 0.001 (Table 5), and the distribution patterns scattered widely among studies (Fig. 7), where the mean differences in plant diversity varied considerably, with some studies skewed to the left, a few others skewed to the right, and the remainder with a mean difference near zero (i.e., the central vertical line). These distribution patterns are presented to validate the dataset only and illustrate how the dataset could be used by potential users.

Table 2 Percent vegetation coverage between grazing and non-grazing practices in northern China grasslands.
Fig. 4
figure 4

Distribution patterns in percent vegetation coverage between ‘grazing’ and ‘non-grazing’. Mean difference ≥0 means the results favor ‘non-grazing’, ≤0 means the results favor ‘grazing’, and a value of 0 means no difference between the two grazing practices. The distribution pattern shows the 95% confidence interval.

Table 3 Aboveground plant biomass between grazing and non-grazing practices in northern China grasslands.
Fig. 5
figure 5

Distribution patterns in aboveground plant biomass between ‘grazing’ and ‘non-grazing’. Mean difference ≥0 means aboveground plant biomass yield favors ‘non-grazing’, ≤0 means the result favors ‘grazing’, and a value of 0 means no difference between the two practices. The distribution pattern shows the 95% confidence interval.

Table 4 Root biomass between grazing and non-grazing practices in northern China grasslands.
Fig. 6
figure 6

Distribution patterns in root biomass between ‘grazing’ and ‘non-grazing’. Mean difference ≥0 means root biomass yield favors ‘non-grazing’, ≤0 means the result favors ‘grazing’, and a value of 0 means no difference in root biomass between the two practices. The distribution pattern shows the 95% confidence interval.

Table 5 Plant diversity between grazing and non-grazing practices in northern China grasslands.
Fig. 7
figure 7

Distribution patterns in plant diversity between ‘grazing’ and ‘non-grazing’. Mean difference ≥0 means plant diversity favors ‘non-grazing’, ≤0 means the result favors ‘grazing’, and a value of 0 means no difference in plant diversity between the two practices. The distribution pattern shows the 95% confidence interval.

Usage Notes

The compilation of experimental field data from 65 data-rich studies resulted in a dataset full of meaningful information24, with a set of variables that cover the key attributes of grassland properties. We suggest that potential dataset-users can perform quantitative assessments on whether grassland rejuvenation processes have had a significant impact on grassland ecosystem properties.

In particular, the dataset could be used for the following:

  1. (1)

    To quantify whether rejuvenation programs have had a significant impact on aboveground plant communities and belowground soil properties by comparing grazing exclusion with other grazing practices (light/mild grazing, and year-around continuous grazing). The aboveground plant community variables—including percent vegetation coverage, quantity of aboveground and belowground biomass (primarily roots), and the derived ratio of aboveground to belowground biomass—can be used by modelers to estimate carbon input into soils by different plant parts. It also provides valuable information on the magnitude of plant diversity in relation to the diverse grassland rejuvenation practices33. Further, the dataset can be analyzed to assess whether the rejuvenation programs have had an impact on soil physiochemical properties, including soil bulk density, moisture content, organic carbon, total and available N, total and available P, soil C:N ratio, and soil pH. Such analyses can help to determine the degree of grassland degradation and rejuvenation and their impact on grassland ecosystem sustainability33,34,35,36. Additionally, the dataset includes a variable—duration (the number of years) of grazing practices—that distinguishes grazing exclusion from continuous grazing. A comparison of the two contrasting practices against the duration of grazing can help identify the trend of the effect in grassland properties. The latter feature is unique for this dataset, and is rarely found in existing scientific literature.

  2. (2)

    The dataset can be used as a solid base for performing more comprehensive analyses such as a meta-analysis37. The dataset presents the results from more than 65 well-designed field studies with two or more replicates and site-years. Analyzing the dataset using meta-analysis may help to elucidate large-picture effects. More data with treatment structure and measurements meeting the criteria defined above could be entered into the Master file to build an even stronger dataset for comprehensive analysis.

  3. (3)

    The dataset can be analyzed to learn the magnitude of grass–climate–soil–human interactions on the outcome of grassland degradation and rejuvenation. The collected data come from a wide range of grasslands spread across diverse landscapes from semi-desert, arid, semiarid to humid climatic zones with varying weather conditions. Climatic variability across grassland zones may have had an impact on some aspects of grassland properties such as species composition38, herb abundance39, and shrub encroachment40, as well as belowground properties41. Such analysis may help with understanding the complex nature of interactions among meteorological, topographic, and soil environments with plant community structures and management practices. This type of analysis may have significant societal value for policymakers, researchers, and the general public.