U.S. cereal rye winter cover crop growth database

Winter cover crop performance metrics (i.e., vegetative biomass quantity and quality) affect ecosystem services provisions, but they vary widely due to differences in agronomic practices, soil properties, and climate. Cereal rye (Secale cereale) is the most common winter cover crop in the United States due to its winter hardiness, low seed cost, and high biomass production. We compiled data on cereal rye winter cover crop performance metrics, agronomic practices, and soil properties across the eastern half of the United States. The dataset includes a total of 5,695 cereal rye biomass observations across 208 site-years between 2001–2022 and encompasses a wide range of agronomic, soils, and climate conditions. Cereal rye biomass values had a mean of 3,428 kg ha−1, a median of 2,458 kg ha−1, and a standard deviation of 3,163 kg ha−1. The data can be used for empirical analyses, to calibrate, validate, and evaluate process-based models, and to develop decision support tools for management and policy decisions.


Background & Summary
Winter cover crops provide many ecosystem services such as weed suppression, improved soil structural and hydraulic properties, increased soil organic carbon (C) stocks, reduced erosion, reduced winter nitrogen (N) leaching, and N provision to cash crops [1][2][3][4] .Cover crop biomass production is often positively correlated to ecosystem service provisions 1 .For example, previous research has shown that weed biomass often decreases with greater cereal rye (Secale cereale) residue 5,6 , and soil organic C often increases with cover crop biomass 7,8 .Similarly, there were greater reductions in nitrate leaching with increases in non-leguminous cover crop shoot biomass in a global meta-analysis 3 .Elucidating the agronomic, soils, and climate controls on winter cover crop performance can help farmers determine the optimal time to terminate cover crops for maximum agronomic benefits, and support broader adoption of winter cover crops for increased climate resilience, reduced soil erosion, and lower nutrient pollution 2 .
We focused on performance data of cereal rye, the most commonly used cover crops in the United States 9,10 .To acquire data for this study, we reached out to potential data contributors through regional cover crop groups and the Precision Sustainable Agriculture and Getting Rid of Weeds networks, two national research consortia with a major focus on cover crop research.Recruiting data contributors through this network enabled us to assimilate many plot-level observations from both on-station and on-farm studies (Fig. 1).Our goal was to create a dataset on cereal rye cover crop biomass quantity and quality across heterogeneous agronomic, soils, and climate conditions with broad coverage across the eastern half of United States (Fig. 1).

Methods
We collected data on cereal rye cover crop performance metrics (biomass, N content, and C:N ratio), additional agronomic and soil data, and metadata such as any associated publications (Table 1).The minimum data required from a location for inclusion in our dataset was aboveground (shoot) cereal rye biomass, the respective harvest or sampling date, the experimental site name, year, latitude and longitude, cereal rye planting date, cereal rye planting method (drilled vs. broadcast), and whether N fertilizer was applied during cover crop growth (Table 1).
Additional data gathered included cereal rye cultivar; seeding rate; row spacing; plant population; tiller density; growth stage at sampling; cumulative growing degree days; shoot N (concentration or content); shoot C:N ratio; N fertilizer rate, type (form), and application date; presence of fall tillage; and previous and subsequent cash crop.Root C and N data were requested but not available for any study.Additional soil data gathered included texture class and/or clay, silt, and sand percentages; bulk density; soil pH; soil ammonium, nitrate, # A full list of authors and their affiliations appears at the end of the paper.and/or total inorganic N; soil organic matter or C; as well as soil sampling depth and timing.The overwhelming majority of observations were recorded as plot level data.In the few cases (17 observations) when plot level data were not available, we collected treatment means with standard deviations (Table 1).At least 14 of the 28 studies correspond to existing publications, which are detailed in the study metadata CSV file [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25] .Plot sizes varied from as small as 6.1 × 6.1 m to as large as 30.5 × 42.7 m depending on the study.These published datasets described methods for cereal rye biomass collection and some soil analyses in each publication.The DOI from any publications associated with data provided is listed in the "publication_DOI" column in the study metadata file, and   "NA" is listed for studies with unpublished data.The methods for cereal rye biomass collection and some soil analyses and unpublished are detailed in the "methods_unpublished_data" column of the study metadata file.

Data DeSCriptor opeN
To focus on the vegetative biomass of cereal rye that is planted in the fall and followed by cash crops in the spring, we only included biomass sampling that occurred in February, March, April, or May.We excluded observations from sites that terminated winter cover crops after May-such as forage experiments where cover crop vegetative biomass and grain were sampled in June or later.We also omitted biomass data outliers that fell above the 99.9 th percentile (n = 6) because the upper end of the distribution appeared to have unrealistic values.All figures and summary statistics were created using the statistical software R (v4.1.3) 26and the following packages: ggplot2 v3.4.1 27 , ggmap v3.0.1 28 , and ggsn 0.5.0 29 .

Data records
The metadata and primary data collected can be accessed through the following repository: https://doi.org/10.5061/dryad.tx95x6b3h 30.The data are organized in tabular CSV files including a data dictionary, study metadata, rye data, agronomic data, and soil data.(Table 1).In 26 out of 245 site years, location data from private farms were obscured to ensure privacy and have lower location accuracy than the rest of the dataset; as such, those locations are indicated as "TRUE" in the "location_obscured" variable.We reported location data with as much precision as possible for each datapoint, as a result, there are varying levels of precision in reported latitude and longitude coordinates.

technical Validation
To check the validity of the data we collected, we examined the spread of the data and found some unreasonable values.Cereal rye biomass data outliers that fell above the 99.9 th percentile were excluded.The final dataset had a mean of 3,428 kg ha −1 , a median of 2,458 kg ha −1 and a standard deviation of 3,163 kg ha −1 (Fig. 2).We also checked overall data validity by assessing whether cereal rye biomass generally increased with time elapsed from planting to termination date.This increase was observed; however, there was a large amount of variability (Fig. 3).This variability in cereal rye biomass production may be explained by differences in agronomic, soil, and climate factors.A subset of commonly reported ancillary data is summarized in Table 2 and Fig. 4.
the acquisition of data and manuscript editing.Ashley L. Waggoner contributed to the acquisition of data and reviewed the manuscript.John M. Wallace contributed to the acquisition of data and reviewed the manuscript.Samantha Wells contributed to the acquisition of data and reviewed the manuscript.Charles White contributed to the acquisition of data and reviewed the manuscript.Bethany Wolters contributed to the acquisition of data and reviewed the manuscript.Alex Woodley contributed to the acquisition of data and reviewed the manuscript.Rongzhong Ye contributed to the acquisition of data and reviewed the manuscript.Eric Youngerman contributed to the acquisition of data and reviewed the manuscript.Brian A. Needelman contributed as a senior author to design, drafts, and revisions.Steven B. Mirsky contributed as a senior author to conception, design, acquisition of data, drafts, and revisions.

Fig. 1
Fig. 1 Map of research locations with the point color scaled by the number of available observations.

Fig. 2
Fig. 2 Histogram of cereal rye shoot dry biomass data.

Fig. 3
Fig. 3 Cereal rye shoot biomass versus time elapsed between cereal rye planting to termination date.

Fig. 4
Fig. 4 Summary of planting method data.

site ID, year, state, latitude, longitude, whether location was obscured for privacy, experimental design, number of replications, publication DOI, methods for unpublished data
File name, attribute name, attribute definition, string format, unit, number type study ID, site ID, year, whether soil samples were taken, sampling event timing and depths, soil ammonium, soil nitrate, total inorganic N, soil texture class, sand %, silt %, clay %, bulk density, percent soil organic matter or carbon, pH Table1.Structure of the data records with file name, purpose, and variables included for each tabular dataset included.Bolded variables were required for inclusion in the (minimum) dataset; other variables were not available for all sites.

Table 2 .
Summary of commonly-reported ancillary data from the rye growth dataset.