Introduction

A politically popular strategy to mitigate agriculture’s contributions to climate change is drawing down carbon dioxide (CO2) from the atmosphere and storing it in agricultural soils via soil organic carbon (SOC) sequestration. SOC sequestration is central to programs such as the “4 per mille Soils for Food Security and Climate” initiative launched at the COP21 and the agricultural aspects of many countries’ pledges to the Paris Agreement1. Despite its popularity, however, the real potential to sequester substantial SOC through agriculture remains uncertain, in part because of the ubiquitous use of flawed or incomplete methods to measure changes in SOC.

To start, long-term studies are necessary to understand if a practice truly sequesters SOC. Without careful and spatially precise sampling, it can be difficult to delineate short-term SOC changes from natural variation2. Further, if SOC gains are temporary, they do not mitigate climate change in the long-term. To truly mitigate climate change, SOC gains must be maintained for decades or more3. Among the few long-term studies that track SOC, ubiquitous but incomplete methods have muddied our understanding of the true impact of agricultural management on SOC. Despite decades of research highlighting their flaws, studies still may over- or under-estimate SOC change by using common C-accounting methods: (1) comparisons lacking longitudinal baseline data, (2) failing to correct for bulk density (ρb) changes, and (3) sampling only surface soils2,4,5,6.

Many studies lack longitudinal data and instead compare concurrent SOC stocks, substituting a difference over time with a difference over space (“space-for-time”) and assuming implicitly or explicitly that any difference indicates SOC increases in the improved management scenario. However, unless the SOC stocks of the control treatment in such “space-for-time” studies remain constant over time (an unlikely situation in a warming climate7), these studies cannot demonstrate C sequestration8. Relative differences in SOC stocks can be found in scenarios where the control and improved management treatments are both gaining or both losing SOC, telling us little about the actual rates of SOC accrual in either treatment6,9.

Further exacerbating these accounting issues, SOC data is often limited to the surface soil (15 or 30 cm deep). Though there are statistical (strongest signal) and logistical (time, equipment) reasons to sample only surface soils, exclusively relying on surface soils may obscure overall SOC effects. For example, if a treatment increases surface SOC while losing more SOC at unsampled depths, it does not actually increase SOC or drawdown atmospheric CO210.

Changes in ρb over time also may obscure SOC accrual. Many studies treat depth from the soil surface as a constant through time and between treatments, despite significant soil expansion or compaction as a result of changing management11,12. These ρb changes can directly affect how much of the soil profile is sampled, making comparisons between treatments difficult5. For example, Guillaume et al. (2022) found a 16% underestimation of treatment effect on SOC when changes in ρb were ignored13. To account for these changes, we must convert depth-based measurements to an unchanging reference system, such as equivalent soil mass (ESM). Institutions such as the IPCC and FAO have emphasized the importance of ESM corrections14,15, but standardization remains lacking in SOC crediting methods4.

When studies use more comprehensive methodologies (change over time, deep sampling, and ESM corrections) while estimating SOC stock changes, they often demonstrate different magnitudes or even directions of SOC change compared to studies using incomplete methods. One such 8-year study by Liu et al. (2022), investigating the effects of no-till and straw retention on SOC in wheat-maize systems, found that tillage redistributed C through the soil profile but did not lead to significant changes in SOC stocks, while straw retention increased SOC stocks regardless of tillage16. These results are inconsistent with those of Al-Kaisi and Kwaw-Mensah (2020) using depth-based methods or Powlson et al. (2014) evaluating only surface soils17,18. Both reported potential for SOC gains in no-till systems, a contradiction potentially explained by using incomplete methods.

While we theoretically understand how these methods may affect results, we have a limited understanding of the general effects and relative impact of using them on real data. The few studies that use more comprehensive methods and report the sensitivity of results to incomplete methods tend to address only one cropping system or method, which limits the generalizability of their results. To truly test the impact of incomplete methods on results, we must compare the results of using them versus comprehensive methods on data from multiple cropping systems in an otherwise controlled study.

While accurate accounting of SOC is important for all major soil types, of particular interest are Mollisols, the high-C, fine-textured grassland soils that support much of global intensive agriculture19. In the Upper Midwest U.S. and elsewhere, Mollisols are used primarily for the production of commodity cash-grains (maize, soybean, wheat) or forages for livestock (maize for silage, alfalfa, pasture), while a common conservation practice is restoration to grassland. To assess the potential of C sequestration in agricultural Mollisols, we assessed SOC over 30 years at the long-term Wisconsin Integrated Cropping Systems Trial (WICST). Our goals were to (1) explore changes in SOC among three cash-grain systems, three dairy-forage systems, and a restored prairie using comprehensive assessment methods and (2) assess how partial methodologies (using space-for-time methods, omitting ESM corrections, and limiting analysis of SOC stocks to surface soils) affect SOC estimates.

Results

SOC change from baseline in years 20 and 30

A general linear mixed effect model of total 0–90 cm SOC stocks from 1989–2019 (ΔSOC 1989–2019) indicated a significant cropping system effect (p = 0.03). For all cropping systems except Management-Intensive Rotation Grazing (MIRG) and Prairie, we estimated a net loss of SOC (p < 0.05) (Fig. 1). Of the five cropping systems that lost SOC, the cash-grain rotations, Maize, Maize-Soy (MS), and organic Maize-Soy-Wheat (org. MSW) lost the most SOC (−0.90 Mg ha−1 yr−1, −0.79 Mg ha−1 yr−1, and −0.77 Mg ha−1 yr−1, respectively) (Table 1). We found evidence for SOC loss in all depths in Maize, although evidence for surface 0–15 cm losses was weaker (p = 0.09) than losses deeper in the soil (p < 0.05). In MS we found evidence (p < 0.05) for SOC losses at every depth. Within org. MSW, we found no evidence for SOC gain or loss in the 0–15 cm depth but found strong evidence (p < 0.005) for losses below. The alfalfa-based dairy-forage systems lost SOC at similar annual loss rates of −0.68 Mg ha−1 yr−1 (p < 0.001) and −0.60 Mg ha−1 yr−1 (p = 0.003) for Maize-alfalfa-Alfalfa-Alfalfa (MaAA) and organic Maize-oats/alfalfa-Alfalfa (org. Mo/aA), respectively (Table 1). In both, lack of evidence for change in surface 0–15 cm SOC stocks belied the significant (p < 0.05) losses in lower depths. In the perennial systems, MIRG and Prairie, there was no evidence for change (0.16 Mg ha−1 yr−1, p = 0.7 and −0.02 Mg ha−1 yr−1, p = 0.91, respectively) in SOC in the 0–90 cm soil profile (Fig. 1). Gains in MIRG’s 0–15 cm depth (0.23 Mg ha−1 yr−1, p = 0.05) were offset by losses in the 15–30, 30–60, and 60–90 cm depths, despite those losses being nonsignificant (p > 0.1) (Table 1). Likewise in Prairie, surface 0–15 cm gains (0.26 Mg ha−1 yr−1, p = 0.04) were offset by nonsignificant losses in lower depths. We found similar results for ΔSOC1989-2009 as Sanford et al. in 2012 (Table 1, see Sanford et al.20 for their 20 year analysis).

Fig. 1: ΔSOC in the 0–15 cm and 0–90 cm soil profile from 1989 to 2019.
figure 1

Center bar represents change in soil organic carbon (ΔSOC) between 1989 and 2019 estimated by the linear mixed effects model. Boxes represent +/- the standard error. Whiskers represent upper and lower 90% confidence limits. Letters represent results of pairwise comparisons within each depth at alpha = 0.1. Treatment abbreviations are as follows: maize, cropping system of continuous maize; MS, minimum tillage cropping rotation of maize to soybean; org. MSW, organic cropping rotation of maize to soybean to winter wheat with cover crop; MaAA, cropping rotation of maize followed by 3 years of conventional alfalfa; org. Mo/aA, organic cropping rotation of maize followed by oats/alfalfa followed by alfalfa; MIRG, management intensive rotationally grazed pasture seeded to red clover, timothy grass, smooth bromegrass, and orchardgrass; prairie, cool-season grassy waterways established in 1990 planted to soy in 1998 and to warm-season grass mixes in 1999.

Table 1 ΔSOC (1989–2009 and 1989–2019) estimated by the linear mixed effects model

SOC change between years 20 and 30

Total 0–90 cm SOC did not significantly change in any of the systems between 2009 and 2019 (Supplementary Information, Table S1). However, there were changes within depths. In MS, losses continued in the 15–30 cm depth (−0.29 Mg ha−1 yr−1, p = 0.008). Org. MSW lost 0.47 Mg ha−1 yr−1 (p = 0.06) in the 30–60 cm depth. In org. Mo/aA, there was weak evidence for surface 0–15 cm gains (0.22 Mg ha−1 yr−1, p = 0.096). In MIRG, there was strong evidence for gains in the 60–90 cm depth (0.32 Mg ha−1 yr−1, p < 0.001). Finally, in Prairie, the 0–15 cm depth gained SOC at a rate of 0.77 Mg ha−1 yr−1 (p = 0.01).

Effects of alternative methods

The simulated use of alternative, less comprehensive methods resulted in marked changes to the estimated total ΔSOC1989-2019 of each system (Fig. 2). These changes were not consistent in magnitude across systems. The space-for-time, shallow sampling, and depth-based methods, compared to the comprehensive results, affected the ΔSOC1989-2019 by ranges of −0.13–0.49, 0.02–0.47, and −0.07–0.13 Mg ha−1 yr−1, respectively.

Fig. 2: ΔSOC in the 0–15 cm and 0–90 cm soil profile from 1989 to 2019 for different data collection and analysis methods.
figure 2

Center bar represents change in soil organic carbon (ΔSOC) between 1989 and 2019 estimated by a linear mixed effects model. Boxes represent +/- the standard error. Whiskers represent upper and lower 90% confidence limits. Treatment abbreviations are as follows: maize cropping system of continuous maize, MS minimum tillage cropping rotation of maize to soybean, org. MSW organic cropping rotation of maize to soybean to winter wheat with cover crop, MaAA cropping rotation of maize followed by 3 years of conventional alfalfa, org. Mo/aA organic cropping rotation of maize followed by oats/alfalfa followed by alfalfa, MIRG management intensive rotationally grazed pasture seeded to red clover, timothy grass smooth bromegrass and orchardgrass, prairie cool-season grassy waterways established in 1990 planted to soy in 1998 and to native warm-season grass mixes in 1999.

Incomplete methods altered the trends emerging from the data. Using space-for-time, with 2019 MaAA SOC stocks as the baseline rather than 1989 SOC stocks by location, the losses estimated by the comprehensive assessment in Maize and MaAA were no longer statistically different from zero (a change of 0.49 and 0.54 Mg ha−1 yr−1, respectively), and gains in MIRG became statistically significant (a change of 0.43 Mg ha−1 yr−1) (Supplementary Information. Table S2). Shallow sampling (0–30 cm) rather than sampling to 90 cm had a positive impact on apparent ΔSOC1989-2019 across systems, with an average increase of 0.35 Mg ha−1 yr−1. It also reduced variation in ΔSOC1989-2019, especially for the perennial systems MIRG and Prairie. The effect of depth-based sampling without ESM corrections was marginal for the cumulative soil profile, but more pronounced in the 0–15 cm depth (Fig. 2), which had experienced greater changes in ρb. Within the 0–15 cm depth, losses in Maize and MS and gains in Prairie were no longer evident in the depth-based scenario (a change of 0.12, 0.11, and −0.15 Mg ha−1 yr−1, respectively) while gains in org. MSW became statistically significant (a change of 0.08 Mg ha−1 yr−1).

Discussion

All cropping systems lost SOC over a 30-year period except MIRG and Prairie. The maintenance of SOC in MIRG and Prairie may stem from high C inputs (above- and below-ground, see Sanford et al.20) and absence of tillage, providing an ideal environment for soil aggregation and subsequently SOC protection from loss to the atmosphere21,22. Relatedly, Rui et al.23 reported greater carbon use efficiency (CUE) and lower oxidative enzyme activity in MIRG than the other non-Prairie systems at WICST (Prairie was not included in their analysis), suggesting a greater proportion of the C metabolized by the microbial communities in MIRG was assimilated into microbial biomass rather than respired23. Microbial necromass is responsible for much of the C found in mineral-associated organic matter, one of the slowest cycling SOC pools24. Rui et al.23 found that MIRG had more mineral-associated organic carbon than the other non-Prairie systems.

The lack of evidence for SOC change below 15 cm in MIRG and Prairie may reflect limited sample size (n = 12 and n = 6, respectively), leading to greater uncertainty in our estimates of SOC stocks. The experimental design at WICST results in more plots dedicated to the multiple-phase systems such as MaAA and org. Mo/aA than the single-phase systems such as Maize, MIRG, and Prairie. The low number of samples taken for Maize (n = 12 yr−1), MIRG (n = 12 yr−1), and Prairie (n = 18 in 1989 and 2009, n = 6 in 2019) complicates comparisons with the other systems. That said, the lack of evidence for change in deep SOC in MIRG and Prairie may come from true SOC maintenance, as C inputs may balance outputs in those systems. Deep C inputs may come from exudation and turnover of the deeper, more extensive root systems of perennial grasslands25,26. Dissolved organic carbon, thought to be a major precursor of SOC27, also may be percolating through the soil from the C inputs above, given the improved water infiltration associated with perennial grasslands28.

In contrast to the perennial grass systems, we found significant losses in all depth increments below 15 cm of all field crop systems (e.g., Maize, MS, org. MSW, MaAA, and org. Mo/aA). Soil warming driven by a changing climate may be responsible for the release of this deeper SOC, as both laboratory and in situ studies have shown that soil warming can induce SOC loss7,9,29. Globally, loss of subsoil SOC has been observed in both agricultural and natural systems in the past several decades, suggesting a widespread cause such as warming6,30,31.

Additionally, our results imply that reduced tillage alone is insufficient to build or maintain SOC stocks in systems with low C inputs. For example, the minimum-tillage cropping system MS lost a significant amount of SOC despite a lack of soil disturbance, likely from the combination of relatively shallow roots and limited biomass returned to the system during the soybean phase. Others have shown that no-till systems require both leguminous cover crops and double cropping to generate C inputs sufficient to offset losses at depth18,32.

The lack of significant changes between 2009 and 2019 in most systems and depths may have several causes. It is possible that, despite ongoing SOC fluxes, a decade was insufficient to accrue detectable changes on these C-rich Mollisols33. Another possibility is that SOC accrual is slowing as these systems approach equilibrium. For MIRG and Prairie, this explanation would align with a meta-analysis of perennial crop age and SOC that found a slowing of SOC gains in perennial systems around 20 years34. Using this 20-year benchmark, MIRG should have approached SOC equilibrium around 2009, while the Prairie treatment (established in 1999) should have approached equilibrium around 2019, explaining both the maintenance of SOC in MIRG’s upper depths and the ongoing accumulation in Prairie’s surface 15 cm.

The use of space-for-time, shallow sampling, or depth-based methods generally resulted in overestimation of ΔSOC. In the case of space-for-time, most systems’ ΔSOC were overestimated, although org. MSW and Prairie were underestimated. Space-for-time studies rely on two assumptions. One, that the baseline system (here, MaAA) is at equilibrium, and two, that SOC stocks across systems were reasonably similar at the beginning of the experiment. Our longitudinal data undercuts the first assumption, since MaAA lost SOC at a rate of about −0.68 Mg ha−1 yr−1 since the start of the experiment. As for the second assumption, despite our randomized complete block design, baseline 1989 SOC stocks for each system were quite variable (Supplementary Materials, Table S3), which may explain why systems were impacted differently by the switch to the space-for-time method. Using the space-for-time method, one may conclude that MIRG was sequestering SOC, Maize and MaAA were SOC-stable, and only MS, org. MSW, and org. Mo/aA were losing SOC, trends that have been noted in many space-for-time studies6,35,36. However, using the comprehensive longitudinal data, we arrived at less sanguine conclusions.

If we had only sampled to 30 cm depth instead of 90 cm, we would have overestimated ΔSOC for all the systems, for an average increase in ΔSOC of 0.35 Mg ha−1 yr−1. Under this sampling scheme, estimated losses, especially in Maize, MaAA and org. Mo/aA are reduced, making these annual and semi-perennial systems appear more climate-smart than the comprehensive analysis indicates. Losses of SOC below 30 cm may represent a blind spot in climate models and C market ventures, many of which assume SOC stocks are stable or increasing on agricultural land with improved management (see Mathers et al.37), such as minimum-tillage (MS), organic management (org. Mo/aA), and/or cover crops (org. MSW)38.

If we had neglected to account for changes in bulk density (ρb) using ESM conversions, and instead relied on depth-based measurements, the changes in the total 0–90 cm ΔSOC would have been marginal, ranging from −0.07 Mg ha−1 yr−1 in MIRG to 0.13 Mg ha−1 yr−1 in MaAA. Within the 0 –15 cm depth, which was most likely to experience change in ρb due to management, the use of depth-based measurements had divergent effects on Prairie versus the rest of the systems (Supplementary Materials, Table S2). This divergence likely occurred because Prairie’s ρb decreased where all other systems’ ρb increased. Increased ρb correlates with increases in ΔSOC when switching to a depth-based method since increased ρb leads to more soil (and thus more SOC) sampled5.

Sensitivity of our comprehensive results to commonly-used incomplete methods, especially space-for-time, demonstrates the importance of long-term, longitudinal, compaction-adjusted studies that measure the entire soil profile. Relying solely on less comprehensive methods, as many studies do, may lead to overestimations of C sequestration. This suggests that C sequestration in agricultural soils may not be as effective in mitigating climate change as previously thought, and other tactics (e.g., reducing combustion of fossil fuels) should be pursued.

The common field crop systems evaluated in this study were sources, not sinks, of atmospheric CO2. Common SOC accounting methods (space-for-time, depth-based, and shallow sampling) obscured the magnitude of SOC losses in these soils, and in the case of space-for-time, suggested SOC gains in rotationally-grazed pastures where more comprehensive methods showed none. While C sequestration in productive, agricultural soils may not be possible, our work aligns with others suggesting that permanently restored and well-managed perennial grasslands are the best options to mitigate agricultural soils’ CO2 emissions.

Methods

Site description and history

We conducted this work at the University of Wisconsin-Madison’s Arlington Agricultural Research Station, in Arlington, WI (43°18’N, 89°20’W) on soils classified as Plano Silt loam (fine-silty, mixed, superactive, Mesic Typic argiudolls). These are relatively deep (about 1 m), well drained soils with little relief that developed under tallgrass prairie vegetation (see Curtis, 195939) in loess deposits over calcareous glacial till. Mean annual temperature and precipitation between 1991 and 2020 were 7.8 °C and 902 mm, respectively40.

Conversion of tallgrass prairie vegetation to row-crop agriculture (primarily wheat), began in the 1840s with the attempted removal of Indigenous peoples such as the Ho-Chunk by northern Europeans. Between the 1860s and the middle of the 20th Century, after wheat yields crashed due to soil degradation and pest pressure, crops for dairy cattle feed (e.g., alfalfa, clovers, maize, oats) predominated. While agricultural practices shifted substantially following the Green Revolution, dairy forage systems remained dominant at the site, and from the 1960s until the establishment of WICST in 1989, maize (Zea mays L.), alfalfa (Medicago sativa L.), and soybeans (Glycine max (L.) Merr.) were the main crops grown, with dairy manure applied for fertility (see Posner et al., (1995) for additional historical details41).

In 1989, maize was planted across the 24-ha site to improve soil uniformity and allow for baseline measurements (crop yield and soil parameters), which were used to determine the boundaries of each block in the core experiment’s four-block randomized complete block design41. In 1990, six cropping systems were initiated in a staggered start beginning with the legume phase of each rotation (if present). To reduce the potential confounding influence of yearly weather variability, every phase of every cropping rotation is present in one plot per block every year. For example, continuous maize (Maize) has one plot per block, while maize followed by 3 years of alfalfa (MaAA) has four plots per block. These rotations represent three cash-grain and three dairy-forage systems common to the Upper Midwest U.S.

The three cash-grain systems include high-input, or chemically and mechanically intensive, continuous maize (Maize), moderate-input minimum-tillage maize-soybean (MS), and organically managed maize-soybean-winter-wheat with a cover crop (org. MSW) (Table 2). The Maize system represents a conventional, continuous maize rotation, using synthetic fertilizer, tillage and herbicides for weed control, and top-yielding maize hybrids with advanced genetic traits such as herbicide resistance and insecticide (Bt) production. The MS system represents a maize-soybean rotation using minimum tillage, synthetic fertilizers, herbicides for weed control, and top-yielding maize hybrids and soybean varieties with advanced genetic traits. Soybeans are no-till drilled into maize stover, and maize is strip-till planted in soybean stubble. The org. MSW is managed according to the USDA’s National Organic Program Standards. Fertility comes from biologically fixed N, composted poultry manure, and potassium sulfate. Weeds are controlled via tillage and cover crops. High-yielding organically-certified maize hybrids and organically-certified soybean and soft red winter wheat varieties are planted. Following wheat harvest mid-summer, a berseem clover and oats cover crop is planted and grows until terminated the following spring before maize is planted.

Table 2 Wisconsin Integrated Cropping Systems Trial (WICST) experimental cropping system descriptions

The three dairy-forage systems include high-input maize-alfalfa (MaAA), organic maize-oats/alfalfa-alfalfa (org. Mo/aA), and management-intensive rotationally grazed pasture (MIRG) (Table 2). The MaAA system represents a high-input, high-yielding dairy forage rotation, with fertility derived primarily from the application of dairy manure, weed control via herbicides, and top-yielding maize hybrids and alfalfa varieties with advanced genetic traits. Typically, two forage harvests occur during the alfalfa seeding year and between three and four harvests occur during other years of the alfalfa phases of the rotation. The org. Mo/aA system is managed according to the USDA’s National Organic Program Standards. Fertility comes from the application of dairy manure, weeds are controlled with tillage, and organically-certified top-yielding maize hybrids and organically-certified leaf hopper resistant alfalfa varieties are planted. Oats are planted with alfalfa to suppress weeds and assist alfalfa establishment in the alfalfa seeding year. Typically, two forage harvests occur during this establishment year; a first harvest of primarily oat biomass with a second harvest of primarily alfalfa biomass later in the growing season. Between three and four alfalfa-only harvests occur during the 2nd year of the alfalfa phase of the rotation. The MIRG system is designed to represent a heifer-raising operation. Heifers rotationally graze the pasture plots from May to October, with a new subsection of the pasture available each day. Fertility comes from biologically fixed N and manure as well as synthetic fertilizer applied according to University-recommended best management practices. Red clover is seeded as necessary to maintain 35% legume levels in the cool season grass pasture.

Plots are relatively large (~0.3 ha, 18 × 156 m) and all fieldwork is carried out with production scale farm equipment, with fertilizer, pesticides, and other inputs applied according to University-recommended best management practices. Additional details on the design of WICST were presented in Posner et al.41. In 1990, grassy waterways were established where precipitation preferentially flows in the field. In 1998, these grassy waterways were chisel-plowed, sprayed with glyphosate, and planted to 1 year of soybeans. In 1999, a mix of native prairie legumes, forbs, and C4 grasses were established in a three-block randomized complete block design with two treatments: a high-diversity mix (25 species) and a low-diversity mix (6 species), for a total of six plots of restored prairie (Prairie) (Supplementary Information, Fig. 1)42.

Soil sampling

In 1989, prior to the establishment of WICST, baseline soil samples were collected across the entire 24-ha field by points on a 27 × 27-m grid. At each sampling point, four cores were taken, divided and homogenized by depth (0–6, 6–12, 12–24, and 24–36 inches, or 0–15, 15–30, 30–60, and 60–90 cm, hereafter in cm). These increments were first chosen to align with standard agronomic soil tests based on surface soil (agronomic fertility, 0–15 cm) and deep soil sampling (soil NO3–N, 0 to 90 cm). In addition to initial agronomic considerations, the glacial till (found at 80 to 120 cm) precluded consistent deeper sampling across the entire experiment. After initial analysis in 1989, the samples were dried, ground, and archived. In 2009, these dried homogenized samples were cleaned of visible plant material and analyzed for C content, which is expected to be stable in dried archival samples43. Full details on soil sampling and processing of the samples from 1989 can be found in Sanford et al.20.

In June 1989, cores for ρb were collected at two sampling depths (0–15 and 15–30 cm) on the same 27 × 27-m grid. Given that detectible change in ρb below 30 cm between 1989 and 2009 was unlikely, ρb values from 2009 were used for 1989 below 30 cm (see Sanford et al.20). Comparison of the 2009 and 2019 ρb validated this assumption as no significant differences were found between the two most recent timepoints below 30 cm. Therefore, like Sanford et al.20, we used ρb values from 2009 for the 30–60 and 60–90 cm depths in 1989.

In April through July 2009 and October through November 2019, samples for ρb and SOC analysis were collected at the center of each third of each plot (North, Center, South). Full details on soil sampling and processing of the samples from 2009 can be found in Sanford et al.20. In 2019, we collected cores for ρb in the autumn using a 5.4 cm diameter hydraulic core. Missing or damaged cores for ρb (3.5% of total cores) were recollected in October 2020. Cores were divided into 0–15, 15–30, 30–60, and 60–90 cm sections, as in 2009. Prior to weighing, we dried the sections in a 50-°C oven until weights stabilized. After weighing, we sieved the sections to 2 mm to remove rock fragments from glacial till. We subtracted the weight and the estimated volume of these fragments from the total section weight and volume prior to ρb calculation, so that ρb estimates represent the same soil as the SOC measurements (see von Haden et al.5). After this adjustment, we calculated ρb by dividing the weight of the dried section by the volume of its section of the soil core.

To align the original 1989 sampling grid with the 2009 and 2019 data, which were collected at locations within plots not yet established in 1989, we georeferenced a map of the original sampling grid and determined the location of the original sampling points relative to plot-level sampling points using ArcGIS. We then used ordinary kriging to spatially interpolate the 1989 data and digitally resample at the plot-level sampling points using R package gstat (v4.3.1). This differs from the approach of Sanford et al. (2012) who used an unweighted average of samplings that occurred within each plot20.

Organic C determination

Because inorganic C in these soils is negligible (~0.005%)44 we used total soil C interchangeably with SOC. To confirm this assumption and verify that routine applications of aglime (CaCO3 or CaMg(CO3)2) had not significantly increased inorganic C in the surface 15 cm, we measured organic C of samples from all recently limed plots using standard methods provided by Thermo Finnignan for use with our Flash EA 1112 CN Automatic Elemental Analyzer45. We found no statistical evidence that SOC was different from total soil C, so we continue to assume total soil C is interchangeable with SOC.

Finely powdered subsamples of the dried soils from each depth (0 to 15, 15–30, 30–60, and 60–90 cm) at each point in the 1989 grid or in each location (N, Ct., S) in each plot were weighed into 5 × 9-mm tin capsules. While 8 to 10 mg of soil was used for all depths of the 1989 and 2009 soils (see Sanford et al.)20, a soil mass of 15, 18, 30, and 50 mg was used for the 0–15, 15–30, 30–60, and 60–90 cm depths, respectively, in 2019 to accommodate declining C content with depth and enhance signal to noise ratio. Total C concentration for each sample was determined by dry combustion using a Flash EA 111d CN Automatic Elemental Analyzer (Thermo Finnigan, Milan, Italy)45.

Estimating SOC stocks

To determine C stocks, rather than using depth from surface, we used cumulative mineral soil mass from surface as it is a reference system that remains stable over time (i.e., not impacted by changes in ρb or changes in organic matter)5. The ESM method accounts for compaction, expansion, and addition or loss of organic matter, ensuring the same section of the soil profile is considered each time. For the 20-year analysis, Sanford et al. (2012) used Excel software to convert depth-based measurements to ESM to calculate the SOC stock per unit area20. Here, for the 30-year analysis, these calculations were performed using R code provided by von Haden et al.5.

Analysis of SOC change over time

General linear mixed effects models (PROC GLIMMIX, SAS v 9.4) were used to: (1) estimate change in SOC within each system between 1989, 2009, and 2019 (ΔSOC1989-2009, ΔSOC1989-2019, ΔSOC2009-2019), and (2) compare between systems the estimated change in SOC. The dataset was subset by depth (0–15, 15–30, 30–60, 60–90, and 0–90 cm) and year (2009 and 2019). For each subset, to address the potential influence of outliers, any Δ ± 2 SD from the mean for each system were removed. Then, for each subset, a general linear mixed effects model was used to estimate ΔSOC as a function of cropping system. In this analysis, system (Maize, MS, org. MSW, MaAA, org. Mo/aA, MIRG, Prairie) was treated as a fixed effect, and block (1–4 for the core trial, 5–7 for the prairie) was treated as a random effect. Because of expected spatial and temporal correlations in the data, and to enable variance heterogeneity between cropping systems, we chose a variance-covariance matrix with a first-order heterogeneous autoregressive structure (subject = plot, group = system). Within SAS, LSMEANS was used to determine means, standard errors, and p-values for each cropping system. Because of the inherent spatial variability of SOC, we used an alpha level of 0.1 for all statistical comparisons.

Sensitivity analysis of ΔSOC methods

To examine the influence of SOC stock estimation methodology (space-for-time, depth-based, and shallow sampling) used in estimating ΔSOC1989-2019, we analyzed three datasets, each produced with each method. These were then compared to a dataset generated using a comprehensive SOC stock estimation method based on: longitudinal measurements, ESM correction, and the full 90-cm soil profile (Supplementary Information, Table S4). In the simulated space-for-time dataset, only data from 2019 were used. MaAA was chosen as the baseline scenario for calculating ESM conversions and change-from-baseline since it is most like the crop rotation present at the site prior to the establishment of WICST and would thus be assumed at SOC equilibrium in a traditional space-for-time analysis. Equivalent soil mass and change since 1989 were determined using an average of MaAA plots in each block in 2019 as the baseline. For the shallow-sampling dataset, data were converted to ESM to account for ρb changes, then SOC data below 30 cm were removed, simulating a scenario where we did not measure below 30 cm. For the depth-based dataset, the full 90-cm SOC stock was included, but not translated to ESM coordinates. Each simulated dataset was then analyzed for change over time using the same linear mixed effects model as the comprehensive dataset to determine estimates of ΔSOC1989-2019. To quantify the impact on SOC stock estimates of less comprehensive methods, ΔSOC1989-2019 obtained via each alternative method (space-for-time, depth-based, shallow sampling) was subtracted from ΔSOC1989-2019 obtained via the comprehensive method to calculate Δ(ΔSOC1989-2019).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.