How ecosystem productivity and species richness are interrelated is one of the most debated subjects in the history of ecology1. Decades of intensive study have yet to discern the actual mechanisms behind observed global patterns2,3. Here, by integrating the predictions from multiple theories into a single model and using data from 1,126 grassland plots spanning five continents, we detect the clear signals of numerous underlying mechanisms linking productivity and richness. We find that an integrative model has substantially higher explanatory power than traditional bivariate analyses. In addition, the specific results unveil several surprising findings that conflict with classical models4,5,6,7. These include the isolation of a strong and consistent enhancement of productivity by richness, an effect in striking contrast with superficial data patterns. Also revealed is a consistent importance of competition across the full range of productivity values, in direct conflict with some (but not all) proposed models. The promotion of local richness by macroecological gradients in climatic favourability, generally seen as a competing hypothesis8, is also found to be important in our analysis. The results demonstrate that an integrative modelling approach leads to a major advance in our ability to discern the underlying processes operating in ecological systems.
Ecosystem productivity and species diversity are essential to the ability of natural systems to provide goods and services. Yet, for decades there has been debate over their interrelationship. In the 1970s and 1980s, conflicting models predicted that elevated productivity would lead to reductions in species richness4,5,6,7. Beginning in the mid-1990s, scientists started to seriously debate another possibility: that richness could promote productivity9,10,11,12. While experimental studies generally support a biodiversity enhancement of productivity13,14, the precise strength of the effect in natural systems and the relationship of this process to other factors that can influence productivity remain major questions. Adding to the debate, macroecological theories propose that regional diversity is controlled by gradients in climatic favourability and evolutionary history15 and that these larger-scale effects are important determinants of smaller-scale diversity patterns8.
The search for a canonical bivariate productivity–richness relationship lies at the heart of the debate among ecologists. This pursuit is fuelled, in part, by the history of the discussion, which has focused on bivariate predictions16. At the same time, it is also seen by some as a means of assessing the overall importance of various mechanisms operating in natural systems. While many different mechanisms have been discussed, the primary competing theories make four main conflicting predictions: (1) richness and productivity should increase together with increasing resources and environmental favourability until limits to coexistence are reached at high productivity and richness declines, producing a humped-shape relationship4,5,6,7,17; (2) richness promotes productivity, leading to a positive relationship6,9; (3) richness and productivity increase together because climatic gradients in productivity lead to increased regional species pools, creating a positive relationship but from a separate mechanism8; and (4) the richness–productivity relationship will be of inconsistent form because the mechanisms controlling them vary in their scale-dependence and relative importance18,19.
Empirical tests of the generality of hypothesized bivariate productivity–richness patterns have reported a wide variety of results and have produced substantial discussion20,21,22,23. Recent global studies2,3 have disagreed with regard to whether a coherent pattern exists for natural grasslands. What has been agreed upon, however, is that the low explanatory power coming from conventional analyses suggests the need to pursue an integrative understanding of the causal mechanisms controlling productivity–richness relationships.
One potential explanation for why debate over mechanisms is proving difficult to resolve is because productivity and richness are jointly controlled by a complex network of processes1,21,24,25,26,27,28. Overcoming the challenge of evaluating more complex hypotheses requires both advanced statistical modelling approaches and large-scale systematic data collection efforts. Here, we used structural equation modeling29 to integrate key predictions from competing theories into a multi-process hypothesis for evaluation. We then evaluated the hypothesis using data collected for that purpose by a global consortium, the Nutrient Network (http://nutnet.org). The data collected comprise samples from 1,126 plots collected at 39 grass-dominated sites around the world. Variables measured include plant species richness, productivity (measured as the annual biomass increment), total biomass (the accumulated non-woody biomass, live and dead, including litter), along with many of the drivers hypothesized to be important for regulating their variations. Additional information is provided in the Methods.
To integrate theoretical expectations from competing theories, we mined the productivity–diversity literature to determine the main theoretical constructs discussed and the hypothesized interconnections between constructs (see Methods). We used this information to develop a structural equation meta-model that assimilates the essential theoretical constructs and hypothesized connections into a network of multivariate expectations (Extended Data Fig. 1, Extended Data Table 1 and Supplementary Information). This meta-model, along with the available data, guided our development of a structural equation model for empirical evaluation. We evaluated model-data consistency to determine whether there were missing linkages in the initial model as well as to determine the support for proposed links. We further addressed the question, ‘what dimension of model (that is, number of parameters and linkages) is required to detect the signals in the underlying data?’. For this, we evaluated lower-dimensional versions of the model by removing linkages and re-evaluating against the data. More methodological detail is provided in the Methods and Supplementary Information.
A simple bivariate plot of richness against productivity (Fig. 1A) reveals little about the underlying mechanisms. Previous analyses of such bivariate relations have found it difficult to even detect significant associations2,30. However, our analysis based on an integrative model reveals strong, clear signals consistent with numerous proposed mechanisms, including several that are not at all suggested from the bivariate data.
First, we found clear evidence that the accumulation of total biomass (hereafter simply ‘biomass’) leads to a negative effect on species richness. At the site level, the partial effect (r∂) of biomass on richness in the model was strong (Figs 1B, a and 2; r∂ = −0.77). The reduction of richness was not found to be mediated by our one-time measurement of average shading at the ground surface, which was subsequently dropped from the model. At the plot level, however, we found evidence that biomass increases shading (r∂ = 0.56), which in turn, decreases richness (Figs 1B, d, e and 2; r∂ = −0.34). The negative effects of biomass on richness appear consistent with long-standing hypotheses that predict a hump-shaped productivity–richness relationship due to competitive dominance at high productivity5,17. However, while those hypotheses assume increasing competitive intensity with increasing productivity, our results reveal a linear effect across the full range of biomass observed in this study (Fig. 1B, a).
Second, we found a positive, linear enhancement of productivity by richness in the model. This effect was among the strongest found at the site-scale (Figs 1B, b and 2; r∂ = 0.67), and was detectable, although weak, at the plot-scale (Figs 1B, f and 2; r∂ = 0.02). A surprising feature of the site-level result is the apparent absence of a levelling off of the biodiversity enhancement of productivity at higher levels of richness. Such a continuous effect has been theorized for larger-scale studies and contrasts with the asymptotic levelling off usually found in experimental (smaller-scale) studies13. Previous attempts to isolate an effect of richness on productivity with observational data using simpler models have failed to do so (see Supplementary Information).
Third, we found strong and independent influences of macroclimate and soils on richness and productivity. The standardized effect sizes provide insights into the relative importance of these processes (Table 1). At the site level, productivity was most strongly related to soil fertility, while richness was most strongly related to climate and soil suitability, with heterogeneity and disturbance also important (Fig. 2). Rather than being made up of similar environmental factors, the soil environmental drivers of richness and productivity were negatively correlated at the site level (Table 1, r∂ = −0.56), supporting previous claims of their semi-independence21. Thus, theories that presume a simultaneous increase in productivity and richness with increasing environmental favourability (see Supplementary Information) fail to correspond with the independent responses to environmental drivers observed in natural systems.
Our results show that failure to account for the variation in richness and productivity explained by the environmental drivers would make it difficult to detect the reciprocal influences of productivity and richness on each other. In fact, our capacity to isolate underlying processes was highly sensitive to model dimensionality, where dimensionality refers to the number of measured determinants of productivity and richness included in the model (Extended Data Table 2). At both site and plot levels, models omitting either productivity or biomass (but not both) still permitted us to detect the feedback from richness to biomass production. Any other simplifications at the site-level, however, resulted in a failure to detect previously detected pathways and resulted in a dramatic loss of signal (as indicated by reduced values of R2 in the model).
Regarding scale dependence, plot-level values of productivity, biomass, and richness were strongly related to site-level estimates (Fig. 2), as is common with hierarchical data. This should be interpreted as meaning much of the overall plot-to-plot variation in productivity, biomass, and richness can be ascribed to site-to-site variations in those properties. In this case, within-site variations in productivity were explained solely by site-level productivity, as there were no predictors for remaining among-plot variations. Within-site variations in richness, however, were additionally explained by within-site variations in soil suitability and shading. Also sensitive to scale was the strength of the feedback from richness to productivity, which was much stronger at the site scale. While multiple factors probably play a role in this scale-dependence, the simplest explanation here may be the smaller span of conditions sampled within sites compared with across sites.
Finally, in contrast to a bivariate model, which our analyses suggest can explain no more than 10% of the observed variation in richness, our structural equation model explains 61% of the variation in richness among sites, and 65% of the variation in richness among plots. An ability to explain a substantial portion of the variation in richness is tremendously important for potential conservation applications. Model complexity is also important because of its more detailed mapping onto nature, as our model can make statements about how both specific management actions (such as reduction of biomass through mowing or increase in soil fertility through fertilization), as well as shifts in climate conditions, may alter both productivity and species richness.
Our findings give reason for optimism about the future of ecology as a more precise and less ambiguous science. We show that many of the proposed processes connecting productivity and richness offered during previous decades operate simultaneously as parts of a whole system of effects. Details of the findings, however, refine many of our assumptions about how those processes operate. Our field’s previous failure to resolve debate about productivity–richness relationships stems from a lack of integration of ideas and absence of simultaneous tests of their combined implications. By integrating and testing those ideas, our approach provides a systems-level understanding and improves our chances to foresee the possible consequences of human alteration of environmental factors, productivity, and richness now occurring worldwide.
Development of meta-model hypothesis
A review and accounting of the history of claims and disputed points in the published literature was developed before construction of the meta-model that guided this analysis (Extended Data Fig. 1 and Supplementary Information). During this review, attention was paid to the theoretical constructs invoked by various authors, since our goal was to provide a framework that had the potential to clarify and resolve disputed points. Attention was also paid to types of variable measured by different authors, as the relationship between constructs and measurements constitutes one of the several sources of ambiguity and confusion31,32. An in-depth description of the literature synthesized to generate the meta-model is presented in the Supplementary Information.
Data collected by the Nutrient Network Cooperative33 was used to design and evaluate a structural equation model based on the meta-model presented. The Nutrient Network is a distributed, coordinated research cooperative. Sites in the Network are dominated primarily by herbaceous vegetation and intended to represent natural/semi-natural grasslands and related ecosystems worldwide. Individual sites were selected to accommodate at least a 1,000 m2 study design footprint. Most sites sampled vegetation in 2007, although 12 sites sampled in 2008 or 2009. No statistical methods were used to predetermine sample size. Samples were collected using a completely randomized block design. The standard design has three blocks and ten plots per block at each site, although some sites deviate slightly from this design. A few sites are grazed or burned before sampling, consistent with their traditional management. Further details on site selection and design can be found at http://www.nutnet.org/exp_protocol.
In this study, we analysed data from 39 of the 45 sites considered in ref. 2 possessing a complete set of covariates (Extended Data Table 3). While ref. 2 only examined bivariate relations between productivity and richness, our analyses brought in many additional variables (Extended Data Table 1) so that we could address the many hypotheses embodied in the meta-model. Individual plots with greater than 10% woody plant cover were omitted from consideration to maintain comparability in total biomass across plots. This step resulted in the removal of 73 plots, leaving 1,126 plots in the data set analysed. Four plots were omitted owing to incomplete plant data and one for incomplete light data. For two of the sites, live mass was estimated from total mass using available information on the proportion of live to total. One apparent measurement error was detected for light data and the associated plot removed from the analysed sample. Random imputation methods34 were used for cases where there were missing soil measurements at a site. The decision to use this approach was based on weighing the demerits of deleting nearly complete multivariate data records versus introducing a modest amount of random error through the imputation process.
Study plots in this investigation had a perimeter of 5 m × 5 m and were separated by 1 m walkways. A single 1 m × 1 m subplot within each plot was permanently marked and sampled for species richness during the season of peak biomass. Sites with strong seasonal variation in composition were sampled twice during the season to assemble a complete list of species. To obtain an estimate of site-level richness, we used a jack-knife procedure35. (Because there have been some recent advances in the reduction of certain sources of bias in richness estimation36, we checked our original results by computing site-level richness using the new iNEXT R package. The correlation between the two estimates of richness was found to be 0.972.)
Productivity and total above-ground biomass were sampled immediately adjacent to the permanent vegetation subplot. Vegetation was sampled destructively by clipping at ground level all above-ground biomass of individual plants rooted within two 0.1-m2 (10 cm × 100 cm) strips. Harvested plant material was sorted into the current year’s live and recently senescent material, and into previous year’s growth (including litter). For shrubs and sub-shrubs, the current year’s leaves and stems were collected. Plant material was dried at 60 °C to a constant mass and weighed to the nearest 0.01 g. We used the current year’s biomass increment as our estimate of annual above-ground productivity, which commonly serves as a measurable surrogate for total productivity37,38. All sites used this protocol to estimate productivity (except for the Sevilleta, New Mexico, site which relied on species-specific allometric relationships39). Total above-ground biomass was computed as the sum of the current year’s biomass and that from previous years and included remaining dead material (litter). Photosynthetically active radiation was measured at the time of peak biomass, both above the vegetation and at the ground surface, the ratio representing the proportion of available light reaching the ground. Degree of shading was computed as 1.0 minus the proportion of light reaching the ground.
Within each plot, 250 g of soil were collected and air dried for processing and soil archiving. Total soil %C and %N were measured using dry combustion gas chromatography analysis (COSTECH ESC 4010 Element Analyzer) at the University of Nebraska. All other soil analyses were performed at A&L Analytical Laboratory, Memphis, Tennessee, USA; these included the following: extractable soil phosphorus and potassium were quantified using the Mehlich-3 extraction method, and parts per million concentration estimated using inductively coupled plasma-emission spectrometry. Soil pH was quantified with a pH probe (Fisher Scientific) in a slurry made from 10 g dry soil and 25 ml of deionized water. Soil texture, expressed as the percentage sand, percentage silt, and percentage clay, was measured on 100 g dry soil using the Buoycous method. Further details on sampling methodology are at http://www.nutnet.org/exp_protocol.
Climatic characteristics were obtained for each site from version 1.4 of BioClim, which is part of the WorldClim40 set of global climate layers at 1 km2 spatial resolution. To represent measures of temperature and precipitation with meaningful relationships to plant growth in global grasslands, we selected mean temperature of the wettest quarter of the year (BIO8) and total precipitation of the warmest quarter of the year (BIO18). Climate values were extracted using universal transverse Mercator (UTM) coordinates collected near the centre of each site.
Several derived variables were developed to include in the modelling effort. To represent within-site heterogeneity, coefficients of variation were computed for the site-level model based on plot-to-plot variation in plot-level measures. This allowed us to examine the explanatory value of heterogeneity in soil nitrogen, phosphorus, potassium, and pH, as well as heterogeneity in biomass and light interception. Indices of total resource supply and resource imbalance were also calculated using the method of ref. 27 and evaluated for inclusion in our models.
Disturbance history information for the sites was converted into four binary (0,1) variables for analyses; information available included pretreatment history of (1) substantial anthropogenic alteration (for example, conversion to pasture), (2) grazing history, by wild or domestic animals, (3) active management (typically haying or mowing), and (4) fire. Current levels of herbivory were estimated by comparing biomass inside and outside exclosure plots located at each site.
Certain variables were constructed within the structural equation modeling process using the composite index development methods of ref. 41. Consideration of the ideas conveyed by the meta-model (Extended Data Fig. 1) and the specific situation being modelled suggested the need to develop index variables for soil fertility and soil suitability. Soil fertility indices were developed using all measured soil properties and were operationally defined as the drivers of productivity, controlling for all other effects on productivity in the model. Two indices were developed, one for site-to-site variations and another for plot-to-plot variations. Similarly, soil suitability indices were developed for the site- and plot-level data using all measured soil properties as potential contributors and operationally defined as the drivers of richness, controlling for all other effects on richness in the model.
Modelling with composites in structural equation models involved a two-step process. First, we constructed a fully specified structural equation model (as represented in Fig. 2), but providing a specific set of soil properties to serve as formative indicators for soil fertility and soil suitability. Variables that did not contribute to the total model (on the basis of model fit indices) were eliminated individually for the two composites being formed. The resulting prediction equations were used to compute index scores. Then, the model was reconstructed, substituting the indices in place of the collection of individual soil properties. Documentation of this process is provided in the Supplementary Information computer code (R script).
A structural equation model was developed based on the ideas embodied in the meta-model, available data, and the principles and procedures laid out in ref. 42. Indicators for constructs were chosen from the set of variables available and quantities that could be computed from them (Extended Data Table 1). The modelling approach used was semi-exploratory in that while we worked to address the general hypothesis embodied in the meta-model, the precise variables (for example, mean annual precipitation versus mean annual precipitation in the warmest quarter of the year) to use for certain constructs (specifically, resource supplies and regulators) were determined empirically. Compositing techniques were used to estimate construct-level effects41. For comparative purposes, we analysed the bivariate pattern in Fig. 1A using a variety of regression models, including Ricker-type nonlinear models as well as second- and third-order polynomials. A three-parameter Ricker-type model provided the best fit for the data.
Data were screened for distributional properties and nonlinear relations. Several variables were log-transformed as a result of evaluations (Extended Data Table 1). We used the R software platform43 and the lavaan package44 along with the lavaan.survey45 package for our structural equation model analyses. For the plot-scale model, robust χ2 tests, as implemented in the lavaan.survey package, were used to judge variable inclusion and model adequacy because of the nested nature of the plot-level data. Each link in the final model was evaluated for significant contribution to the model. Final model fit to data was very good for both submodels. Model fit indices were supplemented by using additional diagnostic evaluations that involve visualizing residual relationships to evaluate conditional independence29. These residual visualizations allowed, among other things, an ability to evaluate linearity assumptions and implement curve-fitting procedures if needed (which was only the case for the composite relationships in this case). Our structural equation model in this case is non-recursive and includes a causal loop. Models of this form are commonplace in structural equation model applications, although they come with some additional assumptions and requirements. Specifically, there is a requirement for unique predictors for the elements involved in loops, a requirement that was met in this case. Additional analysis details are documented in the R script used for the analysis (Supplementary Information).
Multi-level relations were incorporated into the architecture of our model. Several ways to incorporate both site- and plot-level variations in the model were considered and multiple approaches evaluated to ensure results are general. In the model form presented, we chose to follow modern hierarchical modelling principles and allow plot-level observations to depend on site-level parameters, since plots were nested within sites. The result of choosing this approach means site-level explanatory effects can filter down to the plot level while plot-level explanatory variables (for example, pathways from edaphic conditions to plot richness) explain additional plot-to-plot variations in responses that are not predicted from site-level (mean) conditions. Consistent with the capabilities of the structural equation model software used in our analyses (described below), we estimated site- and plot-level submodels using a two-stage approach, first estimating parameters for the site-level component and then using site productivity, biomass, and richness as exogenous predictors in the plot-level component. Comparisons with results from separate site- and plot-level models led to very similar conclusions, although the hierarchical approach used allowed a better integration of processes and greater variance explanation.
One of our objectives in this study was to assess the model dimensionality needed to detect the hypothesized signals in the data. To do this, we started with the most complete model (Fig. 2) and eliminated variables from the model (always retaining richness and some measure of biomass production, either productivity or total biomass). We then made any modifications needed to ensure adequate model-data fit for these reduced-form models. The consequences of model simplification was judged on the basis of signal retention, in particular a loss of capacity to detect signals associated with the remaining parts of the model.
The computer script associated with the analyses in this paper is available as part of the Supplementary Information.
J.B.G. was supported by the US Geological Survey Ecosystems and Climate and Land use Change Programs. This work uses data from the Nutrient Network (http://nutnet.org) experiment, funded at the site scale by individual researchers. Coordination and data management were supported by funding to E.T.B. and E.W.S. from the National Science Foundation (NSF) Research Coordination Network (NSF-DEB-1042132) and Long Term Ecological Research (NSF-DEB-1234162 to Cedar Creek LTER) programs and the UMN Institute on the Environment (DG-0001-13). The Minnesota Supercomputer Institute hosts project data. The use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. Support for site-level activities is acknowledged in the Supplementary Information. We thank D. Laughlin for comments on the manuscript.
Extended data figures
About this article
Responses of soil and plants to spatio-temporal changes in landscape under different land use in Imo watershed, southern Nigeria
Archives of Agronomy and Soil Science (2019)