Abstract
Soil microorganisms shape ecosystem function, yet it remains an open question whether we can predict the composition of the soil microbiome in places before observing it. Furthermore, it is unclear whether the predictability of microbial life exhibits taxonomic- and spatial-scale dependence, as it does for macrobiological communities. Here, we leverage multiple large-scale soil microbiome surveys to develop predictive models of bacterial and fungal community composition in soil, then test these models against independent soil microbial community surveys from across the continental United States. We find remarkable scale dependence in community predictability. The predictability of bacterial and fungal communities increases with the spatial scale of observation, and fungal predictability increases with taxonomic scale. These patterns suggest that there is an increasing importance of deterministic versus stochastic processes with scale, consistent with findings in plant and animal communities, suggesting a general scaling relationship across biology. Biogeochemical functional groups and high-level taxonomic groups of microorganisms were equally predictable, indicating that traits and taxonomy are both powerful lenses for understanding soil communities. By focusing on out-of-sample prediction, these findings suggest an emerging generality in our understanding of the soil microbiome, and that this understanding is fundamentally scale dependent.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data used to train statistical models are either publicly available in associated studies or were provided on request to original study authors. All data used to validate models are publicly available through the National Ecological Observatory Network data portal (https://data.neonscience.org/). We will provide raw and processed data on request for purposes of replicating the findings of this study.
Code availability
All code needed to process raw data and to replicate these analyses is available at GitHub (https://www.github.com/colinaverill/NEFI_microbe).
References
Schlesinger, W. H. & Bernhardt, E. S. Biogeochemistry: an Analysis of Global Change (Elsevier/Academic Press, 2012).
Fernandez, C. W., Langley, J. A., Chapman, S., McCormack, M. L. & Koide, R. T. The decomposition of ectomycorrhizal fungal necromass. Soil Biol. Biochem. 93, 38–49 (2016).
Glassman, S. I. et al. Decomposition responses to climate depend on microbial community composition. Proc. Natl Acad. Sci. USA 115, 11994–11999 (2018).
Mushinski, R. M. et al. Microbial mechanisms and ecosystem flux estimation for aerobic NOy emissions from deciduous forest soils. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1814632116 (2019).
Prosser, J. I. Dispersing misconceptions and identifying opportunities for the use of ‘omics’ in soil microbial ecology. Nat. Rev. Microbiol. 13, 439–446 (2015).
Delgado-Baquerizo, M. et al. A global atlas of the dominant bacteria found in soil. Science 359, 320–325 (2018).
Tedersoo, L. et al. Global diversity and geography of soil fungi. Science 346, 1256688 (2014).
Bahram, M. et al. Structure and function of the global topsoil microbiome. Nature 560, 233–237 (2018).
Drews, G. The roots of microbiology and the influence of Ferdinand Cohn on microbiology of the 19th century. FEMS Microbiol. Rev. 24, 225–249 (2000).
Chase, J. M. Spatial scale resolves the niche versus neutral theory debate. J. Veg. Sci. 25, 319–322 (2014).
Ricklefs, R. E. & Renner, S. S. Global correlations in tropical tree species richness and abundance reject neutrality. Science 335, 464–467 (2012).
Cavender-Bares, J., Keen, A. & Miles, B. Phylogenetic structure of Floridian plant communities depends on taxonomic and spatial scale. Ecology 87, S109–S122 (2006).
Cavender-Bares, J., Kozak, K. H., Fine, P. V. A. & Kembel, S. W. The merging of community ecology and phylogenetic biology. Ecol. Lett. 12, 693–715 (2009).
Ladau, J. & Eloe-Fadrosh, E. A. Spatial, temporal, and phylogenetic scales of microbial ecology. Trends Microbiol. 27, 662–669 (2019).
Elena, S. F. & Lenski, R. E. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nat. Rev. Genet. 4, 457–469 (2003).
Diaz, S. & Cabido, M. Plant functional types and ecosystem function in relation to global change. J. Veg. Sci. 8, 463–474 (1997).
Violle, C. et al. Let the concept of trait be functional! Oikos 116, 882–892 (2007).
Fierer, N., Bradford, M. A. & Jackson, R. B. Toward an ecological classification of soil bacteria. Ecology 88, 1354–1364 (2007).
Nguyen, N. H. et al. FUNGuild: an open annotation tool for parsing fungal community datasets by ecological guild. Fungal Ecol. 20, 241–248 (2016).
Whittaker, R. H. Communities and Ecosystems (Macmillan, 1975).
Gibbons, S. M. Microbial community ecology: function over phylogeny. Nat. Ecol. Evol. 1, 0032 (2017).
Locey, K. J. & Lennon, J. T. Scaling laws predict global microbial diversity. Proc. Natl Acad. Sci. USA 113, 5970–5975 (2016).
Dietze, M. C. Ecological Forecasting (Princeton Univ. Press, 2017).
Losos, J. B. Phylogenetic niche conservatism, phylogenetic signal and the relationship between phylogenetic relatedness and ecological similarity among species. Ecol. Lett. 11, 995–1003 (2008).
Ramirez, K. S. et al. Detecting macroecological patterns in bacterial communities across independent studies of global soils. Nat. Microbiol. 3, 189–196 (2018).
Smets, W. et al. A method for simultaneous measurement of soil bacterial abundances and community composition via 16S rRNA gene sequencing. Soil Biol. Biochem. 96, 145–151 (2016).
Hubbell, S. P. The Unified Neutral Theory of Biodiversity and Biogeography (Princeton Univ. Press, 2001).
Leibold, M. A., Urban, M. C., De Meester, L., Klausmeier, C. A. & Vanoverbeke, J. Regional neutrality evolves through local adaptive niche evolution. Proc. Natl Acad. Sci. USA 116, 2612–2617 (2019).
Dietze, M. & Lynch, H. Forecasting a bright future for ecology. Front. Ecol. Environ. 17, 3 (2019).
Thompson, L. R. et al. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463 (2017).
Todd-Brown, K. E. O. et al. Causes of variation in soil carbon simulations from CMIP5 Earth system models and comparison with observations. Biogeosciences 10, 1717–1736 (2013).
Todd-Brown, K. E. O. et al. Changes in soil organic carbon storage predicted by Earth system models during the 21st century. Biogeosciences 10, 18969–19004 (2013).
Lekberg, Y. et al. More bang for the buck? Can arbuscular mycorrhizal fungal communities be characterized adequately alongside other fungi using general fungal primers? New Phytol. 220, 971–976 (2018).
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Running, S., Mu, Q. & Zhao, M. MOD17A3 MODIS/Terra Net Primary Production Yearly L4 Global 1km SIN Grid V055. NASA EOSDIS Land Processes DAAC (NASA, 2011); https://cmr.earthdata.nasa.gov/search/concepts/C198653829-LPDAAC_ECS.html
Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).
Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267 (2007).
Kõljalg, U. et al. Towards a unified paradigm for sequence-based identification of fungi. Mol. Ecol. 22, 5271–5277 (2013).
Steidinger, B. S. et al. Climatic controls of decomposition drive the global biogeography of forest-tree symbioses. Nature 569, 404–408 (2019).
DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
Albright, M. B. N., Chase, A. B. & Martiny, J. B. H. Experimental evidence that stochasticity contributes to bacterial composition and functioning in a decomposer community. mBio 10, e00568-19 (2019).
Berlemont, R. & Martiny, A. C. Phylogenetic distribution of potential cellulases in bacteria. Appl. Environ. Microbiol. 79, 1545–1554 (2013).
Ho, A., Lonardo, D. P. D. & Bodelier, P. L. E. Revisiting life strategy concepts in environmental microbial ecology. Microbiol. Ecol. https://doi.org/10.1093/femsec/fix006 (2017).
Wang, L. & Wise, M. J. Glycogen with short average chain length enhances bacterial durability. Naturwissenschaften 98, 719–729 (2011).
Soil Microbe Community Composition (DP1.10081.001) (National Ecological Observatory Network (NEON)); https://data.neonscience.org
Averill, C., Dietze, M. C. & Bhatnagar, J. M. Continental-scale nitrogen pollution is shifting forest mycorrhizal associations and soil carbon stocks. Glob. Change Biol. 24, 4544–4553 (2018).
Pawlowsky-Glahn, V., Egozcue, J. J. & Tolosana-Delgado, R. Modelling and Analysis of Compositional Data (John Wiley & Sons, 2015).
Smithson, M. & Verkuilen, J. A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 11, 54–71 (2006).
Cribari-Neto, F. & Zeileis, A. Beta regression in R. J. Stat. Softw. 34, 1–22 (2010).
Johnson, N. L., Kotz, S. & Balakrishnan, N. Discrete Multivariate Distributions (Wiley, 1997).
Plummer, M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In Proc. 3rd International Workshop on Distributed Statistical Computing 1–8 (2003); http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/Plummer.pdf
Denwood, M. J. runjags: an R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. J. Stat. Softw. 71, 1–25 (2016).
Gelman, A. & Hill, J. Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge Univ. Press, 2007).
R Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Moran, P. A. P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
Acknowledgements
The National Ecological Observatory Network is a program sponsored by the National Science Foundation and operated under cooperative agreement by Battelle Memorial Institute. C.A., Z.R.W., M.C.D. and J.M.B. were supported by NSF Macrosystems Biology (no. 1638577). C.A. was supported by an Ambizione Grant (no. PZ00P3_179900) from the Swiss National Science Foundation. K.F.A. was supported by the Boston University BRITE Bioinformatics REU program. D. Maynard gave feedback on an earlier version of this manuscript. L. Stanish helped to access and interpret microbial data from the NEON Network. J. Luecke designed and illustrated Figs. 1 and 2.
Author information
Authors and Affiliations
Contributions
C.A., J.M.B. and M.C.D. conceived the study. C.A., Z.R.W. and K.F.A. performed all analysis and computation. All of the authors wrote the manuscript collaboratively.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Ecology & Evolution thanks Xiaofeng Xu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Cross-validation within the NEON dataset.
Mean cross-validated R2 relative to the 1:1 prediction across functional and taxonomic groups for (a) bacteria and (b) fungi. All models were trained on 70% of NEON core or plot level data, and the validated using the remaining 30% of the data.
Extended Data Fig. 2 Coefficient of variation across taxonomic and functional groupings.
Coefficient of variation of model predictions vs. observations across functional and taxonomic groups, both in and out of sample for (a) bacteria and (b) fungi.
Extended Data Fig. 3 Principal component analysis of microbial environmental sensitivities.
Principal component analysis of phylogenetic and functional group parameter values in the global calibration dataset for (a) fungi and (d) bacteria. Factor importance in principal component space is indicated by the direction and length of factor vectors. We visualize the strongest correlation between an individual factor effect size and predictability and the calibration dataset (b,e), as well as the correlations for all factors (c,f). Factors include net primary productivity (NPP), whether or not conifers are present (conifer), whether or not a site is a forest (forest), mean annual temperature (MAT), mean annual precipitation (MAP), soil pH (pH), soil percent carbon (%C), soil carbon to nitrogen ratio (C:N), and the relative abundance of ectomycorrhizal trees (relEM).
Extended Data Fig. 4 Qualitatively similar but quantitatively different relationships between Acidobacteria and soil pH.
Relative abundance of bacterial phylum Acidobacteriaplotted as function of soil pH, highlighting differences in trends between independent sources. a, Values from combined calibration dataset and validation dataset, with points and loess curves colored by dataset. The relationship between Acidobacteria and pH within the validation data, sourced from the National Ecological Observatory Network, appears to have strong a systematic bias; however, due to the compositional nature of amplicon sequencing data, it is difficult to determine the source of biases for any given taxon. b, Values from a subset of 5 independent datasets used in calibration, with points and loess curves colored by dataset.
Extended Data Fig. 5 Variance decomposition.
Density plot of variance decomposition for all (a) bacterial and (b) fungal groups modeled at the site level.
Extended Data Fig. 6 Distribution of samples used in this analysis.
Distribution of sampling sites used in this analysis. Sites used for fungal model calibration are in pink, sites used for bacterial model calibration are in blue, and NEON sites used for validation are in yellow.
Supplementary information
Supplementary Information
Supplementary Figs. 1–6 and the caption for Supplementary Data 1.
Supplementary Data 1
Out of sample R2 and R21:1 values for all bacterial and fungal groups modeled. Values are reported at core, plot and site scales.
Rights and permissions
About this article
Cite this article
Averill, C., Werbin, Z.R., Atherton, K.F. et al. Soil microbiome predictability increases with spatial and taxonomic scale. Nat Ecol Evol 5, 747–756 (2021). https://doi.org/10.1038/s41559-021-01445-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41559-021-01445-9
This article is cited by
-
Elevated methane flux in a tropical peatland post-fire is linked to depth-dependent changes in peat microbiome assembly
npj Biofilms and Microbiomes (2024)
-
Predicting microbial community compositions in wastewater treatment plants using artificial neural networks
Microbiome (2023)
-
Forest thinning alleviates the negative effects of precipitation reduction on soil microbial diversity and multifunctionality
Biology and Fertility of Soils (2023)