Soil microorganisms shape ecosystem function, yet it remains an open question whether we can predict the composition of the soil microbiome in places before observing it. Furthermore, it is unclear whether the predictability of microbial life exhibits taxonomic- and spatial-scale dependence, as it does for macrobiological communities. Here, we leverage multiple large-scale soil microbiome surveys to develop predictive models of bacterial and fungal community composition in soil, then test these models against independent soil microbial community surveys from across the continental United States. We find remarkable scale dependence in community predictability. The predictability of bacterial and fungal communities increases with the spatial scale of observation, and fungal predictability increases with taxonomic scale. These patterns suggest that there is an increasing importance of deterministic versus stochastic processes with scale, consistent with findings in plant and animal communities, suggesting a general scaling relationship across biology. Biogeochemical functional groups and high-level taxonomic groups of microorganisms were equally predictable, indicating that traits and taxonomy are both powerful lenses for understanding soil communities. By focusing on out-of-sample prediction, these findings suggest an emerging generality in our understanding of the soil microbiome, and that this understanding is fundamentally scale dependent.
Your institute does not have access to this article
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
All data used to train statistical models are either publicly available in associated studies or were provided on request to original study authors. All data used to validate models are publicly available through the National Ecological Observatory Network data portal (https://data.neonscience.org/). We will provide raw and processed data on request for purposes of replicating the findings of this study.
All code needed to process raw data and to replicate these analyses is available at GitHub (https://www.github.com/colinaverill/NEFI_microbe).
Schlesinger, W. H. & Bernhardt, E. S. Biogeochemistry: an Analysis of Global Change (Elsevier/Academic Press, 2012).
Fernandez, C. W., Langley, J. A., Chapman, S., McCormack, M. L. & Koide, R. T. The decomposition of ectomycorrhizal fungal necromass. Soil Biol. Biochem. 93, 38–49 (2016).
Glassman, S. I. et al. Decomposition responses to climate depend on microbial community composition. Proc. Natl Acad. Sci. USA 115, 11994–11999 (2018).
Mushinski, R. M. et al. Microbial mechanisms and ecosystem flux estimation for aerobic NOy emissions from deciduous forest soils. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1814632116 (2019).
Prosser, J. I. Dispersing misconceptions and identifying opportunities for the use of ‘omics’ in soil microbial ecology. Nat. Rev. Microbiol. 13, 439–446 (2015).
Delgado-Baquerizo, M. et al. A global atlas of the dominant bacteria found in soil. Science 359, 320–325 (2018).
Tedersoo, L. et al. Global diversity and geography of soil fungi. Science 346, 1256688 (2014).
Bahram, M. et al. Structure and function of the global topsoil microbiome. Nature 560, 233–237 (2018).
Drews, G. The roots of microbiology and the influence of Ferdinand Cohn on microbiology of the 19th century. FEMS Microbiol. Rev. 24, 225–249 (2000).
Chase, J. M. Spatial scale resolves the niche versus neutral theory debate. J. Veg. Sci. 25, 319–322 (2014).
Ricklefs, R. E. & Renner, S. S. Global correlations in tropical tree species richness and abundance reject neutrality. Science 335, 464–467 (2012).
Cavender-Bares, J., Keen, A. & Miles, B. Phylogenetic structure of Floridian plant communities depends on taxonomic and spatial scale. Ecology 87, S109–S122 (2006).
Cavender-Bares, J., Kozak, K. H., Fine, P. V. A. & Kembel, S. W. The merging of community ecology and phylogenetic biology. Ecol. Lett. 12, 693–715 (2009).
Ladau, J. & Eloe-Fadrosh, E. A. Spatial, temporal, and phylogenetic scales of microbial ecology. Trends Microbiol. 27, 662–669 (2019).
Elena, S. F. & Lenski, R. E. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nat. Rev. Genet. 4, 457–469 (2003).
Diaz, S. & Cabido, M. Plant functional types and ecosystem function in relation to global change. J. Veg. Sci. 8, 463–474 (1997).
Violle, C. et al. Let the concept of trait be functional! Oikos 116, 882–892 (2007).
Fierer, N., Bradford, M. A. & Jackson, R. B. Toward an ecological classification of soil bacteria. Ecology 88, 1354–1364 (2007).
Nguyen, N. H. et al. FUNGuild: an open annotation tool for parsing fungal community datasets by ecological guild. Fungal Ecol. 20, 241–248 (2016).
Whittaker, R. H. Communities and Ecosystems (Macmillan, 1975).
Gibbons, S. M. Microbial community ecology: function over phylogeny. Nat. Ecol. Evol. 1, 0032 (2017).
Locey, K. J. & Lennon, J. T. Scaling laws predict global microbial diversity. Proc. Natl Acad. Sci. USA 113, 5970–5975 (2016).
Dietze, M. C. Ecological Forecasting (Princeton Univ. Press, 2017).
Losos, J. B. Phylogenetic niche conservatism, phylogenetic signal and the relationship between phylogenetic relatedness and ecological similarity among species. Ecol. Lett. 11, 995–1003 (2008).
Ramirez, K. S. et al. Detecting macroecological patterns in bacterial communities across independent studies of global soils. Nat. Microbiol. 3, 189–196 (2018).
Smets, W. et al. A method for simultaneous measurement of soil bacterial abundances and community composition via 16S rRNA gene sequencing. Soil Biol. Biochem. 96, 145–151 (2016).
Hubbell, S. P. The Unified Neutral Theory of Biodiversity and Biogeography (Princeton Univ. Press, 2001).
Leibold, M. A., Urban, M. C., De Meester, L., Klausmeier, C. A. & Vanoverbeke, J. Regional neutrality evolves through local adaptive niche evolution. Proc. Natl Acad. Sci. USA 116, 2612–2617 (2019).
Dietze, M. & Lynch, H. Forecasting a bright future for ecology. Front. Ecol. Environ. 17, 3 (2019).
Thompson, L. R. et al. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463 (2017).
Todd-Brown, K. E. O. et al. Causes of variation in soil carbon simulations from CMIP5 Earth system models and comparison with observations. Biogeosciences 10, 1717–1736 (2013).
Todd-Brown, K. E. O. et al. Changes in soil organic carbon storage predicted by Earth system models during the 21st century. Biogeosciences 10, 18969–19004 (2013).
Lekberg, Y. et al. More bang for the buck? Can arbuscular mycorrhizal fungal communities be characterized adequately alongside other fungi using general fungal primers? New Phytol. 220, 971–976 (2018).
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Running, S., Mu, Q. & Zhao, M. MOD17A3 MODIS/Terra Net Primary Production Yearly L4 Global 1km SIN Grid V055. NASA EOSDIS Land Processes DAAC (NASA, 2011); https://cmr.earthdata.nasa.gov/search/concepts/C198653829-LPDAAC_ECS.html
Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).
Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267 (2007).
Kõljalg, U. et al. Towards a unified paradigm for sequence-based identification of fungi. Mol. Ecol. 22, 5271–5277 (2013).
Steidinger, B. S. et al. Climatic controls of decomposition drive the global biogeography of forest-tree symbioses. Nature 569, 404–408 (2019).
DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
Albright, M. B. N., Chase, A. B. & Martiny, J. B. H. Experimental evidence that stochasticity contributes to bacterial composition and functioning in a decomposer community. mBio 10, e00568-19 (2019).
Berlemont, R. & Martiny, A. C. Phylogenetic distribution of potential cellulases in bacteria. Appl. Environ. Microbiol. 79, 1545–1554 (2013).
Ho, A., Lonardo, D. P. D. & Bodelier, P. L. E. Revisiting life strategy concepts in environmental microbial ecology. Microbiol. Ecol. https://doi.org/10.1093/femsec/fix006 (2017).
Wang, L. & Wise, M. J. Glycogen with short average chain length enhances bacterial durability. Naturwissenschaften 98, 719–729 (2011).
Soil Microbe Community Composition (DP1.10081.001) (National Ecological Observatory Network (NEON)); https://data.neonscience.org
Averill, C., Dietze, M. C. & Bhatnagar, J. M. Continental-scale nitrogen pollution is shifting forest mycorrhizal associations and soil carbon stocks. Glob. Change Biol. 24, 4544–4553 (2018).
Pawlowsky-Glahn, V., Egozcue, J. J. & Tolosana-Delgado, R. Modelling and Analysis of Compositional Data (John Wiley & Sons, 2015).
Smithson, M. & Verkuilen, J. A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 11, 54–71 (2006).
Cribari-Neto, F. & Zeileis, A. Beta regression in R. J. Stat. Softw. 34, 1–22 (2010).
Johnson, N. L., Kotz, S. & Balakrishnan, N. Discrete Multivariate Distributions (Wiley, 1997).
Plummer, M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In Proc. 3rd International Workshop on Distributed Statistical Computing 1–8 (2003); http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/Plummer.pdf
Denwood, M. J. runjags: an R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. J. Stat. Softw. 71, 1–25 (2016).
Gelman, A. & Hill, J. Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge Univ. Press, 2007).
R Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Moran, P. A. P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
The National Ecological Observatory Network is a program sponsored by the National Science Foundation and operated under cooperative agreement by Battelle Memorial Institute. C.A., Z.R.W., M.C.D. and J.M.B. were supported by NSF Macrosystems Biology (no. 1638577). C.A. was supported by an Ambizione Grant (no. PZ00P3_179900) from the Swiss National Science Foundation. K.F.A. was supported by the Boston University BRITE Bioinformatics REU program. D. Maynard gave feedback on an earlier version of this manuscript. L. Stanish helped to access and interpret microbial data from the NEON Network. J. Luecke designed and illustrated Figs. 1 and 2.
The authors declare no competing interests.
Peer review information Nature Ecology & Evolution thanks Xiaofeng Xu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Mean cross-validated R2 relative to the 1:1 prediction across functional and taxonomic groups for (a) bacteria and (b) fungi. All models were trained on 70% of NEON core or plot level data, and the validated using the remaining 30% of the data.
Coefficient of variation of model predictions vs. observations across functional and taxonomic groups, both in and out of sample for (a) bacteria and (b) fungi.
Principal component analysis of phylogenetic and functional group parameter values in the global calibration dataset for (a) fungi and (d) bacteria. Factor importance in principal component space is indicated by the direction and length of factor vectors. We visualize the strongest correlation between an individual factor effect size and predictability and the calibration dataset (b,e), as well as the correlations for all factors (c,f). Factors include net primary productivity (NPP), whether or not conifers are present (conifer), whether or not a site is a forest (forest), mean annual temperature (MAT), mean annual precipitation (MAP), soil pH (pH), soil percent carbon (%C), soil carbon to nitrogen ratio (C:N), and the relative abundance of ectomycorrhizal trees (relEM).
Extended Data Fig. 4 Qualitatively similar but quantitatively different relationships between Acidobacteria and soil pH.
Relative abundance of bacterial phylum Acidobacteriaplotted as function of soil pH, highlighting differences in trends between independent sources. a, Values from combined calibration dataset and validation dataset, with points and loess curves colored by dataset. The relationship between Acidobacteria and pH within the validation data, sourced from the National Ecological Observatory Network, appears to have strong a systematic bias; however, due to the compositional nature of amplicon sequencing data, it is difficult to determine the source of biases for any given taxon. b, Values from a subset of 5 independent datasets used in calibration, with points and loess curves colored by dataset.
Density plot of variance decomposition for all (a) bacterial and (b) fungal groups modeled at the site level.
Distribution of sampling sites used in this analysis. Sites used for fungal model calibration are in pink, sites used for bacterial model calibration are in blue, and NEON sites used for validation are in yellow.
About this article
Cite this article
Averill, C., Werbin, Z.R., Atherton, K.F. et al. Soil microbiome predictability increases with spatial and taxonomic scale. Nat Ecol Evol 5, 747–756 (2021). https://doi.org/10.1038/s41559-021-01445-9