Detecting macroecological patterns in bacterial communities across independent studies of global soils

Abstract

The emergence of high-throughput DNA sequencing methods provides unprecedented opportunities to further unravel bacterial biodiversity and its worldwide role from human health to ecosystem functioning. However, despite the abundance of sequencing studies, combining data from multiple individual studies to address macroecological questions of bacterial diversity remains methodically challenging and plagued with biases. Here, using a machine-learning approach that accounts for differences among studies and complex interactions among taxa, we merge 30 independent bacterial data sets comprising 1,998 soil samples from 21 countries. Whereas previous meta-analysis efforts have focused on bacterial diversity measures or abundances of major taxa, we show that disparate amplicon sequence data can be combined at the taxonomy-based level to assess bacterial community structure. We find that rarer taxa are more important for structuring soil communities than abundant taxa, and that these rarer taxa are better predictors of community structure than environmental factors, which are often confounded across studies. We conclude that combining data from independent studies can be used to explore bacterial community dynamics, identify potential ‘indicator’ taxa with an important role in structuring communities, and propose hypotheses on the factors that shape bacterial biogeography that have been overlooked in the past.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Merging of data from 30 independent studies.
Fig. 2: Regardless of technical differences between studies, many bacterial taxa are still informative about bacterial community structure.
Fig. 3: Rarer taxa are more important for structuring communities than abundant taxa.
Fig. 4: The importance of bacterial taxa classified at different taxonomic ranks.
Fig. 5: Importance of bacterial taxa in community structure related to their occurrence in different studies.

References

  1. 1.

    Proser, J. I. Dispersing misconceptions and identifying opportunities for the use of ‘omics’ in soil microbial ecology. Nat. Rev. Microbiol. 13, 439–446 (2015).

    CAS  Article  PubMed  Google Scholar 

  2. 2.

    Bardgett, R. D. & van der Putten, W. H. Belowground biodiversity and ecosystem functioning. Nature 515, 505–511 (2014).

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Caporaso, J. G. et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Tedersoo, L. et al. Fungal biogeography. Global diversity and geography of soil fungi. Science 346, 1256688 (2014).

    Article  PubMed  Google Scholar 

  5. 5.

    Davison, J. et al. Fungal symbionts. Global assessment of arbuscular mycorrhizal fungus diversity reveals very low endemism. Science 349, 970–973 (2015).

    CAS  Article  PubMed  Google Scholar 

  6. 6.

    Wieder, W. R., Bonan, G. B. & Allison, S. D. Global soil carbon projections are improved by modelling microbial processes. Nat. Clim. Change 3, 909–912 (2013).

    CAS  Article  Google Scholar 

  7. 7.

    Karhu, K. et al. Temperature sensitivity of soil respiration rates enhanced by microbial community response. Nature 513, 81–84 (2014).

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Barberán, A., Casamayor, E. O. & Fierer, N. The microbial contribution to macroecology. Front. Microbiol. 5, 203 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Ramirez, K. S. et al. Biogeographic patterns in below-ground diversity in New York City’s Central Park are similar to those observed globally. P. R. Soc. B 281, 20141988 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    O’Brien, S. L. et al. Spatial scale drives patterns in soil bacterial diversity. Environ. Microbiol. 18, 2039–2051 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Evans, S., Martiny, J. B. H. & Allison, S. D. Effects of dispersal and selection on stochastic assembly in microbial communities. ISME J. 11, 176–185 (2017).

    Article  PubMed  Google Scholar 

  12. 12.

    Talbot, J. M. et al. Endemism and functional convergence across the North American soil mycobiome. Proc. Natl Acad. Sci. USA 111, 6341–6346 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Barber, A. et al. Why are some microbes more ubiquitous than others? Predicting the habitat breadth of soil bacteria. Ecol. Lett. 17, 794–802 (2014).

    Article  Google Scholar 

  14. 14.

    Ranjard, L. et al. Turnover of soil bacterial diversity driven by wide-scale environmental heterogeneity. Nat. Commun. 4, 1434 (2013).

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Jetz, W., McPherson, J. M. & Guralnick, R. P. Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol. Evol. 27, 151–159 (2012).

    Article  PubMed  Google Scholar 

  16. 16.

    Ricketts, T. H. et al. Disaggregating the evidence linking biodiversity and ecosystem services. Nat. Commun. 7, 13106 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Dirzo, R. et al. Defaunation in the Anthropocene. Science 345, 401–406 (2014).

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Patterson, D. J., Cooper, J., Kirk, P. M., Pyle, R. L. & Remsen, D. P. Names are key to the big new biology. Trends Ecol. Evol. 25, 686–691 (2010).

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Santos, A. M. & Branco, M. The quality of name-based species records in databases. Trends Ecol. Evol. 27, 6–7 (2012).

    Article  PubMed  Google Scholar 

  20. 20.

    Beiko, R. G. Microbial malaise: how can we classify the microbiome? Trends Microbiol. 23, 671–679 (2015).

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    Tedersoo, L. et al. Standardizing metadata and taxonomic identification in metabarcoding studies. Gigascience 4, 34 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Ramirez, K. S. et al. Toward a global platform for linking soil biodiversity data. Front. Ecol. Evol. 3, 91 (2015).

    Article  Google Scholar 

  23. 23.

    Turner, W. et al. Free and open-access satellite data are key to biodiversity conservation. Biol. Conserv. 182, 173–176 (2015).

    Article  Google Scholar 

  24. 24.

    Gilbert, J. A., Jansson, J. K. & Knight, R. The Earth Microbiome project: successes and aspirations. BMC Biol. 12, 69 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Joppa, L. N. et al. Big data and biodiversity. Filling in biodiversity threat gaps. Science 352, 416–418 (2016).

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Sinha, R., Abnet, C. C., White, O., Knight, R. & Huttenhower, C. The Microbiome Quality Control project: baseline study design and future directions. Genome Biol. 16, 276 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Sogin, M. L. et al. Microbial diversity in the deep sea and the underexplored ‘rare biosphere’. Proc. Natl Acad. Sci. USA 103, 12115–12120 (2006).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    García-Palacios, P. et al. Are there links between responses of soil microbes and ecosystem functioning to elevated CO2, N deposition and warming? A global perspective. Glob. Chang. Biol. 21, 1590–1600 (2015).

    Article  PubMed  Google Scholar 

  29. 29.

    Hermans, S. M. et al. Bacteria as emerging indicators of soil condition. Appl. Environ. Microbiol. 83, e02826-16 (2016).

    PubMed  PubMed Central  Google Scholar 

  30. 30.

    Philippot, L. et al. The ecological coherence of high bacterial taxonomic ranks. Nat. Rev. Microbiol. 8, 523–529 (2010).

    CAS  Article  PubMed  Google Scholar 

  31. 31.

    Shade, A., Caporaso, J. G., Handelsman, J., Knight, R. & Fierer, N. A meta-analysis of changes in bacterial and archaeal communities with time. ISME J. 7, 1493–1506 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Hendershot, J. N., Read, Q. D., Henning, J. A., Sanders, N. J. & Classen, A. T. Consistently inconsistent drivers of microbial diversity and abundance at macroecological scales. Ecology 98, 1757–1763 (2017).

    Article  PubMed  Google Scholar 

  33. 33.

    Bier, R. L. et al. Linking microbial community structure and microbial processes: an empirical and conceptual overview. FEMS Microbiol. Ecol. 91, fiv113 (2015).

    Article  PubMed  Google Scholar 

  34. 34.

    Walters, W. A., Xu, Z. & Knight, R. Meta-analyses of human gut microbes associated with obesity and IBD. FEBS Lett. 588, 4223–4233 (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Bik, H. M. et al. Sequencing our way towards understanding global eukaryotic biodiversity. Trends Ecol. Evol. 27, 233–243 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Lauber, C. L., Hamady, M., Knight, R. & Fierer, N. Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale. Appl. Environ. Microbiol. 75, 5111–5120 (2009).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

    CAS  Article  PubMed  Google Scholar 

  38. 38.

    Lozupone, C. A. et al. Meta-analyses of studies of the human microbiota. Genome Res. 23, 1704–1714 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Pawluczyk, M. et al. Quantitative evaluation of bias in PCR amplification and next-generation sequencing derived from metabarcoding samples. Anal. Bioanal. Chem. 407, 1841–1848 (2015).

    CAS  Article  PubMed  Google Scholar 

  40. 40.

    Lu, X., Seuradge, B. J. & Neufeld, J. D. Biogeography of soil Thaumarchaeota in relation to soil depth and land usage. FEMS Microbiol. Ecol. 93, fiw246 (2017).

    Article  PubMed  Google Scholar 

  41. 41.

    Jung, S. P. & Kang, H. Assessment of microbial diversity bias associated with soil heterogeneity and sequencing resolution in pyrosequencing analyses. J. Microbiol. 52, 574–580 (2014).

    CAS  Article  PubMed  Google Scholar 

  42. 42.

    Langille, M. G. et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Jousset, A. et al. Where less may be more: how the rare biosphere pulls ecosystems strings. ISME J. 11, 853–862 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  44. 44.

    De Cáceres, M. & Legendre, P. Associations between species and groups of sites: indices and statistical inference. Ecology 90, 3566–3574 (2009).

    Article  PubMed  Google Scholar 

  45. 45.

    Maestre, F. T. et al. Increasing aridity reduces soil microbial diversity and abundance in global drylands. Proc. Natl Acad. Sci. USA 112, 15684–15689 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Knights, D. et al. Bayesian community-wide culture-independent microbial source tracking. Nat. Methods 8, 761–763 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Muir, P. et al. The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biol. 17, 53 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Rideout, J. R. et al. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences. PeerJ 2, e545 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Yilmaz, P. et al. The genomic standards consortium: bringing standards to life for microbial ecology. ISME J. 5, 1565–1567 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Wickham, H. & Francois, R. dplyr: a grammar of data manipulation. R package v. 0.5.0 (CRAN, 2016); https://cran.r-project.org/package=dplyr.

  51. 51.

    The R Core Team. R: A Language and Environment for Statistical (R Foundation for Statistical Computing, 2016); https://cran.r-project.org/doc/manuals/r-release/fullrefman.pdf.

  52. 52.

    Wilke, A. et al. The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res. 44, D590–D594 (2016).

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).

    CAS  Article  PubMed  Google Scholar 

  54. 54.

    Suzuki, M. T. & Giovannoni, S. J. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl. Environ. Microbiol. 62, 625–630 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Sipos, R. et al. Effect of primer mismatch, annealing temperature and PCR cycle number on 16S rRNA gene-targeting bacterial community analysis. FEMS Microbiol. Ecol. 60, 341–350 (2007).

    CAS  Article  PubMed  Google Scholar 

  56. 56.

    Klindworth, A. et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 41, e1 (2013).

    CAS  Article  PubMed  Google Scholar 

  57. 57.

    Joshi, N. A. & Fass, J. N. Sickle: a sliding-window, adaptive, quality-based trimming tool for FastQ files. v. 1.33 (2011); https://github.com/najoshi/sickle.

  58. 58.

    Rognes, T. et al. vsearch: VSEARCH 1.9.6. (2016); https://doi.org/10.5281/ZENODO.44512.

  59. 59.

    McDonald, D. et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Gigascience 1, 7 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Koster, J. & Rahmann, S. Snakemake — a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).

    Article  PubMed  Google Scholar 

  62. 62.

    Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Article  Google Scholar 

  63. 63.

    Breiman, L. & Cutler, A. Using Random Forests v4.0 (UC Berkeley, 2003); https://www.scribd.com/document/208387804/Using-Random-Forests-v4-0.

  64. 64.

    Shi, T. & Horvath, S. Unsupervised learning with Random Forest predictors. J. Comput. Graph. Stat. 15, 118–138 (2006).

    Article  Google Scholar 

Download references

Acknowledgements

We thank all the people who contributed data and input to this study. This study was conducted at a workshop (May 2015, Manchester, UK) funded by the British Ecological Society’s special interest group Plants-Soils-Ecosystems and organized by F.T.d.V. and K.S.R. This study and participants were funded in part by ERC Advanced Grant 26055290 (K.S.R., and W.H.v.d.P.); BBSRC David Phillips Fellowship (BB/L02456X/1) (F.T.d.V.); ERC Grant Agreements 242658 (BIOCOM) and 647038 (BIODESERT) (F.T.M.); the European Regional Development Fund (Centre of Excellence EcolChange) (J.D.); Yorkshire Agricultural Society, Nafferton Ecological Farming Group, and the Northumbria University Research Development Fund (C.H.O.); BBSRC Training Grant (BB/K501943/1) (C.H.); Wallenberg Academy Fellowship (KAW 2012.0152), Formas (214-2011-788) and Vetenskapsrådet (612-2011-5444) (E.D.); the Glastir Monitoring & Evaluation Programme (contract reference: C147/2010/11) and the full support of the GMEP team on the Glastir project (D.L.J., S.C., and D.A.R.). Computing was facilitated by the University of Manchester Condor pool and the CLIMB infrastructure (http://www.climb.ac.uk).

Author information

Affiliations

Authors

Contributions

The idea for this study was conceived by F.T.d.V. and K.S.R. The data sets were compiled by C.G.K., R.G., J.D., A.H., B.C., G.F., A.L.S., and J.R. Metadata were compiled by J.D. and J.R. Raw sequence analysis was conducted by M.d.H. Primer bias analysis was conducted by A.C. Random Forest analyses and figures were conducted by C.G.K. The manuscript was written by K.S.R., C.G.K., and F.T.d.V., with contributions from all co-authors.

Corresponding author

Correspondence to Kelly S. Ramirez.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Supplementary Tables 2 and 3 and Supplementary Figures 1–10.

Life Sciences Reporting Summary

Figure Generation Data

Supplementary Table 4: Data used to generate figures.

Figure Generation Code

R code use to generate figures.

Supplementary Table 1

Summary of all datasets used.

Supplementary Table 5

Name-matched data.

Supplementary Table 6

Sequence-matched data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ramirez, K.S., Knight, C.G., de Hollander, M. et al. Detecting macroecological patterns in bacterial communities across independent studies of global soils. Nat Microbiol 3, 189–196 (2018). https://doi.org/10.1038/s41564-017-0062-x

Download citation

Further reading