Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes

Subjects

Abstract

Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced metagenomes. Multiple metagenomes of the same community, which differ in relative population abundances, were used to assemble 31 bacterial genomes, including rare (<1% relative abundance) species, from an activated sludge bioreactor. Twelve genomes were assembled into complete or near-complete chromosomes. Four belong to the candidate bacterial phylum TM7 and represent the most complete genomes for this phylum to date (relative abundances, 0.06–1.58%). Reanalysis of published metagenomes reveals that differential coverage binning facilitates recovery of more complete and higher fidelity genome bins than other currently used methods, which are primarily based on sequence composition. This approach will be an important addition to the standard metagenome toolbox and greatly improve access to genomes of uncultured microorganisms.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Sequence composition–independent binning of metagenome scaffolds from the lab-scale bioreactor using differential coverage (HP+, HP).
Figure 2: Overview of the pipeline to obtain high-quality population genomes from multiple deep metagenomes using differential coverage as the primary binning method, illustrated using the population genome TM7-AAU-ii.
Figure 3: (a) Sequence composition–independent binning using metagenome coverage of two samples, A and C. Reanalysis of published metagenomes17 using the differential coverage approach.
Figure 4: Overview of the metabolism, cell wall characteristics and morphology of TM7.

Accession codes

Primary accessions

NCBI Reference Sequence

Sequence Read Archive

References

  1. Wu, D. et al. A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature 462, 1056–1060 (2009).

    Article  CAS  Google Scholar 

  2. Hugenholtz, P. Exploring prokaryotic diversity in the genomic era. Genome biology 3, S0003 (2002).

    Article  Google Scholar 

  3. Achtman, M. & Wagner, M. Microbial diversity and the genetic nature of microbial species. Nat. Rev. Microbiol. 6, 431–440 (2008).

    Article  CAS  Google Scholar 

  4. Knittel, K. & Boetius, A. Anaerobic oxidation of methane: progress with an unknown process. Annu. Rev. Microbiol. 63, 311–334 (2009).

    Article  CAS  Google Scholar 

  5. Elinav, E. et al. NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis. Cell 145, 745–757 (2011).

    Article  CAS  Google Scholar 

  6. Brinig, M.M., Lepp, P.W., Ouverney, C.C., Armitage, G.C. & Relman, D.A. Prevalence of bacteria of division TM7 in human subgingival plaque and their association with disease. Appl. Environ. Microbiol. 69, 1687–1694 (2003).

    Article  CAS  Google Scholar 

  7. Marcy, Y. et al. Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc. Natl. Acad. Sci. USA 104, 11889–11894 (2007).

    Article  CAS  Google Scholar 

  8. Kuehbacher, T. et al. Intestinal TM7 bacterial phylogenies in active inflammatory bowel disease. J. Med. Microbiol. 57, 1569–1576 (2008).

    Article  CAS  Google Scholar 

  9. Kunin, V., Copeland, A., Lapidus, A., Mavromatis, K. & Hugenholtz, P. A bioinformatician's guide to metagenomics. Microbiol. Mol. Biol. Rev. 72, 557–578 (2008).

    Article  CAS  Google Scholar 

  10. Tyson, G.W. et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004).

    Article  CAS  Google Scholar 

  11. García Martín, H. et al. Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat. Biotechnol. 24, 1263–1269 (2006).

    Article  Google Scholar 

  12. Tringe, S.G. et al. Comparative metagenomics of microbial communities. Science 308, 554–557 (2005).

    Article  CAS  Google Scholar 

  13. Rodrigue, S. et al. Whole genome amplification and de novo assembly of single bacterial cells. PLoS ONE 4, e6864 (2009).

    Article  Google Scholar 

  14. Lasken, R.S. Genomic sequencing of uncultured microorganisms from single cells. Nat. Rev. Microbiol. 10, 631–640 (2012).

    Article  CAS  Google Scholar 

  15. Luo, C., Tsementzi, D., Kyrpides, N.C. & Konstantinidis, K.T. Individual genome assembly from complex community short-read metagenomic datasets. ISME J. 6, 898–901 (2012).

    Article  CAS  Google Scholar 

  16. Mande, S.S., Mohammed, M.H. & Ghosh, T.S. Classification of metagenomic sequences: methods and challenges. Brief. Bioinform. 13, 669–681 (2012).

    Article  Google Scholar 

  17. Wrighton, K.C. et al. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science 337, 1661–1665 (2012).

    Article  CAS  Google Scholar 

  18. Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

    Article  CAS  Google Scholar 

  19. Smoot, M.E., Ono, K., Ruscheinski, J., Wang, P.-L. & Ideker, T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431–432 (2011).

    Article  CAS  Google Scholar 

  20. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

    Article  CAS  Google Scholar 

  21. Hess, M. et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–467 (2011).

    Article  CAS  Google Scholar 

  22. Dupont, C.L. et al. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 6, 1186–1199 (2012).

    Article  CAS  Google Scholar 

  23. Vezzi, F., Narzisi, G. & Mishra, B. Reevaluating assembly evaluations with feature response curves: GAGE and assemblathons. PLoS ONE 7, e52210 (2012).

    Article  CAS  Google Scholar 

  24. Podar, M. et al. Targeted access to the genomes of low-abundance organisms in complex microbial communities. Appl. Environ. Microbiol. 73, 3205–3214 (2007).

    Article  CAS  Google Scholar 

  25. Nielsen, P.H., Saunders, A.M., Hansen, A.A., Larsen, P. & Nielsen, J.L. Microbial communities involved in enhanced biological phosphorus removal from wastewater-a model system in environmental biotechnology. Curr. Opin. Biotechnol. 23, 452–459 (2012).

    Article  CAS  Google Scholar 

  26. Luo, C., Xie, S., Sun, W., Li, X. & Cupples, A.M. Identification of a novel toluene-degrading bacterium from the candidate phylum TM7, as determined by DNA stable isotope probing. Appl. Environ. Microbiol. 75, 4644–4647 (2009).

    Article  CAS  Google Scholar 

  27. Hugenholtz, P., Tyson, G.W., Webb, R.I., Wagner, A.M. & Blackall, L.L. Investigation of candidate division TM7, a recently recognized major lineage of the domain Bacteria with no known pure-culture representatives. Appl. Environ. Microbiol. 67, 411–419 (2001).

    Article  CAS  Google Scholar 

  28. Sutcliffe, I.C. A phylum level perspective on bacterial cell envelope architecture. Trends Microbiol. 18, 464–470 (2010).

    Article  CAS  Google Scholar 

  29. McCutcheon, J.P. & Moran, N.A. Extreme genome reduction in symbiotic bacteria. Nat. Rev. Microbiol. 10, 13–26 (2012).

    Article  CAS  Google Scholar 

  30. Baker, B.J. et al. Enigmatic, ultrasmall, uncultivated Archaea. Proc. Natl. Acad. Sci. USA 107, 8806–8811 (2010).

    Article  CAS  Google Scholar 

  31. Thomsen, T.R., Kjellerup, B.V., Nielsen, J.L., Hugenholtz, P. & Nielsen, P.H. In situ studies of the phylogeny and physiology of filamentous bacteria with attached growth. Environ. Microbiol. 4, 383–391 (2002).

    Article  CAS  Google Scholar 

  32. Mandlik, A., Swierczynski, A., Das, A. & Ton-That, H. Pili in Gram-positive bacteria: assembly, involvement in colonization and biofilm development. Trends Microbiol. 16, 33–40 (2008).

    Article  CAS  Google Scholar 

  33. Sutcliffe, I.C. Cell envelope architecture in the Chloroflexi: a shifting frontline in a phylogenetic turf war. Environ. Microbiol. 13, 279–282 (2011).

    Article  Google Scholar 

  34. Schneewind, O. & Missiakas, D.M. Protein secretion and surface display in Gram-positive bacteria. Phil. Trans. R. Soc. Lond. B 367, 1123–1139 (2012).

    Article  CAS  Google Scholar 

  35. Weidenmaier, C. & Peschel, A. Teichoic acids and related cell-wall glycopolymers in Gram-positive physiology and host interactions. Nat. Rev. Microbiol. 6, 276–287 (2008).

    Article  CAS  Google Scholar 

  36. Hoiczyk, E. & Hansel, A. Cyanobacterial cell walls: news from an unusual prokaryotic envelope. J. Bacteriol. 182, 1191–1199 (2000).

    Article  CAS  Google Scholar 

  37. Battistuzzi, F.U. & Hedges, S.B. A major clade of prokaryotes with ancient adaptations to life on land. Mol. Biol. Evol. 26, 335–343 (2009).

    Article  CAS  Google Scholar 

  38. Gupta, R.S. Origin of diderm (Gram-negative) bacteria: antibiotic selection pressure rather than endosymbiosis likely led to the evolution of bacterial cells with two membranes. Antonie van Leeuwenhoek 100, 171–182 (2011).

    Article  CAS  Google Scholar 

  39. Ludwig, W. et al. ARB: a software environment for sequence data. Nucleic Acids Res. 32, 1363–1371 (2004).

    Article  CAS  Google Scholar 

  40. Pruesse, E. et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35, 7188–7196 (2007).

    Article  CAS  Google Scholar 

  41. Nielsen, P.H. et al. A conceptual ecosystem model of microbial communities in enhanced biological phosphorus removal plants. Water Res. 44, 5070–5088 (2010).

    Article  CAS  Google Scholar 

  42. Caporaso, J. et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. USA 108, 4516–4522 (2011).

    Article  CAS  Google Scholar 

  43. Berry, D., Ben Mahfoudh, K., Wagner, M. & Loy, A. Barcoded primers used in multiplex amplicon pyrosequencing bias amplification. Appl. Environ. Microbiol. 77, 7846–7849 (2011).

    Article  CAS  Google Scholar 

  44. Masella, A.P., Bartram, A.K., Truszkowski, J.M., Brown, D.G. & Neufeld, J.D. PANDAseq: paired-end assembler for illumina sequences. BMC Bioinformatics 13, 31 (2012).

    Article  CAS  Google Scholar 

  45. Caporaso, J.G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

    Article  CAS  Google Scholar 

  46. Oksanen, J. et al. Vegan: Community Ecology Package. R package version 2.0–5 (2011).

  47. Hyatt, D., LoCascio, P.F., Hauser, L.J. & Uberbacher, E.C. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28, 2223–2230 (2012).

    Article  CAS  Google Scholar 

  48. Huson, D.H., Mitra, S., Ruscheweyh, H.-J., Weber, N. & Schuster, S.C. Integrative analysis of environmental sequences using MEGAN4. Genome Res. 21, 1552–1560 (2011).

    Article  CAS  Google Scholar 

  49. McDonald, D. et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

    Article  CAS  Google Scholar 

  50. Edgar, R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

    Article  CAS  Google Scholar 

  51. Pruesse, E., Peplies, J. & Glöckner, F.O. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28, 1823–1829 (2012).

    Article  CAS  Google Scholar 

  52. Markowitz, V.M. et al. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics 25, 2271–2278 (2009).

    Article  CAS  Google Scholar 

  53. Aziz, R.K. et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9, 75 (2008).

    Article  Google Scholar 

  54. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).

    Article  CAS  Google Scholar 

  55. Price, M.N., Dehal, P.S. & Arkin, A.P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  Google Scholar 

  56. Felsenstein, J. PHYLIP - Phylogeny inference package (version 3.2). Cladistics 5, 164–166 (1989).

    Google Scholar 

  57. Punta, M. et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).

    Article  CAS  Google Scholar 

  58. Tatusov, R.L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).

    Article  Google Scholar 

  59. Szklarczyk, D. et al. The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 39, D561–D568 (2011).

    Article  CAS  Google Scholar 

  60. Letunic, I. & Bork, P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This study was funded by Aalborg University and the Danish Research Council for Strategic Research via the Centre “EcoDesign-MBR” and the Obelske Family foundation. P.H. was supported by a Discovery Outstanding Researcher Award (DORA) from the Australian Research Council, grant DP120103498 and G.W.T. was supported by a QEII fellowship from the Australian Research Council, grant DP1093175. We thank S. McIlroy and P. Larsen for assistance with FISH analyses, A.M. Saunders for 16S rRNA data generation and J.P. Euzeby for suggesting the new genus name.

Author information

Authors and Affiliations

Authors

Contributions

M.A., experimental design, data analysis and manuscript; P.H., data analysis and manuscript; A.S., data analysis; K.L.N., sequencing; G.W.T., data analysis and manuscript; P.H.N., experimental design and manuscript.

Corresponding author

Correspondence to Per H Nielsen.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Notes, Supplementary Figures 1–13 and Supplementary Tables 1–10 (PDF 4591 kb)

Data Set 1

All scripts used in the manuscript, including a detailed step by step guide and example datasets. (ZIP 7822 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Albertsen, M., Hugenholtz, P., Skarshewski, A. et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 31, 533–538 (2013). https://doi.org/10.1038/nbt.2579

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.2579

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing