Article | Published:

Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes

Nature Biotechnology volume 31, pages 533538 (2013) | Download Citation

Subjects

Abstract

Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced metagenomes. Multiple metagenomes of the same community, which differ in relative population abundances, were used to assemble 31 bacterial genomes, including rare (<1% relative abundance) species, from an activated sludge bioreactor. Twelve genomes were assembled into complete or near-complete chromosomes. Four belong to the candidate bacterial phylum TM7 and represent the most complete genomes for this phylum to date (relative abundances, 0.06–1.58%). Reanalysis of published metagenomes reveals that differential coverage binning facilitates recovery of more complete and higher fidelity genome bins than other currently used methods, which are primarily based on sequence composition. This approach will be an important addition to the standard metagenome toolbox and greatly improve access to genomes of uncultured microorganisms.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

Accessions

Primary accessions

NCBI Reference Sequence

Sequence Read Archive

References

  1. 1.

    et al. A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature 462, 1056–1060 (2009).

  2. 2.

    Exploring prokaryotic diversity in the genomic era. Genome biology 3, S0003 (2002).

  3. 3.

    & Microbial diversity and the genetic nature of microbial species. Nat. Rev. Microbiol. 6, 431–440 (2008).

  4. 4.

    & Anaerobic oxidation of methane: progress with an unknown process. Annu. Rev. Microbiol. 63, 311–334 (2009).

  5. 5.

    et al. NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis. Cell 145, 745–757 (2011).

  6. 6.

    , , , & Prevalence of bacteria of division TM7 in human subgingival plaque and their association with disease. Appl. Environ. Microbiol. 69, 1687–1694 (2003).

  7. 7.

    et al. Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc. Natl. Acad. Sci. USA 104, 11889–11894 (2007).

  8. 8.

    et al. Intestinal TM7 bacterial phylogenies in active inflammatory bowel disease. J. Med. Microbiol. 57, 1569–1576 (2008).

  9. 9.

    , , , & A bioinformatician's guide to metagenomics. Microbiol. Mol. Biol. Rev. 72, 557–578 (2008).

  10. 10.

    et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004).

  11. 11.

    et al. Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat. Biotechnol. 24, 1263–1269 (2006).

  12. 12.

    et al. Comparative metagenomics of microbial communities. Science 308, 554–557 (2005).

  13. 13.

    et al. Whole genome amplification and de novo assembly of single bacterial cells. PLoS ONE 4, e6864 (2009).

  14. 14.

    Genomic sequencing of uncultured microorganisms from single cells. Nat. Rev. Microbiol. 10, 631–640 (2012).

  15. 15.

    , , & Individual genome assembly from complex community short-read metagenomic datasets. ISME J. 6, 898–901 (2012).

  16. 16.

    , & Classification of metagenomic sequences: methods and challenges. Brief. Bioinform. 13, 669–681 (2012).

  17. 17.

    et al. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science 337, 1661–1665 (2012).

  18. 18.

    & Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

  19. 19.

    , , , & Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431–432 (2011).

  20. 20.

    et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

  21. 21.

    et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–467 (2011).

  22. 22.

    et al. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 6, 1186–1199 (2012).

  23. 23.

    , & Reevaluating assembly evaluations with feature response curves: GAGE and assemblathons. PLoS ONE 7, e52210 (2012).

  24. 24.

    et al. Targeted access to the genomes of low-abundance organisms in complex microbial communities. Appl. Environ. Microbiol. 73, 3205–3214 (2007).

  25. 25.

    , , , & Microbial communities involved in enhanced biological phosphorus removal from wastewater-a model system in environmental biotechnology. Curr. Opin. Biotechnol. 23, 452–459 (2012).

  26. 26.

    , , , & Identification of a novel toluene-degrading bacterium from the candidate phylum TM7, as determined by DNA stable isotope probing. Appl. Environ. Microbiol. 75, 4644–4647 (2009).

  27. 27.

    , , , & Investigation of candidate division TM7, a recently recognized major lineage of the domain Bacteria with no known pure-culture representatives. Appl. Environ. Microbiol. 67, 411–419 (2001).

  28. 28.

    A phylum level perspective on bacterial cell envelope architecture. Trends Microbiol. 18, 464–470 (2010).

  29. 29.

    & Extreme genome reduction in symbiotic bacteria. Nat. Rev. Microbiol. 10, 13–26 (2012).

  30. 30.

    et al. Enigmatic, ultrasmall, uncultivated Archaea. Proc. Natl. Acad. Sci. USA 107, 8806–8811 (2010).

  31. 31.

    , , , & In situ studies of the phylogeny and physiology of filamentous bacteria with attached growth. Environ. Microbiol. 4, 383–391 (2002).

  32. 32.

    , , & Pili in Gram-positive bacteria: assembly, involvement in colonization and biofilm development. Trends Microbiol. 16, 33–40 (2008).

  33. 33.

    Cell envelope architecture in the Chloroflexi: a shifting frontline in a phylogenetic turf war. Environ. Microbiol. 13, 279–282 (2011).

  34. 34.

    & Protein secretion and surface display in Gram-positive bacteria. Phil. Trans. R. Soc. Lond. B 367, 1123–1139 (2012).

  35. 35.

    & Teichoic acids and related cell-wall glycopolymers in Gram-positive physiology and host interactions. Nat. Rev. Microbiol. 6, 276–287 (2008).

  36. 36.

    & Cyanobacterial cell walls: news from an unusual prokaryotic envelope. J. Bacteriol. 182, 1191–1199 (2000).

  37. 37.

    & A major clade of prokaryotes with ancient adaptations to life on land. Mol. Biol. Evol. 26, 335–343 (2009).

  38. 38.

    Origin of diderm (Gram-negative) bacteria: antibiotic selection pressure rather than endosymbiosis likely led to the evolution of bacterial cells with two membranes. Antonie van Leeuwenhoek 100, 171–182 (2011).

  39. 39.

    et al. ARB: a software environment for sequence data. Nucleic Acids Res. 32, 1363–1371 (2004).

  40. 40.

    et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35, 7188–7196 (2007).

  41. 41.

    et al. A conceptual ecosystem model of microbial communities in enhanced biological phosphorus removal plants. Water Res. 44, 5070–5088 (2010).

  42. 42.

    et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. USA 108, 4516–4522 (2011).

  43. 43.

    , , & Barcoded primers used in multiplex amplicon pyrosequencing bias amplification. Appl. Environ. Microbiol. 77, 7846–7849 (2011).

  44. 44.

    , , , & PANDAseq: paired-end assembler for illumina sequences. BMC Bioinformatics 13, 31 (2012).

  45. 45.

    et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

  46. 46.

    et al. Vegan: Community Ecology Package. R package version 2.0–5 (2011).

  47. 47.

    , , & Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28, 2223–2230 (2012).

  48. 48.

    , , , & Integrative analysis of environmental sequences using MEGAN4. Genome Res. 21, 1552–1560 (2011).

  49. 49.

    et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6, 610–618 (2012).

  50. 50.

    Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

  51. 51.

    , & SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28, 1823–1829 (2012).

  52. 52.

    et al. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics 25, 2271–2278 (2009).

  53. 53.

    et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9, 75 (2008).

  54. 54.

    Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).

  55. 55.

    , & FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

  56. 56.

    PHYLIP - Phylogeny inference package (version 3.2). Cladistics 5, 164–166 (1989).

  57. 57.

    et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).

  58. 58.

    et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).

  59. 59.

    et al. The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 39, D561–D568 (2011).

  60. 60.

    & Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).

Download references

Acknowledgements

This study was funded by Aalborg University and the Danish Research Council for Strategic Research via the Centre “EcoDesign-MBR” and the Obelske Family foundation. P.H. was supported by a Discovery Outstanding Researcher Award (DORA) from the Australian Research Council, grant DP120103498 and G.W.T. was supported by a QEII fellowship from the Australian Research Council, grant DP1093175. We thank S. McIlroy and P. Larsen for assistance with FISH analyses, A.M. Saunders for 16S rRNA data generation and J.P. Euzeby for suggesting the new genus name.

Author information

Affiliations

  1. Department of Biotechnology, Chemistry and Environmental Engineering, Aalborg University, Aalborg, Denmark.

    • Mads Albertsen
    • , Kåre L Nielsen
    •  & Per H Nielsen
  2. Australian Centre for Ecogenomics, School of Chemistry & Molecular Biosciences, The University of Queensland, St. Lucia, Queensland, Australia.

    • Philip Hugenholtz
    • , Adam Skarshewski
    •  & Gene W Tyson
  3. Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Queensland, Australia.

    • Philip Hugenholtz
  4. Advanced Water Management Centre, The University of Queensland, St. Lucia, Queensland, Australia.

    • Gene W Tyson

Authors

  1. Search for Mads Albertsen in:

  2. Search for Philip Hugenholtz in:

  3. Search for Adam Skarshewski in:

  4. Search for Kåre L Nielsen in:

  5. Search for Gene W Tyson in:

  6. Search for Per H Nielsen in:

Contributions

M.A., experimental design, data analysis and manuscript; P.H., data analysis and manuscript; A.S., data analysis; K.L.N., sequencing; G.W.T., data analysis and manuscript; P.H.N., experimental design and manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Per H Nielsen.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Notes, Supplementary Figures 1–13 and Supplementary Tables 1–10

Zip files

  1. 1.

    Data Set 1

    All scripts used in the manuscript, including a detailed step by step guide and example datasets.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.2579