Short Communication | Published:

dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication

The ISME Journal volume 11, pages 28642868 (2017) | Download Citation

Abstract

The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that reduces the computational time for pairwise genome comparisons by sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate measure of average nucleotide identity. dRep achieves a 28 × increase in speed with perfect recall and precision when benchmarked against previously developed algorithms. We demonstrate the use of dRep for genome recovery from time-series datasets. Each metagenome was assembled separately, and dRep was used to identify groups of essentially identical genomes and select the best genome from each replicate set. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using co-assembly.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

References

  1. , , , , , et al. (2016). Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations. ISME J 10: 1589–1601.

  2. , , , , , et al. (2015). Anvi’o: an advanced analysis and visualization platform for ‘omics data. Peer J 3: e1319.

  3. , , , , , et al. (2016). Developmental dynamics of the preterm infant gut microbiota and antibiotic resistome. Nat Microbiol 1: 16024.

  4. , , . (2001). SciPy: Open source scientific tools for Python. Available at: .

  5. , , , , , et al. (2017). Tracking microbial colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics. Microbiome 5: 50.

  6. , , , , , et al. (2017). Identical bacterial populations colonize premature infant gut, skin, and oral microbiomes and exhibit different in situ growth rates. Genome Res gr-213256 27: 601–612.

  7. , , , , , et al. (2016). Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17: 132.

  8. , , , , . (2015). CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25: 1043–1055.

  9. , , , , , et al. (2016). Genomic resolution of a cold subsurface aquifer community provides metabolic insights for novel microbes adapted to high CO2 concentrations. Environ Microbiol 19: 459–474.

  10. , , , , , et al. (2016), De novo extraction of microbial strains from metagenomes reveals intra-species niche partitioning. Available at: (Accessed 12 September 2016).

  11. , , , , , et al. (2016). Evidence for persistent and shared bacterial strains against a background of largely unique gut colonization in hospitalized premature infants. ISME J 10: 2817–2830.

  12. , , , , , et al. (2015). Gut bacteria are rarely shared by co-hospitalized premature infants, regardless of necrotizing enterocolitis development Kolter R (ed). eLife 4: e05477.

  13. , . (2009). Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci USA 106: 19126–19131.

  14. , , , , , et al. (2017). Critical Assessment of Metagenome Interpretation − a benchmark of computational metagenomics software. BioRxiv, 099127. Available at: .

  15. , , , , , . (2013). Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization. Genome Res 23: 111–120.

  16. , , , , , et al. (2004). Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37–43.

  17. , , , , , et al. (2015). Microbial species delineation using whole genome sequences. Nucleic Acids Res 43: 6761–6771.

  18. , , , , , et al. (2016). Patient-specific Bacteroides genome variants in pouchitis. mBio 7: e01713–e01716.

  19. , , , , , et al. (2016). Metagenomic sequencing with strain-level resolution implicates uropathogenic E. coli in necrotizing enterocolitis and mortality in preterm infants. Cell Rep 14: 2912–2924.

Download references

Acknowledgements

Funding was provided by the Sloan Foundation (http://www.sloan.org/, grant number: G 2012-10-05, PI: JFB) and the National Institutes of Health (NIH; award reference number 5R01-AI-092531). This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE 1106400.

Author information

Affiliations

  1. Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA

    • Matthew R Olm
    • , Christopher T Brown
    •  & Brandon Brooks
  2. Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA, USA

    • Jillian F Banfield
  3. Department of Earth and Planetary Science, University of California, Berkeley, CA, USA

    • Jillian F Banfield

Authors

  1. Search for Matthew R Olm in:

  2. Search for Christopher T Brown in:

  3. Search for Brandon Brooks in:

  4. Search for Jillian F Banfield in:

Competing interests

The authors declare no conflict of interest.

Corresponding author

Correspondence to Jillian F Banfield.

Supplementary information

About this article

Publication history

Received

Revised

Accepted

Published

DOI

https://doi.org/10.1038/ismej.2017.126

Supplementary Information accompanies this paper on The ISME Journal website (http://www.nature.com/ismej)