Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Shotgun metagenomics, from sampling to analysis

A Corrigendum to this article was published on 08 December 2017

This article has been updated

Abstract

Diverse microbial communities of bacteria, archaea, viruses and single-celled eukaryotes have crucial roles in the environment and in human health. However, microbes are frequently difficult to culture in the laboratory, which can confound cataloging of members and understanding of how communities function. High-throughput sequencing technologies and a suite of computational pipelines have been combined into shotgun metagenomics methods that have transformed microbiology. Still, computational approaches to overcome the challenges that affect both assembly-based and mapping-based metagenomic profiling, particularly of high-complexity samples or environments containing organisms with limited similarity to sequenced genomes, are needed. Understanding the functions and characterizing specific strains of these communities offers biotechnological promise in therapeutic discovery and innovative ways to synthesize products using microbial factories and can pinpoint the contributions of microorganisms to planetary, animal and human health.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Summary of a metagenomics workflow.
Figure 2: Assembly-based and assembly-free metagenome profiling.

Similar content being viewed by others

Change history

  • 12 September 2017

    In the version of this article initially published, the Competing Financial Interests should have indicated the authors had competing interests, but instead indicated there were none. The detailed statement was missing from the HTML: J.T.S. receives research funding from Oxford Nanopore Technologies and has received travel and accommodations to speak at meetings hosted by Oxford Nanopore Technologies. N.J.L. has received honoraria to speak at Oxford Nanopore and Illumina meetings, and travel and accommodation to attend company-sponsored meetings. N.J.L. has ongoing research collaborations with Oxford Nanopore who have provided free-of-charge sequencing reagents as part of the MinION Access Programme and directly in support of research projects. In addition, the publication date was given as 11 September, rather than 12 September 2017. The errors have been corrected for the PDF and HTML versions of this article.

References

  1. Hamady, M. & Knight, R. Microbial community profiling for human microbiome projects: tools, techniques, and challenges. Genome Res. 19, 1141–1152 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214 (2012).

  3. Oh, J. et al. Biogeography and individuality shape function in the human skin metagenome. Nature 514, 59–64 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Loman, N.J. et al. A culture-independent sequence-based metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli O104:H4. J. Am. Med. Assoc. 309, 1502–1510 (2013).

    Article  CAS  Google Scholar 

  5. Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Sunagawa, S. et al. Ocean plankton. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Venter, J.C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004).

    Article  PubMed  Google Scholar 

  8. Brown, C.T. et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature 523, 208–211 (2015).

    Article  CAS  PubMed  Google Scholar 

  9. van Kessel, M.A. et al. Complete nitrification by a single microorganism. Nature 528, 555–559 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Daims, H. et al. Complete nitrification by Nitrospira bacteria. Nature 528, 504–509 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Donia, M.S. et al. A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics. Cell 158, 1402–1414 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Norman, J.M. et al. Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell 160, 447–460 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Gevers, D. et al. The treatment-naive microbiome in new-onset Crohn's disease. Cell Host Microbe 15, 382–392 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li, S.S. et al. Durable coexistence of donor and recipient strains after fecal microbiota transplantation. Science 352, 586–589 (2016).

    Article  CAS  PubMed  Google Scholar 

  15. Kuczynski, J. et al. Direct sequencing of the human microbiome readily reveals community differences. Genome Biol. 11, 210 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Goodrich, J.K. et al. Conducting a microbiome study. Cell 158, 250–262 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. La Rosa, P.S. et al. Hypothesis testing and power calculations for taxonomic-based human microbiome data. PLoS One 7, e52078 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Tickle, T.L., Segata, N., Waldron, L., Weingart, U. & Huttenhower, C. Two-stage microbial community experimental design. ISME J. 7, 2330–2339 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Bonder, M.J. et al. The effect of host genetics on the gut microbiome. Nat. Genet. 48, 1407–1412 (2016).

    Article  CAS  PubMed  Google Scholar 

  20. Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016).

    Article  CAS  PubMed  Google Scholar 

  21. Knight, R. et al. Unlocking the potential of metagenomics through replicated experimental design. Nat. Biotechnol. 30, 513–520 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. McCafferty, J. et al. Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model. ISME J. 7, 2116–2125 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Lees, H. et al. Age and microenvironment outweigh genetic influence on the Zucker rat microbiome. PLoS One 9, e100916 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Sorge, R.E. et al. Olfactory exposure to males, including men, causes stress and related analgesia in rodents. Nat. Methods 11, 629–632 (2014).

    Article  CAS  PubMed  Google Scholar 

  25. Laukens, D., Brinkman, B.M., Raes, J., De Vos, M. & Vandenabeele, P. Heterogeneity of the gut microbiome in mice: guidelines for optimizing experimental design. FEMS Microbiol. Rev. 40, 117–132 (2016).

    Article  CAS  PubMed  Google Scholar 

  26. Yilmaz, P. et al. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat. Biotechnol. 29, 415–420 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lozupone, C.A. et al. Meta-analyses of studies of the human microbiota. Genome Res. 23, 1704–1714 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Probst, A.J., Weinmaier, T., DeSantis, T.Z., Santo Domingo, J.W. & Ashbolt, N. New perspectives on microbial community distortion after whole-genome amplification. PLoS One 10, e0124158 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Cuthbertson, L. et al. Time between collection and storage significantly influences bacterial sequence composition in sputum samples from cystic fibrosis respiratory infections. J. Clin. Microbiol. 52, 3011–3016 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Wesolowska-Andersen, A. et al. Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomic analysis. Microbiome 2, 19 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Yuan, S., Cohen, D.B., Ravel, J., Abdo, Z. & Forney, L.J. Evaluation of methods for the extraction and purification of DNA from the human microbiome. PLoS One 7, e33865 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Kennedy, N.A. et al. The impact of different DNA extraction kits and laboratories upon the assessment of human gut microbiota composition by 16S rRNA gene sequencing. PLoS One 9, e88982 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Tanner, M.A., Goebel, B.M., Dojka, M.A. & Pace, N.R. Specific ribosomal DNA sequences from diverse environmental settings correlate with experimental contaminants. Appl. Environ. Microbiol. 64, 3110–3113 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Salter, S.J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Motley, S.T. et al. Improved multiple displacement amplification (iMDA) and ultraclean reagents. BMC Genomics 15, 443 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Nelson, M.C., Morrison, H.G., Benjamino, J., Grim, S.L. & Graf, J. Analysis, optimization and verification of Illumina-generated 16S rRNA gene amplicon surveys. PLoS One 9, e94249 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Sinha, R. et al. Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing. Preprint at http://www.biorxiv.org/content/early/2017/04/09/125724 (2017).

  38. Baym, M. et al. Inexpensive multiplexed library preparation for megabase-sized genomes. PLoS One 10, e0128036 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Jones, M.B. et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc. Natl. Acad. Sci. USA 112, 14024–14029 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Simpson, J.T. & Pop, M. The theory and practice of genome sequence assembly. Annu. Rev. Genomics Hum. Genet. 16, 153–172 (2015).

    Article  CAS  PubMed  Google Scholar 

  41. Pevzner, P.A., Tang, H. & Waterman, M.S. An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. USA 98, 9748–9753 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Simpson, J.T. Exploring genome characteristics and sequence quality without a reference. Bioinformatics 30, 1228–1235 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Iqbal, Z., Caccamo, M., Turner, I., Flicek, P. & McVean, G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat. Genet. 44, 226–232 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Peng, Y., Leung, H.C., Yiu, S.M. & Chin, F.Y. Meta-IDBA: a de novo assembler for metagenomic data. Bioinformatics 27, i94–i101 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Namiki, T., Hachiya, T., Tanaka, H. & Sakakibara, Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. 40, e155 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Peng, Y., Leung, H.C., Yiu, S.M. & Chin, F.Y. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).

    Article  CAS  PubMed  Google Scholar 

  47. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Simpson, J.T. et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Boisvert, S., Raymond, F., Godzaridis, E., Laviolette, F. & Corbeil, J. Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol. 13, R122 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Pell, J. et al. Scaling metagenome sequence assembly with probabilistic de Bruijn graphs. Proc. Natl. Acad. Sci. USA 109, 13272–13277 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Cleary, B. et al. Detection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioning. Nat. Biotechnol. 33, 1053–1060 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Li, D., Liu, C.M., Luo, R., Sadakane, K. & Lam, T.W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    Article  CAS  PubMed  Google Scholar 

  53. Bradnam, K.R. et al. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience 2, 10 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. D'Amore, R. et al. A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling. BMC Genomics 17, 55 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Sczyrba, A. et al. Critical assessment of metagenome interpretation—a benchmark of computational metagenomics software. Preprint at http://www.biorxiv.org/content/early/2017/06/12/099127 (2017).

  56. Karlin, S., Mrázek, J. & Campbell, A.M. Compositional biases of bacterial genomes and evolutionary implications. J. Bacteriol. 179, 3899–3913 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Dick, G.J. et al. Community-wide analysis of microbial genome sequence signatures. Genome Biol. 10, R85 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Rosen, G., Garbarine, E., Caseiro, D., Polikar, R. & Sokhansanj, B. Metagenome fragment classification using N-mer frequency profiles. Adv. Bioinformatics 2008, 205969 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  59. McHardy, A.C., Martín, H.G., Tsirigos, A., Hugenholtz, P. & Rigoutsos, I. Accurate phylogenetic classification of variable-length DNA fragments. Nat. Methods 4, 63–72 (2007).

    Article  CAS  PubMed  Google Scholar 

  60. Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).

    Article  CAS  PubMed  Google Scholar 

  61. Strous, M., Kraft, B., Bisdorf, R. & Tegetmeyer, H.E. The binning of metagenomic contigs for microbial physiology of mixed cultures. Front. Microbiol. 3, 410 (2012).

    PubMed  PubMed Central  Google Scholar 

  62. Kelley, D.R. & Salzberg, S.L. Clustering metagenomic sequences with interpolated Markov models. BMC Bioinformatics 11, 544 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Sharon, I. et al. Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization. Genome Res. 23, 111–120 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Albertsen, M. et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31, 533–538 (2013).

    CAS  PubMed  Google Scholar 

  65. Korem, T. et al. Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples. Science 349, 1101–1106 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Imelfort, M. et al. GroopM: an automated tool for the recovery of population genomes from related metagenomes. PeerJ 2, e603 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Kang, D.D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Eren, A.M. et al. Anvi'o: an advanced analysis and visualization platform for 'omics data. PeerJ 3, e1319 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Hug, L.A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).

    Article  CAS  PubMed  Google Scholar 

  70. Stewart, E.J. Growing unculturable bacteria. J. Bacteriol. 194, 4151–4160 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).

    Article  CAS  PubMed  Google Scholar 

  72. Nelson, K.E. et al. A catalog of reference genomes from the human microbiome. Science 328, 994–999 (2010).

    Article  CAS  PubMed  Google Scholar 

  73. Segata, N. et al. Computational meta'omics for microbial community studies. Mol. Syst. Biol. 9, 666 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Nielsen, H.B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

    Article  CAS  PubMed  Google Scholar 

  75. Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55–60 (2012).

    Article  CAS  PubMed  Google Scholar 

  76. Karlsson, F.H. et al. Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498, 99–103 (2013).

    Article  CAS  PubMed  Google Scholar 

  77. Le Chatelier, E. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).

    Article  CAS  PubMed  Google Scholar 

  78. Zeller, G. et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 10, 766 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Qin, N. et al. Alterations of the human gut microbiome in liver cirrhosis. Nature 513, 59–64 (2014).

    Article  CAS  PubMed  Google Scholar 

  80. Huson, D.H., Mitra, S., Ruscheweyh, H.-J., Weber, N. & Schuster, S.C. Integrative analysis of environmental sequences using MEGAN4. Genome Res. 21, 1552–1560 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Brady, A. & Salzberg, S.L. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat. Methods 6, 673–676 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Wood, D.E. & Salzberg, S.L. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15, R46 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103–1108 (2015).

    Article  CAS  PubMed  Google Scholar 

  84. Walker, A.W., Duncan, S.H., Louis, P. & Flint, H.J. Phylogeny, culturing, and metagenomics of the human gut microbiota. Trends Microbiol. 22, 267–274 (2014).

    Article  CAS  PubMed  Google Scholar 

  85. Sunagawa, S. et al. Metagenomic species profiling using universal phylogenetic marker genes. Nat. Methods 10, 1196–1199 (2013).

    Article  CAS  PubMed  Google Scholar 

  86. Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811–814 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Truong, D.T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).

    Article  CAS  PubMed  Google Scholar 

  88. Pasolli, E. et al. Accessible, curated metagenomic data through ExperimentHub. Nat. Methods (in press).

  89. Luo, C. et al. ConStrains identifies microbial strains in metagenomic datasets. Nat. Biotechnol. 33, 1045–1052 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Donati, C. et al. Uncovering oral Neisseria tropism and persistence using metagenomic sequencing. Nat. Microbiol. 1, 16070 (2016).

    Article  CAS  PubMed  Google Scholar 

  91. Zhu, W., Lomsadze, A. & Borodovsky, M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res. 38, e132 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).

    Article  CAS  PubMed  Google Scholar 

  93. Abubucker, S. et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput. Biol. 8, e1002358 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Kanehisa, M. et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 42, D199–D205 (2014).

    Article  CAS  PubMed  Google Scholar 

  95. UniProt Consortium. Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res. 42, D191–D198 (2014).

  96. Pehrsson, E.C. et al. Interconnected microbiomes and resistomes in low-income human habitats. Nature 533, 212–216 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Kaminski, J. et al. High-specificity targeted functional profiling in microbial communities with ShortBRED. PLoS Comput. Biol. 11, e1004557 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Liu, B. & Pop, M. ARDB—Antibiotic Resistance Genes Database. Nucleic Acids Res. 37, D443–D447 (2009).

    Article  CAS  PubMed  Google Scholar 

  99. Gibson, M.K., Forsberg, K.J. & Dantas, G. Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology. ISME J. 9, 207–216 (2015).

    Article  CAS  PubMed  Google Scholar 

  100. Love, M.I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 0550–8 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Oksanen, J. et al. Vegan: the community ecology package. The Comprehensive R Archive Network https://cran.r-project.org/web/packages/vegan/vegan.pdf (2007).

  102. Paulson, J.N., Stine, O.C., Bravo, H.C. & Pop, M. Differential abundance analysis for microbial marker-gene surveys. Nat. Methods 10, 1200–1202 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Friedman, J. & Alm, E.J. Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 8, e1002687 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Faust, K. et al. Microbial co-occurrence relationships in the human microbiome. PLoS Comput. Biol. 8, e1002606 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Pasolli, E., Truong, D.T., Malik, F., Waldron, L. & Segata, N. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12, e1004977 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. White, J.R., Nagarajan, N. & Pop, M. Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput. Biol. 5, e1000352 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Segata, N. et al. Metagenomic biomarker discovery and explanation. Genome Biol. 12, R60 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  108. Asnicar, F., Weingart, G., Tickle, T.L., Huttenhower, C. & Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. PeerJ 3, e1029 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Ondov, B.D., Bergman, N.H. & Phillippy, A.M. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics 12, 385 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  110. Duy Truong, T., Tett, A., Pasolli, E., Huttenhower, C. & Segata, N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 27, 626–638 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Scholz, M. et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat. Methods 13, 435–438 (2016).

    Article  CAS  PubMed  Google Scholar 

  112. Quince, C. et al. De novo extraction of microbial strains from metagenomes reveals intra-species niche partitioning. Preprint at http://www.biorxiv.org/content/early/2016/09/06/073825 (2016).

  113. Quick, J., Quinlan, A.R. & Loman, N.J. A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer. Gigascience 3, 22 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Loman, N.J., Quick, J. & Simpson, J.T. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat. Methods 12, 733–735 (2015).

    Article  CAS  PubMed  Google Scholar 

  115. Kuleshov, V. et al. Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome. Nat. Biotechnol. 34, 64–69 (2016).

    Article  CAS  PubMed  Google Scholar 

  116. Sharon, I. et al. Accurate, multi-kb reads resolve complex populations and detect rare microorganisms. Genome Res. 25, 534–543 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Marx, V. Microbiology: the road to strain-level identification. Nat. Methods 13, 401–404 (2016).

    Article  CAS  PubMed  Google Scholar 

  118. O'Brien, J.D. et al. A Bayesian approach to inferring the phylogenetic structure of communities from metagenomic data. Genetics 197, 925–937 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  119. Nayfach, S., Rodriguez-Mueller, B., Garud, N. & Pollard, K.S. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Res. 26, 1612–1625 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Tyson, G.W. et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004).

    Article  CAS  PubMed  Google Scholar 

  121. Bolger, A.M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. de Bourcy, C.F. et al. A quantitative comparison of single-cell whole-genome amplification methods. PLoS ONE 9, e105585 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Yilmaz, S., Haroon, M.F., Rabkin, B.A., Tyson, G.W. & Hugenholtz, P. Fixation-free fluorescence in situ hybridization for targeted enrichment of microbial populations. ISME J. 4, 1352–1356 (2010).

    Article  PubMed  Google Scholar 

  124. Delmont, T.O. et al. Reconstructing rare soil microbial genomes using in situ enrichments and metagenomics. Front. Microbiol. 6, 358 (2015).

    PubMed  PubMed Central  Google Scholar 

  125. Kent, B.N. et al. Complete bacteriophage transfer in a bacterial endosymbiont (Wolbachia) determined by targeted genome capture. Genome Biol. Evol. 3, 209–218 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Seth-Smith, H.M. et al. Generating whole bacterial genome sequences of low-abundance species from complex samples with IMS-MDA. Nat. Protoc. 8, 2404–2412 (2013).

    Article  CAS  PubMed  Google Scholar 

  127. Lim, Y.W. et al. Purifying the impure: sequencing metagenomes and metatranscriptomes from complex animal-associated samples. J. Vis. Exp. 94, e52117 (2014).

    Google Scholar 

  128. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Ofek-Lalzar, M. et al. Niche and host-associated functional signatures of the root surface microbiome. Nat. Commun. 5, 4950 (2014).

    Article  CAS  PubMed  Google Scholar 

  130. Hyatt, D., LoCascio, P.F., Hauser, L.J. Uberbacher, E.C. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28, 2223–2230 (2012).

    Article  CAS  PubMed  Google Scholar 

  131. Huson, D.H. et al. Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads. Microbiome 5, 11 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  132. Tatusov, R.L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  133. Marchler-Bauer, A. et al. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 30, 281–283 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Beszteri, B., Temperton, B., Frickenhaus, S. Giovannoni, S.J. Average genome size: a potential source of bias in comparative metagenomics. ISME J. 4, 1075–1077 (2010).

    Article  PubMed  Google Scholar 

  135. Knight, R. et al. Unlocking the potential of metagenomics through replicated experimental design. Nat. Biotechnol. 30, 513–520 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

A.W.W. and the Rowett Institute receive core funding support from the Scottish Government's Rural and Environmental Science and Analysis Service (RESAS). N.S. is supported by the European Research Council (ERC-STG project MetaPG), a European Union Framework Program 7 Marie-Curie grant (PCIG13-618833), a MIUR grant (FIR RBFR13EWWI), a Fondazione Caritro grant (Rif.Int.2013.0239) and a Terme di Comano grant. C.Q. and N.J.L. are funded through a MRC bioinformatics fellowship (MR/M50161X/1) as part of the MRC Cloud Infrastructure for Microbial Bioinformatics (CLIMB) consortium (MR/L015080/1). J.T.S. is supported by the Ontario Institute for Cancer Research through funding provided by the Government of Ontario.

Author information

Authors and Affiliations

Authors

Contributions

C.Q., A.W.W., J.T.S., N.J.L. and N.S. drafted the paper, revised the text and designed figures, tables and boxes. C.Q. and N.S. performed the metagenomic analyses described in the manuscript.

Corresponding author

Correspondence to Nicola Segata.

Ethics declarations

Competing interests

J.T.S. receives research funding from Oxford Nanopore Technologies and has received travel and accommodations to speak at meetings hosted by Oxford Nanopore Technologies. N.J.L. has received honoraria to speak at Oxford Nanopore and Illumina meetings, and travel and accommodation to attend company-sponsored meetings. N.J.L. has ongoing research collaborations with Oxford Nanopore who have provided free-of-charge sequencing reagents as part of the MinION Access Programme and directly in support of research projects.

Integrated supplementary information

Supplementary Figure 1 Example workflow for planning a metagenomics study

The advice presented here is targeted towards entry-level researchers in this area, with a particular focus on hypothesis-driven experiments, which of course may be designed very differently compared to exploratory/hypothesis-generating studies. Key considerations for study design (blue box), sample collection (green box) and experimental procedures (yellow box) are highlighted. Understanding the potential for confounding factors, and optimization of design, can substantially improve the quality of both metagenomic sequence data, and interpretation. Supplementary Table 1 contains further specific recommendations.

Supplementary information

Supplementary Text and Figures

Supplementary Figure 1 (PDF 208 kb)

Life Sciences Reporting Summary (PDF 128 kb)

Supplementary Code 1

Supporting scripts and pipeline description. (ZIP 91 kb)

Supplementary Box 1

Problems and solutions for study and design. (PDF 287 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Quince, C., Walker, A., Simpson, J. et al. Shotgun metagenomics, from sampling to analysis. Nat Biotechnol 35, 833–844 (2017). https://doi.org/10.1038/nbt.3935

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3935

This article is cited by

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology