Metagenomic microbial community profiling using unique clade-specific marker genes

Article metrics

Abstract

Metagenomic shotgun sequencing data can identify microbes populating a microbial community and their proportions, but existing taxonomic profiling methods are inefficient for increasingly large data sets. We present an approach that uses clade-specific marker genes to unambiguously assign reads to microbial clades more accurately and >50× faster than current approaches. We validated our metagenomic phylogenetic analysis tool, MetaPhlAn, on terabases of short reads and provide the largest metagenomic profiling to date of the human gut. It can be accessed at http://huttenhower.sph.harvard.edu/metaphlan/.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Comparison of MetaPhlAn to existing methods.
Figure 2: Composition of healthy vaginal microbiota.
Figure 3: The gut microbiota in asymptomatic Western populations as inferred by MetaPhlAn on 224 samples combining the HMP and MetaHIT cohorts.

References

  1. 1

    DeLong, E.F. Nat. Rev. Microbiol. 3, 459–469 (2005).

  2. 2

    Daniel, R. Nat. Rev. Microbiol. 3, 470–478 (2005).

  3. 3

    The Human Microbiome Project Consortium. Nature advance online publication, doi:10.1038/nature11209 (14 June 2012).

  4. 4

    Qin, J. et al. Nature 464, 59–65 (2010).

  5. 5

    Ravel, J. et al. Proc. Natl. Acad. Sci. USA 108, 4680–4687 (2011).

  6. 6

    Veiga, P. et al. Proc. Natl. Acad. Sci. USA 107, 18132–18137 (2010).

  7. 7

    Turnbaugh, P.J. et al. Nature 457, 480–484 (2009).

  8. 8

    Markowitz, V.M. et al. Nucleic Acids Res. 38, D382–D390 (2010).

  9. 9

    Fredricks, D.N., Fiedler, T.L. & Marrazzo, J.M. N. Engl. J. Med. 353, 1899–1911 (2005).

  10. 10

    Stewart, F.J., Ulloa, O. & DeLong, E.F. Environ. Microbiol. 14, 23–40 (2012).

  11. 11

    Arumugam, M. et al. Nature 473, 174–180 (2011).

  12. 12

    Brady, A. & Salzberg, S. Nat. Methods 8, 367 (2011).

  13. 13

    Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. J. Mol. Biol. 215, 403–410 (1990).

  14. 14

    Parks, D.H., MacDonald, N. & Beiko, R. BMC Bioinformatics 12, 328 (2011).

  15. 15

    Rosen, G.L., Reichenberger, E.R. & Rosenfeld, A.M. Bioinformatics 27, 127–129 (2011).

  16. 16

    Segata, N. & Huttenhower, C. PLoS ONE 6, e24704 (2011).

  17. 17

    Bohlin, J. et al. BMC Evol. Biol. 10, 249 (2010).

  18. 18

    Edgar, R.C. Bioinformatics 26, 2460–2461 (2010).

  19. 19

    Wu, M. & Eisen, J.A. Genome Biol. 9, R151 (2008).

  20. 20

    Ciccarelli, F.D. et al. Science 311, 1283–1287 (2006).

  21. 21

    Mavromatis, K. et al. Nat. Methods 4, 495–500 (2007).

  22. 22

    Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M. & Hirakawa, M. Nucleic Acids Res. 38, D355–D360 (2010).

  23. 23

    Li, H., Ruan, J. & Durbin, R. Genome Res. 18, 1851–1858 (2008).

  24. 24

    Pruitt, K.D., Tatusova, T., Klimke, W. & Maglott, D.R. Nucleic Acids Res. 37, D32–D36 (2009).

  25. 25

    Huson, D.H., Auch, A.F., Qi, J. & Schuster, S.C. Genome Res. 17, 377–386 (2007).

  26. 26

    Huson, D.H., Mitra, S., Ruscheweyh, H.J., Weber, N. & Schuster, S.C. Genome Res. 21, 1552–1560 (2011).

  27. 27

    Gori, F., Folino, G., Jetten, M.S.M. & Marchiori, E. Bioinformatics 27, 196–203 (2011).

  28. 28

    Berger, S.A. & Stamatakis, A. Bioinformatics 27, 2068–2075 (2011).

  29. 29

    Gerlach, W. & Stoye, J. Nucleic Acids Res. 39, e91 (2011).

  30. 30

    McHardy, A.C., Rigoutsos, I., Hugenholtz, P., Tsirigos, A. & Martin, H.G. Nat. Methods 4, 63–72 (2007).

  31. 31

    Patil, K.R. et al. Nat. Methods 8, 191–192 (2011).

  32. 32

    Brady, A. & Salzberg, S.L. Nat. Methods 6, 673–676 (2009).

  33. 33

    Rosen, G., Garbarine, E., Caseiro, D., Polikar, R. & Sokhansanj, B. Adv. Bioinformatics 2008, 205969 (2008).

  34. 34

    Nalbantoglu, O.U., Way, S.F., Hinrichs, S.H. & Sayood, K. BMC Bioinformatics 12, 41 (2011).

  35. 35

    Leung, H.C. et al. Bioinformatics 27, 1489–1495 (2011).

  36. 36

    Schloss, P.D. et al. Appl. Environ. Microbiol. 75, 7537–7541 (2009).

  37. 37

    Cole, J.R. et al. Nucleic Acids Res. 37, D141–D145 (2009).

Download references

Acknowledgements

We would like to thank F. Stewart and E. DeLong for their helpful input during this study; D. Gevers, S. Sykes and K. Huang for their feedback on the methodology and J. Reyes and G. Weingart for their assistance with the implementation. This work was supported by US National Institutes of Health grant 1R01HG005969 and National Science Foundation grant DBI-1053486 to C.H.

Author information

N.S., A.B., O.J. and C.H. conceived the method; N.S. implemented the software; N.S. and C.H. performed the experiments; N.S., L.W., V.N. and C.H. analyzed the data; and N.S. and C.H. wrote the manuscript.

Correspondence to Curtis Huttenhower.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7, Supplementary Tables 1 and 2 and Supplementary Notes 1–3 (PDF 2101 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Segata, N., Waldron, L., Ballarini, A. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9, 811–814 (2012) doi:10.1038/nmeth.2066

Download citation

Further reading