Brief Communication | Published:

Strain-level microbial epidemiology and population genomics from shotgun metagenomics

Nature Methods volume 13, pages 435438 (2016) | Download Citation

Abstract

Identifying microbial strains and characterizing their functional potential is essential for pathogen discovery, epidemiology and population genomics. We present pangenome-based phylogenomic analysis (PanPhlAn; http://segatalab.cibio.unitn.it/tools/panphlan), a tool that uses metagenomic data to achieve strain-level microbial profiling resolution. PanPhlAn recognized outbreak strains, produced the largest strain-level population genomic study of human-associated bacteria and, in combination with metatranscriptomics, profiled the transcriptional activity of strains in complex communities.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

References

  1. 1.

    Nat. Rev. Microbiol. 3, 470–478 (2005).

  2. 2.

    et al. Nature 464, 59–65 (2010).

  3. 3.

    The Human Microbiome Consortium. Nature 486, 207–214 (2012).

  4. 4.

    et al. Nature 490, 55–60 (2012).

  5. 5.

    et al. Nature 498, 99–103 (2013).

  6. 6.

    et al. Nat. Methods 9, 811–814 (2012).

  7. 7.

    et al. Nat. Methods 10, 1196–1199 (2013).

  8. 8.

    & Genome Biol. 15, R46 (2014).

  9. 9.

    et al. Nat. Biotechnol. 32, 822–828 (2014).

  10. 10.

    , , & Genome Res. 17, 377–386 (2007).

  11. 11.

    et al. PLOS Comput. Biol. 8, e1002358 (2012).

  12. 12.

    et al. Nat. Methods 12, 902–903 (2015).

  13. 13.

    et al. Proc. Natl. Acad. Sci. USA 111, E2329–E2338 (2014).

  14. 14.

    et al. Genome Res. 23, 1721–1729 (2013).

  15. 15.

    et al. Nat. Biotechnol. 33, 1045–1052 (2015).

  16. 16.

    , , , & PeerJ 2, e585 (2014).

  17. 17.

    et al. J. Am. Med. Assoc. 309, 1502–1510 (2013).

  18. 18.

    et al. N. Engl. J. Med. 366, 2267–2275 (2012).

  19. 19.

    et al. PLoS One 7, e48228 (2012).

  20. 20.

    et al. N. Engl. J. Med. 365, 709–717 (2011).

  21. 21.

    et al. Proc. Natl. Acad. Sci. USA 102, 13950–13955 (2005).

  22. 22.

    & Mob. Genet. Elements 2, 96–100 (2012).

  23. 23.

    , et al. & MetaHIT consortium. Nature 500, 541–546 (2013).

  24. 24.

    et al. Mol. Syst. Biol. 10, 766 (2014).

  25. 25.

    et al. Nature 513, 59–64 (2014).

  26. 26.

    et al. Proc. Natl. Acad. Sci. USA 110, 9066–9071 (2013).

  27. 27.

    et al. Cell Rep. 10.1016/j.celrep.2016.03.015 (17 March 2016).

  28. 28.

    et al. eLife 2, e01202 (2013).

  29. 29.

    et al. Proc. Natl. Acad. Sci. USA 105, 16731–16736 (2008).

  30. 30.

    et al. Nature 501, 426–429 (2013).

  31. 31.

    Bioinformatics 26, 2460–2461 (2010).

  32. 32.

    et al. Bioinformatics 31, 3691–3693 (2015).

  33. 33.

    , , , & Nucleic Acids Res. 40, e172 (2012).

  34. 34.

    , & Genome Res. 13, 2178–2189 (2003).

  35. 35.

    , , & PeerJ 2, e332 (2014).

  36. 36.

    , , & Nat. Commun. 4, 2304 (2013).

  37. 37.

    & Nat. Methods 9, 357–359 (2012).

  38. 38.

    , et al. & 1000 Genome Project Data Processing Subgroup. Bioinformatics 25, 2078–2079 (2009).

  39. 39.

    , , , & Bioinformatics 23, 1164–1167 (2007).

  40. 40.

    et al. Nucleic Acids Res. 36, D480–D484 (2008).

  41. 41.

    , , , & J. Mol. Biol. 215, 403–410 (1990).

  42. 42.

    et al. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

  43. 43.

    , & BMC Genomics 13, 74 (2012).

  44. 44.

    et al. J. Comput. Biol. 19, 455–477 (2012).

  45. 45.

    , , , & Bioinformatics 31, 1674–1676 (2015).

  46. 46.

    et al. Microbiome 1, 13 (2013).

  47. 47.

    et al. Genome Res. 13, 2498–2504 (2003).

  48. 48.

    Bioinformatics 30, 2068–2069 (18 March 2014).

  49. 49.

    Bioinformatics 30, 1312–1313 (2014).

Download references

Acknowledgements

We gratefully thank the members of the Segata lab for insightful discussions on the method, K. Schibler for his contribution to the preterm infant cohort study, and V. De Sanctis and R. Bertorelli for help in sequencing the skin samples. This work was supported by the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract number HHSN272200900018C (D.V.W., A.L.M.). The work was also supported by the People Programme (Marie Curie Actions) of the European Union's Seventh Framework Programme (FP7/2007-2013) under REA grant agreement number PCIG13-GA-2013-618833 (N.S.), by startup funds from the Centre for Integrative Biology, University of Trento (N.S.), by MIUR “Futuro in Ricerca” RBFR13EWWI_001 (N.S.), by the Fondazione Caritro–2013 (N.S.) and by 'Terme di Comano' (N.S.).

Author information

Author notes

    • Matthias Scholz
    • , Doyle V Ward
    •  & Edoardo Pasolli

    These authors contributed equally to this work.

Affiliations

  1. Centre for Integrative Biology, University of Trento, Trento, Italy.

    • Matthias Scholz
    • , Edoardo Pasolli
    • , Thomas Tolio
    • , Moreno Zolfo
    • , Francesco Asnicar
    • , Duy Tin Truong
    • , Adrian Tett
    •  & Nicola Segata
  2. Center for Microbiome Research, University of Massachusetts Medical School, Worcester, Massachusetts, USA.

    • Doyle V Ward
  3. Department of Pediatrics, Perinatal Institute, Cincinnati, Ohio, USA.

    • Ardythe L Morrow

Authors

  1. Search for Matthias Scholz in:

  2. Search for Doyle V Ward in:

  3. Search for Edoardo Pasolli in:

  4. Search for Thomas Tolio in:

  5. Search for Moreno Zolfo in:

  6. Search for Francesco Asnicar in:

  7. Search for Duy Tin Truong in:

  8. Search for Adrian Tett in:

  9. Search for Ardythe L Morrow in:

  10. Search for Nicola Segata in:

Contributions

N.S. supervised the work and originally conceived the project. M.S. and D.V.W. contributed to the conception and design of the work. M.S. and T.T. implemented, validated, tested, and documented the software. M.S. and E.P. performed the experiments. A.T., D.V.W. and A.L.M. performed and provided the metagenomics sequencing. M.Z., F.A. and D.T.T. provided computational tools and performed comparative analyses. N.S. and M.S. wrote the manuscript. All authors provided feedback, edited, and approved the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Nicola Segata.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–15, Supplementary Tables 2, 4, and 5, and Supplementary Notes 1–8

Excel files

  1. 1.

    Supplementary Table 1

    Synthetic and semi-synthetic metagenomes used for PanPhlAn validation.

  2. 2.

    Supplementary Table 3

    German 2011 E. coli outbreak specific gene set (Fisher exact test).

  3. 3.

    Supplementary Table 6

    Top 100 transcribed genes of E. coli in gut samples of healthy infants.

  4. 4.

    Supplementary Table 7

    Bottom 100 transcribed genes of E. coli in gut samples of healthy infants.

  5. 5.

    Supplementary Table 8

    Active pathway modules of E. coli in five gut samples of healthy infants.

Zip files

  1. 1.

    Supplementary Software

    Software tool PanPhlAn for strain detection and characterization (version 1.2).

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.3802

Further reading

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing