Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Phage-inclusive profiling of human gut microbiomes with Phanta

Abstract

Due to technical limitations, most gut microbiome studies have focused on prokaryotes, overlooking viruses. Phanta, a virome-inclusive gut microbiome profiling tool, overcomes the limitations of assembly-based viral profiling methods by using customized k-mer-based classification tools and incorporating recently published catalogs of gut viral genomes. Phanta’s optimizations consider the small genome size of viruses, sequence homology with prokaryotes and interactions with other gut microbes. Extensive testing of Phanta on simulated data demonstrates that it quickly and accurately quantifies prokaryotes and viruses. When applied to 245 fecal metagenomes from healthy adults, Phanta identifies ~200 viral species per sample, ~5× more than standard assembly-based methods. We observe a ~2:1 ratio between DNA viruses and bacteria, with higher interindividual variability of the gut virome compared to the gut bacteriome. In another cohort, we observe that Phanta performs equally well on bulk versus virus-enriched metagenomes, making it possible to study prokaryotes and viruses in a single experiment, with a single analysis.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of Phanta’s comprehensive, virus-inclusive metagenomic annotation workflow.
Fig. 2: Evaluation of Phanta’s performance using simulated metagenomes.
Fig. 3: Evaluation of Phanta’s performance using shotgun gut metagenomes from 245 healthy human adults.
Fig. 4: Core properties of the healthy adult virome.
Fig. 5: Application of Phanta to n = 143 pairs of virus-enriched and bulk metagenomes from infant gut samples.

Similar content being viewed by others

Data availability

Accession numbers of all publicly available metagenomes used for analysis are provided in Supplementary Table 11. Source data for individual figures are provided with this manuscript. Phanta’s databases are available from the links specified at https://github.com/bhattlab/phanta (ref. 45). There are no restrictions on data availability. Source data are provided with this paper.

Code availability

Workflows were used for the preprocessing and assembly steps described in Methods and the workflows are available at https://github.com/bhattlab/bhattlab_workflows. Phanta and its postprocessing scripts are publicly available at https://github.com/bhattlab/phanta (ref. 45) with a detailed tutorial describing installation and usage.

References

  1. Shreiner, A. B., Kao, J. Y. & Young, V. B. The gut microbiome in health and in disease. Curr. Opin. Gastroenterol. 31, 69–75 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Pflughoeft, K. J. & Versalovic, J. Human microbiome in health and disease. Annu. Rev. Pathol. 7, 99–122 (2012).

    CAS  PubMed  Google Scholar 

  3. Cryan, J. F. et al. The microbiota-gut-brain axis. Physiol. Rev. 99, 1877–2013 (2019).

    CAS  PubMed  Google Scholar 

  4. Kau, A. L., Ahern, P. P., Griffin, N. W., Goodman, A. L. & Gordon, J. I. Human nutrition, the gut microbiome and the immune system. Nature 474, 327–336 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Tringe, S. G. & Hugenholtz, P. A renaissance for the pioneering 16S rRNA gene. Curr. Opin. Microbiol. 11, 442–446 (2008).

    CAS  PubMed  Google Scholar 

  6. Drewes, J. L. et al. High-resolution bacterial 16S rRNA gene profile meta-analysis and biofilm status reveal common colorectal cancer consortia. NPJ Biofilms Microbiomes 3, 34 (2017).

    PubMed  PubMed Central  Google Scholar 

  7. Romano, S. et al. Meta-analysis of the Parkinson’s disease gut microbiome suggests alterations linked to intestinal inflammation. NPJ Parkinsons Dis. 7, 27 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Davis-Richardson, A. G. et al. Bacteroides dorei dominates gut microbiome prior to autoimmunity in Finnish children at high risk for type 1 diabetes. Front. Microbiol. 5, 678 (2014).

    PubMed  PubMed Central  Google Scholar 

  10. Xu, W. et al. Characterization of shallow whole-metagenome shotgun sequencing as a high-accuracy and low-cost method by complicated mock microbiomes. Front. Microbiol. 12, 678319 (2021).

    PubMed  PubMed Central  Google Scholar 

  11. Sharpton, T. J. An introduction to the analysis of shotgun metagenomic data. Front. Plant Sci. 5, 209 (2014).

    PubMed  PubMed Central  Google Scholar 

  12. Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).

    CAS  PubMed  Google Scholar 

  13. Kunin, V., Copeland, A., Lapidus, A., Mavromatis, K. & Hugenholtz, P. A bioinformatician’s guide to metagenomics. Microbiol. Mol. Biol. Rev. 72, 557–578 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Gregory, A. C. et al. MetaPop: a pipeline for macro- and microdiversity analyses and visualization of microbial and viral metagenome-derived populations. Microbiome 10, 49 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Pandolfo, M., Telatin, A., Lazzari, G., Adriaenssens, E. M. & Vitulo, N. MetaPhage: an automated pipeline for analyzing, annotating, and classifying bacteriophages in metagenomics sequencing data. mSystems. 7, e0074122 (2022).

    PubMed  Google Scholar 

  16. Shen, W. et al. KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping. Bioinformatics 39, btac845 (2023).

    CAS  PubMed  Google Scholar 

  17. Lopera-Maya, E. A. et al. Effect of host genetics on the gut microbiome in 7,738 participants of the Dutch Microbiome Project. Nat. Genet. 54, 143–151 (2022).

    CAS  PubMed  Google Scholar 

  18. Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. 39, 105–114 (2021).

    CAS  PubMed  Google Scholar 

  19. Hiseni, P., Rudi, K., Wilson, R. C., Hegge, F. T. & Snipen, L. HumGut: a comprehensive human gut prokaryotic genomes collection filtered by metagenome data. Microbiome 9, 165 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Zhernakova, A. et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science 352, 565–569 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

    PubMed  Google Scholar 

  22. Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811–814 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Wood, D. E. & Salzberg, S. L. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15, R46 (2014).

    PubMed  PubMed Central  Google Scholar 

  25. Kieft, K., Zhou, Z. & Anantharaman, K. VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome 8, 90 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Roux, S., Enault, F., Hurwitz, B. L. & Sullivan, M. B. VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985 (2015).

    PubMed  PubMed Central  Google Scholar 

  27. Guo, J. et al. VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome 9, 37 (2021).

    PubMed  PubMed Central  Google Scholar 

  28. Ren, J. et al. Identifying viruses from metagenomic data using deep learning. Quant. Biol. 8, 64–77 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Ren, J., Ahlgren, N. A., Lu, Y. Y., Fuhrman, J. A. & Sun, F. VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome 5, 69 (2017).

    PubMed  PubMed Central  Google Scholar 

  30. Amgarten, D., Braga, L. P. P., da Silva, A. M. & Setubal, J. C. MARVEL, a tool for prediction of Bacteriophage sequences in metagenomic bins. Front. Genet. 9, 304 (2018).

    PubMed  PubMed Central  Google Scholar 

  31. Fang, Z. et al. PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning. GigaScience 8, giz066 (2019).

    PubMed  PubMed Central  Google Scholar 

  32. Sutton, T. D. S., Clooney, A. G., Ryan, F. J., Ross, R. P. & Hill, C. Choice of assembly software has a critical impact on virome characterisation. Microbiome 7, 12 (2019).

    PubMed  PubMed Central  Google Scholar 

  33. Nayfach, S. et al. Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome. Nat. Microbiol. 6, 960–970 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Gregory, A. C. et al. The gut virome database reveals age-dependent patterns of virome diversity in the human gut. Cell Host Microbe 28, 724–740 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Soto-Perez, P. et al. CRISPR-Cas system of a prevalent human gut bacterium reveals hyper-targeting against phages in a human virome catalog. Cell Host Microbe 26, 325–335 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Paez-Espino, D. et al. IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes. Nucleic Acids Res. 47, D678–D686 (2019).

    CAS  PubMed  Google Scholar 

  37. Tisza, M. J. & Buck, C. B. A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases. Proc. Natl Acad. Sci. USA 118, e2023202118 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Camarillo-Guerrero, L. F., Almeida, A., Rangel-Pineros, G., Finn, R. D. & Lawley, T. D. Massive expansion of human gut bacteriophage diversity. Cell 184, 1098–1109 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Benler, S. et al. Thousands of previously unknown phages discovered in whole-community human gut metagenomes. Microbiome 9, 78 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Van Espen, L. et al. A previously undescribed highly prevalent Phage identified in a Danish enteric virome catalog. mSystems 6, e0038221 (2021).

    PubMed  Google Scholar 

  41. Bharti, R. & Grimm, D. G. Current challenges and best-practice protocols for microbiome analysis. Brief. Bioinform. 22, 178–193 (2021).

    CAS  PubMed  Google Scholar 

  42. Ghurye, J. S., Cepeda-Espinoza, V. & Pop, M. Metagenomic assembly: overview, challenges and applications. Yale J. Biol. Med. 89, 353–362 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Rose, R., Constantinides, B., Tapinos, A., Robertson, D. L. & Prosperi, M. Challenges in the analysis of viral metagenomes. Virus Evol. 2, vew022 (2016).

    PubMed  PubMed Central  Google Scholar 

  44. Lu, J., Breitwieser, F. P., Thielen, P. & Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).

    Google Scholar 

  45. Pinto, Y., Chakraborty, M., Jain, N. & Bhatt, A. S. bhattlab/phanta. GitHub https://github.com/bhattlab/phanta (2023).

  46. Wright, R. J., Comeau, A. M. & Langille, M. G. I. From defaults to databases: parameter and database choice dramatically impact the performance of metagenomic taxonomic classification tools. Microb. Genom. 9, mgen000949 (2023).

    PubMed  PubMed Central  Google Scholar 

  47. Breitwieser, F. P., Baker, D. N. & Salzberg, S. L. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biol. 19, 198 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Sun, Z. et al. Challenges in benchmarking metagenomic profilers. Nat. Methods 18, 618–626 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Yachida, S. et al. Metagenomic and metabolomic analyses reveal distinct stage-specific phenotypes of the gut microbiota in colorectal cancer. Nat. Med. 25, 968–976 (2019).

    CAS  PubMed  Google Scholar 

  50. BenLangmead. Index zone. GitHub https://benlangmead.github.io/aws-indexes/k2 (2023).

  51. Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174–180 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Nayfach, S. et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol. 39, 578–585 (2021).

    CAS  PubMed  Google Scholar 

  54. Danovaro, R. & Serresi, M. Viral density and virus-to-bacterium ratio in deep-sea sediments of the Eastern Mediterranean. Appl. Environ. Microbiol. 66, 1857–1861 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Weitz, J. S., Beckett, S. J., Brum, J. R., Cael, B. B. & Dushoff, J. Lysis, lysogeny and virus–microbe ratios. Nature 549, E1–E3 (2017).

    CAS  PubMed  Google Scholar 

  56. Marbouty, M., Thierry, A., Millot, G. A. & Koszul, R. MetaHiC phage-bacteria infection network reveals active cycling phages of the healthy human gut. eLife 10, e60608 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Liang, G. et al. The stepwise assembly of the neonatal virome is modulated by breastfeeding. Nature 581, 470–474 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Shkoporov, A. N. & Hill, C. Bacteriophages of the human gut: the ‘known unknown’ of the microbiome. Cell Host Microbe 25, 195–209 (2019).

    CAS  PubMed  Google Scholar 

  59. Moreno-Gallego, J. L. et al. Virome diversity correlates with intestinal microbiome diversity in adult monozygotic twins. Cell Host Microbe 25, 261–272 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Chen, W. et al. Vast human gut virus diversity uncovered by combined short- and long-read sequencing. Preprint at bioRxiv https://doi.org/10.1101/2022.07.03.498593 (2022).

  61. Shkoporov, A. N. et al. The human gut virome is highly diverse, stable, and individual specific. Cell Host Microbe 26, 527–541 (2019).

    CAS  PubMed  Google Scholar 

  62. Stachler, E. & Bibby, K. Metagenomic evaluation of the highly abundant human gut Bacteriophage CrAssphage for source tracking of human fecal pollution. Environ. Sci. Technol. Lett. 1, 405–409 (2014).

    CAS  Google Scholar 

  63. Dutilh, B. E. et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun. 5, 4498 (2014).

    CAS  PubMed  Google Scholar 

  64. Benler, S. et al. A diversity-generating retroelement encoded by a globally ubiquitous Bacteroides phage. Microbiome 6, 191 (2018).

    PubMed  PubMed Central  Google Scholar 

  65. Guerin, E. & Hill, C. Shining light on human gut Bacteriophages. Front. Cell. Infect. Microbiol. 10, 481 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Kleiner, M., Hooper, L. V. & Duerkop, B. A. Evaluation of methods to purify virus-like particles for metagenomic sequencing of intestinal viromes. BMC Genomics 16, 7 (2015).

    PubMed  PubMed Central  Google Scholar 

  67. Khan Mirzaei, M. et al. Challenges of studying the human virome - relevant emerging technologies. Trends Microbiol. 29, 171–181 (2021).

    CAS  PubMed  Google Scholar 

  68. Krishnamurthy, S. R. & Wang, D. Origins and challenges of viral dark matter. Virus Res. 239, 136–142 (2017).

    CAS  PubMed  Google Scholar 

  69. Roux, S., Hallam, S. J., Woyke, T. & Sullivan, M. B. Viral dark matter and virus–host interactions resolved from publicly available microbial genomes. eLife 4, e08490 (2015).

    PubMed  PubMed Central  Google Scholar 

  70. Roux, S. et al. iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria. PLoS Biol. 21, e3002083 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Jain, C., Rodriguez, L. M., Phillippy, A. M., Konstantinidis, K. T. & Aluru, A. High throughput ANI analysis of 90 K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).

    PubMed  PubMed Central  Google Scholar 

  72. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Hockenberry, A. J. & Wilke, C. O. BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains. PeerJ 9, e11396 (2021).

    PubMed  PubMed Central  Google Scholar 

  74. Watts, S. C., Ritchie, S. C., Inouye, M. & Holt, K. E. FastSpar: rapid and scalable correlation estimation for compositional data. Bioinformatics 35, 1064–1066 (2019).

    CAS  PubMed  Google Scholar 

  75. Friedman, J. & Alm, E. J. Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 8, e1002687 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Fritz, A. et al. CAMISIM: simulating metagenomes and microbial communities. Microbiome 7, 17 (2019).

    PubMed  PubMed Central  Google Scholar 

  77. Li, H. et al. lh3/seqtk. GitHub https://github.com/lh3/seqtk (2018).

  78. Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 11, 2864–2868 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. Maghini and B. Doyle for thoughtful comments on the manuscript; S. Nayfach, P. Hiseni, S. Salzberg and J. Lu for helpful conversations; B. Doyle and J. Wirbel for testing Phanta; and B. Siranosian, C. Nicolau, K. Bettinger, A. Behr and the Stanford Research Computing Center for computational support. Computing costs were supported, in part, by an NIH S10 Shared Instrumentation under grant 1S10OD02014101. Figure 1 was created using BioRender.com. This study was supported in part by NIH R01AI148623 and R01AI143757, a Stand Up 2 Cancer Grant, the Chan Zuckerberg Initiative, a Sloan Foundation Fellowship and the Allen Distinguished Investigator Award (to A.S.B.). Y.P. is supported by the School of Medicine Dean’s Postdoctoral Fellowship. M.C. was supported by an NIH-funded predoctoral fellowship (5T32HG000044-25) and is supported by the National Defense Science and Engineering Graduate Fellowship (starting September 2022).

Author information

Authors and Affiliations

Authors

Contributions

Y.P., M.C. and A.S.B. conceived the study and wrote the manuscript. Y.P. and M.C. developed Phanta. Y.P., M.C., N.J. and A.S.B. analyzed the data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ami S. Bhatt.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks Guanxiang Liang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–12.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–11.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Figs. 3 and 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pinto, Y., Chakraborty, M., Jain, N. et al. Phage-inclusive profiling of human gut microbiomes with Phanta. Nat Biotechnol 42, 651–662 (2024). https://doi.org/10.1038/s41587-023-01799-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-023-01799-4

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research