Perspective | Open | Published:

Minimum Information about an Uncultivated Virus Genome (MIUViG)

Nature Biotechnology volume 37, pages 2937 (2019) | Download Citation

Abstract

We present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, which complement the Minimum Information about a Single Amplified Genome (MISAG) and Metagenome-Assembled Genome (MIMAG) standards for uncultivated bacteria and archaea, will improve the reporting of uncultivated virus genomes in public databases. In turn, this should enable more robust comparative studies and a systematic exploration of the global virosphere.

Main

Current estimates are that virus particles massively outnumber live cells in most habitats1,2, but only a tiny fraction of viruses have been cultivated in the laboratory. An unprecedented diversity of viruses are being discovered through culture-independent sequencing3. Progress has been made in reconstructing genomes of uncultivated viruses de novo, from biotic and abiotic environments, without laboratory isolation of the virus–host system. For example, in the past 2 years, more than 750,000 uncultivated virus genomes (UViGs) have been identified in metagenome and metatranscriptome datasets4,5,6,7,8,9, five times the total number of genomes sequenced from virus isolates (Fig. 1), and UViGs already represent ≥95% of the taxonomic diversity in publicly available virus sequences10,11. Although double-stranded DNA (dsDNA) genomes are over-represented in UViGs because most metagenomic protocols exclusively target dsDNA, UViGs nonetheless enable an assessment of global virus diversity and an evaluation of structure and drivers of viral communities. UViGs also contribute to improving our understanding of the evolutionary history of viruses and virus–host interactions.

Figure 1: Size of virus genome databases over time4,7,22,45,83,84,85,86,87,88,89.
Figure 1

Genome sequences from isolates (blue and green) or from UViGs (yellow) are shown. For genomes from isolates, the total number of genomes (blue) and the number of 'reference' genomes (green) are shown. Data were downloaded using the queries “Viruses[Organism] AND srcdb_refseq[PROP] NOT wgs[PROP] NOT cellular organisms[ORGN] NOT AC_000001:AC_999999[PACC]” for reference genomes and “Viruses[Organism] NOT cellular organisms[ORGN] NOT wgs[PROP] NOT AC_000001:AC_999999[pacc] NOT gbdiv syn[prop] AND nuccore genome samespecies[Filter]” for total number of virus genomes, on the NCBI nucleotide database portal (https://www.ncbi.nlm.nih.gov/nuccore) in January 2018. Genomes from the influenza virus database (https://www.ncbi.nlm.nih.gov/genomes/FLU/Database/nph-select.cgi?go=genomeset) were also added to the total number of virus genomes. UViGs can be assembled from metagenomes, from proviruses identified in microbial genomes, or from single-virus genomes, and estimated total UViG numbers were obtained by compiling data from the literature and from the total number of sequences in the IMG/VR database in January 2017, January 2018 and July 2018 (https://img.jgi.doe.gov/vr/)11. UpViG, uncultivated provirus.

Analysis and interpretation of standalone genomes present substantial challenges, whether the genomes are eukaryotic, bacterial, archaeal or viral. To address these challenges, MISAG and MIMAG standards were drafted to improve the quality of reporting of microbial genomes derived from single cell or metagenome sequences, which are often incomplete12. Although some aspects of MISAG and MIMAG can be applied to UViGs, the extraordinary diversity of viral genome composition and content, replication strategies, and hosts means that the completeness, quality, taxonomy and ecology of UViGs need to be evaluated via virus-specific metrics.

The Genomic Standards Consortium (http://gensc.org) maintains metadata checklists for MIxS, encompassing genome and metagenome sequences13, marker gene sequences14 and single amplified and metagenome-assembled bacterial and archaeal genomes12. Here we present a set of standards that extend the MIxS checklists to include identification, quality assessment, analysis and reporting of UViGs (Table 1 and Supplementary Tables 1 and 2), together with recommendations on how to perform these analyses. We provide a metadata checklist for database submission and publication of UViGs designed to be flexible enough to accommodate technological and methodological changes over time (Table 1 and Supplementary Table 1). The information gathered through the MIUViG checklist can be directly submitted with new UViG sequences to International Nucleotide Sequence Database Collaboration (INSDC) member databases—the DNA Database of Japan (DDBJ), the European Molecular Biology Laboratory–European Bioinformatics Institute (EMBL-EBI) and US National Center for Biotechnology Information (NCBI)—which will host and display checklist metadata alongside the UViG sequence. These MIUViG standards should also be used along with existing guidelines for virus genome analysis, including those issued by the International Committee on Taxonomy of Viruses (ICTV), which recently endorsed the incorporation of UViGs into the official virus classification scheme15 (https://talk.ictvonline.org). Although MIUViG standards and best practices were designed for genomes of viruses infecting microorganisms, they can also be applied to viruses infecting animals, fungi and plants, and are compatible with standards that are already in place for epidemiological analysis of these viruses16 (Supplementary Table 3).

Table 1: List of mandatory metadata for UViGs

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    , , & Phage puppet masters of the marine microbial realm. Nat. Microbiol. 3, 754–766 (2018).

  2. 2.

    , & in Viruses: Essential Agents of Life (ed. Witzany, G.) 61–81 (Springer Netherlands, 2012).

  3. 3.

    & Rising to the challenge: accelerated pace of discovery transforms marine virology. Nat. Rev. Microbiol. 13, 147–159 (2015).

  4. 4.

    et al. Uncovering Earth's virome. Nature 536, 425–430 (2016).

  5. 5.

    et al. Redefining the invertebrate RNA virosphere. Nature 540, 539–543 (2016).

  6. 6.

    et al. Diverse circular replication-associated protein encoding viruses circulating in invertebrates within a lake ecosystem. Infect. Genet. Evol. 39, 304–316 (2016).

  7. 7.

    et al. Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537, 689–693 (2016).

  8. 8.

    et al. Temporal dynamics of uncultured viruses: a new dimension in viral diversity. ISME J. 12, 199–211 (2018).

  9. 9.

    et al. Genomic exploration of individual giant ocean viruses. ISME J. 11, 1736–1745 (2017).

  10. 10.

    , , & NCBI viral genomes resource. Nucleic Acids Res. 43, D571–D577 (2015).

  11. 11.

    et al. IMG/VR: a database of cultured and uncultured DNA viruses and retroviruses. Nucleic Acids Res. 45, D457–D465 (2017).

  12. 12.

    et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).

  13. 13.

    et al. The minimum information about a genome sequence (MIGS) specification. Nat. Biotechnol. 26, 541–547 (2008).

  14. 14.

    et al. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat. Biotechnol. 29, 415–420 (2011).

  15. 15.

    et al. Consensus statement: virus taxonomy in the age of metagenomics. Nat. Rev. Microbiol. 15, 161–168 (2017).

  16. 16.

    et al. Standards for sequencing viral genomes in the era of high-throughput sequencing. MBio 5, e01360–e14 (2014).

  17. 17.

    , , , & Laboratory procedures to generate viral metagenomes. Nat. Protoc. 4, 470–483 (2009).

  18. 18.

    , & Metagenomics and future perspectives in virus discovery. Curr. Opin. Virol. 2, 63–77 (2012).

  19. 19.

    , , & Towards quantitative metagenomics of wild viruses and other ultra-low concentration DNA samples: a rigorous assessment and optimization of the linker amplification method. Environ. Microbiol. 14, 2526–2537 (2012).

  20. 20.

    , , & Enhanced virome sequencing using targeted sequence capture. Genome Res. 25, 1910–1920 (2015).

  21. 21.

    et al. Single virus genomics: a new tool for virus discovery. PLoS One 6, e17722 (2011).

  22. 22.

    et al. Single-virus genomics reveals hidden cosmopolitan and abundant viruses. Nat. Commun. 8, 15892 (2017).

  23. 23.

    et al. Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles. Nat. Commun. 8, 84 (2017).

  24. 24.

    , & Clinical and biological insights from viral genome sequencing. Nat. Rev. Microbiol. 15, 183–192 (2017).

  25. 25.

    et al. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes. ISME J. 7, 1678–1695 (2013).

  26. 26.

    , , & Viral dark matter and virus-host interactions resolved from publicly available microbial genomes. Elife 4, e08490 (2015).

  27. 27.

    et al. Prophage genomics reveals patterns in phage genome organization and replication. Preprint at bioRxiv (2017).

  28. 28.

    Prophages and bacterial genomics: what have we learned so far? Mol. Microbiol. 49, 277–300 (2003).

  29. 29.

    , , , & Genome diversity of marine phages recovered from Mediterranean metagenomes: size matters. PLoS Genet. 13, e1007018 (2017).

  30. 30.

    , , , & Assessment of viral community functional potential from viral metagenomes may be hampered by contamination with cellular sequences. Open Biol. 3, 130160 (2013).

  31. 31.

    et al. Diverse uncultivated ultra-small bacterial cells in groundwater. Nat. Commun. 6, 6372 (2015).

  32. 32.

    , , & Mobile genetic elements: the agents of open source evolution. Nat. Rev. Microbiol. 3, 722–732 (2005).

  33. 33.

    & Importance of widespread gene transfer agent genes in alpha-proteobacteria. Trends Microbiol. 15, 54–62 (2007).

  34. 34.

    et al. Membrane vesicles in sea water: heterogeneous DNA content and implications for viral abundance estimates. ISME J. 11, 394–404 (2017).

  35. 35.

    et al. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res. 44, W16–W21 (2016).

  36. 36.

    , , & VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985 (2015).

  37. 37.

    , , & MARVEL, a tool for prediction of bacteriophage sequences in metagenomic bins. Front. Genet. 9, 304 (2018).

  38. 38.

    , , , & VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome 5, 69 (2017).

  39. 39.

    et al. VirusSeeker, a computational pipeline for virus discovery and virome composition analysis. Virology 503, 21–30 (2017).

  40. 40.

    , , & Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data. Nat. Protoc. 12, 1673–1682 (2017).

  41. 41.

    et al. Diversity and dynamics of algal Megaviridae members during a harmful brown tide caused by the pelagophyte, Aureococcus anophagefferens. FEMS Microbiol. Ecol. 92, fiw058 (2016).

  42. 42.

    et al. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses. Proc. Natl. Acad. Sci. USA 111, 15786–15791 (2014).

  43. 43.

    , , , & Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology. ISME J. 11, 2479–2491 (2017).

  44. 44.

    , , , & Shotgun metagenomics indicates novel family A DNA polymerases predominate within marine virioplankton. ISME J. 8, 103–114 (2014).

  45. 45.

    , & Metagenomic analysis of coastal RNA virus communities. Science 312, 1795–1798 (2006).

  46. 46.

    , & Ecological dynamics and co-occurrence among marine phytoplankton, bacteria and myoviruses shows microdiversity matters. ISME J. 11, 1614–1629 (2017).

  47. 47.

    , , & Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ 5, e3817 (2017).

  48. 48.

    et al. The viral metagenome annotation pipeline (VMGAP): an automated tool for the functional annotation of viral metagenomic shotgun sequencing data. Stand. Genomic Sci. 4, 418–429 (2011).

  49. 49.

    et al. Phage genome annotation using the RAST pipeline. Methods Mol. Biol. 1681, 231–238 (2018).

  50. 50.

    et al. Towards viral genome annotation standards, report from the 2010 NCBI Annotation Workshop. Viruses 2, 2258–2268 (2010).

  51. 51.

    Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

  52. 52.

    et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

  53. 53.

    Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).

  54. 54.

    , , & Use of profile hidden Markov models in viral discovery: current insights. Adv. Genomics Genet. 7, 29–45 (2017).

  55. 55.

    et al. Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer. BMC Genomics 17, 930 (2016).

  56. 56.

    et al. Comparative omics and trait analyses of marine Pseudoalteromonas phages advance the phage OTU concept. Front. Microbiol. 8, 1241 (2017).

  57. 57.

    & Bacteriophage evolution differs by host, lifestyle and genome. Nat. Microbiol. 2, 17112 (2017).

  58. 58.

    & The genomic underpinnings of eukaryotic virus taxonomy: creating a sequence-based framework for family-level virus classification. Microbiome 6, 38 (2018).

  59. 59.

    et al. vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria. PeerJ 5, e3243 (2017).

  60. 60.

    , , & Expanding the marine virosphere using metagenomics. PLoS Genet. 9, e1003987 (2013).

  61. 61.

    et al. Implementation of objective PASC-derived taxon demarcation criteria for official classification of filoviruses. Viruses 9, E106 (2017).

  62. 62.

    & Sequence-based taxonomic framework for the classification of uncultured single-stranded DNA viruses of the family Genomoviridae. Virus Evol. 3, vew037 (2017).

  63. 63.

    & The phage proteomic tree: a genome-based taxonomy for phage. J. Bacteriol. 184, 4529–4535 (2002).

  64. 64.

    et al. Classification of Myoviridae bacteriophages using protein sequence similarity. BMC Microbiol. 9, 224 (2009).

  65. 65.

    et al. ViPTree: the viral proteomic tree server. Bioinformatics 33, 2379–2380 (2017).

  66. 66.

    & VICTOR: genome-based phylogeny and classification of prokaryotic viruses. Bioinformatics 33, 3396–3404 (2017).

  67. 67.

    Expression of animal virus genomes. Bacteriol. Rev. 35, 235–241 (1971).

  68. 68.

    , , , & Computational approaches to predict bacteriophage-host relationships. FEMS Microbiol. Rev. 40, 258–272 (2016).

  69. 69.

    et al. Linking virus genomes with host taxonomy. Viruses 8, 66 (2016).

  70. 70.

    et al. HostPhinder: a phage host prediction tool. Viruses 8, 116 (2016).

  71. 71.

    et al. Reconstructing viral genomes from the environment using fosmid clones: the case of haloviruses. PLoS One 7, e33802 (2012).

  72. 72.

    et al. Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics. Elife 3, e03125 (2014).

  73. 73.

    et al. Single-cell genomics-based analysis of virus-host interactions in marine surface bacterioplankton. ISME J. 9, 2386–2399 (2015).

  74. 74.

    , , , & WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs. Bioinformatics 33, 3113–3114 (2017).

  75. 75.

    , , , & Alignment-free d2* oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences. Nucleic Acids Res. 45, 39–53 (2017).

  76. 76.

    , , , & Gnotobiotic mouse model of phage-bacterial host dynamics in the human gut. Proc. Natl. Acad. Sci. USA 110, 20236–20241 (2013).

  77. 77.

    et al. Determinants of community structure in the global plankton interactome. Science 348, 1262073 (2015).

  78. 78.

    et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun. 5, 4498 (2014).

  79. 79.

    & Limitations of correlation-based inference in complex virus-microbe communities. mSystems 3, e00084–18 (2018).

  80. 80.

    et al. MVP: a microbe-phage interaction database. Nucleic Acids Res. 46, D700–D707 (2018).

  81. 81.

    , , , & Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes. Front. Microbiol. 6, 381 (2015).

  82. 82.

    et al. 50 years of the International Committee on Taxonomy of Viruses: progress and prospects. Arch. Virol. 162, 1441–1446 (2017).

  83. 83.

    et al. Gut DNA viromes of Malawian twins discordant for severe acute malnutrition. Proc. Natl. Acad. Sci. USA 112, 11941–11946 (2015).

  84. 84.

    et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005).

  85. 85.

    et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

  86. 86.

    et al. The marine viromes of four oceanic regions. PLoS Biol. 4, e368 (2006).

  87. 87.

    , , & Prophinder: a computational tool for prophage prediction in prokaryotic genomes. Bioinformatics 24, 863–865 (2008).

  88. 88.

    et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334–338 (2010).

  89. 89.

    et al. Single-cell genomics reveals organismal interactions in uncultivated marine protists. Science 332, 714–717 (2011).

  90. 90.

    The classification of viruses. J. Gen. Microbiol. 12, 358–361 (1955).

  91. 91.

    , & A system of viruses. Cold Spring Harb. Symp. Quant. Biol. 27, 51–55 (1962).

  92. 92.

    The new provisional committee on nomenclature of viruses. Int. Bull. Bacteriol. Nomencl. Taxon. 14, 53–56 (1964).

  93. 93.

    et al. Changes to taxonomy and the International Code of Virus Classification and Nomenclature ratified by the International Committee on Taxonomy of Viruses. Arch. Virol. 163, 2601–2631 (2018).

Download references

Acknowledgements

This work was supported by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under US Department of Energy Contract No. DE-AC02-05CH11231 for S.R.; the Netherlands Organization for Scientific Research (NWO) Vidi grant 864.14.004 for B.E.D.; the Intramural Research Program of the National Library of Medicine, National Institutes of Health for E.V.K., I.K.M., J.R.B. and N.Y.; the Virus-X project (EU Horizon 2020, No. 685778) for F.E. and M.K.; Battelle Memorial Institute's prime contract with the US National Institute of Allergy and Infectious Diseases (NIAID) under Contract No. HHSN272200700016I for J.H.K.; the GOA grant “Bacteriophage Biosystems” from KU Leuven for R.L.; the European Molecular Biology Laboratory for C.A. and G.R.C.; Cairo University Grant 2016-57 for R.K.A.; National Science Foundation award 1456778, National Institutes of Health awards R01 AI132581 and R21 HD086833, and The Vanderbilt Microbiome Initiative award for S.R.B.; National Science Foundation awards DEB-1239976 for M.B. and K.R. and DEB-1555854 for M.B.; the NSF Early Career award DEB-1555854 and NSF Dimensions of Biodiversity #1342701 for K.C.W. and R.A.D.; the Agence Nationale de la Recherche JCJC grant ANR-13-JSV6-0004 and Investissements d'Avenir Méditerranée Infection 10-IAHU-03 for C.D.; the Gordon and Betty Moore Foundation Marine Microbiology Initiative No. 3779 and the Simons Foundation for J.A.F.; the French government “Investissements d'Avenir” program OCEANOMICS ANR-11-BTBR-0008 and European FEDER Fund 1166-39417 for P. Hingamp; Australian Research Council Laureate Fellowship FL150100038 to P. Hugenholtz the National Science Foundation award 1801367 and C-DEBI Research Grant for J.M.L.; the Gordon and Betty Moore Foundation grant 5334 and Ministry of Economy and Competitivity refs. CGL2013-40564-R and SAF2013-49267-EXP for M.M.-G.; the Grant-in-Aid for Scientific Research on Innovative Areas from the Ministry of Education, Culture, Science, Sports, and Technology (MEXT) of Japan No. 16H06429, 16K21723, and 16H06437 for H.O. and T.Y.; National Science Foundation award DBI-1661357 to C.P.; the Ministry of Economy and Competitivity ref CGL2016-76273-P (cofunded with FEDER funds) for F.R.-V.; the Gordon and Betty Moore Foundation awards 3305 and 3790 and NSF Biological Oceanography OCE 1536989 for M.B.S.; the ETH Zurich and Helmut Horten Foundation and the Novartis Foundation for Medical-Biological Research (17B077) for S.S.; a BIOS-SCOPE award from Simons Foundation International and NERC award NE/P008534/1 to B.T.; NSF Biological Oceanography Grant 1635913 for R.V.T.; the Australian Research Council Future Fellowship FT120100480 for N.S.W.; a Gilead Sciences Cystic Fibrosis Research Scholarship for K.L.W.; Gordon and Better Moore Foundation Grant 4971 for S.W.W.; the NSF EPSCoR grant 1736030 for K.E.W.; the National Science Foundation award DEB-4W4596 and National Institutes of Health award R01 GM117361 for M.J.Y.; the Gordon and Betty Moore Foundation No. 7000 and the National Oceanic and Atmospheric Administration (NOAA) under award NA15OAR4320071 for L.Z.A. DDBJ is supported by ROIS and MEXT. The work conducted by the US Department of Energy Joint Genome Institute is supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the US Department of Health and Human Services or of the institutions and companies affiliated with the authors. B.E.D., A.K., M.K., J.H.K., R.L. and A.V. are members of the ICTV Executive Committee, but the views and opinions expressed are those of the authors and not those of the ICTV.

Author information

Affiliations

  1. US Department of Energy Joint Genome Institute, Walnut Creek, California, USA.

    • Simon Roux
    • , Natalia N Ivanova
    • , Rex R Malmstrom
    • , David Páez-Espino
    • , Frederik Schulz
    • , Susannah G Tringe
    • , Tanja Woyke
    • , Nikos C Kyrpides
    •  & Emiley A Eloe-Fadrosh
  2. Institute of Integrative Biology, University of Liverpool, Liverpool, UK.

    • Evelien M Adriaenssens
  3. Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, the Netherlands.

    • Bas E Dutilh
  4. Centre for Molecular and Biomolecular Informatics, Radboud University Medical Centre, Nijmegen, the Netherlands.

    • Bas E Dutilh
  5. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.

    • Eugene V Koonin
    • , J Rodney Brister
    • , Ilene Karsch Mizrachi
    •  & Natalya Yutin
  6. Department of Pathobiology, Ontario Veterinary College, University of Guelph, Guelph, Ontario, Canada.

    • Andrew M Kropinski
  7. Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France.

    • Mart Krupovic
  8. Integrated Research Facility at Fort Detrick, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Fort Detrick, Frederick, Maryland, USA.

    • Jens H Kuhn
  9. KU Leuven, Laboratory of Gene Technology, Heverlee, Belgium.

    • Rob Lavigne
  10. Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, Arizona, USA.

    • Arvind Varsani
  11. Structural Biology Research Unit, Department of Integrative Biomedical Sciences, University of Cape Town, Observatory, Cape Town, South Africa.

    • Arvind Varsani
  12. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.

    • Clara Amid
    •  & Guy R Cochrane
  13. Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt.

    • Ramy K Aziz
  14. Departments of Biological Sciences and Pathology, Microbiology, and Immunology, Vanderbilt Institute for Infection, Immunology and Inflammation, Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee, USA.

    • Seth R Bordenstein
  15. European Molecular Biology Laboratory, Heidelberg, Germany.

    • Peer Bork
  16. College of Marine Science, University of South Florida, Saint Petersburg, Florida, USA.

    • Mya Breitbart
    •  & Karyna Rosario
  17. Soil and Crop Sciences Department, Colorado State University, Fort Collins, Colorado, USA.

    • Rebecca A Daly
    •  & Kelly C Wrighton
  18. Aix-Marseille Université, CNRS, MEPHI, IHU Méditerranée Infection, Marseille, France.

    • Christelle Desnues
  19. Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, USA.

    • Melissa B Duhaime
  20. University of California, Davis, Department of Plant Pathology, Davis, California, USA.

    • Joanne B Emerson
  21. LMGE,UMR 6023 CNRS, Université Clermont Auvergne, Aubiére, France.

    • François Enault
  22. University of Southern California, Los Angeles, Los Angeles, California, USA.

    • Jed A Fuhrman
  23. Aix Marseille Université´, Université de Toulon, CNRS, IRD, MIO UM 110, Marseille, France.

    • Pascal Hingamp
  24. Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, St. Lucia, Queensland, Australia.

    • Philip Hugenholtz
    •  & Nicole S Webster
  25. Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, Arizona, USA.

    • Bonnie L Hurwitz
  26. BIO5 Research Institute, University of Arizona, Tucson, Arizona, USA.

    • Bonnie L Hurwitz
  27. Department of Marine Biology, Texas A&M University at Galveston, Galveston, Texas, USA.

    • Jessica M Labonté
  28. DDBJ Center, National Institute of Genetics, Mishima, Shizuoka, Japan.

    • Kyung-Bum Lee
  29. Department of Physiology, Genetics and Microbiology, University of Alicante, Alicante, Spain.

    • Manuel Martinez-Garcia
  30. Institute for Chemical Research, Kyoto University, Uji, Japan.

    • Hiroyuki Ogata
  31. Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France.

    • Marie-Agnès Petit
  32. Department of Biology, Loyola University Chicago, Chicago, Illinois, USA.

    • Catherine Putonti
  33. Bioinformatics Program, Loyola University Chicago, Chicago, Illinois, USA.

    • Catherine Putonti
  34. Department of Computer Science, Loyola University Chicago, Chicago, Illinois, USA.

    • Catherine Putonti
  35. Division of Computational Systems Biology, Department of Microbiology and Ecosystem Science, Research Network “Chemistry Meets Microbiology,” University of Vienna, Vienna, Austria.

    • Thomas Rattei
  36. Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia.

    • Alejandro Reyes
  37. Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Alicante, Spain.

    • Francisco Rodriguez-Valera
  38. University of Maryland School of Medicine, Baltimore, Maryland, USA.

    • Lynn Schriml
  39. Center for Microbial Oceanography: Research and Education, Department of Oceanography, University of Hawai'i at Mānoa, Honolulu, Hawai'i, USA.

    • Grieg F Steward
  40. Department of Microbiology, The Ohio State University, Columbus, Ohio, USA.

    • Matthew B Sullivan
  41. Department of Civil, Environmental and Geodetic Engineering, The Ohio State University, Columbus, Ohio, USA.

    • Matthew B Sullivan
  42. ETH Zurich, Department of Biology, Zurich, Switzerland.

    • Shinichi Sunagawa
  43. Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, Vancouver, British Columbia, Canada.

    • Curtis A Suttle
  44. Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada.

    • Curtis A Suttle
  45. Department of Microbiology and Immunology, University of British Columbia, Vancouver, British Columbia, Canada.

    • Curtis A Suttle
  46. Institute of Oceans and Fisheries, University of British Columbia, Vancouver, British Columbia, Canada.

    • Curtis A Suttle
  47. School of Biosciences, University of Exeter, Exeter, UK.

    • Ben Temperton
  48. Department of Microbiology, Oregon State University, Oregon, USA.

    • Rebecca Vega Thurber
  49. Australian Institute of Marine Science, Townsville, Queensland, Australia.

    • Nicole S Webster
  50. Department of Molecular Biology and Biochemistry, University of California, Irvine, California, USA.

    • Katrine L Whiteson
  51. Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA.

    • Steven W Wilhelm
  52. University of Delaware, Delaware Biotechnology Institute, Newark, Delaware, USA.

    • K Eric Wommack
  53. Microbial Physiology Group, Max Planck Institute for Marine Microbiology, Bremen, Germany.

    • Pelin Yilmaz
  54. Graduate School of Agriculture, Kyoto University, Kitashirakawa-Oiwake, Kyoto, Japan.

    • Takashi Yoshida
  55. Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, Montana, USA.

    • Mark J Young
  56. J Craig Venter Institute, La Jolla, California, USA.

    • Lisa Zeigler Allen
  57. Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California, USA.

    • Lisa Zeigler Allen

Authors

  1. Search for Simon Roux in:

  2. Search for Evelien M Adriaenssens in:

  3. Search for Bas E Dutilh in:

  4. Search for Eugene V Koonin in:

  5. Search for Andrew M Kropinski in:

  6. Search for Mart Krupovic in:

  7. Search for Jens H Kuhn in:

  8. Search for Rob Lavigne in:

  9. Search for J Rodney Brister in:

  10. Search for Arvind Varsani in:

  11. Search for Clara Amid in:

  12. Search for Ramy K Aziz in:

  13. Search for Seth R Bordenstein in:

  14. Search for Peer Bork in:

  15. Search for Mya Breitbart in:

  16. Search for Guy R Cochrane in:

  17. Search for Rebecca A Daly in:

  18. Search for Christelle Desnues in:

  19. Search for Melissa B Duhaime in:

  20. Search for Joanne B Emerson in:

  21. Search for François Enault in:

  22. Search for Jed A Fuhrman in:

  23. Search for Pascal Hingamp in:

  24. Search for Philip Hugenholtz in:

  25. Search for Bonnie L Hurwitz in:

  26. Search for Natalia N Ivanova in:

  27. Search for Jessica M Labonté in:

  28. Search for Kyung-Bum Lee in:

  29. Search for Rex R Malmstrom in:

  30. Search for Manuel Martinez-Garcia in:

  31. Search for Ilene Karsch Mizrachi in:

  32. Search for Hiroyuki Ogata in:

  33. Search for David Páez-Espino in:

  34. Search for Marie-Agnès Petit in:

  35. Search for Catherine Putonti in:

  36. Search for Thomas Rattei in:

  37. Search for Alejandro Reyes in:

  38. Search for Francisco Rodriguez-Valera in:

  39. Search for Karyna Rosario in:

  40. Search for Lynn Schriml in:

  41. Search for Frederik Schulz in:

  42. Search for Grieg F Steward in:

  43. Search for Matthew B Sullivan in:

  44. Search for Shinichi Sunagawa in:

  45. Search for Curtis A Suttle in:

  46. Search for Ben Temperton in:

  47. Search for Susannah G Tringe in:

  48. Search for Rebecca Vega Thurber in:

  49. Search for Nicole S Webster in:

  50. Search for Katrine L Whiteson in:

  51. Search for Steven W Wilhelm in:

  52. Search for K Eric Wommack in:

  53. Search for Tanja Woyke in:

  54. Search for Kelly C Wrighton in:

  55. Search for Pelin Yilmaz in:

  56. Search for Takashi Yoshida in:

  57. Search for Mark J Young in:

  58. Search for Natalya Yutin in:

  59. Search for Lisa Zeigler Allen in:

  60. Search for Nikos C Kyrpides in:

  61. Search for Emiley A Eloe-Fadrosh in:

Contributions

All authors participated in writing the manuscript and provided critical feedback. S.R. performed the analyses for the supplementary notes and figures.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Simon Roux or Emiley A Eloe-Fadrosh.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–5

  2. 2.

    Supplementary Notes

    Supplementary Notes 1–4

Excel files

  1. 1.

    Supplementary Table 1

    List of mandatory and optional metadata for UViGs

  2. 2.

    Supplementary Table 2

    List of metadata from previous standards relevant for UViGs21

  3. 3.

    Supplementary Table 3

    Comparison between UViGs categories and the quality categories proposed for small DNA/RNA virus whole-genome sequencing for epidemiology and surveillance by Ladner et al.22

  4. 4.

    Supplementary Table 4

    List and characteristics of tools used to identify virus sequences in mixed datasets published or updated since 201223–31

  5. 5.

    Supplementary Table 5

    Variation in genome length for virus families and genera with two or more genomes, from NCBI RefSeq v83.

  6. 6.

    Supplementary Table 6

    List of potential marker genes for virus orders, families or genera, based on the VOGdb v83 (http://vogdb.org/)

  7. 7.

    Supplementary Table 7

    List of UViGs from the GOV dataset4 considered as high-quality drafts or finished genomes

  8. 8.

    Supplementary Table 8

    List of databases providing collections of HMM profiles for virus protein families32–35

  9. 9.

    Supplementary Table 9

    Current species demarcation criteria from ICTV ninth and tenth reports.

  10. 10.

    Supplementary Table 10

    Approaches available for in silico host prediction18,37–42

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.4306