Letter

Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses

  • Nature volume 537, pages 689693 (29 September 2016)
  • doi:10.1038/nature19366
  • Download Citation
Received:
Accepted:
Published:

Abstract

Ocean microbes drive biogeochemical cycling on a global scale1. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories2,3. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known4. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions5,6, and analyse the resulting ‘global ocean virome’ dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups7,8). This roughly triples the number of known ocean viral populations4 and doubles the number of candidate bacterial and archaeal virus genera8, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.

  • Subscribe to Nature for full access:

    $199

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    , & The microbial engines that drive Earth’s biogeochemical cycles. Science 320, 1034–1039 (2008)

  2. 2.

    & Viruses manipulate the marine environment. Nature 459, 207–212 (2009)

  3. 3.

    & Rising to the challenge: accelerated pace of discovery transforms marine virology. Nat. Rev. Microbiol. 13, 147–159 (2015)

  4. 4.

    et al. Patterns and ecological drivers of ocean viral communities. Science 348, 1261498 (2015)

  5. 5.

    et al. A holistic approach to marine eco-systems biology. PLoS Biol . 9, e1001177 (2011)

  6. 6.

    Seafaring in the 21st century: the Malaspina 2010 circumnavigation expedition. Limnol. Oceanogr. 24, 11–14 (2015)

  7. 7.

    , , & Reticulate representation of evolutionary and functional relationships between phage genomes. Mol. Biol. Evol. 25, 762–777 (2008)

  8. 8.

    , , & Viral dark matter and virus-host interactions resolved from publicly available microbial genomes. eLife 4, 1–20 (2015)

  9. 9.

    , , & Expanding the marine virosphere using metagenomics. PLoS Genet . 9, e1003987 (2013)

  10. 10.

    , , , & Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions. Front. Microbiol. 6, 265 (2015)

  11. 11.

    et al. Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics. eLife 3, e03125 (2014)

  12. 12.

    et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun . 5, 4498 (2014)

  13. 13.

    et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31, 533–538 (2013)

  14. 14.

    et al. Genomic analysis of oceanic cyanobacterial myoviruses compared with T4-like myoviruses from diverse hosts and environments. Environ. Microbiol. 12, 3035–3056 (2010)

  15. 15.

    et al. Abundant SAR11 viruses in the ocean. Nature 494, 357–360 (2013)

  16. 16.

    et al. Genomes of marine cyanopodoviruses reveal multiple origins of diversity. Environ. Microbiol. 15, 1356–1376 (2013)

  17. 17.

    & Virus population dynamics and acquired virus resistance in natural microbial communities. Science 320, 1047–1050 (2008)

  18. 18.

    et al. Ocean plankton. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015)

  19. 19.

    , & Multi-scale structure and geographic drivers of cross-infection within marine bacteria and phages. ISME J . 7, 520–532 (2013)

  20. 20.

    , & Depth-stratified functional and taxonomic niche specialization in the ‘core’ and ‘flexible’ Pacific Ocean Virome. ISME J . 9, 472–484 (2015)

  21. 21.

    et al. Sulfur oxidation genes in diverse deep-sea viruses. Science 344, 757–760 (2014)

  22. 22.

    , , , & Prokaryotic sulfur oxidation. Curr. Opin. Microbiol. 8, 253–259 (2005)

  23. 23.

    et al. A protein trisulfide couples dissimilatory sulfate reduction to energy conservation. Science 350, 1541–1545 (2015)

  24. 24.

    , , & The “bacterial heterodisulfide” DsrC is a key protein in dissimilatory sulfur metabolism. Biochim. Biophys. Acta 1837, 1148–1164 (2014)

  25. 25.

    , , , & Sulfite oxidation in the purple sulfur bacterium Allochromatium vinosum: identification of SoeABC as a major player and relevance of SoxYZ in the process. Microbiology 159, 2626–2638 (2013)

  26. 26.

    , & & (II) signal transduction proteins: nitrogen regulation and beyond. FEMS Microbiol. Rev. 37, 251–283 (2013)

  27. 27.

    & Physiology and diversity of ammonia-oxidizing archaea. Annu. Rev. Microbiol. 66, 83–101 (2012)

  28. 28.

    et al. Reverse dissimilatory sulfite reductase as phylogenetic marker for a subgroup of sulfur-oxidizing prokaryotes. Environ. Microbiol. 11, 289–299 (2009)

  29. 29.

    , & The Thaumarchaeota: an emerging view of their phylogeny and ecophysiology. Curr. Opin. Microbiol. 14, 300–306 (2011)

  30. 30.

    et al. A multitrophic model to quantify the effects of marine viruses on microbial food webs and ecosystem processes. ISME J . 9, 1352–1364 (2015)

  31. 31.

    , & & (II) signal transduction proteins, pivotal players in microbial nitrogen control. Microbiol. Mol. Biol. Rev. 65, 80–105 (2001)

  32. 32.

    et al. Open science resources for the discovery and analysis of Tara Oceans data. Sci. Data 2, 150023 (2015)

  33. 33.

    et al. A simple and efficient method for concentration of ocean viruses by chemical flocculation. Environ. Microbiol. Rep. 3, 195–202 (2011)

  34. 34.

    , , & Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics. Environ. Microbiol. 15, 1428–1440 (2013)

  35. 35.

    , & in Practical Guidelines for the Analysis of Seawater (ed. ) 143–176 (CRC Press, 2009)

  36. 36.

    Tara Oceans Consortium & Tara Oceans Expedition. Registry of all samples from the Tara Oceans Expedition (2009–2013). (2015)

  37. 37.

    Tara Oceans Consortium & Tara Oceans Expedition. Environmental context of all samples from the Tara Oceans Expedition (2009–2013). (2015)

  38. 38.

    Tara Oceans Consortium & Tara Oceans Expedition. Biodiversity context of all samples from the Tara Oceans Expedition (2009–2013). (2015)

  39. 39.

    et al. Global diversity and biogeography of deep-sea pelagic prokaryotes. ISME J . 10, 596–608 (2016). 10.1038/ismej.2015.137

  40. 40.

    et al. MOCAT: a metagenomics assembly and gene prediction toolkit. PLoS One 7, e47656 (2012)

  41. 41.

    , , & IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012)

  42. 42.

    et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012)

  43. 43.

    & Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006)

  44. 44.

    , , & MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015)

  45. 45.

    et al. Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. Nat. Methods 4, 495–500 (2007)

  46. 46.

    , , , & Assessment of viral community functional potential from viral metagenomes may be hampered by contamination with cellular sequences. Open Biol . 3, 130160 (2013)

  47. 47.

    , , & VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985 (2015)

  48. 48.

    et al. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity. eLife 4, e06416 (2015)

  49. 49.

    , & An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002)

  50. 50.

    et al. Pfam: the protein families database. Nucleic Acids Res . 42, D222–D230 (2014)

  51. 51.

    Accelerated Profile HMM Searches. PLOS Comput. Biol. 7, e1002195 (2011)

  52. 52.

    et al. Illuminating structural proteins in viral “dark matter” with metaproteomics. Proc. Natl Acad. Sci. USA 113, 2436–2441 (2016)

  53. 53.

    et al. Twelve previously unknown phage genera are ubiquitous in global oceans. Proc. Natl Acad. Sci. USA 110, 12798–12803 (2013)

  54. 54.

    , & Complete genome sequences of two Persicivirga bacteriophages, P12024S and P12024L. J. Virol. 86, 8907–8908 (2012)

  55. 55.

    , , & Genome of a SAR116 bacteriophage shows the prevalence of this phage type in the oceans. Proc. Natl Acad. Sci. USA 110, 12343–12348 (2013)

  56. 56.

    , , , & Isolation, growth and genome of the Rhodothermus RM378 thermophilic bacteriophage. Extremophiles 18, 261–270 (2014)

  57. 57.

    & Characterization of a thermophilic bacteriophage of Geobacillus kaustophilus. Arch. Virol. 159, 2771–2775 (2014)

  58. 58.

    , , & Genomic and phenotypic characterization of Rhizobium gallicum phage vB_RglS_P106B. Microbiology 161, 611–620 (2015)

  59. 59.

    & The Phage Proteomic Tree: a genome-based taxonomy for phage. J. Bacteriol. 184, 4529–4535 (2002)

  60. 60.

    & Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007)

  61. 61.

    & Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res . 39, W475–8 (2011)

  62. 62.

    & Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012)

  63. 63.

    , , , & Computational approaches to predict bacteriophage-host relationships. FEMS Microbiol. Rev. 40, 258–272 (2016)

  64. 64.

    et al. CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8, 209 (2007)

  65. 65.

    , , , & Diverse CRISPRs evolving in human microbiomes. PLoS Genet . 8, e1002441 (2012)

  66. 66.

    , & EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet . 16, 276–277 (2000)

  67. 67.

    et al. Genome signature-based dissection of human gut metagenomes to extract subliminal viral sequences. Nat. Commun. 4, 2420 (2013)

  68. 68.

    & A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011)

  69. 69.

    et al. The vegan package version 2.4-0; (2016)

  70. 70.

    et al. Comparative metagenomics of microbial traits within oceanic viral communities. ISME J . 5, 1178–1190 (2011)

  71. 71.

    et al. Phage auxiliary metabolic genes and the redirection of cyanobacterial host carbon metabolism. Proc. Natl Acad. Sci. USA 108, E757–E764 (2011)

  72. 72.

    , , , & Efficient phage-mediated pigment biosynthesis in oceanic cyanobacteria. Curr. Biol. 18, 442–448 (2008)

  73. 73.

    , , , & Photosynthesis genes in marine viruses yield proteins during host infection. Nature 438, 86–89 (2005)

  74. 74.

    et al. Genome-wide expression dynamics of a marine virus and host reveal features of co-evolution. Nature 449, 83–86 (2007)

  75. 75.

    et al. Prevalence and evolution of core photosystem II genes in marine cyanobacterial viruses and their hosts. PLoS Biol . 4, e234 (2006)

  76. 76.

    MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004)

  77. 77.

    , , , & Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009)

  78. 78.

    , & FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010)

  79. 79.

    & MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755 (2001)

  80. 80.

    P. phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593 (2011)

  81. 81.

    , & Easyfig: a genome comparison visualizer. Bioinformatics 27, 1009–1010 (2011)

  82. 82.

    , & I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protocols 5, 725–738 (2010)

  83. 83.

    & ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res . 35, W407–10 (2007)

  84. 84.

    et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50 (2013)

  85. 85.

    et al. Comparison of library preparation methods reveals their impact on interpretation of metatranscriptomic data. BMC Genomics 15, 912 (2014)

Download references

Acknowledgements

We thank J. Weitz for advice on statistics, C. Pelikan for help with the DsrAB phylogenetic tree, C. Dahl for discussion regarding DsrC function, and members of the Sullivan and the V. Rich laboratories for suggestions and comments on this manuscript. We acknowledge support from UA high-performance computing and the Ohio Supercomputer Center. Sponsors and support for Tara Oceans and Malaspina expeditions are listed in the Supplementary Information. This viral research was funded by a National Science Foundation grant (1536989) and Gordon and Betty Moore Foundation grants (3790, 2631) to M.B.S., and the French Ministry of Research and Government through the ‘Investissements d’Avenir’ program OCEANOMICS (ANR-11-BTBR-0008) and France Genomique (ANR-10-INBS-09-08). Virus researchers were partially supported by the Water, Environmental and Energy Solutions Initiative and the Ecosystem Genomics Institute (S.R.), the Netherlands Organization for Scientific Research Vidi grant 864.14.004 and CAPES/BRASIL (B.E.D.), and the Austrian Science Fund (project P25111-B22, A.L.). Sequencing was provided by Genoscope (Tara Oceans) and DOE JGI (Malaspina). All authors approved the final manuscript. This article is contribution number 43 of the Tara Oceans expedition.

Author information

Author notes

    • Shinichi Sunagawa

    Present address: Department of Biology, Institute of Microbiology, ETH Zurich, 8093 Zurich, Switzerland.

Affiliations

  1. Department of Microbiology, The Ohio State University, Columbus, Ohio 43210, USA

    • Simon Roux
    • , Jennifer R. Brum
    • , Natalie Solonenko
    •  & Matthew B. Sullivan
  2. Theoretical Biology and Bioinformatics, Utrecht University, 3584 CH Utrecht, The Netherlands

    • Bas E. Dutilh
  3. Centre for Molecular and Biomolecular Informatics, Radboud University Medical Centre, 6525 GA Nijmegen, The Netherlands

    • Bas E. Dutilh
  4. Department of Marine Biology, Federal University of Rio de Janeiro, Rio de Janeiro, CEP 21941-902, Brazil

    • Bas E. Dutilh
  5. Structural and Computational Biology, European Molecular Biology Laboratory, 69117 Heidelberg, Germany

    • Shinichi Sunagawa
    • , Stefanie Kandels-Lewis
    •  & Peer Bork
  6. Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109, USA

    • Melissa B. Duhaime
  7. Division of Microbial Ecology, Department of Microbiology and Ecosystem Science, Research Network Chemistry Meets Microbiology, University of Vienna, A-1090 Vienna, Austria

    • Alexander Loy
  8. Austrian Polar Research Institute, A-1090 Vienna, Austria

    • Alexander Loy
  9. Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA

    • Bonnie T. Poulos
  10. Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM), CSIC Barcelona E0800, Spain

    • Elena Lara
    • , Josep M. Gasol
    • , Dolors Vaqué
    •  & Silvia G. Acinas
  11. Institute of Marine Sciences (CNR-ISMAR), National Research Council, 30122 Venezia, Italy

    • Elena Lara
  12. CEA - Institut de Génomique, GENOSCOPE, 91057 Evry, France

    • Julie Poulain
    • , Corinne Cruaud
    • , Adriana Alberti
    •  & Patrick Wincker
  13. PANGAEA, Data Publisher for Earth and Environmental Science, University of Bremen, 28359 Bremen, Germany

    • Stéphane Pesant
  14. MARUM, Bremen University, 28359 Bremen, Germany

    • Stéphane Pesant
  15. Directors’ Research, European Molecular Biology Laboratory, 69117 Heidelberg, Germany

    • Stefanie Kandels-Lewis
  16. CNRS, UMR 7144, EPEP, Station Biologique de Roscoff, 29680 Roscoff, France

    • Céline Dimier
  17. Sorbonne Universités, UPMC Université Paris 06, UMR 7144, Station Biologique de Roscoff, 29680 Roscoff, France

    • Céline Dimier
  18. Institut de Biologie de l’École Normale Supérieure, École Normale Supérieure, Paris Sciences et Lettres Research University, CNRS UMR 8197, INSERM U1024, F-75005 Paris, France

    • Céline Dimier
  19. CNRS, UMR 7093, Laboratoire d'océanographie de Villefranche, Observatoire Océanologique, 06230 Villefranche-sur-mer, France

    • Marc Picheral
    •  & Sarah Searson
  20. Sorbonne Universités, UPMC Université Paris 06, UMR 7093, Observatoire Océanologique, 06230 Villefranche-sur-mer, France

    • Marc Picheral
    •  & Sarah Searson
  21. Mediterranean Institute of Advanced Studies, CSIC-UiB, 21-07190 Esporles, Mallorca, Spain

    • Carlos M. Duarte
  22. King Abdullah University of Science and Technology, Red Sea Research Center, Thuwal 23955-6900, Saudi Arabia

    • Carlos M. Duarte
  23. Max-Delbrück-Centre for Molecular Medicine, 13092 Berlin, Germany

    • Peer Bork
  24. CNRS, UMR 8030, 91057 Evry, France

    • Patrick Wincker
  25. Université d’Evry, UMR 8030, 91057 Evry, France

    • Patrick Wincker
  26. Department of Civil, Environmental and Geodetic Engineering, The Ohio State University, Columbus, Ohio 43210, USA

    • Matthew B. Sullivan

Consortia

  1. Tara Oceans Coordinators

    A list of participants and their affiliations appears in the Supplementary Information.

Authors

  1. Search for Simon Roux in:

  2. Search for Jennifer R. Brum in:

  3. Search for Bas E. Dutilh in:

  4. Search for Shinichi Sunagawa in:

  5. Search for Melissa B. Duhaime in:

  6. Search for Alexander Loy in:

  7. Search for Bonnie T. Poulos in:

  8. Search for Natalie Solonenko in:

  9. Search for Elena Lara in:

  10. Search for Julie Poulain in:

  11. Search for Stéphane Pesant in:

  12. Search for Stefanie Kandels-Lewis in:

  13. Search for Céline Dimier in:

  14. Search for Marc Picheral in:

  15. Search for Sarah Searson in:

  16. Search for Corinne Cruaud in:

  17. Search for Adriana Alberti in:

  18. Search for Carlos M. Duarte in:

  19. Search for Josep M. Gasol in:

  20. Search for Dolors Vaqué in:

  21. Search for Peer Bork in:

  22. Search for Silvia G. Acinas in:

  23. Search for Patrick Wincker in:

  24. Search for Matthew B. Sullivan in:

Contributions

S.R. and M.B.S. designed the study. C.D., M.P. and S.Se. contributed extensively to sampling collection. S.K.-L. managed the logistics of the Tara Oceans project. B.T.P., N.S. and E.L. performed the viral-specific processing of the samples. J.P., C.C., A.A. and P.W. led the sequencing of viral samples. S.R., S.Su. and B.E.D. led the assembly of raw data. S.R., S.Su., M.B.D. and M.B.S. analysed the genomic diversity data. S.R., A.L., J.R.B. and M.B.S. analysed the AMGs data. S.R., J.R.B., B.E.D, S.Su., M.B.D., A.L., S.P., P.B., S.G.A., C.D., J.M.G., D.V. and M.B.S. provided constructive comments, revised and edited the manuscript. Tara Oceans Coordinators provided constructive criticism throughout the study. All authors discussed the results and commented on the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Matthew B. Sullivan.

All data are fully and freely available from the date of publication, with no restrictions, at EBI, PANGAEA, and iVirus. All of the samples, analyses, publications, and ownership of data are free from legal entanglement or restriction of any sort by the nations in whose waters Tara Oceans expedition sampled.

Extended data

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This file includes Supplementary Text and Data, Supplementary Figures 1-8 legends for Supplementary Tables 1-6 (see separate excel files) and additional references. The text includes additional information and literature context that help document details about the generation of the GOV dataset (assembly, identification of viral contigs, read mapping to viral contigs), viral cluster definition and affiliation (including comparison to other genome classification methods), host prediction (methods evaluation and results), discussions about AMG affiliation and host prediction for associated contigs, and list of supports and sponsors of Tara Oceans and Malaspina expeditions (including the list and affiliation of Tara Oceans coordinators).

Excel files

  1. 1.

    Supplementary Table 1

    This file contains the list of viromes in the GOV dataset. Station number, depth, longhurst province, biome, and sequencing effort are indicated for each virome sample.

  2. 2.

    Supplementary Table 2

    This file contains the GOV viral population summary. The number of contig and length of each population are presented, alongside their normalized coverage across the 104 GOV viromes.

  3. 3.

    Supplementary Table 3

    This file contains a summary of GOV Viral Clusters (VCs). For each VC, the composition (number and origin of VC members), affiliation, and coverage across GOV viromes are indicated.

  4. 4.

    Supplementary Table 4

    This file contains the benchmarks of in silico host prediction methods. Results of host prediction methods evaluations performed using the NCBI RefSeq Virus database and VirSorter Curated Dataset.

  5. 5.

    Supplementary Table 5

    This file contains the host prediction for GOV viral contigs that are associated with a population. Predictions are reported for each population with the type of signal (blastn, CRISPR, tetranucleotide composition), the host sequence used, and the strength of the prediction.

  6. 6.

    Supplementary Table 6

    This file contains the PFAM domains detected in GOV viral contigs (≥1.5kb). For each PFAM domain, the number of genes detected in the GOV dataset is indicated, alongside the functional category of the domain.

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.