Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria


The most abundant viruses on Earth are thought to be double-stranded DNA (dsDNA) viruses that infect bacteria1. However, tailed bacterial dsDNA viruses (Caudovirales), which dominate sequence and culture collections, are not representative of the environmental diversity of viruses2,3. In fact, non-tailed viruses often dominate ocean samples numerically4, raising the fundamental question of the nature of these viruses. Here we characterize a group of marine dsDNA non-tailed viruses with short 10-kb genomes isolated during a study that quantified the diversity of viruses infecting Vibrionaceae bacteria. These viruses, which we propose to name the Autolykiviridae, represent a novel family within the ancient lineage of double jelly roll (DJR) capsid viruses. Ecologically, members of the Autolykiviridae have a broad host range, killing on average 34 hosts in four Vibrio species, in contrast to tailed viruses which kill on average only two hosts in one species. Biochemical and physical characterization of autolykiviruses reveals multiple virion features that cause systematic loss of DJR viruses in sequencing and culture-based studies, and we describe simple procedural adjustments to recover them. We identify DJR viruses in the genomes of diverse major bacterial and archaeal phyla, and in marine water column and sediment metagenomes, and find that their diversity greatly exceeds the diversity that is currently captured by the three recognized families of such viruses. Overall, these data suggest that viruses of the non-tailed dsDNA DJR lineage are important but often overlooked predators of bacteria and archaea that impose fundamentally different predation and gene transfer regimes on microbial systems than on tailed viruses, which form the basis of all environmental models of bacteria–virus interactions.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Autolykiviridae is a new family of non-tailed dsDNA viruses in the DJR capsid lineage.
Figure 2: Autolykiviruses dominate the lytic viral infection network of marine Vibrio.
Figure 3: Recovery of autolykiviruses is subject to multiple methodological biases.
Figure 4: DJR capsid viruses are far more diverse than the three currently recognized families, and include hosts in diverse bacterial and archaeal phyla

Accession codes

Primary accessions


NCBI Reference Sequence

Sequence Read Archive


  1. Wommack, K. E. & Colwell, R. R. Virioplankton: viruses in aquatic ecosystems. Microbiol. Mol. Biol. Rev. 64, 69–114 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Krishnamurthy, S. R. & Wang, D. Origins and challenges of viral dark matter. Virus Res. 239, 136–142 (2017)

    Article  CAS  PubMed  Google Scholar 

  3. Krupovic, M., Prangishvili, D., Hendrix, R. W. & Bamford, D. H. Genomics of bacterial and archaeal viruses: dynamics within the prokaryotic virosphere. Microbiol. Mol. Biol. Rev. 75, 610–635 (2011)

    Article  PubMed  PubMed Central  Google Scholar 

  4. Brum, J. R., Schenck, R. O. & Sullivan, M. B. Global morphological analysis of marine viruses shows minimal regional variation and dominance of non-tailed viruses. ISME J. 7, 1738–1751 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Benson, S. D., Bamford, J. K. H., Bamford, D. H. & Burnett, R. M. Does common architecture reveal a viral lineage spanning all three domains of life? Mol. Cell 16, 673–685 (2004)

    Article  CAS  PubMed  Google Scholar 

  6. Krupovicˇ, M. & Bamford, D. H. Archaeal proviruses TKV4 and MVV extend the PRD1-adenovirus lineage to the phylum Euryarchaeota. Virology 375, 292–300 (2008)

    Article  CAS  PubMed  Google Scholar 

  7. Pietilä, M. K. et al. Structure of the archaeal head-tailed virus HSTV-1 completes the HK97 fold story. Proc. Natl Acad. Sci. USA 110, 10604–10609 (2013)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  8. Koonin, E. V., Dolja, V. V. & Krupovicˇ, M. Origins and evolution of viruses of eukaryotes: the ultimate modularity. Virology 479–480, 2–25 (2015)

    Article  CAS  PubMed  Google Scholar 

  9. Wikoff, W. R. et al. Topologically linked protein rings in the bacteriophage HK97 capsid. Science 289, 2129–2133 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Krupovicˇ, M. & Bamford, D. H. Virus evolution: how far does the double b-barrel viral lineage extend? Nat. Rev. Microbiol. 6, 941–948 (2008)

    Article  CAS  PubMed  Google Scholar 

  11. Krupovic, M. & Koonin, E. V. Multiple origins of viral capsid proteins from cellular ancestors. Proc. Natl Acad. Sci. USA 114, E2401–E2410 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Iranzo, J., Krupovic, M. & Koonin, E. V. The double-stranded DNA virosphere as a modular hierarchical network of gene sharing. MBio 7, e00978–16 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. International Committee on Taxonomy of Viruses. ICTV Master Species List v.1.3 (2016)

  14. Brister, J. R., Ako-Adjei, D., Bao, Y. & Blinkova, O. NCBI viral genomes resource. Nucleic Acids Res. 43, D571–D577 (2015)

    Article  CAS  PubMed  Google Scholar 

  15. Espejo, R. T. & Canelo, E. S. Properties of bacteriophage PM2: a lipid-containing bacterial virus. Virology 34, 738–747 (1968)

    Article  CAS  PubMed  Google Scholar 

  16. Wommack, K. E., Hill, R. T., Kessel, M., Russek-Cohen, E. & Colwell, R. R. Distribution of viruses in the Chesapeake Bay. Appl. Environ. Microbiol. 58, 2965–2970 (1992)

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Andrews-Pfannkoch, C., Fadrosh, D. W., Thorpe, J. & Williamson, S. J. Hydroxyapatite-mediated separation of double-stranded DNA, single-stranded DNA, and RNA genomes from natural viral assemblages. Appl. Environ. Microbiol. 76, 5039–5045 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Steward, G. F. et al. Are we missing half of the viruses in the ocean? ISME J. 7, 672–679 (2013)

    Article  CAS  PubMed  Google Scholar 

  19. Labonté, J. M. & Suttle, C. A. Previously unknown and highly divergent ssDNA viruses populate the oceans. ISME J. 7, 2169–2177 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Roux, S. et al. Towards quantitative viromics for both double-stranded and single-stranded DNA viruses. PeerJ 4, e2777 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Peralta, B. et al. Mechanism of membranous tunnelling nanotube formation in viral genome delivery. PLoS Biol. 11, e1001667 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Sun, L. et al. Icosahedral bacteriophage ΦX174 forms a tail for DNA transport during infection. Nature 505, 432–435 (2014)

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Saren, A.-M. et al. A snapshot of viral evolution from genome analysis of the Tectiviridae family. J. Mol. Biol. 350, 427–440 (2005)

    Article  CAS  PubMed  Google Scholar 

  24. Thurber, R. V., Haynes, M., Breitbart, M., Wegley, L. & Rohwer, F. Laboratory procedures to generate viral metagenomes. Nat. Protoc. 4, 470–483 (2009)

    Article  CAS  PubMed  Google Scholar 

  25. Castro-Mejía, J. L. et al. Optimizing protocols for extraction of bacteriophages prior to metagenomic analyses of phage communities in the human gut. Microbiome 3, 64 (2015)

    Article  PubMed  PubMed Central  Google Scholar 

  26. D’Herelle, F. Studies upon Asiatic cholera. Yale J. Biol. Med. 1, 195–219 (1929)

    PubMed  PubMed Central  Google Scholar 

  27. Krupovicˇ, M. & Bamford, D. H. Putative prophages related to lytic tailless marine dsDNA phage PM2 are widespread in the genomes of aquatic bacteria. BMC Genomics 8, 236 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Xue, H. et al. Eco-evolutionary dynamics of episomes among ecologically cohesive bacterial populations. MBio 6, e00552–e15 (2015)

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Brum, J. R. & Sullivan, M. B. Rising to the challenge: accelerated pace of discovery transforms marine virology. Nat. Rev. Microbiol. 13, 147–159 (2015)

    Article  CAS  PubMed  Google Scholar 

  30. Smillie, C. S. et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 480, 241–244 (2011)

    Article  ADS  CAS  PubMed  Google Scholar 

  31. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Hunt, D. E. et al. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science 320, 1081–1085 (2008)

    Article  ADS  CAS  PubMed  Google Scholar 

  33. Baym, M. et al. Inexpensive multiplexed library preparation for megabase-sized genomes. PLoS ONE 10, e0128036 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Eddy, S. R. Accelerated profile HMM searches. PLOS Comput. Biol. 7, e1002195 (2011)

    Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  35. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Hehemann, J.-H. et al. Adaptive radiation by waves of gene transfer leads to fine-scale resource partitioning in marine microbes. Nat. Commun. 7, 12860 (2016)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285 (2016)

    Article  CAS  PubMed  Google Scholar 

  38. John, S. G. et al. A simple and efficient method for concentration of ocean viruses by chemical flocculation. Environ. Microbiol. Rep. 3, 195–202 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Rasband, W. S. ImageJ (U.S. National Institutes of Health, 1997)

  40. Silbert, J. A., Salditt, M. & Franklin, R. M. Structure and synthesis of a lipid-containing bacteriophage. 3. Purification of bacteriophage PM2 and some structural studies on the virion. Virology 39, 666–681 (1969)

    Article  CAS  PubMed  Google Scholar 

  41. Lawrence, J. E. & Steward, G. F. in Manual of Aquatic Viral Ecology (eds Wilhelm, S. W., Weinbauer, M. G. & Suttle, C. A. ) 166–181 (ASLO, 2010)

  42. Kivelä, H. M., Männistö, R. H., Kalkkinen, N. & Bamford, D. H. Purification and protein composition of PM2, the first lipid-containing bacterial virus to be isolated. Virology 262, 364–374 (1999)

    Article  PubMed  Google Scholar 

  43. Biller, S. J. et al. Bacterial vesicles in marine ecosystems. Science 343, 183–186 (2014)

    Article  ADS  CAS  PubMed  Google Scholar 

  44. Hurwitz, B. L., Deng, L., Poulos, B. T. & Sullivan, M. B. Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics. Environ. Microbiol. 15, 1428–1440 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Henn, M. R. et al. Analysis of high-throughput sequencing and annotation strategies for phage genomes. PLoS ONE 5, e9083 (2010)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  46. Bates, D. M. lme4: Mixed-effects modeling with R. (2010)

  47. Bates, D. & Mächler, M., Bolker, B. & Walker, S. Fitting Linear Mixed-Effects Models Using lme4. J. Stat. Softw. 67, 1–48 (2015)

    Article  Google Scholar 

  48. R Core Team. R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, Vienna, Austria, 2016)

  49. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015)

    Article  CAS  PubMed  Google Scholar 

  51. Guidi, L. et al. Plankton networks driving carbon export in the oligotrophic ocean. Nature 532, 465–470 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. E. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Hildebrand, A., Remmert, M., Biegert, A. & Söding, J. Fast and accurate automatic structure prediction with HHpred. Proteins 77, 128–132 (2009)

    Article  CAS  PubMed  Google Scholar 

  55. Alva, V., Nam, S.-Z., Söding, J. & Lupas, A. N. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res. 44, W410–W415 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Marchler-Bauer, A. et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 45, D200–D203 (2017)

    Article  CAS  PubMed  Google Scholar 

  57. Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 8, 785–786 (2011)

    Article  CAS  PubMed  Google Scholar 

  60. Guy, L., Kultima, J. R. & Andersson, S. G. E. genoPlotR: comparative gene and genome visualization in R. Bioinformatics 26, 2334–2335 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)

    Article  PubMed  PubMed Central  Google Scholar 

  62. Li, W. et al. The EMBL-EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res. 43, W580–W584 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. McWilliam, H. et al. Analysis tool web services from the EMBL-EBI. Nucleic Acids Res. 41, W597–W600 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  64. Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010)

    Article  CAS  PubMed  Google Scholar 

  65. Lefort, V., Longueville, J.-E. & Gascuel, O. SMS: smart model selection in PhyML. Mol. Biol. Evol. 34, 2422–2424 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Grazziotin, A. L., Koonin, E. V. & Kristensen, D. M. Prokaryotic virus orthologous groups (pVOGs): a resource for comparative genomics and protein family annotation. Nucleic Acids Res. 45, D491–D498 (2017)

    Article  CAS  PubMed  Google Scholar 

  67. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010)

    Article  CAS  PubMed  Google Scholar 

  68. Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: Reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004)

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Wallace, I. M., O’Sullivan, O., Higgins, D. G. & Notredame, C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34, 1692–1699 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Huerta-Cepas, J. et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 44 (D1), D286–D293 (2016)

    Article  CAS  PubMed  Google Scholar 

  74. Letunic, I. & Bork, P. Interactive tree of life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007)

    Article  CAS  PubMed  Google Scholar 

  75. Csardi, G. & Nepusz, T. The igraph software package for complex network research. InterJ. Complex Systems, 1695 (2006)

  76. Vega Yon, G., Fabrega Lacoa, J. & Kunst, J. B. rgexf: Build, Import, and Export GEXF Graph Files. (2015)

  77. Finn, R. D. et al. HMMER web server: 2015 update. Nucleic Acids Res. 43, W30–W38 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 (2017)

  79. Benson, D. A. et al. GenBank. Nucleic Acids Res. 41, D36–D42 (2013)

    Article  CAS  PubMed  Google Scholar 

  80. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)

    Article  CAS  PubMed  Google Scholar 

  81. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  82. Marchler-Bauer, A. et al. CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res. 39, D225–D229 (2011)

    Article  CAS  PubMed  Google Scholar 

  83. Adriaenssens, E. & Brister, J. R. How to name and classify your phage: an informal guide. Viruses 9, 70 (2017)

    Article  PubMed Central  Google Scholar 

  84. Clerissi, C. et al. Unveiling of the diversity of prasinoviruses (Phycodnaviridae) in marine samples by using high-throughput sequencing analyses of PCR-amplified DNA polymerase and major capsid protein genes. Appl. Environ. Microbiol. 80, 3150–3160 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Brum, J. R. et al. Patterns and ecological drivers of ocean viral communities. Science 348, 1261498 (2015)

    Article  CAS  PubMed  Google Scholar 

  86. Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017)

    Article  ADS  CAS  PubMed  Google Scholar 

  87. Anantharaman, K. et al. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat. Commun. 7, 13219 (2016)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  88. Anantharaman, K. et al. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum. PeerJ 4, e1607 (2016)

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  89. Mizuno, C. M., Rodriguez-Valera, F., Kimes, N. E. & Ghai, R. Expanding the marine virosphere using metagenomics. PLoS Genet. 9, e1003987 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Ghai, R. et al. Metagenome of the Mediterranean deep chlorophyll maximum studied by direct and fosmid library 454 pyrosequencing. ISME J. 4, 1154–1166 (2010)

    Article  CAS  PubMed  Google Scholar 

  91. Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 44, D67–D72 (2016)

    Article  CAS  PubMed  Google Scholar 

Download references


We thank J. King, P. Weigele, J. Daily and J. Chodera for comments and suggestions; T. Soni and members of the Polz laboratory for assistance with sampling; S. Labrie for guidance in viral genome extractions and sequencing library preparation, and C. Haase-Pettingell for assistance with density gradients; N. Watson for electron microscopy; and R. Ratzlaff for discussions and the suggestion of electron microscopy of virus plaques in agar overlay. This work was supported by grants from the National Science Foundation OCE 1435993 to M.F.P. and L.K., the NSF GRFP to F.A.H. and the WHOI Ocean Ventures Fund to K.M.K.

Author information

Authors and Affiliations



K.M.K., F.A.H., L.K. and M.F.P. designed the study and planned experiments and analyses. K.M.K., L.K. and M.F.P. wrote the paper with contributions from all authors. K.M.K. conducted field sampling, isolations and experimental characterizations of lytic viruses. J.Y. conducted the statistical analyses of the viral decay experiment and wrote the scripts to visualize the infection matrix as a phylogeny-anchored network, which was based on the host ribosomal protein tree generated by P.A. W.K.C. and L.K. performed the quantification of significance of host sharing. F.A.H. performed isolation and characterization of active Vibrio DJR prophages. Bacterial genome sequencing libraries were prepared by M.B.C., assembled by P.A., and curated and annotated by P.A. and J.E. The viral genome sequencing libraries were prepared by K.M.K. and R.S.S., assembled by J.M.B. and K.M.K., and annotated and curated by J.M.B., K.M.K., J.E., W.K.C. and L.K. The viral metagenome sequencing libraries were prepared by K.M.K., and assembled and curated by P.A. and L.K. The bioinformatic analyses of microbial genomes and metagenomes for DJR capsid elements were performed by L.K. and K.M.K., and the visualization of the DJR network was performed by D.V. M.B.C. provided field and laboratory technical support throughout. Although specific contributions are highlighted for each author, all authors contributed in additional ways through contributions to figures, analyses, discussion of results and comments on the manuscript.

Corresponding authors

Correspondence to Libusha Kelly or Martin F. Polz.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks J. Fuhrman, E. V. Koonin and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Figure 1 Members of the Autolykiviridae are non-tailed viruses that may form tail tubes on contact with cells.

Thin-section electron microscopy of an agar overlay containing plaques of representative Autolykiviridae virus 1.008.O (see Methods for experimental details). a, Virus particles in contact with cell membranes are observed to occasionally possess tail-tube-like structures, whereas those not in contact with cells do not. b, Lower magnification of same field of view as a shows that tail-tube-free virions are more common than those with tail tubes. c, Lower magnification view of virion in Fig. 1b also shows that the presence of the tail tube is associated with cell contact and is not observed in nearby virions.

Extended Data Figure 2 Whole-genome alignments show that the family Autolykiviridae consists of five major sequence diversity clusters.

a, Maximum Likelihood phylogeny of whole-genome nucleotide alignments of 21 autolykiviruses. Alignments were made with Clustal Omega and the phylogenetic tree was generated with PhyML-SMS with aLRT branch supports. Scale bar, substitutions per base. b, Percentage of whole-genome nucleotide identities among 21 autolykivirus genomes on the basis of the Clustal Omega alignment. Assumptions of 50% and 95% identity for genus and species classifications83, respectively, suggest that these viruses represent two genera (groups A, B, C, D and group E) and five species. Two viruses with identical genomes were isolated at time points 39 days apart (1.048.O and 1.102.O), viruses with the same number and different letter suffixes represent lineages derived from a single plaque that gave rise to variable morphotypes during serial purification.

Extended Data Figure 3 Genomes of members of the Autolykiviridae are syntenic despite extensive diversity at the nucleotide level.

Virus genomes are grouped by nucleotide similarity (as identified in Extended Data Fig. 2). Homologous proteins were identified by performing an all-by-all BLASTp, requiring a minimum bitscore of 50, and clustering all pairs unweighted, using MCL with an inflation parameter set to 1.4 (Methods), cluster membership is identified by the label over the block arrows in the genome diagram. Protein clustering reveals that in addition to the six proteins identifiable by sequence similarity as core to all characterized autolykiviruses, additional protein clusters are shared among various subsets of the identified viral genome groups. For example, in the region of the genome to the right of the major capsid protein, 17 out of 18 viruses (genome groups A, B, C and D) share a set of seven protein clusters of unknown function (c11, c12, c13, c14, c15, c16 and c17); among these viruses, two additional proteins are shared only within subsets of the genomes (c26 in genome groups A and B; c19 in genome groups C and D).

Extended Data Figure 4 Packaging and replication protein-sequence phylogenies of autolykiviruses are incongruent with respect to other known families of non-tailed dsDNA viruses.

Autolykiviruses are most similar to the corticovirus PM2 in their major capsid protein, poorly resolved in their packaging ATPase, and most similar to the tectiviruses in their protein-primed DNA polymerase. Pairwise identities and phylogenies of the protein sequences of the DJR major capsid protein (a and b), packaging ATPase (c and d) and protein primed DNA polymerase (e and f). Members of the Tectiviridae infecting Gram-positive and Gram-negative hosts are shown separately as G+ and G−, respectively. All alignments were performed using the ETE3 Toolkit with workflow eggNOG41. All trees are maximum-likelihood trees with aLRT branch supports.

Extended Data Figure 5 Sequence-diverse autolykiviruses share extensively overlapping host ranges that include diverse hosts.

a, Pairwise coinfection significance by host count. Autolykiviruses exhibit highly significant host sharing. b, Pairwise coinfection significance compared to mean pairwise genomic similarity of the host. Autolykiviruses exhibit more significant host sharing than tailed phages of comparable host diversity. a, b, Coinfection significance as defined in Methods. c, Pairwise coinfection significance compared to viral genomic similarity measured as a fraction of shared open reading frames (ORFs). Autolykiviruses exhibit more significant host sharing than tailed viruses of comparable genomic similarity. A total of 998 reciprocal pairs of tailed viruses and 236 reciprocal pairs of autolykiviruses are shown, representing all pairs of viruses within each group (141 unique tailed, 16 unique autolykiviruses) that share at least one host.

Extended Data Figure 6 Autolykiviruses show delayed host lysis compared with other viruses.

Inverted phylogenetic tree showing the relationships among all 318 assayed bacterial strains on the basis of the concatenated alignments of the hsp60 and ribosomal protein genes, and using a partitioned model in RaxML to allow placement of 40 strains for which only the hsp60 gene sequence was available (Methods). Isolates are generally non-clonal. Leaves represent Vibrionaceae isolates and are coloured by population (Methods). Nodes represent viruses and are coloured by morphotype, as defined by major capsid protein or genome composition (Methods; non-tailed in orange, tailed in blue, unsequenced viruses in grey); edges represent infections with intensity increasing with increased time required for observation of plaques. Whereas 94% of tailed virus infections were detected within three days in host range assays, only 57% of autolykivirus infections were detected in that time, with 15% requiring more than seven days to be detected.

Extended Data Figure 7 DJR elements in Vibrionaceae include naturally excising integrated prophages and broad host-range plasmids.

Prophages of representative group 5 DJR elements (Fig. 4) naturally excise from their Vibrio hosts during growth in culture. Sequencing of nuclease-treated cell-free culture supernatants reveals sharply delineated regions of high coverage read mapping with respect to host genome background, indicating the presence of extracellular nuclease-protected prophage DNA. a, V. kanaloae 5S-149 DJR prophage. b, Vibrio 10N.286.55.C7 DJR prophage. c, Genome diagrams of the excising 5S-149 DJR prophage and the nine Vibrionaceae plasmids28 that are identified here as DJR elements show that they are syntenic and all share the DJR capsid protein, packaging ATPase and the corticovirus PM2 P17-like protein. MCL clustering of proteins on the basis of the BLASTp sequence similarity reveals that additional proteins, including integrases, repressors, peptidoglycan hydrolases and replication initiation genes, are common but not universal within these elements. d, Pairwise percentage of whole-genome nucleotide identities between 5S-149 DJR prophage and the DJR Vibrionaceae plasmids show that these elements are highly diverse at the nucleotide level and that 100% nucleotide-identical 13.6-kb plasmids are found in hosts in multiple species.

Extended Data Figure 8 Network of DJR virus capsids identified in bacterial and archaeal genomes and marine metagenomes.

Iterative HMM-based searches of marine metagenomes, on the basis of a reference panel of autolykiviruses and previously identified DJR capsid bacterial and archaeal viruses, yield approximately 15,000 proteins following stringent quality control filtering of the initial approximately 45,000 sequences that were recovered. Network visualization reflects MCL clustering of BLASTp-based similarities among sequences. a, Placement of reference panel sequences within the network. b, Characterization of proteins as DJRs on the basis of sequence- and structural-similarity-based annotation. c, Best BLASTp matches to RefSeq viruses, bitscore requirement of 50. d, Association of Tara Oceans-derived sequences to size fraction of isolation. e, Subset of sequences selected for phylogenetic analyses (Fig. 4) on the basis of membership in protein clusters strongly supported as bacterial and archaeal virus DJR capsids and requiring a length of ≥200 amino acids (Methods). We note that this selection is conservative, given the greater number and diversity of sequences recovered by our HMM-based search that passed all quality controls and show no structural- or sequence-based similarity to any other proteins, and thus were excluded from further analyses. The observed dominance of eukaryotic virus DJR capsids in this search is predicted to reflect four major aspects of our approach. First, inclusion of cellular metagenomes allows capture of large viruses such as the Mimiviridae (>400?nm), Iridoviridae (120–350?nm) and Phycodnaviridae (100–220?nm). Second, some Phycodnaviridae have been shown to encode up to eight sequence-diverse copies of their DJR major capsid gene84. Third, <0.22?μm viral metagenomes are biased against recovery of bacterial and archaeal DJR viruses, as described here. And fourth, the sequence content of HMMs using iterative searches is defined by the search space, such that if eukaryotic virus DJR capsid sequences are well represented, as they are in the larger size-fraction sequence databases used here, they will drive searches towards increased detection of similar sequences.

Extended Data Table 1 Metagenomes used in this study
Extended Data Table 2 Contigs of DJR elements

Supplementary information

Supplementary Information

This file contains Supplementary Figure 1, a comparison of tailed virus and autolykivirus genome recovery with and without protease treatment. The uncropped source gel for Fig. 3c. (PDF 288 kb)

Life Sciences Reporting Summary (PDF 74 kb)

Supplementary Data 1

This file contains accession numbers, taxonomy, and annotation of DJR capsid proteins included in the trees and network. (XLSX 781 kb)

Supplementary Data 2

This file contains accession numbers, taxonomy, and annotation of DJR capsid proteins included in the trees and network. (XLSX 6229 kb)

Supplementary Data 3

This file contains GenBank accession numbers for newly obtained sequences and previously published genomes included in Fig. 2 and Extended Data Fig. 6. (XLSX 45 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kauffman, K., Hussain, F., Yang, J. et al. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature 554, 118–122 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing