Abstract
Viruses impact microbial diversity, gene flow and function through virus–host interactions. Although metagenomics surveys are rapidly cataloguing viral diversity, methods are needed to capture specific virus–host interactions in situ. Here, we leveraged metagenomics and repurposed emulsion paired isolation-concatenation PCR (epicPCR) to investigate viral diversity and virus–host interactions in situ over time in an estuarine environment. The method fuses a phage marker, the ribonucleotide reductase gene, with the host 16S rRNA gene of infected bacterial cells within emulsion droplets providing single-cell resolution for dozens of samples. EpicPCR captured in situ virus–host interactions for viral clades with no closely related database representatives. Abundant freshwater Actinobacteria lineages, in particular Rhodoluna sp., were the most common hosts for these poorly characterized viruses, with interactions correlated with environmental factors. Multiple methods used to identify virus–host interactions, including epicPCR, identified different and largely non-overlapping interactions within the vast virus–host interaction space. Tracking virus–host interaction dynamics also revealed that multi-host viruses had significantly longer periods with observed virus–host interactions, whereas single-host viruses were observed interacting with hosts at lower minimum abundances, suggesting more efficient interactions. Capturing in situ interactions with epicPCR revealed environmental and ecological factors shaping virus–host interactions, highlighting epicPCR as a valuable technique in viral ecology.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Ecophysiology and genomics of the brackish water adapted SAR11 subclade IIIa
The ISME Journal Open Access 04 February 2023
-
Response of soil viral communities to land use changes
Nature Communications Open Access 12 October 2022
-
Viral tag and grow: a scalable approach to capture and characterize infectious virus–host pairs
ISME Communications Open Access 01 February 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
Sequences associated with 16S rRNA libraries from environmental samples and incubation experiments, bacterial and viral shotgun libraries, and fusion amplicons from epicPCR, have been deposited in the NCBI under BioProject accession no. PRJNA599167. Water physicochemical measurements and qPCR data have been deposited in the BCO-DMO database under datasets 757405 and 821955. Datasets used in this analysis include GOV2.0, NCBI non-redundant nucleotide database (nr), Tampa Bay metagenomic libraries (BioProject accession nos. PRJNA28619, PRJNA47459 and PRJNA52403), and Damariscotta River Estuary, ME, USA (BioProject accession no. PRJNA357591). Source data are provided with this paper.
Code availability
No customized code was used in analysis of the data.
References
Suttle, C. A. Marine viruses—major players in the global ecosystem. Nat. Rev. Microbiol. 5, 801–812 (2007).
Emerson, J. B. et al. Host-linked soil viral ecology along a permafrost thaw gradient. Nat. Microbiol. 3, 870–880 (2018).
Reyes, A., Semenkovich, N. P., Whiteson, K., Rohwer, F. & Gordon, J. I. Going viral: next-generation sequencing applied to phage populations in the human gut. Nat. Rev. Microbiol. 10, 607–617 (2012).
Suttle, C. A. The significance of viruses to mortality in aquatic microbial communities. Microb. Ecol. 28, 237–243 (1994).
Guidi, L. et al. Plankton networks driving carbon export in the oligotrophic ocean. Nature 532, 465–470 (2016).
Roux, S. et al. Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537, 689–693 (2016).
Winget, D. M. et al. Repeating patterns of virioplankton production within an estuarine ecosystem. Proc. Natl Acad. Sci. USA 108, 11506–11511 (2011).
Chen, X. W. et al. Tide driven microbial dynamics through virus–host interactions in the estuarine ecosystem. Water Res. 160, 118–129 (2019).
Flores, C. O., Meyer, J. R., Valverde, S., Farr, L. & Weitz, J. S. Statistical structure of host–phage interactions. Proc. Natl Acad. Sci. USA 108, E288–E297 (2011).
Flores, C. O., Valverde, S. & Weitz, J. S. Multi-scale structure and geographic drivers of cross-infection within marine bacteria and phages. ISME J. 7, 520–532 (2013).
Jover, L. F., Cortez, M. H. & Weitz, J. S. Mechanisms of multi-strain coexistence in host–phage systems with nested infection networks. J. Theor. Biol. 332, 65–77 (2013).
Våge, S., Storesund, J. E. & Thingstad, T. F. Adding a cost of resistance description extends the ability of virus–host model to explain observed patterns in structure and function of pelagic microbial communities. Environ. Microbiol. 15, 1842–1852 (2013).
Edwards, R. A., McNair, K., Faust, K., Raes, J. & Dutilh, B. E. Computational approaches to predict bacteriophage–host relationships. FEMS Microbiol. Rev. 40, 258–272 (2016).
Gregory, A. C. et al. Marine DNA viral macro- and microdiversity from pole to pole. Cell 177, 1109–1123 (2019).
Tully, B. J., Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).
Burstein, D. et al. Major bacterial lineages are essentially devoid of CRISPR–Cas viral defence systems. Nat. Commun. 7, 10613 (2016).
Hatfull, G. F. Dark matter of the biosphere: the amazing world of bacteriophage diversity. J. Virol. 89, 8107–8110 (2015).
Middelboe, M., Chan, A. M. & Bertelsen, S. K. in Manual of Aquatic Viral Ecology (eds Wilhelm, S. W. et al.) 118–133 (American Society of Limnology and Oceanography, 2010).
Deng, L. et al. Viral tagging reveals discrete populations in Synechococcus viral genome sequence space. Nature 513, 242–245 (2014).
Mosier-Boss, P. A. et al. Use of fluorescently labeled phage in the detection and identification of bacterial species. Appl. Spectrosc. 57, 1138–1144 (2003).
Allers, E. et al. Single-cell and population level viral infection dynamics revealed by phage FISH, a method to visualize intracellular and free viruses. Environ. Microbiol. 15, 2306–2318 (2013).
Tadmor, A. D., Ottesen, E. A., Leadbetter, J. R. & Phillips, R. Probing individual environmental bacteria for viruses by using microfluidic digital PCR. Science 333, 58–62 (2011).
Bickhart, D. M. et al. Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation. Genome Biol. 20, 153 (2019).
Stalder, T., Press, M. O., Sullivan, S., Liachko, I. & Top, E. M. Linking the resistome and plasmidome to the microbiome. ISME J. 13, 2437–2446 (2019).
Labonte, J. M. et al. Single-cell genomics-based analysis of virus-host interactions in marine surface bacterioplankton. ISME J. 9, 2386–2399 (2015).
Spencer, S. J. et al. Massively parallel sequencing of single cells by epicPCR links functional genes with phylogenetic markers. ISME J. 10, 427–436 (2016).
Jang, H. B. et al. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat. Biotechnol. 37, 632–639 (2019).
Bench, S. R. et al. Metagenomic characterization of Chesapeake Bay virioplankton. Appl. Environ. Microbiol. 73, 7629–7641 (2007).
Kan, J., Evans, S. E., Chen, F. & Suzuki, M. T. Novel estuarine bacterioplankton in rRNA operon libraries from the Chesapeake Bay. Aquat. Microb. Ecol. 51, 55–66 (2008).
Chen, F. et al. Diverse and dynamic populations of cyanobacterial podoviruses in the Chesapeake Bay unveiled through DNA polymerase gene sequences. Environ. Microbiol. 11, 2884–2892 (2009).
Kan, J., Suzuki, M. T., Wang, K., Evans, S. E. & Chen, F. High temporal but low spatial heterogeneity of bacterioplankton in the Chesapeake Bay. Appl. Environ. Microbiol. 73, 6776–6789 (2007).
Nasko, D. J. et al. Family A DNA polymerase phylogeny uncovers diversity and replication gene organization in the virioplankton. Front. Microbiol. 9, 3053 (2018).
Sakowski, E. G. et al. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses. Proc. Natl Acad. Sci. USA 111, 15786–15791 (2014).
Dwivedi, B., Xue, B., Lundin, D., Edwards, R. A. & Breitbart, M. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes. BMC Evolut. Biol. 13, 33 (2013).
Harrison, A. O., Moore, R. M., Polson, S. W. & Wommack, K. E. Reannotation of the ribonucleotide reductase in a cyanophage reveals life history strategies within the virioplankton. Front. Microbiol. 10, 134 (2019).
Suzek, B. E., Huang, H. Z., McGarvey, P., Mazumder, R. & Wu, C. H. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288 (2007).
Martinez-Hernandez, F. et al. Single-virus genomics reveals hidden cosmopolitan and abundant viruses. Nat. Commun. 8, 15892 (2017).
Kavagutti, V. S., Andrei, A. S., Mehrshad, M., Salcher, M. M. & Ghai, R. Phage-centric ecological interactions in aquatic ecosystems revealed through ultra-deep metagenomics. Microbiome 7, 135 (2019).
Tzortziou, M. et al. Tidal marshes as a source of optically and chemically distinctive colored dissolved organic matter in the Chesapeake Bay. Limnol. Oceanogr. 53, 148–159 (2008).
Jordan, T. E., Pierce, J. W. & Correll, D. L. Flux of particulate matter in the tidal marshes and subtidal shallows of the Rhode River estuary. Estuaries 9, 310–319 (1986).
Chai, T. J. Characteristics of Escherichia coli grown in bay water as compared with rich medium. Appl. Environ. Microbiol. 45, 1316–1323 (1983).
Martiny, J. B., Riemann, L., Marston, M. F. & Middelboe, M. Antagonistic coevolution of marine planktonic viruses and their hosts. Annu. Rev. Marine Sci. https://doi.org/10.1146/annurev-marine-010213-135108 (2014).
Sieradzki, E. T., Ignacio-Espinoza, J. C., Needham, D. M., Fichot, E. B. & Fuhrman, J. A. Dynamic marine viral infections and major contribution to photosynthetic processes shown by spatiotemporal picoplankton metatranscriptomes. Nat. Commun. 10, 1169 (2019).
Moniruzzaman, M. et al. Virus–host relationships of marine single-celled eukaryotes resolved from metatranscriptomics. Nat. Commun. 8, 16054 (2017).
Duffy, S., Turner, P. E. & Burch, C. L. Pleiotropic costs of niche expansion in the RNA bacteriophage Φ6. Genetics 172, 751–757 (2006).
Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).
Deng, L. et al. Contrasting life strategies of viruses that infect photo- and heterotrophic bacteria, as revealed by viral tagging. mBio 3, e00373-12 (2012).
Adriaenssens, E. M. & Cowan, D. A. Using signature genes as tools to assess environmental viral ecology and diversity. Appl. Environ. Microbiol. 80, 4470–4480 (2014).
Martinez-Hernandez, F. et al. Droplet digital PCR for estimating absolute abundances of widespread pelagibacter viruses. Front. Microbiol. 10, 1226 (2019).
Vik, D. R. et al. Putative archaeal viruses from the mesopelagic ocean. PeerJ 5, e3428 (2017).
Jover, L. F., Romberg, J. & Weitz, J. S. Inferring phage–bacteria infection networks from time-series data. R. Soc. Open Sci. 3, 160654 (2016).
Brankatschk, R., Bodenhausen, N., Zeyer, J. & Burgmann, H. Simple absolute quantification method correcting for quantitative PCR efficiency variations for microbial community samples. Appl. Environ. Microbiol. 78, 4481–4489 (2012).
Baran, N., Goldin, S., Maidanik, I. & Lindell, D. Quantification of diverse virus populations in the environment using the polony method. Nat. Microbiol. 3, 62–72 (2018).
Russell, D. A. & Hatfull, G. F. PhagesDB: the actinobacteriophage database. Bioinformatics 33, 784–786 (2017).
Jensen, E. C. et al. Prevalence of broad-host-range lytic bacteriophages of Sphaerotilus natans, Escherichia coli, and Pseudomonas aeruginosa. Appl. Environ. Microbiol. 64, 575–580 (1998).
Peters, D. L., Lynch, K. H., Stothard, P. & Dennis, J. J. The isolation and characterization of two Stenotrophomonas maltophilia bacteriophages capable of cross-taxonomic order infectivity. BMC Genom. 16, 664 (2015).
Paez-Espino, D. et al. Uncovering Earth’s virome. Nature 536, 425–430 (2016).
John, S. G. et al. A simple and efficient method for concentration of ocean viruses by chemical flocculation. Environ. Microbiol. Rep. 3, 195–202 (2011).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).
Kang, D. D., Froula, J., Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015).
Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2015).
Song, W.-Z. & Thomas, T. Binning_refiner: improving genome bins through the combination of different binning programs. Bioinformatics 33, 1873–1875 (2017).
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36, 1925–1927 (2019).
Schütze, T. et al. A streamlined protocol for emulsion polymerase chain reaction and subsequent purification. Anal. Biochem. 410, 155–157 (2011).
Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).
DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
Warwick-Dugdale, J. et al. Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands. PeerJ 7, e6800 (2019).
Hurwitz, B. L., Deng, L., Poulos, B. T. & Sullivan, M. B. Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics. Environ. Microbiol. 15, 1428–1440 (2013).
Bushnell, B. BBMap: A Fast, Accurate, Splice-aware Aligner (US Department of Energy Joint Genome Institute, 2014); https://www.osti.gov/servlets/purl/1241166
De Coster, W., D’Hert, S., Schultz, D. T., Cruts, M. & Van Broeckhoven, C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Roux, S., Enault, F., Hurwitz, B. L. & Sullivan, M. B. VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985 (2015).
Ren, J. et al. Identifying viruses from metagenomic data using deep learning. Quant. Biol. 8, 64–77 (2020).
Bolduc, B., Youens-Clark, K., Roux, S., Hurwitz, B. L. & Sullivan, M. B. iVirus: facilitating new insights in viral ecology with software and community data sets imbedded in a cyberinfrastructure. ISME J. 11, 7–14 (2017).
Brum, J. R. et al. Patterns and ecological drivers of ocean viral communities. Science 348, 1261498 (2015).
Gregory, A. C. et al. Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer. BMC Genom. 17, 930 (2016).
Roux, S. et al. Minimum information about an uncultivated virus genome (MIUViG). Nat. Biotechnol. 37, 29–37 (2019).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–U354 (2012).
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 11, 119 (2010).
Smoot, M. E., Ono, K., Ruscheinski, J., Wang, P. L. & Ideker, T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431–432 (2011).
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).
Warnes, G. R. et al. gplots: various R programming tools for plotting data. R version 3.5.3 (2015).
Jasna, V., Parvathi, A. & Dash, A. Genetic and functional diversity of double-stranded DNA viruses in a tropical monsoonal estuary, India. Sci. Rep. 8, 16036 (2018).
McDaniel, L. D., Rosario, K., Breitbart, M. & Paul, J. H. Comparative metagenomics: natural populations of induced prophages demonstrate highly unique, lower diversity viral sequences. Environ. Microbiol. 16, 570–585 (2014).
Allen, L. Z. et al. The baltic sea virome: diversity and transcriptional activity of DNA and RNA viruses. mSystems 2, e00125-16 (2017).
Zhu, W., Lomsadze, A. & Borodovsky, M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res. 38, e132 (2010).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Kearse, M. et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012).
Galiez, C., Siebert, M., Enault, F., Vincent, J. & Söding, J. WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs. Bioinformatics 33, 3113–3114 (2017).
Laslett, D. & Canback, B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 32, 11–16 (2004).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Grissa, I., Vergnaud, G. & Pourcel, C. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 35, W52–W57 (2007).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267 (2007).
Dereeper, A. et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 36, W465–W469 (2008).
Stepanauskas, R. et al. Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles. Nat. Commun. 8, 84 (2017).
Acknowledgements
We thank the Smithsonian Environmental Research Center and K. Lohan for providing access to their facilities during sample collection. This work was supported by the National Science Foundation Biological Oceanography (award nos. 1820652, 1829831 and 1756314) and a Gordon and Betty Moore Foundation Investigator award (no. 3790). Part of this project was conducted using computational resources at the Maryland Advanced Research Computing Center and the Ohio Supercomputer Center for high-performance computing.
Author information
Authors and Affiliations
Contributions
E.G.S. and S.P.P. conceived the work. E.G.S. conducted all field work, epicPCR analysis and incubation experiments. E.G.S., K.A.W. and S.P.P. conducted the experimental and computational analysis for the bacterial metagenomic and 16S rRNA gene libraries. E.G.S., F.T., A.A.Z. and O.Z. conducted experimental and computational analysis for the viral metagenomic libraries. E.G.S., K.A.W. and S.P.P. conducted the bioinformatic host prediction. E.G.S. wrote the manuscript. E.G.S., A.A.Z., O.Z., M.B.S. and S.P.P. edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Map of the Chesapeake Bay with sampling site.
Map of the Chesapeake Bay with sampling site at the Smithsonian Environmental Research Center (SERC) in Edgewater, MD marked (SERC pier). Samples were taken off the SERC pier near the mouth of the Rhode River.
Extended Data Fig. 2 Distribution of viral populations in Chesapeake Bay samples collected across seasons and years.
Viral populations were defined as contigs >5kb with < 95% average nucleotide identity across 80% of the contig. Population distributions were determined by mapping reads from each time point to population representative contigs. Shared viral populations indicate those populations that were observed across multiple time points. a, Shared viral populations between Chesapeake Bay samples collected from May 2017 to December 2018. Viral population distributions were also compared to a sample collected from the same site in December 2012. The number of viral populations observed differed between each timepoint, resulting in non-reciprocal comparisons. Self-self comparisons completely overlapped. Vertical and horizontal (below each bar cluster) bar color represent the date of sampling for each group, according to the figure legend. b, Comparison of mean viral community similarity between samples from the same season (spring, winter; n = 6 comparisons) versus between seasons (n = 19 comparisons). There was a significantly higher proportion of viral populations that were shared by samples from the same season compared to samples from different seasons (two-tailed Mann-Whitney U, p = 0.0003). Box and whisker markers represent the minimum, first quartile, median, third quartile, and maximum values. c, Comparison of mean viral community similarity between samples from the same year (2018, n = 12 comparisons) versus samples from different years (2012, 2017, and 2018; n = 13 comparisons). There was no observed difference in the proportion of shared viral populations from the same year versus across years (two-tailed Mann-Whitney U, p > 0.05). Box and whisker markers represent the minimum, first quartile, median, third quartile, and maximum values.
Extended Data Fig. 3 Abundance and diversity of viral populations with RNR alpha subunit genes in the Chesapeake Bay.
Abundance and diversity of viral populations with RNR alpha subunit genes in the Chesapeake Bay from samples collected between May 2017 and December 2018. 634 RNR gene homologs were found across 10,858 total viral populations. a, The proportion of viral populations > 5kb with identified RNR genes at each sample time point. The total number of viral populations with identified RNR alpha subunit genes at each time point are indicated in parentheses. b, The predicted relative abundance of viral populations > 5kb with identifiable RNR alpha subunit genes in the Chesapeake Bay. Relative abundance was predicted by read mapping to all viral populations > 5kb. c, Maximum likelihood tree displaying the diversity of RNR alpha subunit genes in the Chesapeake Bay. RNR alpha subunit peptides from UniRef were clustered with translated Chesapeake Bay cellular metagenome (> 0.2 µm), viral metagenome (< 0.2 µm), and amplicon RNR sequences at 50% aa identity. Sequences were aligned with MAFFT and trimmed to 407C to 596P in E. coli. Primers designed to amplify cyanosiphoviruses and cyanopodoviruses were limited to amplifying RNR genes within this monophyletic clade. Scale bar represents amino acid substitutions per site. d, Gene counts per mL of ‘Cyano SP-like’ RNR genes in the Chesapeake Bay from May to December 2018 (n = 17 biologically independent samples). Gene counts were quantified with qPCR. Error bars are SE.
Extended Data Fig. 4 Correlation of pairwise shared populations determined by comparing viral contigs and alpha subunit RNR genes.
Spearman’s Rank correlation of pairwise shared populations determined by comparing viral contigs and alpha subunit RNR genes. Shared viral populations between libraries from December 2012, May 2017, May 2018, August 2018, and December 2018 were compared. Each point represents the proportion of viral populations that were shared between two sample time points based on comparison of contigs > 5kb (x axis) and comparison of RNR genes only (y axis) (for example 50% of May 2017 viral populations were shared with May 2018 by analysis of contigs, 57% of May 2017 viral populations were shared with May 2018 by analysis of RNR genes only). RNR genes alone captured the seasonal diversity observed from viral contigs, making it a good marker gene for viral population diversity (Spearman’s Rho = 0.93, p = 0).
Extended Data Fig. 5 Host predictions for Chesapeake Bay viral populations from 2018 virome libraries.
Host predictions for Chesapeake Bay viral populations from spring (May), summer (August), and winter (December) 2018 virome libraries. a, Predicted host and phage family based on RNR homology. 634 RNR gene homologs were found across 10,858 total viral populations. RNR nucleotide sequences were aligned with MAFFT using the peptide sequences as a guide in TranslatorX. Only the 437 RNR sequences that spanned the same region (460A to 693I in the E. coli class I alpha RNR peptide) were further analyzed to avoid possibly double-counting partial RNR sequences on separate contigs. RNR homology was queried by BLASTn against the NCBI nr database. Only top hits with e values < 1E-10 were classified. The total number of RNR sequences analyzed at each time point are indicated in parentheses. b, Mean nucleotide identity between Chesapeake Bay RNR sequences and reference sequence top hits. Only Chesapeake Bay RNR sequences sharing at least 65% nucleotide identity over 90% of the sequence with a reference sequence were assigned a reference hit. The number of RNR sequences are indicated in parentheses. Box and whisker markers represent the minimum, first quartile, median, third quartile, and maximum values.
Extended Data Fig. 6 Rank abundance curve of viral populations and RNR sequences in an example viral community.
All of the Cyano SP-like Actinophage RNR sequences (dark lines highlighted with arrows) were ranked in the rare tails of the two viral communities and are shown in the blown-out, log-transformed insets. Only the two paired long-read and short-read metagenomes are shown here with similar results being observed for the rest of the short-read-only viromes.
Extended Data Fig. 7 k-mer-based host predictions for Chesapeake Bay viral populations assembled from shotgun metagenomics sequence data.
Half of all assembled viral populations > 5kb had a significant top hit to a putative host in the host database (see Materials and Methods; upper panel left). Actinobacteria were overrepresented as putative hosts for Chesapeake Bay viral contigs relative to all Chesapeake Bay metagenome-assembled genomes (MAGs; upper panel right). If the composition of predicted top hosts and the abundance of those in database were very similar, it would suggest that the probability of being a predicted host would scale with the abundance in the database, potentially creating false-positive associations. However, the enrichment of MAGs in the top host predictions (upper panel middle) compared to the number of MAGs in the database (lower panel left) and the enrichment of Actinobacteria within MAGs predicted as hosts (upper panel right) compared to the composition of the MAG dataset (lower panel right) is consistent with substantial viral pressure on Actinobacteria populations in this environment.
Extended Data Fig. 8 Viral community composition of populations based on top predicted host by in silico host prediction.
a, Viral community composition of populations with a Chesapeake Bay MAG as a top predicted host. Contigs with a Chesapeake Bay MAG as a top predicted host represented 20% of all viral contigs with a significant (p < 0.05) host prediction (n = 937). The total number of viral populations with a Chesapeake Bay MAG as a top host prediction are indicated in parentheses for each sample timepoint. b, Viral community composition of all contigs with a significant (p < 0.05) host prediction (n = 4,644). The total number of viral populations with a significant host prediction are indicated in parentheses for each sample timepoint.
Extended Data Fig. 9 Predictions of virus-host associations by various bioinformatics approaches largely identify unique interactions within virus-host interaction space.
a, Bioinformatics approaches identifying viral populations predicted to infected the observed metagenome assembled genomes (MAGs). Three different approaches were applied to infer viral populations that infect MAGs; Markov model-based method (WIsH, blue), CRISPER spacer homology match (CRISPR, red) and tRNA homology matches (tRNA, yellow). Numbers within each non-overlapping shaded region show how many MAGs were uniquely predicted as hosts with each method. MAGs predicted as hosts from multiple different methods are found within the overlapping shaded region (for example 2 of the same MAGs were predicted as hosts by WIsH and CRISPR in the red and blue overlapping region). Numbers in parentheses indicate how many of the shared predictions match the same viral population. In all cases, none of the viral populations predicted to infect MAGs were identical between methods. b, Bioinformatics approaches identifying host taxonomy for observed viral populations. Three different approaches were applied to infer host taxonomy; Markov model-based method (WIsH, blue), RNR homology match (RNR, red) and tRNA homology matches (tRNA, yellow). Numbers within each non-overlapping shaded region show how many viral population predictions were unique for each method. Viral populations with hosts predicted from multiple different methods are found within the overlapping shaded region. Numbers in parentheses indicate how many overlapping predictions match at the genus (first) and phylum (second) level. For example, 36 of the same viral populations had host taxonomy predicted by WIsH and RNR (red and blue overlapping region). However, while there were 36 shared predictions, only two of these host predictions were concordant at the genus or phylum level (5.6%).
Extended Data Fig. 10 The impact of viruses on Chesapeake Bay summer bacterial populations.
Bacterial communities sampled in July 2019 were incubated for 75 hours with or without viruses. a, Bacterial community composition prior to incubation. b, Growth of bacterial communities in incubations with viruses (n = 6 incubations of the same initial sample) and without viruses (n = 6 incubations of the same initial sample). Bacterial abundances were quantitated with qPCR and are reported as the mean fold-change of the bacterial community relative to the starting community abundance. Error bars are SD. Asterix indicates significant (two-tailed Mann-Whitney U test, p = 0.02 at 52 hours; p = 0.005 at 75 hours) difference in fold change between no virus and with virus incubations. c, Phylogeny and putative resistance or susceptibility of OTUs to viruses in the incubation experiment. Only OTUs with significantly (FDR p < 0.05) greater growth in one of the treatments (without viruses or with viruses) are depicted. OTUs were first filtered for detectable growth during the incubation (as determined by relative abundance of each OTU and community absolute abundances from qPCR). OTUs that displayed growth in at least four treatment replicates (without viruses or with viruses) were assessed for significantly (FDR p < 0.05) greater growth in one of the treatments using a Kruskal-Wallis test. In total, 89/ 2,664 OTUs passed filtering criteria and were identified as displaying significantly higher growth in one of the treatments. Putative susceptibility was calculated as (OTU Abundance Fold Change Without viruses)/(OTU Abundance Fold Change With Viruses). Higher values indicate greater presumed susceptibility to viral-mediated mortality.
Supplementary information
Supplementary Information
Supplementary Tables 1 and 4–6, and references.
Supplementary Tables
Supplementary Tables 2 and 3.
Source data
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
About this article
Cite this article
Sakowski, E.G., Arora-Williams, K., Tian, F. et al. Interaction dynamics and virus–host range for estuarine actinophages captured by epicPCR. Nat Microbiol 6, 630–642 (2021). https://doi.org/10.1038/s41564-021-00873-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41564-021-00873-4
This article is cited by
-
Ecophysiology and genomics of the brackish water adapted SAR11 subclade IIIa
The ISME Journal (2023)
-
Viral tag and grow: a scalable approach to capture and characterize infectious virus–host pairs
ISME Communications (2022)
-
Response of soil viral communities to land use changes
Nature Communications (2022)
-
Unexpected myriad of co-occurring viral strains and species in one of the most abundant and microdiverse viruses on Earth
The ISME Journal (2022)