A single-cell polony method reveals low levels of infected Prochlorococcus in oligotrophic waters despite high cyanophage abundances

Long-term stability of picocyanobacteria in the open oceans is maintained by a balance between synchronous division and death on daily timescales. Viruses are considered a major source of microbial mortality, however, current methods to measure infection have significant methodological limitations. Here we describe a method that pairs flow-cytometric sorting with a PCR-based polony technique to simultaneously screen thousands of taxonomically resolved individual cells for intracellular virus DNA, enabling sensitive, high-throughput, and direct quantification of infection by different virus lineages. Under controlled conditions with picocyanobacteria-cyanophage models, the method detected infection throughout the lytic cycle and discriminated between varying infection levels. In North Pacific subtropical surface waters, the method revealed that only a small percentage of Prochlorococcus (0.35–1.6%) were infected, predominantly by T4-like cyanophages, and that infection oscillated 2-fold in phase with the diel cycle. This corresponds to 0.35–4.8% of Prochlorococcus mortality daily. Cyanophages were 2–4-fold more abundant than Prochlorococcus, indicating that most encounters did not result in infection and suggesting infection is mitigated via host resistance, reduced phage infectivity and inefficient adsorption. This method will enable quantification of infection for key microbial taxa across oceanic regimes and will help determine the extent that viruses shape microbial communities and ecosystem level processes.

efficiency observed in the S-TIM4 and Syn5 growth experiments (Fig. 2). Cyanophage growth experiments with T7-like (n=3) and T4-like (n=9) cyanophages (Supplementary Table 4) indicated that, on average, single virus genome copies exist prior to DNA replication for 23% and 30% of the latent period, that genome replication occurs for 27% and 35% of the latent period, and 50% and 35% of the latent period is after genome replication in T4-like and T7-like cyanophages, respectively. Therefore, the efficiencies of the three bins were weighted according to the average duration of each respective stage of infection for each phage family.

Co-sorted virus contamination
Potential co-sorting of free viruses can be a significant issue when sorting is based on autofluorescence properties of particles as cell sorters will not detect and abort events with nonfluorescent co-sorted viruses. Initial tests were performed by adding viruses to solutions of 1 µm yellow-green beads and sorting the bead-virus solutions. Either the T7-like cyanophage, P-SSP7, or the T4-like cyanophage, Syn9, were serially diluted 10-fold to achieve concentrations ranging from 10 3 to 10 8 viruses ml -1 and added to the beads. Polonies from free viruses were detected in sorted solutions when the added virus concentrations were 10 5 viruses ml -1 and above.
We empirically tested the frequency of co-sorted viruses in environmental samples that spanned a range of infection values, free virus abundances, depths, seasons, and ocean basins (n=19, see Supplementary Table 5). Prochlorococcus were sorted from these samples and an aliquot of the sorted cells was filtered through a 0.2 µm syringe top filter (Acrodisc HT Tuffryn membrane, Pall Corp.). Both cells and filtrate were analyzed with the same iPolony reaction conditions for the abundance of T4-like and T7-like polonies. The abundance of total polonies observed was compared to the abundance of polonies in the filtrate ( Supplementary Fig. 4). Cosorted viruses were compared to the abundances of cyanophages in the water column determined from these samples to estimate thresholds of co-sorted virus contamination (Supplementary Fig.  4c-d). When T7-like or T4-like free cyanophage in seawater samples was greater than 5x10 5 or 3x10 5 cyanophage ml -1 , respectively, co-sorted viruses contributed to a low but constant proportion of the infection signal ( Supplementary Fig. 4c-d). Regressions between uncorrected percent infection and percent infection corrected for co-sorted free viruses indicated that cosorted viruses contributed to 8.2% and 19.5% for T7-like cyanophage and T4-like cyanophage infection, respectively (Supplementary Fig. 4e-f). Thus, a correction factor (0.918 for T7-like cyanophage or 0.805 for T4-like cyanophage) is applied to adjust for free virus co-sorting when co-sorted virus concentrations were not empirically determined and in situ free virus abundances reach these thresholds.

Sample storage conditions
The effects of storage time and conditions on measured infected cell concentrations were tested for both initial sample collection and after sorting. Locations and dates of sample collection used in this section can be found in Supplementary Table 5. First, we tested whether freezing samples affected infected cell concentrations. Water was collected from the Mediterranean Sea off Haifa, Israel on September 11, 2018 and fixed with glutaraldehyde (0.125% final concentration). Prochlorococcus and Synechococcus were sorted from one aliquot that was not frozen. The other aliquots were frozen in liquid nitrogen and stored at -80 ºC for 5 hours, thawed, and then sorted. Sorted cyanobacteria from fresh and frozen samples were analyzed for the extent of viral infection by T7-like and T4-like cyanophages ( Supplementary  Fig. 1a). Freezing had no significant effect on percent infection compared to samples that were not frozen (n=5, paired two-tailed t-test, p= 0.69). Thus, samples can be flash frozen without affecting the number of infected cyanobacteria measured.
To test the effect of long-term storage of samples at -80 ºC on infected cell abundances, samples were collected and stored in replicates. One set of samples collected from the Mediterranean Sea was analyzed for percent infection of Prochlorococcus and Synechococcus initially on the day of collection (unfrozen) and again 1 week, 6 weeks, and 5 months after collection ( Supplementary Fig. 1b). To test storage over >5 months, a second set of samples (n=8) from the North Pacific Ocean were initially analyzed for Prochlorococcus percent infection 2 months after collection. Frozen replicates of these samples were then analyzed 1 year after the initial analysis ( Supplementary Fig. 1b). There was no significant effect of storage time on the percent infection relative to the initial analysis (one way ANOVA, F(5,31)=0.64, p=0.67). These results were consistent across samples collected from different oceans that varied in their absolute infection percentage. Thus, virally infected cells are stable long-term when flash frozen in liquid nitrogen and stored at -80 ºC.
We tested the effects of different storage conditions on sorted cells in order to see if sorting and polony analysis could be performed on separate days due to the time intensive nature of performing sorting and the polony analysis on the same day. We tested this on Prochlorococcus cells as they are considered more sensitive than Synechococcus. First, Prochlorococcus cells were sorted from 3 environmental samples and were analyzed on the same day as they were sorted. An aliquot of sorted cells were also flash frozen in liquid nitrogen, stored at -80 ºC, and percent infection reanalyzed after 7 days. The number of polonies postfreezing decreased significantly (n=3, paired two-tailed t-test, p = 0.028) for Prochlorococcus cells infected by T4-like and T7-like cyanophage by an average of 47.5% and 58.7%, respectively ( Supplementary Fig. 2a). Thus, Prochlorococcus cells cannot be frozen and stored after sorting without significantly affecting the number of infected cells measured. Second, we tested whether cells could be stored at 4 ºC without affecting infection levels. Prochlorococcus was sorted from two samples, and T4-like and T7-like cyanophage infection analyzed on the same day and 1, 2, 4, and 7 days post sorting after storage at 4 ºC in the dark ( Supplementary  Fig. 2b). Infection percentages decreased significantly over time (repeated measures ANOVA, F(4,12) = 6.119, p<0.05). In 3 of the 4 analyses, infection decreased 22-57% compared to samples analyzed on the same day ( Supplementary Fig. 2b). Thus, these data indicate that sorted Prochlorococcus cells cannot be stored and polony analysis should be done on the same day as sorting.

Balancing parameters that mitigate infectious encounters
The graphical solution space in Supplementary Fig. 8 shows the range of parameters for loss of virus infectivity, inefficient adsorption to host cells, and reduced host susceptibility to infection which are needed to mitigate the high number of host-virus encounters based on encounter theory, which yields maximal encounter rates, to produce the empirically observed levels of infection. No a priori parameters were used to generate this solution space other than the average concentrations of Prochlorococcus, cyanophages, and percent infection. Via parsimony, we examine the range of values (the orange space in Supplementary Fig. 8) close to that in which all three parameters explain the gap in lysis rates from that expected by encounter theory. In this parameter regime, adsorption frequency, viral infectivity ranges, and host susceptibility all range between 10-35%. This range of parameters falls within ranges of previously reported values. First, 5-25% is the median range of adsorption frequencies for ~70 host-virus pairs (66), consistent with our solution space for this parameter. Second, infectivity values can be derived from knowing the balance between cyanophage production and decay (17). Cyanophage decay ranges widely from 0.048 -2.0 d -1 (19, 51). There are few data on cyanophage production, and this is typically derived from assuming that production balances decay (19). Between 20-90% of total viruses are thought to lose infectivity each day based on decay values (17, 65). We estimated that 14,440 T4-like cyanophages would be produced from the lysis of 1200 infected cells with a burst size of 12 viruses cell -1 (see Results and Discussion), a conservative estimate based on 1 infection cycle occurring each day. The timescale of production balancing decay in this scenario is a 23-day turnover time. For 10-35% of cyanophages to be infective at steady state between ~28-99% of cyanophages would need to lose infectivity each day. This overlaps with the wide bounds of loss of infectivity each day noted above (17, 65).
Third, the fraction of susceptible hosts that exist in a given population is currently unknown. Hundreds of genomically distinct Prochlorococcus subpopulations coexist in a single water mass (68, 76). This could represent a lower bound if every subpopulation was infected by a distinct virus. However, it is well known that multiple cyanophages can infect a particular cyanobacterial isolate (18, 30, [77][78][79] and that a single phage can infect multiple cyanobacteria (18, 30, 39). Thus, in the field, a network of susceptible hosts and infective viruses is highly likely to be more complex with overlapping interaction networks (69, 80,81). Thus, we cannot reliably evaluate the range of cyanobacterial susceptibility. It should be noted, however, that some estimates of susceptibility have been made for Synechococcus based on isolating cells and phages from a single water source and testing their cross-infectivity (18, 30) or based on contactrate calculations (19). These calculations suggest that inshore Synechococcus are largely resistant (18, 19) while offshore Synechococcus are largely susceptible to co-occurring cyanophages (19).

Cyanobacterial growth and infection conditions
Cultures of Synechococcus sp. strain WH8109 and Prochlorococcus sp. strain MIT9515 were grown in artificial seawater medium (ASW) or Pro99 medium, respectively, at 22 ºC under a 14:10 h light-dark cycle at a light intensity of 7-10 µmol photons m -2 sec -2 . Cyanobacterial cultures were also grown on semi-solid media to produce a lawn of cells using the pour plating method (82). Briefly, Synechococcus or Prochlorococcus cells were mixed with either ASW or Pro99 medium containing Invitrogen Ultra-Pure low melting point agarose at a final concentration of 0.28% and poured into petri dishes. The helper bacterium Alteromonas sp. strain EZ55 was added to the cells to reduce oxidative stress and increase the efficiency of plating (83). The plated cultures were grown under the same conditions described above. Culture growth in liquid media was monitored via chlorophyll a fluorescence using a Turner 10AU fluorometer. Cell abundances were measured on a BD LSR II cytometer for Synechococcus or a BD Influx cytometer for Prochlorococcus.

Phage lysate preparation
Phage lysates of Syn5, S-TIP37, and S-TIM4 were prepared by infecting 25-30 ml volumes of cyanobacterial cultures with phages at a MOI of 0.1 to 0.6. After culture lysis, cell debris were precipitated by centrifugation at 8694xg for 15 min at 21 ºC and the supernatant containing free phages was collected and filtered through a 0.2 μm pore size syringe filter (Acrodisc HT Tuffryn Membrane, Pall Corp.).

rbcL primer and probe design
The polony method for detection of cyanobacterial genomic DNA was used to assess the efficiency of cell permeabilization methods. Specific primers and probes were designed for the rbcL gene from Synechococcus sp. strain WH8109 (Supplementary Table 2). The primers were first tested by standard PCR and resulted in the amplification of the expected product with a size of 412 bp. The conditions for the polony method were optimized for these primers using extracted genomic DNA and were 1.5 µM acrydite-modified primer, 0.5 µM non-modified primer, 0.335 U/µl enzyme, 12.5% acrylamide gel, 50 ˚C annealing temperature, 35 PCR cycles, and 55 ºC probe hybridization temperature. These polony conditions yielded 31% ± 7% (n=4) efficiency of polony formation on extracted DNA.

Single genome copy detection in E. coli
The T4-like cyanophage Syn9 g20 gene was cloned into the pMK-proCAT plasmid, a pUC57 derivative containing EcoRI and SpeI cleavage sites. Primers DR213F (GTCTAAGAATTCCACTAGATTCATCTAGAATCTTCTACATTGACGTTGGTAA) and DR213R (TCAGAAGTGCATCAAGGTCGAATACTTACCAACAAAAGGATCC) were used to amplify the g20 fragment as well as to add EcoRI and XbaI sites at the 5' end and a SpeI site at the 3' end of the fragment. This 586 bp PCR fragment was then cloned into pMK-proCAT by EcoRI and SpeI cleavage and ligation. The resulting plasmid was named pG20-C1. Plasmid abundance was determined from the concentration of DNA based on absorbance at 260 nm and the molecular weight of the plasmid based on its sequence length.
A PCR fragment carrying single copy of the g20 fragment, yellow fluorescent protein (YFP), and a kanamycin resistance gene was constructed using primers DR286F (TTAAGCACCGGAATTCCAC) and DR286R (CGAAAACTCACGTTAAGGG), and transformed into E. coli strain DH10b. To inhibit plasmid replication (75) and reach a single plasmid copy cell -1 , triplicate E. coli cultures grown in LB medium overnight at 37˚C with shaking at 250 rpm were starved by transfering to minimal M9 medium (Sigma) in the presence of kanamycin to retain plasmid selection. The LB medium was removed by two rounds of centrifugation and resuspension in M9 medium (Sigma) and a 50-fold dilution into M9 medium with kanamycin. The cultures were then incubated for an additional 3 hours at the original growth conditions. Cells (100 µL) were harvested, fixed with 0.1% glutaraldehyde, sorted based on YFP fluorescence and size and subjected to the iPolony protocol with g20 T4-like degenerate primers and probes.
To confirm a single copy of the plasmid per cell, cells were collected on a 25 mm diameter, 0.2 µm pore-size polycarbonate membrane filter (Millipore, GVS). Intracellular DNA was extracted by a heat lysis method as described previously (84). The extracted DNA was used as a template for quantitative real-time PCR using primers Syn9_g20F/R (Supplementary Table  2), and the pG20-C1 plasmid (described above) was used to generate a standard curve for absolute copy number calculation.
Comparison of metagenomes to the polony method and polony primers Virome sequence reads from the 2015 cruise (46) were downloaded from NCBI's Sequence Read Archive (Bioproject PRJNA358725). Sequences were quality trimmed to a Phred score of 20 and quality controlled for a minimum length of 50 bp using BBDuk in BBtoolsv38.22 (sourceforge.net/projects/bbmap/). Quality controlled reads were than classified using Kaiju v1.7.0 (85) using the 'viruses 2019-02-05' database with default settings. All reads identified as cyanophage were summed by sample and taxonomic affiliation (T4-like cyanomyovirus, T7-like cyanopodovirus, TIM5-like cyanomyoviruses, and cyanosiphoviruses). T4-like myoviruses have larger genome sizes (~180kb)(86) than T7-like podoviruses (~45kb) (87) and cyanosiphoviruses (~30-110kb)(32) and thus may recruit more reads. Therefore, read abundances were normalized for genome size by dividing by a factor of 3.5, 1.5, and 1 for T4-like cyanophages, cyanosiphoviruses, and T7-like cyanophages. The normalized sum of each group was divided by the total number of cyanophage reads in each sample to calculate relative abundance of each phage group.
We checked whether the primers and probes used in the polony and iPolony methods to assess T4-like cyanophage abundance and infection captured the sequence diversity of the T4like cyanophages in the environment. We downloaded the virus contigs assembled from the 2015 cruise (Bioproject PRJNA358725)(46) and assumed that assembled viruses were the major genotypes in the water at this time because they were abundant enough to generate long scaffolds and were reported to be high in relative abundance in the libraries (46). Genemark.hmm v3.25(88) was used to find open reading frames (ORFs) on scaffolds. ORFS were blasted against nr (downloaded February 21, 2019) to identify T4-like cyanophage g20 sequences and checked against scaffold taxonomic assignments in Aylward et al. (46). The environmental T4-like cyanophage g20 sequences and the Syn9 g20 sequence as reference were aligned with MUSCLE v3.8.31 (89)and visualized with AliView (90). We further assessed whether the sequence variation in the sequence reads was captured by the primers and probes. Sequence reads from the viromes used to generate the above assemblies were downloaded. A phylogenetic tree of g20 nucleotide sequences was constructed based on the tree in Goldin et al. (34) using RAxML v8.1.20 (91). To obtain short reads, a database of all viromes was searched using blastn v 2.9.0 (92) at an e-value of 1x10 -5 using representative cyanophage g20 sequences as queries to broadly capture the diversity of g20 sequences in the viromes. Short read sequences were aligned to fulllength reference sequences using PaPaRa: Parsimony-based Phylogeny-Aware Read Alignment program 2.0 (93). Reads were placed on phylogenetic trees using the EPA: Evolutionary Placement Algorithm portion of RAxML v. 8.1.20. Reads that did not place within the T4-like cyanophage group were removed from the PaPaRa alignment. Alignments were trimmed to the primer and probe sequences used in the iPolony assays. Sequence logos were generated from these curated alignments using WebLogo v 2.8.2 (94).

Ecological models Estimates of daily mortality
An ecological model containing susceptible Prochlorococcus cells, P, infected cells I, and viruses V, was created to infer the fraction of daily mortality that could be attributed to the viruses. Susceptible cells grow logistically with a per capita growth rate μ and carrying capacity K. Adsorption of viruses to cells occurs at a rate f, and all other losses of susceptible cells are characterized by the rate m1. Infected cells lyse following the latent period L = 1/λ generating a burst size b of new virions. Other losses to infected cells are characterized by the rate m2, which may differ from m1. Finally, the virus decay rate is described by w. This ecological system is described by the following differential equations: We assumed steady state, where Prochlorococcus mortality and growth are balanced, and the population growth is approximately μP* assuming total cell density is much less than K. We note that the infected fraction fi at steady state is I*/(P*+I*), which is approximately I*/P* given low rates of infection (I*<<P*). The fraction of mortality attributed to viruses (fvm) can then be written as the ratio of lysis to growth, i.e.: where Tp is the population turnover time and L is the infection latent period. Effectively, the fraction of virus-induced mortality is the number of infection cycles completed by the infected cell fraction within the population turnover time.
Estimating encounter rates Encounter rates between Prochlorococcus and T4-like cyanophages were calculated using the Einstein equation for diffusion (95) and the Smoluchowski equation for particle encounters (64). Each particle is assumed to be a sphere and its movements governed by diffusion. The diffusion constant (D) for each sphere is: where kB is Boltzmann's constant (1.38x10 -8 µm 2 g s -2 K -1 ) , T is the temperature of the seawater (296.15 K), h is the dynamic viscosity of seawater (9.96x10 -7 g µm -1 s -1 ), and r is the radius of the sphere. Although T4-like viruses are not spherical and are composed of a 80-90 nm capsid and a 150 nm tail (96), we approximate their dimensions within the constraints of the model being a 0.15 µm diameter sphere. Prochlorococcus was assumed to have a 0.5 µm diameter. The encounter rate kernel (E) of one Prochlorococcus cell with cyanophages which is governed by diffusion is: where r is the radius of a sphere and D is the diffusion constant of that sphere. The encounter rate kernel (E) was multiplied by the abundances of Prochlorococcus and the abundances of free cyanophages to calculate the total number of contacts between cyanophages and Prochlorococcus at the population level. The percent of encounters that resulted in infection was determined by dividing the total number of Prochlorococcus cells that were estimated to be killed per day, based on 1-3 latent periods derived from iPolony measurements (see above), by the estimated total number of encounters per day at the population level. Similarly, to estimate the percent of Prochlorococcus killed per day based on contact rates assuming 100% of encounters resulted in infection, the total number of contacts d -1 ml -1 was divided by the abundances of Prochlorococcus ml -1 . We estimated the number of resistant cells assuming all viruses were infective and all contacts would lead to infection by taking the difference between the number of cells that encountered a virus based on contact rates and the number of cells killed based on polonies and then dividing this number by the total number of contacts d -1 ml -1 , or in other words, the number of cells that were contacted but not infected. It was assumed that no cell contacted the same virus twice. Thus, these represent maximum theoretical contact rates.
To visualize the solution space for virus infectivity, host susceptibility, and adsorption efficiency the encounter rate equation was adapted as: where E is the encounter kernel from the above equation, x is the infective fraction of the total virus population, V, (3.6 x 10 5 viruses ml -1 ), y is the susceptible fraction of Prochlorococcus, P, (1.8 x 10 5 cells ml -1 ), and z is the adsorption efficiency of a host-virus encounter. The function was parameterized as described above with the number of encounters that result in infection and lysis each day (C) being 0.79% as observed on average throughout the time series with the iPolony data. We assume that the system is in steady state where new encounters that lead to infection must balance the loss of infected cells each day.    The slope of the linear regression is used to correct infection values from environmental samples for which direct measurement of co-sorted free phage were not determined. Note that no correction was needed for this study as free viruses were below concentrations at which they were co-sorted with cells (c,d).   The alignment shows the regions of the g20 gene that are targeted by the degenerate primers and probes used in the T4-like cyanophage polony method (34). The top row is the primer and probe sequences (sequences shown for the probe and reverse primer are the reverse complements). Degenerate bases in the primer/probe sequence are shown by stacked nucleotides. Cyanophage scaffolds assembled from viromes collected on the same 2015 cruise and a reference cultured T4-like cyanophage (Syn9) are shown below. Pink boxes highlight bases that are nucleotide mismatches to the primer and probe sequences. Note that single nucleotide mismatches were observed for 5 sequences and these were in the middle of the reverse primer and probe which would not interfere with amplification or detection. Empirical testing of the probe indicated that it could tolerate 3 mismatches (Goldin, personal communication). (b) Sequence logos show the relative frequency of bases at each position in the primers and probe based on the alignment of individual T4-like cyanophage reads from the 2015 viromes. The percent of reads aligned to the forward and reverse primers with two or less mismatches was 94% and 97%, respectively. Ninety three percent of reads aligned to the probe sequence with 3 or less mismatches.