Characterization of prophages of Lactococcus garvieae

This report describes the morphological characterization and genome analysis of an induced prophage (PLg-TB25) from a dairy strain of Lactococcus garvieae. The phage belongs to the Siphoviridae family and its morphology is typical of other lactococcal phages. A general analysis of its genome did not reveal similarities with other lactococcal phage genomes, confirming its novelty. However, similarities were found between genes of its morphogenesis cluster and genes of Gram-positive bacteria, suggesting that this phage genome resulted from recombination events that took place in a heterogeneous microbial environment. An in silico search for other prophages in 16 L. garvieae genomes available in public databases, uncovered eight seemingly complete prophages in strains isolated from dairy and fish niches. Genome analyses of these prophages revealed three novel L. garvieae phages. The remaining prophages had homology to phages of Lactococcus lactis (P335 group) suggesting a close relationship between these lactococcal species. The similarity in GC content of L. garvieae prophages to the genomes of L. lactis phages further supports the hypothesis that these phages likely originated from the same ancestor.


Results
General features of the temperate phage PLg-TB25. L. garvieae TB25 was previously isolated from an Italian cheese. A mitomycin C induction assay led to the isolation of an inducible prophage we named PLg-TB25. As shown in Fig. 1, phage PLg-TB25 is characterized by a 60 ± 6 nm icosahedral capsid and a non-contractile tail of 222 ± 6 nm long, 13 ± 3 nm wide, indicating it belongs to the Siphoviridae family. It has a double-stranded DNA genome of 38,122 bp and its GC content was calculated to be 34.5%, slightly lower than the GC content (38.1%) of its L. garvieae TB25 host 17 . The PLg-TB25 sequence shared no homology with the limited number of known L. garvieae phages. However, we identified homology with only very short DNA fragments of other L. lactis phage genomes. Thus, the L. garvieae temperate phage PLg-TB25 is a new member of the Siphoviridae family. Genome analysis. The search for orfs using ORF Finder and RAST Server was limited to those encoding proteins of more than 30 amino acids and flanked by an upstream potential Shine-Dalgarno sequence. The functions of the ORFs were presumed by comparing (BLASTp) deduced protein sequences with the GenBank database as well as by identifying conserved domains. Gene order was started by identifying the gene coding the integrase as the first orf (orf1), as done previously for other lactococcal phages 18 . Therefore, the PLg-TB25 genome starts with the divergently oriented lysogenic module (orf1 to orf5), followed by the replication/transcription module (orf6 to orf26), the morphogenesis genes (orf27 to orf48) and finally the lysis module (orf49 and orf50). Similar gene organization has been reported for other lactococcal phages 18 . Of note, while the draft genome of the host L. garvieae TB25 strain is available 17 , fragments of the inducible phage genome were found on different contigs of TB25. Yet, the phage gene order was the same on the various bacterial contigs (data not shown) as the gene order obtained in the single assembled contig from the induced phage.
In total, we identified 58 orfs covering 91% of the phage genome (Table 1, Fig. 2). The most common starting codon was AUG (87%), followed by UUG (10%) and GUG (3%). A typical RBS (AGGAGA) preceded only eight orfs (orf5, orf10, orf11, orf21, orf30, orf34, orf38, and orf50). We did not identify any tRNA or recognizable virulence factors in the genome of phage PLg-TB25. Predicted functions were attributed to 31 ORFs (53%). The proteins encoded by the 27 remaining ORFs had no homology with other phage proteins, confirming that phages are sources for novel genes and that the inducible phage PLg-TB25 is new.

Continued
Within the lysogeny and replication modules, the majority of the ORFs best matched proteins found in strains of L. garvieae and L. lactis. Conversely, the deduced ORFs involved in the phage's morphological structure are similar to proteins found in other Gram positive bacteria, such as Staphylococcus spp., Enterococcus spp., and Fructobacillus spp., although with low amino acid identity (31-54%). A 6-kb region containing 8 genes with low GC content (31.5%) was located downstream of the lysis module. One of the genes seemed to code for a cold shock protein 19 . While the function of cold-shock proteins is not fully understood, they often bind nucleic acids and may provide a mechanism for coping with stress and adapting to changing environmental conditions. This additional region at the end of the genome was likely acquired through recombination events or imprecise excision of the prophage.
A comparison between the genomes of phage PLg-TB25 and the PLgT-1 temperate phage from a L. garvieae marine fish isolate, revealed similar length (38 kb for PLg-TB25 and 40 kb for PLgT-1) and GC content (35.4% for the marine isolate and 34.5% for the dairy isolate). The 66 orfs found in PLgT-1 are organized in modules similar to PLg-TB25 but the gene/protein content is completely different.
Search for temperate phages in L. garvieae genomes. The search for prophages was extended to 16 L. garvieae genomes available in GenBank (Supplementary Table S1). As reported in Table 2, eight seemingly complete prophages were found in the genomes of seven L. garvieae strains isolated from dairy and fish environments. The genome sizes ranged from 30 to 40 kb and, when possible, the integration site (att core) was also determined. Six prophages had lower GC content (34.1-35.9%) compared to the rest of the bacterial genome (37-38%).
To verify whether L. garvieae strains colonizing a similar ecological niche carried similar prophages, we compared the genome of inducible prophage PLg-TB25 with prophages found in the genomes of the two L. garvieae strains of dairy origin, IPLA 31405 and NBRC 100934. Very low sequence identity was found between these prophages. Moreover, the prophage from NBRC 100934 (PLg-100934) shared low nucleotide identity with other phage genomes available in GenBank. In fact, the closest (with 11% identity) phage genome was the L. lactis temperate phage BK5-T (P335 group, Fig. 3) 20 .
The genome of PLg-100934 was 36,265 bp in length with a GC content of 37.5%, a value close to its host (38.5%) (Supplementary Table S2). A total of 54 orfs were detected, covering 90% of the genome. The majority of the ORFs use AUG as the starting codon (85%), followed by UUG (11%) and GUG (4%). A RBS (AGGAGA) was found upstream of 11 orfs (orf4, orf11, orf18, orf19, orf21, orf28, orf32, orf37, orf45, orf48, and orf52). Genome analysis identified one tRNA (Lys) and no recognizable virulence factors. The genome of PLg-100934 was also divided into four modules: lysogeny (orf1 to orf6), replication/transcription (orf7 to orf31), morphogenesis (orf32 to orf50), and lysis (orf51 and orf52). Predicted functions were attributed to 23 of the 54 orfs (42%), including orf31, which was predicted to be related to a L. lactis homing endonuclease thought to be involved in horizontal gene transfer 21,22 . As reported for phage PLg-TB25, the PLg-100934 genome carries two extra genes with low GC content (31.8%) downstream of the lysis module. The function of the deduced proteins is unknown. Two prophages were found in the genome of dairy strain L. garvieae IPLA 31405 23 , having homology to L. lactis phages. The genome of PLg-IPLA31405a was 34,986 bp in length with a GC content of 36.4%. A total of 53 orfs were detected, covering 90% of the genome. The genome of the second prophage, PLg-IPLA31405b, was 30,579 bp in length with a GC content of 35.1% and 46 orfs covering 85% of the genome. One of the prophages, PLg-IPLA31405a, had >90% nucleotide homology with the virulent L. lactis phage ul36 18 and its mutant ul36.k1 (Fig. 3), the latter being resistant to the AbiK abortive infection mechanism 24 . Half of the deduced ORFs (26/53) had between 32 and 97% amino acid identity to proteins from these L. lactis phages (Fig. 4). The morphogenesis module was particularly conserved, suggesting the same morphological features. Both L. lactis phages (ul36 and ul36.k1) are virulent members of the P335 group, which contains both temperate and lytic phages 4 . The gene coding for a dUTPase proposed to be used to detect P335 phages was not found in the PLg-IPLA31405a genome 18 .
Interestingly, the deduced protein of a gene found after an orf coding for a putative XRE regulator in the PLg-IPLA3145a genome had 72% amino acid identity with the Sak protein of L. lactis phages ul36.k1 24 and ul36.1 25 . Sak is involved in sensitivity/insensitivity to the lactococcal AbiK abortive infection system (Fig. 4). Surprisingly, a gene (orf53) coding for a protein sharing a conserved domain with the lactococcal abortive infection system, AbiF (COG4823) 26 , was detected downstream of the lysis module 27 . A phylogenetic analysis was performed using the amino acid sequences of ORF53 (AbiF conserved domain), phage PLg-IPLA 3405a and 20 Abi systems from L. lactis [27][28][29] . The proteomic phylogenetic tree constructed using MEGA5 software and the neighbour-joining method revealed that the L. garvieae Abi-like protein was grouped with other lactococcal Abi systems tested, but diverged in a separate branch (Supplementary Fig. S1).
The other L. garvieae IPLA 31405 prophage, PLg-IPLA 31405b, was related to the temperate phage r1t from L. lactis 30 (Figs 3 and 5). Phage r1t also belongs to the P335 group (subgroup III) 31 . The highest amino acid identity was found with proteins involved in the morphogenesis module (75%). While a gene coding for a dUTPase was not found, an additional gene, located 700 bp downstream from the lysis module, appeared to code for a protein with a conserved cold-shock DNA-binding domain (pfam00313).
Similar comparative genome analyses were performed with prophages harboured by L. garvieae strains isolated from fishes, such as ATCC 49156, Lg2 and UNIUD 074 ( Table 2). The prophages from L. garvieae ATCC 49156 and Lg2 are closely related (99% nucleotide identity) and have significant nucleotide identity (95% over 41% of the genome) with the prophage found in L. garvieae strain UNIUD 074 (Fig. 3). Interestingly, they are all related to the temperate phage ɸTP712 found in the widely used plasmid-free laboratory strain L. lactis MG1363   and derived from the dairy L. lactis strain NCDO 712 32,33 . Phage ɸTP712 is also related to the sequenced temperate genome, PLgT-1, isolated from a marine environment (Fig. 3). These genomes have a similar size and genome organization. The morphogenesis module is the most conserved region and while we cannot confirm at this time that they are inducible and functional, it is tempting to speculate that at some point they had the same morphological features. Finally, L. garvieae strains 8831 and PAQ102015-99, both isolated from rainbow trout, may have an identical prophage. Due to the genome status (contigs) of strain 8831 we were unable to find the complete phage genome sequence delimited by the att sites of PLg-PAQ102015-99 (Table 2). Still, both prophages do not have any significant identity with other known phage genomes but their organization was similar to those discussed above (Supplementary Table S3). Most ORFs seemingly involved in replication and transcription have various levels of similarity with the host proteins of L. garvieae. However, the morphogenesis cluster presents the highest nucleotide variability. Seven deduced orfs (orf23, orf24, orf26 to orf29, orf38) matched (with an amino acid identity ranging from 53 to 82%) proteins found in three species of the genus Weissella (hellenica, oryzae and koreensis) 34 . Moreover, seven orfs (orf 25, orf30 to orf34, orf37) displayed similarities with deduced proteins from strains of  Enterococcus gilvus and E. faecalis 35,36 . As noted above, a putative homing endonuclease (orf11) and a 4.2 kb DNA fragment with lower GC content (31%) were located downstream of the lysis module. Comparison of L. garvieae phages with members of the currently recognized 10 L. lactis phage groups 4 revealed that while GE1 is more closely related with phage Q54 (Q54 species) and c2 (c2 species), L. garvieae prophages are more related to L. lactis phages of the P335 group.
Overall, our comparison of prophages from L. garvieae strains isolated from dairy and fish samples indicated low nucleotide identity, highlighting the diversity of lactococcal phages, particularly L. garvieae prophages.

Discussion
The recent isolation of a lytic phage infecting a strain of L. garvieae with significant similarities to dairy L. lactis phages belonging to the c2 and Q54 groups 14 , raised the question of whether the same was true for temperate L. garvieae phages and prophages. Moreover, since little data is available on MGEs that contribute to the evolution and adaptability of the L. garvieae species, we characterized an inducible temperate phage and analysed several prophages found within the genomes of L. garvieae strains available in public databases. Phage genome sequencing has revealed the presence of several novel genes with unknown functions. While these genes provide limited information on the biology of these phages, their analysis can shed light on their origin and provide underlying information on phage-bacteria interactions.
L. garvieae strain TB25 was previously isolated from an Italian cheese sample and was found to possess an inducible prophage belonging to the Siphoviridae family. Comparative analyses of the genome of phage PLg-TB25 with the genome of the recently described temperate phage PLgT-1 from a fish L. garvieae isolate indicated low nucleotide identity. However, the genome of PLg-TB25 had similar features (genome size, gene organization and GC content) to those observed in other L. lactis temperate phages 18 . Yet, the overall low nucleotide identity of phage PLg-TB25 with other phage genomes available in public databases confirmed that it represents a newly functional lactococcal phage. Of note, the inducible phage PLg-TB25 did not infect a panel of 56 strains of L. lactis (data not shown).
The analysis of 16 sequenced L. garvieae genomes revealed at least three other novel prophage groups. Within the different genomic modules, several genes encode for putative proteins with similarities to deduced proteins from phylogenetically distant genera, such as Lactobacillus, Weissella, and Enterococcus. In all likelihood, these novel phages are the result of genetic recombination events that have taken place in an environment containing multiple bacterial genera and species, and that have led to subsequent adaptation to a L. garvieae host.
The other prophages found in the genome of L. garvieae strains show similarity with temperate phages of L. lactis, belonging to the P335 group. L. lactis phages are currently classified into 10 groups based on genome analysis and phage morphology 4 but only one group (P335) appears to contain virulent and temperate phages. Some authors have proposed to divide the diverse P335 phage group into subgroups 37 . These observations suggest an evolutionary history in an environment where these two lactococcal bacterial species can thrive, perhaps the dairy ecosystem. Since the GC content of these L. garvieae (pro)phages is lower as compared to the GC of their hosts and, in fact, much closer to the GC content of L. lactis strains and phages, it is tempting to speculate that they originated from L. lactis, while on-going adaptation to a L. garvieae host. The analysis of four phage genomes harboured by dairy L. garvieae strains also revealed the presence of seemingly additional genes after the lysis module. These genes encode for putative proteins involved in responses to environmental stresses or host strains (cold-shock proteins and defense mechanisms).

Figure 5.
Genomic comparison between L. garvieae phage IPLA31405b and L. lactis phage r1t. Color shading was used to discriminate between ≥70% amino acid identity (dark color) and ≤69% amino acid identity (light color). The absence of shading means no significant similarity. The percent of amino acid identity inside the shading is representative of the aligned region only. Black arrows identify the lysogeny module.
Since the L. garvieae temperate phage PLgT-1 was previously described to be capable of transduction, thereby possibly playing a role in the genetic evolution and diversification of L. garvieae marine strains 16 , it is conceivable to suggest the involvement of the prophages characterized in this study in spreading genes which might contribute to the adaptation of L. garvieae to the dairy environment. Mobile Genetic Elements found in strain IPLA 31405 have already been proposed to play an important role in adaptation in milk, through dissemination of the gene for lactose utilization 38 .
Perhaps of interest, no known virulence factors were found in the prophages characterized in this study, even if some of the strains were isolated from infected fishes. While it remains unclear if these L. garvieae strains were directly responsible for the reported illnesses, it suggests that the virulence factors are either elsewhere in the bacterial genomes or that new molecules contributing to the pathogenicity of this organism have yet to be discovered.
In conclusion, this study highlights the diversity of L. garvieae phages and, in particular, its prophages. While most of our current knowledge about lactococcal phages is derived from the characterization of phages infecting L. lactis strains in the cheese and fermented milk industries [39][40][41] , it appears that the Lactococcus phage population is more diverse than previously estimated. In fact, it is plausible that some L. garvieae phages might have originated from L. lactis while others are the results of recombination events with phages infecting other bacterial genera.

Methods
Induction assay and morphology studies. L. garvieae strain TB25, previously isolated from an Italian cheese 17 , was grown statically at 30 °C in M17 broth (Pronadisa) containing 1% glucose (GM17) to an optical density at 600 nm (OD 600 ) of 0.3. Mitomycin C (Sigma) was added to a concentration of 5 µg/ml and the OD 600 was measured (in quadruplicate) every 30 min for over 5 hours using a BioTek PowerWave XS2 spectrophotometer (BioTek). Typical induction curves observed with the mitomycin C-containing cultures were characterized by an initial increase in OD 600 followed by a sharp reduction, compared to the control without mitomycin C.
The presence of induced phages was confirmed by transmission electronic microscopy (TEM). Briefly, the phage lysate was filtered through a 0.45 µm syringe filter and 1 ml was centrifuged at 24,000 × g for 1 h at 4 °C (Beckman). The supernatant (approximately 800 µl) was gently discarded and the remaining lysate (approximately 200 µl) was washed twice with 800 µl of ammonium acetate (0.1 M, pH 7.5) then centrifuged (1 h at 24,000 × g at 4 °C) and discarded. Next, 10 μl of the remaining phage solution (200 μl) was mixed with 10 μl of 2% uranyl-acetate and deposited on a nickel, Formvar-carbon-coated grid (Pelco International). The liquid was removed after 1 min by touching the edge of the grid with blotting paper. Phage morphology was observed at 80 kV using a JEOL1230 transmission electron microscope (Platforme d'Imagerie Moléculaire et Microscopie of the Université Laval). Capsid size, tail length and tail width were determined by measuring at least 10 phage specimens 31 . The phage was named PLg-TB25.
Phage DNA extraction. DNA of phage PLg-TB25 was isolated as described previously 42 , with the modifications described here. After DNase treatment to remove free DNA in the phage lysate, the DNAse was inactivated at 65 °C for 30 min. To facilitate the release of phage DNA from the capsid, 200 µl of SDS (20% stock solution) was added, along with 20 µl of proteinase K (stock solution: 20 mg/ml), and samples were incubated at 37 °C for 15 min and then at 60 °C for 30 min.
To sequence the genome of phage PLg-TB25, 90 mL of induced lysate was filtered, and polyethylene glycol (8000, 10% final concentration) and NaCl (final concentration of 0.6 M) were added to the lysate. This mixture was centrifuged at 24,000 × g (Beckman) for 1 h at 4 °C. The phage pellet was resuspended in 1 ml of phage buffer (10 mM Tris-HCl pH 7.4, 100 mM NaCl, 10 mM MgSO 4 ) and treated with SDS/proteinase K as described above. The DNA was purified using an UltraClean TM Microbial DNA Isolation Kit (MO BIO Laboratories, Inc.).
Phage DNA sequencing and analysis. A PLg-TB25 sequencing library was first prepared with the Nextera XT DNA Sample Prep Kit (Illumina) according to the manufacturer's instructions. The library was sequenced using a MiSeq system (2 × 250 nt paired-end). De novo assembly was performed with the ABySS v1.5.2 assembler and CLC v7. Open reading frame (ORF) prediction was carried out using ORF Finder (http:// www.ncbi.nlm.nih.gov/gorf/gorf.html) and RAST Server 43 . An ORF was considered valid only if the start codon was AUG, UUG or GUG and coded for at least 30 amino acids (aa). The presence of a ribosomal binding site (RBS) similar to the standard Shine-Dalgarno sequence (AGGAGA) was also determined. Functions and domains were attributed by comparison of the translated products with the database using BLASTp 44 . PSI-BLAST and InterProScan at EMBL-EBI (http://www.ebi.ac.uk/) were used to search for more distant homologous proteins and conserved domains, respectively. The ProtParam tool (http://web.expasy.org/protparam/) was used to determine theoretical molecular masses (MM) and isoelectric points (pI) of the deduced phage proteins. Transfer RNA (tRNA) were predicted using the tRNAscan-SE server 45 and confirmed using the ARAGORN program 46 . Virulence Factor Databases 47 , together with DBETH 48 , were used to search for virulence factors. Online bioinformatics tools were used with the default settings. Prophage and phage genome maps were generated with BioEdit (http://www.mbio.ncsu.edu/bioedit/bioedit.html) and manually edited in Adobe Illustrator.