Introduction

Viruses that infect bacteria - bacteriophages (phages) - play significant roles in the evolution of bacteria, at both the individual and community levels1. As agents of horizontal gene transfer (HGT), phages can enhance the fitness of their host cells in the form of lysogenic conversion and/moron genes, for instance by providing so-called auxiliary metabolic genes (AMGs)2 as well as virulence or pathogenicity traits3. Moreover, phages function in the biological ‘warfare’ among neighboring bacterial cells and can modulate the formation of bacterial biofilms at the population level4.

Prophages - temperate phages that occur in an integrated form in the bacterial genome - are often present in considerable amounts in bacterial genomes. For example, a recent study5 of 69 Escherichia and Salmonella genomes revealed prophages to occupy up to 13.5% of the genome of Escherichia coli O157:H7 [strain EC4115] and up to 4.9% of that of Salmonella Newport strain SL254. Such prophages, when intact, may be induced from the host genome, yielding phage progeny in lysates. This may occur as a response to stress, for instance resulting from exposure to UV6, 7, hydrogen peroxide8 or mitomycin C (MMC)9. Moreover, prophages can be ‘spontaneously’ induced, which implies that cues of unknown nature may have been at the basis of induction10. However, many potential prophages are, to different extents, defective or ‘cryptic’, as they have been subjected to genetic erosion (degradation and deletion) processes11,12,13. Such defective prophages may endow their hosts with gene repertoires that allow survival in harsh environments14.

The extant abundance of phages, as compared to their bacterial hosts, is often astounding12, 15. However, we have so far only just scratched the ‘tip of the phage iceberg’. Moreover, whereas most studies on phages have been made in aquatic ecosystems2, 16,17,18, those in soil have been lower in number or have only just emerged19, 20.

Members of the genus Burkholderia exhibit a tremendous phenotypic diversity and they inhabit diverse ecological settings21, ranging from soil22, 23 to plants and humans24. A recent study divides Burkholderia into two clades, in which clade I contain all pathogenic Burkholderia species and clade II mainly so-called “environmental” bacteria. Clade II was renamed Paraburkholderia 21, 25. This genus encompasses members with the largest genomes among all known bacteria. Such genomes may have resulted from frequent HGT events and potential selection26. Zhang et al.27 recently provided arguments for the tenet that the mycosphere, in the light of the bacterium-‘feeding’ fungus and the multitude of active bacteria occurring there, constitutes a true arena that fosters HGT. Hence, there is great interest in digging deeper into the genetic legacies of such events in mycosphere dwellers. Nazir et al.28 described a suite of truly fungal-interactive Paraburkholderia strains, including P. terrae strains BS001, BS007, BS110 and BS437, and P. phytofirmans BS455. Analysis of the 11.5 Mb large genome of the then selected P. terrae BS001 - in comparison with other similar genomes - revealed 96% of it to belong to the non-core [variable] part26. Some evidence was presented for the presence of phage-typical integrases, along with other phage-related genes, raising the question whether phages could facilitate HGT in this organism.

In this study, we hypothesized that prophage sequences present in some of the aforementioned fungal-interactive Paraburkholderia strains can give rise to phage populations that foster adaptive processes in Paraburkholderia in the mycosphere. We thus first screened the mycosphere (and corresponding bulk soil) for free phages and then – in a search for prophages – examined the genomes of mycosphere-derived Paraburkholderia strains. Indeed, evidence was found for the presence of putative prophage and phage-like elements in several genomes. We then focused on a predicted full-phage sequence found in P. terrae strain BS437, the data of which are presented here. To the best of our knowledge, this is the first study that isolates induced prophage from Paraburkholderia isolated from the mycosphere.

Results

Screening of mycosphere and bulk soil samples for free Paraburkholderia phages

Given the fact that previous studies28, 29 revealed a prevalence of Paraburkholderia types (in particular P. terrae) in the mycospheres of different soil fungi, we first screened two freshly-sampled mycospheres (Scleroderma citrinum and Galerina spp.) for the presence of phage particles able to produce plaques on selected strains of Paraburkholderia spp. including P. terrae, P. phytofirmans, P. caribensis, P. hospita and P. terricola (for details of the strains, see Table S1). Both direct extracts and fivefold phage-enriched ones (See Materials and Methods) were tested. This first attempt to detect phages that, in a lytic or temperate manner, productively interact with any of the selected Paraburkholderia species was done using the classical double-agar-layer [DAL] method30 and spot tests. Unfortunately, neither the crude phage preparations from the mycosphere as well as bulk soil samples nor the phage enrichments showed any single plaques or lysis zones across all assays that were performed. This indicated an insufficiently low titer of virions in the extracts that were able to produce detectable clear or turbid plaques on the lawns of indicator bacteria used (Table S1).

Analysis of putative prophage regions across Paraburkholderia genomes

In the light of the presumed low prevalence of free phage particles in the mycosphere as well as bulk soil samples, we then examined the putative presence of integrated phage. For that, we analyzed the genomes of the mycosphere-derived P. terrae strains BS001, BS007, BS110 and BS437, as well as of P. phytofirmans strains BS455, BIFAS53, J1U5 and PsJN, for the presence of putative prophage-like (PP) elements (Table S2). For this, we used the phage identification programs PHAST31, Prophinder/ACLAME32 and PhiSpy33. By applying the criteria (see Material and Methods), we identified a total of 209 PP regions across the eight Paraburkholderia genomes. Following curation, 127 of the regions remained for further analyses (Tables 1 and S2). Most of these predicted prophage regions (Table S2) were interpreted as putative legacies of previous phage insertions, as they appeared to have lost essential phage core genes13.

Table 1 Genome ɸ437 assignment.

Across the P. terrae strains, P. terrae BS007 had the largest (11.8%), and P. terrae BS001 the lowest (8.17%) total amount of PP region. P. terrae strain BS007 also harbored the largest PP (encoded ɸ007-5), of about 205.2 Kb. For the P. phytofirmans strains examined, P. phytofirmans J1U5 had the largest (13.4%) and P. phytofirmans BIFAS53 the lowest (2.7%) total amount of PP region. P. phytofirmans strain J1U5 harbored the highest PP number, i.e. 27. In contrast, P. phytofirmans PsJN only carried two identifiable PP regions, i.e. (encoded by us) ɸPsJN-2 (63.1 Kb) and ɸPsJN-3 (15.7 Kb). P. phytofirmans strain BIFAS53 contained the smallest identifiable PP (ɸBIFAS53-4), of about 10.5 Kb (Table S2).

For the next phase of this study, (1) only complete phage regions that could be predicted to form phage progeny, and (2) were consistently detected by all three programs, were further analyzed. It should be noted here that both PHAST and PhiSpy indicated the presence of one complete prophage in each of P. phytofirmans BS455 and PsJN. These regions however were excluded, as we placed a focus on the fungal-interactive Paraburkholderia terrae. Very convincingly, all three programs indicated that one full PP region was present in P. terrae BS437, with size of about 43.6 Kb (positions 6888478 to 6932098); this prophage, tentatively denoted as ɸ437, thus formed the focus of the next parts of this study.

Bacteriophage induction in P. terrae BS437

Given the finding of the ɸ437 encoding sequence in the P. terrae BS437 genome, cultures of this organism were screened for the presence of virions, using induction with different levels of MMC, in comparison to a control (to address spontaneous release; Fig. 1). We took a significant decrease of the OD600 in the BS437 cultures, following addition of MMC, as an indication that prophage had been induced to excise from the host genome, resulting in production of enhanced levels of phage progeny. Indeed, MMC had a population-reducing effect, as measured by the OD600 of the cultures, with higher levels of MMC resulting in stronger decreases of the OD600. Specifically, mid-log-phase cultures - upon treatment with 10 μg/mL MMC - showed significant decreases (ANOVA n = 3, P < 0.05) of the OD600 as compared to the untreated control up to 14 h. In the control, at 10 h, exponential growth was found, with the stationary phase at 18 h being followed by a slow decrease of optical density at 24 h (Fig. 1a).

Figure 1
figure 1

(a) Prophage induction and (b) quantitative PCR of the progeny. The MMC was added with a different concentration, MMC-4 (4 µg/mL) (■), MMC-10 (10 µg/mL) (♦) and without MMC/control (▲) to exponential-growing cell (10 hour; indicated with red arrow) of incubation in LB medium at 28 °C. Sample from 10 hour, 16 hour, 20 hour and 24 hour were used for quantitative PCR, error bars indicated s.d. values (n = 3). Significant of the treatments are marked with letter (a,b) for P < 0.05.

TEM was then used to observe phage progeny in the MMC-induced lysates as well as controls, and to determine the morphology of the phage particles (Fig. 2a). First, phage particles were not observed in the controls, even after extensive screens. However, in the MMC-induced suspensions, homogeneous populations of virions were found. These particles were composed of isometric heads of ~50 nm in diameter, and contractile tails with base plates of about ~75 nm length. Two to three long tail fibers were also distinguishable. According to morphological classification criteria [ICTV - International committee on taxonomy of viruses], the phage can be classified as belonging to the Myoviridae, with typical contractile tail features.

Figure 2
figure 2

(a) The TEM image and approximate induced prophage measurement. Crude induced lysate was filtered with 0.22-µm-pore-size filter and centrifuged to pellet the cell derbies, then store in −20 °C for one night prior imaging. Image shows a typical Myoviridae family, the image also shows induced ɸ437 (red arrow) and ɸ437 that has lost its capsid structure (black arrow). The bar represents 100 nm. (b) genome sequence of ɸ437. Red arrows indicate phage lysis and lysogenic genes; blue arrows indicate phage structural genes (tail, capsid and fiber); green arrows indicate replication, recombination, repressor and phage related genes; gray arrows indicate hypothetical proteins. The black knobs indicate ρ-independent terminator and the bent arrows indicate putative promoters. The star indicates phage tRNA. (c) the attachment sites attP and attB of ɸ437. The att sites were analyzed using motif-finding tools MEME49. The attachment sites on the tRNA P. terrae BS437 (attB) and ɸ437 (attP) are shown.

Thus, high levels of MMC induced lysis of BS437 cells, albeit partially, which occurred concomitantly with the release of TEM-detectable phage particles (Fig. 2). We then tested the potential infectivity of the released phage particles using the DAL method and spot test with diverse indicator hosts (Table S1), including P. terrae BS437. In several attempts (adding different concentrations of helper salts MgCl2, MnCl2 and CaCl2), the phage lysates did not give rise to any plaque on the different hosts tested. We also examined whether any integration event had taken place on selected hosts, using suites of 20 host clones taken from the areas where lysates were spotted (Fig. S1). The clones were PCR-screened using phage ɸ437 major capsid specific primers (see Material and Methods). The results showed that any integration event that might have occurred was below the detection limit of the applied method.

Linking the phage particle population to prophage ɸ437 specific genes using qPCR. We estimated that the phage lysates, estimated to have raised number of phage particles per ml (about 108 in the MMC-10 induction), contained dominant phage ɸ437 particles. To examine this tenet, we thus developed and performed phage ɸ437 based real-time quantitative PCR34, 35, on extracts prepared from the control and the MMC (4 µg/mL and 10 µg/mL) induced phage lysates.

The results confirmed that the phage ɸ437 progeny levels increased over time in correspondence with the MMC concentration, with the highest gene copy number being 7.60 × 108 per ml at 20 h with MMC-10 induction (ANOVA significant n = 3, P < 0.05). On the other hand, in the control (no MMC induction), the copy numbers were consistently low, i.e. about 6.88 × 106 per ml at 20 h (Fig. 1b). This result indicated that (1) phage ɸ437 - upon MMC induction - is indeed induced from the BS437 genome by MMC to form progeny, and (2) it most likely concurs with the phage particles visualized by TEM, as described in the foregoing. Furthermore, we found a consistent presence of about 106 to 107 copies of the gene for the phage ɸ437 major coat protein in the control, indicating spontaneous release of phage particles; as yet, we still do not understand what type of ‘cue’, e.g. partial/incidental stress, may have caused such release.

Detailed analysis of the genome of phage ɸ437

The genome of phage ɸ437, as evidenced from virion population sequencing, was approximately 54 Kb in size, with GC-content of about 60.31%. This is slightly below the GC content of the host bacterium P. terrae BS437, of about 61.78%. Based on RAST annotation, the phage ɸ437 genome was found to contain 90 predicted open reading frames (ORFs), with 63 ORFs having more than 100 bp, 83 ORFs having start codon ATG (92%), four GTG (4%) and three TTG (3%). The identified PP region in BS437 (using our criteria, see materials and methods) was smaller than the sequenced genome of ɸ437. However, we did find that the PHAST-identified PP region had about 54 Kb in recent analysis. The comparison of the initially-identified smaller region with the sequenced ɸ437 genome is shown in Fig. S2.

The biggest predicted gene in the genome of phage ɸ437 was orf88, of 1,688 bp (563 amino acids - aa). The predicted gene product was identified as a portal protein, which enables DNA passage during ejection and virion assembly. The predicted protein had 35% homology [90% coverage] to a similar one from Xylella phage Sano (AHB12085). The smallest gene (orf31) had only 126 bp (41 aa), and the predicted protein had 61% homology [43% coverage] to a tail fiber protein of Escherichia phage 64795_ec1 (YP_009291518). Interestingly, more than half of the genes of the ɸ437 genome (53 genes, 59%) were predicted to encode hypothetical proteins (Table 1), with no designated phage sequences. This indicates the phage is indeed a repertoire of novel genes. To assign functions to the hypothetical gene products, PSI-BLASTP and Phyre2 were used (see Materials and Methods), as detailed in the following.

Predicted genes encoding proteins that determine phage lifestyle

Phage ɸ437 was predicted to have a predominantly temperate lifestyle in its natural setting, as first evidenced by the fact that it was detected as a complete prophage. This tenet was also supported by PHACTS-supported and genomic analyses that showed the presence of typical genes involved in lysogeny. First, the phage ɸ437 genome encodes a predicted integrase (orf47), with 33% homology [40% coverage] to Pseudomonas phage D3 integrase (NP_061531). This integrase belongs to the tyrosine recombinase family, and a typical family representative is the phage lambda integrase36. We also found a tRNA sequence in the intergenic region adjacent to the integrase-encoding gene (Fig. 2b,c). We predict this site to be the phage integration site12, 36. A second piece of evidence for the prophage lifestyle of ɸ437 was the presence of phage lambda-like repressor genes (orf27), next to an antirepressor (orf65), indicating the presence of a system designed to ‘hold’/’release’ the integrated form.

Tail component and DNA packaging genes

As shown in Table 1, 28 phage ɸ437 morphogenesis genes were found, i.e. orf3, 5, 23, 28, 31, 48, 51, 54, 56, 57, 62, 70–77, 79, 80, 82, 84–88 and 90. PSI-BLASTP analyses of these genes showed homologies with database entries at 24–61% similarity and at coverages of 16–100%. PSI-BLASTP and Phyre2 analyses revealed that some ORFs encoded hypothetical morphological proteins. Thus, predicted tail fiber protein (orf56) showed 45% homology [75% coverage] with a gene of Salmonella phage IME207 (YP_009322735). Phage ɸ437 also contained ORFs predicted to encode several baseplate proteins (orf51, 53, 72 and 74). Thus orf51 and orf72 may encode baseplate assembly proteins, as they showed 31% [26% coverage] and 32% homologies (89% coverage) with such ORFs from Escherichia phage Lw1 (YP_008060715) and Shigella phage SfIV (YP_009147467), respectively. Fifteen ORFs were predicted to encode a suite of tail proteins (orf3, 23, 31, 54, 56, 57, 62, 70, 71, 73, 75, 76, 77, 79 and 80), next to a tail sheath protein (orf80). The latter showed 42% homology [100% coverage] to a putative tail protein in Enterobacteria phage SfI (YP_009147459). The products of orf84 through orf90, next to orf48 and orf5, were predicted to be involved in the packaging of DNA and in capsid formation (Table 1); the major capsid protein (orf84) showed 38% homology [99% coverage] to a similar protein from Aurantimonas phage AmM-1 (YP_009146944). The phage ɸ437 genome also contained a putative ORF encoding a portal protein (orf40) as well as an ORF for a large terminase subunit protein (orf42). These proteins showed 35% [90% coverage] and 28% homology [30% coverage] to their database counterparts, respectively. These proteins are all essential in phage DNA packaging processes.

The phage ɸ437 genome - comparison to related sequences and phylogenetic tree

In this analysis, a holistic approach was used, in which phylogenetic and overall DNA and protein sequence identities were used as the criteria. First, BLASTN analyses of the ɸ437 genome showed no similarity of the whole sequence to sequences present in the viral (tailed-phage) database. Subsequent PSI-BLASTP analyses revealed that proteins encoded by 19 of the 90 genes of the ɸ437 genome showed best hits to proteins encoded by other Burkholderia phages (Table 1). We thus compared the ɸ437 genome sequence to those of known Burkholderia phages (see Materials and Methods) using progressiveMauve37, pairwise comparisons and nucleotide dot-plot analyses. The progressiveMauve analyses showed non-colinear synteny of the P. terrae phage ɸ437 sequence with those of other Burkholderia phages (Fig. 3). Then, pairwise comparisons of the phage ɸ437 sequence to those of Burkholderia virus E125 (AF447491) and B. pseudomallei 1026b (AY453853) [both with similar genome sizes, i.e. 53.4 Kb and 54.8 Kb] confirmed the similarity, at a very low level, of phage ɸ437 with other Burkholderia phages (Fig. 4). Finally, the dot-plot analyses also showed low similarities among the compared sequences (Fig. S3). Collectively, these results supporting the BLASTN and PSI-BLASTP analyses.

Figure 3
figure 3

The Multiple genome alignment of Burkholderia phages. Genome were compare using progressiveMauve software, the genome homologous indicates by the local coliner blocks (LCB) and connected with lines. The analysis included known Burkholderia phages from Myoviridae, Siphoviridae and Podoviridae. The ɸ437 is indicated by red arrow.

Figure 5
figure 4

Phylogenetic trees of phage ɸ437 for (a) lysozyme, (b) major capsid, (c) portal, (d) tail sheath, (e) tail length tape measure and (f) phage terminase large subunit. Phylogenetic tree were generated with neighbor-joining tree Mega version 7 with 1,000 boothstrap method and p-distance methods. Red arrows indicate ɸ437. PSI-BLASTP best hits, coupled with other known Burkholderia phages were used in the analysis.

Phylogenetic analyses were then performed on the basis of selected proteins encoded by ɸ437, using MEGA7 [see Materials and Methods]. We thus analyzed phage hallmark genes, i.e. those encoding (1) lysozyme, (2) the major head capsid protein, (3) the portal, (4) the tail sheath protein, (5) the tail length tape measure protein and (6) the phage terminase large subunit. The closest hits to these proteins were most often proteins predicted from other phages (Fig. 5). The trees thus consistently pointed to a relatedness of ɸ437 to other phages. However, the phage ɸ437 proteins were phylogenetically quite distantly related to similar proteins from other phages. Specifically, the phage ɸ437 encoded 150-aa lysozyme had 40% homology [94% coverage] to similar proteins encoded by Idiomarinaceae phage Phi1M2-2 (YP_009104271), classified to the family Siphoviridae. Moreover, the 354-aa major capsid protein showed 38% homology [99% coverage] to a similar protein encoded by Aurantimonas sp. phage AmM-1 (YP_009146944), which was classified to the family Caudoviridae. The 562-aa portal protein had 35% homology [90% coverage] to a similar protein encoded by Xylella phage Sano (AHB12085), classified to family Siphoviridae. The 496-aa tail sheath protein had 42% homology [100% coverage] to a tail sheath protein from Enterobacteria phage SfI (YP_009147459), classified to the family Myoviridae. The 520-aa tail length tape measure had 28% homology [28% coverage] to a similar protein from Burkholderia phage BcepB1A (YP_291174), classified to the family Myoviridae. Finally, the 416-aa phage terminase large subunit had 44% homology [81% coverage] to a similar protein from Acidithiobacillus phage AcaML1 (AFU62879), classified to the family Myoviridae. These results show an overall consistent yet low level of similarity to proteins from known phages, indicating (1) phage ɸ437 predicted proteins are related to similar ones from phages, and (2) overall, phage ɸ437 is only remotely related to any known phage.

Figure 4
figure 5

Comparison of (a) Burkholderia virus E125 (AF447491), (b) Paraburkholderia terrae phage ɸ437,and (c) B. pseudomallei phage 1026b (AY453853). Color boxes are indicated as previous figure with additional. Comparison percentage was generated using BLAST + 2.4.0 (tBLASTx with cutoff value 10−3) and map comparison figures were created with Easyfig as indicated in material and methods. Gene similarity percentage is indicated in gray scale bar.

Phage core genes versus predicted morons

Given the large genetic distance of most ɸ437 genes to genes of known phages (Fig. 3), it was difficult to identify morons in the ɸ437 genome sequence. However, some genes with features that were strongly suggestive of morons were found (Fig. 2b). In this study, we applied strict criteria for protein-encoding regions to be considered to constitute a moron: (1) they potentially give fitness advantages to the host and do not constitute phage core genes, (2) they are flanked by an upstream σ70 promoter and a downstream ρ-independent transcriptional terminator, allowing autonomous transcription38. Genes meeting criterion (2) were found in several putative intergenic regions (Fig. 2b). As a third criterion (criterion 3), we used the fact that morons often have GC-contents different from those of neighboring sequences38. Thus, orf64 was singled out as a potential moron; the region was identified as a so-called amrZ (alginate and motility regulator Z)/Arc domain. PSI-BLASTP analysis showed orf64 has 55% homology [79% coverage] with a similar protein present in Pseudomonas phage SM1 (ALT58107). This result was supported by Phyre2 analysis (Table S3). Furthermore, 55% homology [80% coverage] - with 100% confidence – was found with ‘alginate and motility regulator Z’ found in Pseudomonas aeruginosa. The orf64 encoding transcriptional factor AmrZ was homologous to the Pseudomonas phage SM1 (ALT58107) Arc domain which had been shown to regulate virulence during infection39. This factor is also essential for biofilm formation in Pseudomonas aeruginosa.

Discussion

In spite of the apparent selection and outgrowth in mycosphere soils of the Paraburkholderia types used as phage hosts, to our surprise we could not detect any phage that was productive (including highly lytic to temperate modes of action) on these. This indicated that such phage populations, if present, were very low in number, so that they were not detectable by the classical DAL or related spot tests. Alternatively, our indicator bacteria (Table S1) may have had effective defense systems against the extant phage populations, which may have included R-M, CRISPR or BREX systems1, 40. Finally, the conditions that allow such phages to proliferate on DAL plates may not have been established in our screens. We thus set out to analyze the genomes of several selected mycosphere-isolated Paraburkholderia strains for predicted prophage sequences using currently accepted bioinformatics tools.

The analysis of the genomes of our Paraburkholderia strains to identify prophages/phage-like elements (PP) showed evidence for the contention that all of the analyzed sequences contain substantial amounts of prophage regions. Most of the identified PP regions turned out to be remnants of a phage ‘history’, as previously discussed12, 13. These regions have probably been subjected to (stochastically acting) selective deletion pressures from the host cell, which may indicate their infrequent (re)selection. When phage structural machinery genes get eroded, prophages lose their abilities to produce progeny. Such prophages might still be coding and remain functional as they offer lysogenic conversion to host cell11 or they increasingly might represent ‘passive genetic cargo’ that is not transcribed12. With respect to the identified phages, such hypotheses surely need experimental evidence.

A certain prevalence of prophages in the Paraburkholderia genomes was expected considering the fact that these Paraburkholderia species can inhabit the mycosphere, an environment that has been depicted as a hot spot for HGT processes in soil27. So far, only few studies have successfully described phages from Burkholderia (and/or Paraburkholderia) spp.41,42,43,44,45. However, most phages described were from pathogenic strains isolated from clinical environments, i.e. B. cepacia complex isolates. To the best of our knowledge, no previous studies have as yet focused on Paraburkholderia phages in environmental isolates, especially from the mycosphere. We here singled out the P. terrae strain BS437 phage ɸ437, on the basis of the experimental and computational analysis, as outlined in the foregoing.

Phage ɸ437 was apparently ‘spontaneously’ released in strain BS437 populations growing in liquid medium, whereas its particle numbers were raised by successful induction with MMC (Fig. 1a). These observations were supported by the concomitant phage coat gene based qPCR analyses and TEM observations (Figs 1b and 2a). However, we did not detect any infective phage particles by the DAL or spot tests applied to phage lysates, which may be due to (1) the absence of infectivity in our phage lysate, or (2) an intrinsic resistance or insusceptibility of host cells to released phages, as previously observed in other study. Notably, 45 strains of Clostridium difficile also failed to show infective phage production using the DAL method9. The isolation, propagation and downstream analysis of phages from natural samples remain a challenge46. The absence of detectable phage activity in the spot tests clearly excluded a lysis-from-without scenario under these conditions.

The spontaneous prophage induction that was observed in the liquid controls used [non-MMC induction] (Fig. 1a), if occurring in natural settings, might have an impact on host fitness10. We hypothesized that ɸ437 might modulate the formation of P. terrae BS437 biofilms on its fungal host strain, which we presume to be akin to P. terrae strain BS001 forming biofilms on Lyophyllum sp. strain Karsten47. However, experimental work still needs to be done to prove this theory. Collectively, the significant decrease of the OD600 in strain BS437 cultures upon MMC induction, the phage progeny observed by TEM, and the increased gene copy number of the ɸ437 major capsid gene strongly indicate that phage ɸ437 was the major, if not only, phage that was released from the genome of P. terrae BS437.

The genomic architecture of ɸ437, compared to Burkholderia virus E125 (AF447491) and B. pseudomallei 1026b (AY453853) indicated a strong conservation of a cluster of functional genes (phage core genes) in the same relative spatial position. Tail (orf70-orf80) and head (orf84-orf90) morphogenesis genes were among the most conserved genes in the ɸ437 genome. This is consistent with data by Morgan et al.48 and Summer et al.41, indicating that such conserved genes as well as gene order represents a phage gene repertoire that is fine-tuned to effectively execute key phage functions (as shaped by evolution). Moreover, the key functional genes may be better interchanged in the continuous flux of gene acquisition and recombination in the bacterial host genome. The analyses applied to assign the taxonomic class of ɸ437 show no large sequence similarity to any known phage sequences in the public database. However, the phylogenetic analysis of the selected phage hallmark genes (phage lysozyme, major capsid, portal, tail sheath, tail length tape measure and phage large terminase subunit) revealed ɸ437 to be most related to phages from the Myoviridae family. Moreover, the morphology of ɸ437 placed it in the Myoviridae. We thus propose ɸ437 as a new member of this family, with unique sequence features that do not relate to any of the currently ICTV-recognized subfamilies or genera.

The integration of phage ɸ437 is not well understood and does not fit classical integration mechanisms. We found the site/region of integration in the host bacterium and phage genome showed interrupted blocks, regardless of sequence identity. It is noteworthy that comparative studies of lambdoid bacteriophage genomes11 also revealed mosaicisms as a consequence of HGTs involving homologous and non-homologous recombinations49, 50. Additionally, moron genes have been reported to be common in Burkholderia phages44. Our analyses found one moron (orf64) that potentially endows the host with a superinfection defense mechanism against other phage infection, enhance host fitness and enhance biofilm formation. Considering this line of evidence, we hypothesize that the gene product potentially plays a role in the P. terrae strain BS437 interaction with a host fungus in the mycosphere, including biofilm formation. Although the significance of this potential moron still remains enigmatic at this point, this analysis gives direction for future experiments.

Materials and Methods

Phage isolation from soil and mycosphere samples

Replicate soil and mycosphere samples (Scleroderma citrinum and Galerina spp.) were obtained from a forest in Noordlaren in autumn 2015, and processed as in Zhang et al.27. Attempts to isolate phage from these samples were made using two methods. First, 0.5 g of each mycosphere sample was added to 5 ml of sterile water, after which the mixtures were vortexed vigorously. After one minute still, centrifugation at 100 xg (30 s) was done to sediment course soil particles. The collected supernatant was then spun at maximal speed (7,000 xg) for 15 min, to remove fine soil particles. Following this, 100 µL was filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA); the suspension was then added to 20 mL of LB (Sigma-Aldrich, St. Louis, Mo, USA), with 200 µL of overnight grown ‘indicator’ bacteria (Table S1). The suspensions were incubated overnight at 28 °C.

Method 2 consisted of directly adding 0.5 g soil or mycosphere sample to 20 mL LB broth and incubating overnight at 28 °C, to foster bacterial growth and potential phage development. Following incubation, the cultures were centrifuged at maximal speed (7,000 xg) for 10 min at 4 °C to pellet bacterial cells, and supernatants filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA). One mL of each filtered supernatant was then added to 3 mL indicator bacteria (Table S1) in LB medium, and incubated overnight at 28 °C. The resulting cultures were then centrifuged at maximum speed for 30 min at 4 °C and the filtered supernatants used for later cultures. The procedure was repeated five times, ultimately yielding a suspension that presumably contains phage particles51.

Prophage identifications across genomes

The genomes of the selected Paraburkholderia strains were screened for the presence of prophages by using PHAST31- version October 2015, Prophinder/ACLAME32- version 04, October 2015 and PhiSpy [PhiSpyNov11_3.2]33. PHAST and Prophinder identify prophage regions by using a database of known phage genes, sequence identification, tRNA identification (as phages often use tRNAs as target sites for integration), attachment site recognition and gene clustering density measurements (prophage regions can be identified as clusters of phage-like genes within a bacterial genome)31, 32. PhiSpy uses several distinct characteristics of prophages, as outlined in the following. First, the median length of predicted proteins; as the median protein lengths in phage regions is much higher than that of proteins in the bacterial genome. Additionally, the directionality of the transcription strand and the GC skew. Both directionality of the transcription strands and GC skew are correlated with the direction of replication. Most consecutive genes in phage genome tend to be encoded on the same strand, in contrast to bacterial consecutive genes. Any observed changes in GC skew might result from the insertion of foreign DNA. Also, the abundance of unique phage words is used, next to the phage insertion site (attP) and the similarity to known phage proteins33. We here also applied other criteria to define putative prophage-like (PP) regions: (1) PP of sizes below 10 Kb were discarded5, 11 and (2) when a region consistently appeared in all three independent analyses, we used the PHAST results, as PhiSpy was reported to give less consistent results52.

Bacterial growth and MMC-mediated prophage induction

Paraburkholderia terrae strain BS437 became the focus of this study. It was isolated from the mycosphere of Lyophyllum sp strain Karsten28 and is a current reference strain in our laboratory. The strain was grown in LB broth at 28 °C with shaking (180 rpm). Induction with MMC (Sigma-Aldrich, St. Louis, Mo, USA) was conducted according to Fortier and Moineau9, with modifications. Briefly, bacterial cells were introduced into 5 ml of LB medium and incubated overnight at 28 °C (shaking at 180 rpm). The resulting cultures were then transferred (1:100) into replicate Erlenmeyer flasks containing 40 ml of fresh LB medium and growth was monitored until the exponential growth phase (about 10 h incubation). Thereafter, all cultures were split into two 20 ml cultures. MMC was added to the cultures, at final concentrations of either 4 or 10 µg/mL (MMC-4, MMC-10, respectively), with the ‘twin’ culture serving as the control. The cultures were incubated and the OD600 was monitored for 24 h. Decreases of the cell density were taken as indications of progressive cell lysis and prophage release. The experiments were done with three biological replicates. The resulting crude lysates were finally filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA) and stored at −20 °C until further analysis.

Assessment of host range and indicator bacterial strains

For all phage activity tests, the double agar layer (DAL) method, next to a spot test, was used according to Adams30, with some modifications. In one effort, we used the extracted mycosphere and bulk soil directly with selected indicator Paraburkholderia strains (Table S1). Suspensions resulting from the fivefold enrichment with the same indicator bacteria were also used.

Spot or “lysis from without” assays were also used on the induced lysates. Briefly, overnight cultures of the indicator bacteria (Table S1) were poured onto R2A (Becton Dickinson, NJ, USA) plate agar. Then 5 µL (10−2, 10−3, 10−4, 10−5) diluted induced lysates were spotted onto the plate and the plates incubated overnight at 28 °C.

Quantitative PCR (qPCR)

Specific primer sets for detecting phage genes were developed as the indicator gene to verify the presence of phage ɸ437 in the induced lysate. We selected one phage ɸ437-specific gene: a major capsid protein using the P. terrae BS437 draft sequence26. Major capsid genes have been used to assess viral diversity (see review by Adriaenssens and Cowan53). This method followed the path taken to quantify ten closely related lambdoid phages of Escherichia coli strain K-1234, 35.

Here, we treated the induced lysates and the control (not treated with MMC) with DNase to remove any host genomic DNA (confirmed by host-specific PCR). Using the ɸ437 specific primer set, a 198 bp band was produced from P. terrae BS437 DNA, whereas no bands were amplified from genomic DNA of P. terrae strains BS001, BS007, BS110, 17804T or P. hospita DSMZ 17164T and P. caribensis DSMZ 1323T (Fig. S1a). Then, these strains were used to detect and quantify phage progeny in the induced lysates as described34, 35.

Briefly, induced cultures were centrifuged and filtered over Whatman 0.22 µm puradisc syringe cellulose acetate filters (GE Healthcare Life Sciences, Pittsburgh, PA, USA) to remove bacterial cells and debris. A drop of chloroform was added to 10-fold diluted filtrates. These were then centrifuged at 2700 xg for 10 min at 4 °C. Then, 2 units of DNaseI endonuclease (Sigma-Aldrich, St. Louis, Mo, USA) with 1.3 µL 10x reaction buffer (Sigma-Aldrich, St. Louis, Mo, USA) was added to 10 µL lysate and the mixture was kept at 37 °C for 1 h. Later 1.5 µL of stop solution (Sigma-Aldrich, St. Louis, Mo, USA) was added and the mixture incubated at 95 °C for 30 min to inactivate DNaseI and also to open up phage capsids. The resulting suspensions were then diluted 10 fold and stored at −20 °C for later analysis. Primers specific for the ɸ437 gene for major capsid protein were used (PP1.437_ca1F: 5′-CACGATGACACGATCCACAC-3′; PP1.437_ca1R: 5′-GAGAACCATGCCCTGAACC-3′). The qPCR reaction mixtures consisted of 12.5 µL SYBR Green (Applied Biosystems, CA, USA), 0.75 µL each primer (Eurogentec, Liège, Belgium), 10 µL ultrapure water and 1 µL sample, for a total 25 µL reaction volume. Amplification and detection of ɸ437 product were performed using ABI 7300 (ThermoFisher Scientific, Waltham, Mass, USA) with qPCR reaction conditions: denaturation at 95 °C for 30 sec, annealing at 60 °C for 1 min and elongation at 72 °C for 60 sec. The qPCR efficiency was 106%.

The examination of the presence of prophage within indicator hosts

Experiments were performed to test the potential integration of ɸ437 (Fig. S1) using spot tests with ɸ437 containing suspensions (titer estimated at 108 per ml) on several Paraburkholderia strains (P. terrae BS001, BS007, BS110, 17804T, P. hospita DSMZ 17164T and P. caribensis DSMZ 1323T) as previously explained. The top and bottom parts of each spots were later streaked onto the new R2A medium and incubated overnight at 28 °C. Colony PCR-based test using specific ɸ437 gene for major capsid protein (198 bp) were used and 20 single-colonies from each strains were tested. The isolated DNA of ɸ437 and the phage suspension produced from strain BS437 were used as positive controls, whereas the unspotted strains and E. coli K-12 were used as negative controls. The test was applied to potential host strain BS007, with 50 more single-colonies.

Phage particle concentration by polyethylene glycol (PEG) 8000

The induced phage particles were purified according to the PEG method of Sambrook and Russell54 with the following modifications. Induced phage lysate was centrifuged at 11,000 xg for 15 min at 4 °C, and then supernatants were filtered over a Whatman 0.22 µm puradisc syringe filter- cellulose acetate (GE Healthcare Life Sciences, Pittsburgh, PA, USA). NaCl (29.2 g) was dissolved into 500 mL lysates to final concentration 1 M, which was then stored on ice for 1 h. Solid polyethylene glycol (PEG) 8000 was added to the supernatant to a final concentration of 10% (w/v) and the mixture stored overnight at 4 °C to allow phage particles to precipitate. The PEG-precipitated lysate was then centrifuged at 11,000 xg for 10 min at 4 °C (Sorvall SLA-1500 rotor). The supernatants were discarded to 20 mL and 10x SM buffer (10 mM NaCl, 50 mM Tris, 10 mM MgSO4, and 0.1% gelatin) was added for storage and later analysis.

Phage DNA extraction and sequencing

Phage DNA extraction was performed with a Phage DNA Isolation Kit (Norgen, Biotek Corp, ON, Canada) using manufacturer’s protocols, with slight modification, i.e. DNase I inactivation temperature was 80 °C for 10 min. In addition, 16S rRNA PCR amplification using 16SFP/16SRP universal 16S rRNA gene primer set55 was performed to confirm the absence of genomic DNA in the phage DNA extracts. Aliquots of amplification products were electrophoresed in 1% agarose gels stained with ethidium bromide and visualized under UV illumination.

Phage DNA was sequenced on the Illumina HiSeq. 2500 paired-end by BaseClear (Leiden, Netherlands). The libraries for the strains were prepared using Illumina genomic Nextera XT libraries. The quality analyses of FASTQ sequence reads were done using the Illumina Casava pipeline version 1.8.3. The Initial quality assessment was based on data passing the Illumina Chastity filtering. Subsequently, reads containing PhiX control signal were removed using an in-house filtering protocol. In addition, reads containing (partial) adapters were clipped (up to minimum read length of 50 bp). The second quality assessment was based on the remaining reads using the FASTQC quality control tool version 0.10.0. The final quality scores per sample yielded 707,8049 reads, or 166 MB, at 37.45 average quality. Reads were then aligned and successfully assembled using the CLC genomics workbench 9 (Aarhus, Denmark) with the default parameters: mismatch cost 2, insertion cost 3, deletion cost 3, length fraction 0.5 and similarity 0.9.

RAST (Rapid Annotation using Subsystem Technology) was subsequently used to annotate the sequenced genome56. Predicted hypothetical proteins were checked with PSI-BLASTP and Phyre2 program57. Predicted amino acid sequences of genes with assigned function [and of those without] were analyzed against the non-redundant (nr) NCBI database and the tailed phages database by PSI-BLASTP. Phyre2 was used to predict secondary and tertiary structures (Table S2). To predict the lifestyle, PHACTS (uses a novel similarity algorithm to create a training set from known phage lifestyles and a random forest that classify a multitude of decision trees58) was used. Phage-bound σ70 promoters were predicted using predicted promoter tool (http://www.fruitfly.org/seq_tools/promoter.html) and ρ-independent terminators were identified using the Arnold terminator-finding program59. The analysis of tRNA in the phage genome was done using tRNAscan-SE60. The attachment (att) sites were analyzed using motif-finding tools MEME61. The PROBIUS prediction tool62 was used to predict transmembrane and signal peptide of genome ɸ437.

Transmission electron microscopy (TEM)

Viral particles were detected, and viral morphology examined by TEM (PHILIPS CM10). The phage stocks were directly applied onto carbon-coated nitrocellulose grids, and let it set for about a minute. The excess of liquid was drained with filter paper before negative staining with 1% uranyl acetate followed by washing and drying, before immediate observation in the TEM.

Genome comparison and phylogenetic trees

Known Burkholderia phages such as, Burkholderia cepacia phage Bcep22 (AY349011), B. cenocepacia phage BcepM(AY539836), B. cenocepacia phage BcepB1A (NC_005886), B. pseudomallei phage 1026b (AY453853), Burkholderia virus E125 (AF447491), Burkholderia phage BcepIL02 (FJ937737), Burkholderia phage 52237 (NC_007145), Burkholderia phage E202 (NC_009234), Burkholderia phage E255 (NC_009237), Burkholderia phage 644-2 (NC_009235), Burkholderia phage E12-2 (NC_009236), Burkholderia phage Bcep1 (NC_005263), Burkholderia phage Bcep43 (NC_005342), Burkholderia phage Bcep781 (NC_004333) and Burkholderia phage BcepNY4 (0096001), including Enterobacteria phage T4 (NC_00086), Enterobacteria phage Mu (NC_000929), Enterobacteria phage sfV (NC_003444), Enterobacteria phage P2 (AF063097), and Enterobacteria phage lambda (NC_001416), coupled with the PSI-BLASTP best hits for hallmark genes (phage lysozyme, major capsid, portal, tail sheath, tail length tape measure and phage terminase large subunit gene) were used to generated phylogenetic trees and molecular evolutionary analysis. Trees were analyzed using MEGA763. The comparison were performed with three different approaches, such as ProgressiveMauve37, pairwise comparison64 and dot-plot analysis63. Pairwise analysis generated by BLAST + 2.4.0 (tBLASTx with cutoff value 10−3) and map comparison figures were created with Easyfig.64. Dot-plot analysis was done using Gepard with default parameters65.