A novel inducible prophage from the mycosphere inhabitant Paraburkholderia terrae BS437

Bacteriophages constitute key gene transfer agents in many bacteria. Specifically, they may confer gene mobility to Paraburkholderia spp. that dwells in soil and the mycosphere. In this study, we first screened mycosphere and bulk soils for phages able to produce plaques, however found these to be below detection. Then, prophage identification methods were applied to the genome sequences of the mycosphere-derived Paraburkholderia terrae strains BS001, BS007, BS110 and BS437, next to P. phytofirmans strains BS455, BIFAS53, J1U5 and PsJN. These analyses revealed all bacterial genomes to contain considerable amounts [up to 13.3%] of prophage-like sequences. One sequence predicted to encode a complete phage was found in the genome of P. terrae BS437. Using the inducing agent mitomycin C, we produced high-titered phage suspensions. These indeed encompassed the progeny of the identified prophage (denoted ɸ437), as evidenced using phage major capsid gene molecular detection. We obtained the full sequence of phage ɸ437, which, remarkably, had undergone a reshuffling of two large gene blocks. One predicted moron gene was found, and it is currently analyzed to understand the extent of its ecological significance for the host.

Scientific RepoRts | 7: 9156 | DOI: 10.1038/s41598-017-09317-8 there is great interest in digging deeper into the genetic legacies of such events in mycosphere dwellers. Nazir et al. 28 described a suite of truly fungal-interactive Paraburkholderia strains, including P. terrae strains BS001, BS007, BS110 and BS437, and P. phytofirmans BS455. Analysis of the 11.5 Mb large genome of the then selected P. terrae BS001 -in comparison with other similar genomes -revealed 96% of it to belong to the non-core [variable] part 26 . Some evidence was presented for the presence of phage-typical integrases, along with other phage-related genes, raising the question whether phages could facilitate HGT in this organism.
In this study, we hypothesized that prophage sequences present in some of the aforementioned fungal-interactive Paraburkholderia strains can give rise to phage populations that foster adaptive processes in Paraburkholderia in the mycosphere. We thus first screened the mycosphere (and corresponding bulk soil) for free phages and then -in a search for prophages -examined the genomes of mycosphere-derived Paraburkholderia strains. Indeed, evidence was found for the presence of putative prophage and phage-like elements in several genomes. We then focused on a predicted full-phage sequence found in P. terrae strain BS437, the data of which are presented here. To the best of our knowledge, this is the first study that isolates induced prophage from Paraburkholderia isolated from the mycosphere.

Results
Screening of mycosphere and bulk soil samples for free Paraburkholderia phages. Given the fact that previous studies 28,29 revealed a prevalence of Paraburkholderia types (in particular P. terrae) in the mycospheres of different soil fungi, we first screened two freshly-sampled mycospheres (Scleroderma citrinum and Galerina spp.) for the presence of phage particles able to produce plaques on selected strains of Paraburkholderia spp. including P. terrae, P. phytofirmans, P. caribensis, P. hospita and P. terricola (for details of the strains, see Table S1). Both direct extracts and fivefold phage-enriched ones (See Materials and Methods) were tested. This first attempt to detect phages that, in a lytic or temperate manner, productively interact with any of the selected Paraburkholderia species was done using the classical double-agar-layer [DAL] method 30 and spot tests. Unfortunately, neither the crude phage preparations from the mycosphere as well as bulk soil samples nor the phage enrichments showed any single plaques or lysis zones across all assays that were performed. This indicated an insufficiently low titer of virions in the extracts that were able to produce detectable clear or turbid plaques on the lawns of indicator bacteria used (Table S1).

Analysis of putative prophage regions across Paraburkholderia genomes.
In the light of the presumed low prevalence of free phage particles in the mycosphere as well as bulk soil samples, we then examined the putative presence of integrated phage. For that, we analyzed the genomes of the mycosphere-derived P. terrae strains BS001, BS007, BS110 and BS437, as well as of P. phytofirmans strains BS455, BIFAS53, J1U5 and PsJN, for the presence of putative prophage-like (PP) elements (Table S2). For this, we used the phage identification programs PHAST 31 , Prophinder/ACLAME 32 and PhiSpy 33 . By applying the criteria (see Material and Methods), we identified a total of 209 PP regions across the eight Paraburkholderia genomes. Following curation, 127 of the regions remained for further analyses (Tables 1 and S2). Most of these predicted prophage regions (Table S2) were interpreted as putative legacies of previous phage insertions, as they appeared to have lost essential phage core genes 13 .
For the next phase of this study, (1) only complete phage regions that could be predicted to form phage progeny, and (2) were consistently detected by all three programs, were further analyzed. It should be noted here that both PHAST and PhiSpy indicated the presence of one complete prophage in each of P. phytofirmans BS455 and PsJN. These regions however were excluded, as we placed a focus on the fungal-interactive Paraburkholderia terrae. Very convincingly, all three programs indicated that one full PP region was present in P. terrae BS437, with size of about 43.6 Kb (positions 6888478 to 6932098); this prophage, tentatively denoted as ɸ437, thus formed the focus of the next parts of this study.
Bacteriophage induction in P. terrae BS437. Given the finding of the ɸ437 encoding sequence in the P. terrae BS437 genome, cultures of this organism were screened for the presence of virions, using induction with different levels of MMC, in comparison to a control (to address spontaneous release; Fig. 1). We took a significant decrease of the OD 600 in the BS437 cultures, following addition of MMC, as an indication that prophage had been induced to excise from the host genome, resulting in production of enhanced levels of phage progeny. Indeed, MMC had a population-reducing effect, as measured by the OD 600 of the cultures, with higher levels of MMC resulting in stronger decreases of the OD 600 . Specifically, mid-log-phase cultures -upon treatment with 10 μg/mL MMC -showed significant decreases (ANOVA n = 3, P < 0.05) of the OD 600 as compared to the untreated control up to 14 h. In the control, at 10 h, exponential growth was found, with the stationary phase at 18 h being followed by a slow decrease of optical density at 24 h (Fig. 1a).
TEM was then used to observe phage progeny in the MMC-induced lysates as well as controls, and to determine the morphology of the phage particles (Fig. 2a). First, phage particles were not observed in the controls, even after extensive screens. However, in the MMC-induced suspensions, homogeneous populations of virions were found. These particles were composed of isometric heads of ~50 nm in diameter, and contractile tails with base plates of about ~75 nm length. Two to three long tail fibers were also distinguishable. According to  Significant of the treatments are marked with letter (a,b) for P < 0.05.

Figure 2. (a)
The TEM image and approximate induced prophage measurement. Crude induced lysate was filtered with 0.22-µm-pore-size filter and centrifuged to pellet the cell derbies, then store in −20 °C for one night prior imaging. Image shows a typical Myoviridae family, the image also shows induced ɸ437 (red arrow) and ɸ437 that has lost its capsid structure (black arrow). The bar represents 100 nm. (b) genome sequence of ɸ437.
Red arrows indicate phage lysis and lysogenic genes; blue arrows indicate phage structural genes (tail, capsid and fiber); green arrows indicate replication, recombination, repressor and phage related genes; gray arrows indicate hypothetical proteins. The black knobs indicate ρ-independent terminator and the bent arrows indicate putative promoters. The star indicates phage tRNA. (c) the attachment sites attP and attB of ɸ437. The att sites were analyzed using motif-finding tools MEME 49 . The attachment sites on the tRNA P. terrae BS437 (attB) and ɸ437 (attP) are shown. morphological classification criteria [ICTV -International committee on taxonomy of viruses], the phage can be classified as belonging to the Myoviridae, with typical contractile tail features. Thus, high levels of MMC induced lysis of BS437 cells, albeit partially, which occurred concomitantly with the release of TEM-detectable phage particles (Fig. 2). We then tested the potential infectivity of the released phage particles using the DAL method and spot test with diverse indicator hosts (Table S1), including P. terrae BS437. In several attempts (adding different concentrations of helper salts MgCl 2 , MnCl 2 and CaCl 2 ), the phage lysates did not give rise to any plaque on the different hosts tested. We also examined whether any integration event had taken place on selected hosts, using suites of 20 host clones taken from the areas where lysates were spotted (Fig. S1). The clones were PCR-screened using phage ɸ437 major capsid specific primers (see Material and Methods). The results showed that any integration event that might have occurred was below the detection limit of the applied method.
Linking the phage particle population to prophage ɸ437 specific genes using qPCR. We estimated that the phage lysates, estimated to have raised number of phage particles per ml (about 10 8 in the MMC-10 induction), contained dominant phage ɸ437 particles. To examine this tenet, we thus developed and performed phage ɸ437 based real-time quantitative PCR 34, 35 , on extracts prepared from the control and the MMC (4 µg/mL and 10 µg/ mL) induced phage lysates.
The results confirmed that the phage ɸ437 progeny levels increased over time in correspondence with the MMC concentration, with the highest gene copy number being 7.60 × 10 8 per ml at 20 h with MMC-10 induction (ANOVA significant n = 3, P < 0.05). On the other hand, in the control (no MMC induction), the copy numbers were consistently low, i.e. about 6.88 × 10 6 per ml at 20 h (Fig. 1b). This result indicated that (1) phage ɸ437upon MMC induction -is indeed induced from the BS437 genome by MMC to form progeny, and (2) it most likely concurs with the phage particles visualized by TEM, as described in the foregoing. Furthermore, we found a consistent presence of about 10 6 to 10 7 copies of the gene for the phage ɸ437 major coat protein in the control, indicating spontaneous release of phage particles; as yet, we still do not understand what type of 'cue' , e.g. partial/ incidental stress, may have caused such release.
Detailed analysis of the genome of phage ɸ437. The genome of phage ɸ437, as evidenced from virion population sequencing, was approximately 54 Kb in size, with GC-content of about 60.31%. This is slightly below the GC content of the host bacterium P. terrae BS437, of about 61.78%. Based on RAST annotation, the phage ɸ437 genome was found to contain 90 predicted open reading frames (ORFs), with 63 ORFs having more than 100 bp, 83 ORFs having start codon ATG (92%), four GTG (4%) and three TTG (3%). The identified PP region in BS437 (using our criteria, see materials and methods) was smaller than the sequenced genome of ɸ437. However, we did find that the PHAST-identified PP region had about 54 Kb in recent analysis. The comparison of the initially-identified smaller region with the sequenced ɸ437 genome is shown in Fig. S2.
The biggest predicted gene in the genome of phage ɸ437 was orf88, of 1,688 bp (563 amino acids -aa). The predicted gene product was identified as a portal protein, which enables DNA passage during ejection and virion assembly. The predicted protein had 35% homology [90% coverage] to a similar one from Xylella phage Sano (AHB12085). The smallest gene (orf31) had only 126 bp (41 aa), and the predicted protein had 61% homology [43% coverage] to a tail fiber protein of Escherichia phage 64795_ec1 (YP_009291518). Interestingly, more than half of the genes of the ɸ437 genome (53 genes, 59%) were predicted to encode hypothetical proteins (Table 1), with no designated phage sequences. This indicates the phage is indeed a repertoire of novel genes. To assign functions to the hypothetical gene products, PSI-BLASTP and Phyre 2 were used (see Materials and Methods), as detailed in the following.
Predicted genes encoding proteins that determine phage lifestyle. Phage ɸ437 was predicted to have a predominantly temperate lifestyle in its natural setting, as first evidenced by the fact that it was detected as a complete prophage. This tenet was also supported by PHACTS-supported and genomic analyses that showed the presence of typical genes involved in lysogeny. First, the phage ɸ437 genome encodes a predicted integrase (orf47), with 33% homology [40% coverage] to Pseudomonas phage D3 integrase (NP_061531). This integrase belongs to the tyrosine recombinase family, and a typical family representative is the phage lambda integrase 36 . We also found a tRNA sequence in the intergenic region adjacent to the integrase-encoding gene (Fig. 2b,c). We predict this site to be the phage integration site 12,36 . A second piece of evidence for the prophage lifestyle of ɸ437 was the presence of phage lambda-like repressor genes (orf27), next to an antirepressor (orf65), indicating the presence of a system designed to 'hold'/'release' the integrated form.
The phage ɸ437 genome -comparison to related sequences and phylogenetic tree. In this analysis, a holistic approach was used, in which phylogenetic and overall DNA and protein sequence identities were used as the criteria. First, BLASTN analyses of the ɸ437 genome showed no similarity of the whole sequence to sequences present in the viral (tailed-phage) database. Subsequent PSI-BLASTP analyses revealed that proteins encoded by 19 of the 90 genes of the ɸ437 genome showed best hits to proteins encoded by other Burkholderia phages (Table 1). We thus compared the ɸ437 genome sequence to those of known Burkholderia phages (see Materials and Methods) using progressiveMauve 37 , pairwise comparisons and nucleotide dot-plot analyses. The progressiveMauve analyses showed non-colinear synteny of the P. terrae phage ɸ437 sequence with those of other Burkholderia phages (Fig. 3). Then, pairwise comparisons of the phage ɸ437 sequence to those of Burkholderia virus E125 (AF447491) and B. pseudomallei 1026b (AY453853) [both with similar genome sizes, i.e. 53.4 Kb and 54.8 Kb] confirmed the similarity, at a very low level, of phage ɸ437 with other Burkholderia phages (Fig. 4). Finally, the dot-plot analyses also showed low similarities among the compared sequences (Fig. S3). Collectively, these results supporting the BLASTN and PSI-BLASTP analyses.
Phylogenetic analyses were then performed on the basis of selected proteins encoded by ɸ437, using MEGA7 [see Materials and Methods]. We thus analyzed phage hallmark genes, i.e. those encoding (1) lysozyme, (2) the major head capsid protein, (3) the portal, (4) the tail sheath protein, (5) the tail length tape measure protein and (6) the phage terminase large subunit. The closest hits to these proteins were most often proteins predicted from other phages (Fig. 5). The trees thus consistently pointed to a relatedness of ɸ437 to other phages. However, the phage ɸ437 proteins were phylogenetically quite distantly related to similar proteins from other phages. Specifically, the phage ɸ437 encoded 150-aa lysozyme had 40% homology [94% coverage] to similar proteins encoded by Idiomarinaceae phage Phi1M2-2 (YP_009104271), classified to the family Siphoviridae. Moreover, the 354-aa major capsid protein showed 38% homology [99% coverage] to a similar protein encoded by Aurantimonas sp. phage AmM-1 (YP_009146944), which was classified to the family Caudoviridae. The 562-aa portal protein had 35% homology [90% coverage] to a similar protein encoded by Xylella phage Sano (AHB12085), classified to family Siphoviridae. The 496-aa tail sheath protein had 42% homology [100% coverage] to a tail sheath protein from Enterobacteria phage SfI (YP_009147459), classified to the family Myoviridae. The 520-aa tail length tape measure had 28% homology [28% coverage] to a similar protein from Burkholderia phage BcepB1A (YP_291174), classified to the family Myoviridae. Finally, the 416-aa phage terminase large subunit had 44% homology [81% coverage] to a similar protein from Acidithiobacillus phage AcaML1 (AFU62879), classified to the family Myoviridae. These results show an overall consistent yet low level of similarity to proteins from known phages, indicating (1) phage ɸ437 predicted proteins are related to similar ones from phages, and (2) overall, phage ɸ437 is only remotely related to any known phage.
Phage core genes versus predicted morons. Given the large genetic distance of most ɸ437 genes to genes of known phages (Fig. 3), it was difficult to identify morons in the ɸ437 genome sequence. However, some genes with features that were strongly suggestive of morons were found (Fig. 2b). In this study, we applied strict criteria for protein-encoding regions to be considered to constitute a moron: (1) they potentially give fitness advantages to the host and do not constitute phage core genes, (2) they are flanked by an upstream σ 70 promoter and a downstream ρ-independent transcriptional terminator, allowing autonomous transcription 38 . Genes meeting criterion (2) were found in several putative intergenic regions (Fig. 2b). As a third criterion (criterion 3), we used the fact that morons often have GC-contents different from those of neighboring sequences 38 . Thus, orf64 was singled out as a potential moron; the region was identified as a so-called amrZ (alginate and motility regulator Z)/Arc domain. PSI-BLASTP analysis showed orf64 has 55% homology [79% coverage] with a similar protein present in Pseudomonas phage SM1 (ALT58107). This result was supported by Phyre 2 analysis (Table S3). Furthermore, 55% homology [80% coverage] -with 100% confidence -was found with 'alginate and motility regulator Z' found in Pseudomonas aeruginosa. The orf64 encoding transcriptional factor AmrZ was homologous to the Pseudomonas phage SM1 (ALT58107) Arc domain which had been shown to regulate virulence during infection 39 . This factor is also essential for biofilm formation in Pseudomonas aeruginosa.

Discussion
In spite of the apparent selection and outgrowth in mycosphere soils of the Paraburkholderia types used as phage hosts, to our surprise we could not detect any phage that was productive (including highly lytic to temperate modes of action) on these. This indicated that such phage populations, if present, were very low in number, so that they were not detectable by the classical DAL or related spot tests. Alternatively, our indicator bacteria (Table S1) may have had effective defense systems against the extant phage populations, which may have included R-M, CRISPR or BREX systems 1,40 . Finally, the conditions that allow such phages to proliferate on DAL plates may not have been established in our screens. We thus set out to analyze the genomes of several selected mycosphere-isolated Paraburkholderia strains for predicted prophage sequences using currently accepted bioinformatics tools.
The analysis of the genomes of our Paraburkholderia strains to identify prophages/phage-like elements (PP) showed evidence for the contention that all of the analyzed sequences contain substantial amounts of prophage regions. Most of the identified PP regions turned out to be remnants of a phage 'history' , as previously discussed 12,13 . These regions have probably been subjected to (stochastically acting) selective deletion pressures from the host cell, which may indicate their infrequent (re)selection. When phage structural machinery genes get eroded, prophages lose their abilities to produce progeny. Such prophages might still be coding and remain functional as they offer lysogenic conversion to host cell 11 or they increasingly might represent 'passive genetic cargo' that is not transcribed 12 . With respect to the identified phages, such hypotheses surely need experimental evidence.
A certain prevalence of prophages in the Paraburkholderia genomes was expected considering the fact that these Paraburkholderia species can inhabit the mycosphere, an environment that has been depicted as a hot spot for HGT processes in soil 27 . So far, only few studies have successfully described phages from Burkholderia (and/ or Paraburkholderia) spp. [41][42][43][44][45] . However, most phages described were from pathogenic strains isolated from clinical environments, i.e. B. cepacia complex isolates. To the best of our knowledge, no previous studies have as yet focused on Paraburkholderia phages in environmental isolates, especially from the mycosphere. We here singled out the P. terrae strain BS437 phage ɸ437, on the basis of the experimental and computational analysis, as outlined in the foregoing.
Phage ɸ437 was apparently 'spontaneously' released in strain BS437 populations growing in liquid medium, whereas its particle numbers were raised by successful induction with MMC (Fig. 1a). These observations were supported by the concomitant phage coat gene based qPCR analyses and TEM observations (Figs 1b and 2a). However, we did not detect any infective phage particles by the DAL or spot tests applied to phage lysates, which may be due to (1) the absence of infectivity in our phage lysate, or (2) an intrinsic resistance or insusceptibility of host cells to released phages, as previously observed in other study. Notably, 45 strains of Clostridium difficile also failed to show infective phage production using the DAL method 9 . The isolation, propagation and downstream analysis of phages from natural samples remain a challenge 46 . The absence of detectable phage activity in the spot tests clearly excluded a lysis-from-without scenario under these conditions.
The spontaneous prophage induction that was observed in the liquid controls used [non-MMC induction] (Fig. 1a), if occurring in natural settings, might have an impact on host fitness 10 . We hypothesized that ɸ437 might modulate the formation of P. terrae BS437 biofilms on its fungal host strain, which we presume to be akin to P. terrae strain BS001 forming biofilms on Lyophyllum sp. strain Karsten 47 . However, experimental work still needs to be done to prove this theory. Collectively, the significant decrease of the OD 600 in strain BS437 cultures upon MMC induction, the phage progeny observed by TEM, and the increased gene copy number of the ɸ437 major capsid gene strongly indicate that phage ɸ437 was the major, if not only, phage that was released from the genome of P. terrae BS437.
The genomic architecture of ɸ437, compared to Burkholderia virus E125 (AF447491) and B. pseudomallei 1026b (AY453853) indicated a strong conservation of a cluster of functional genes (phage core genes) in the same relative spatial position. Tail (orf70-orf80) and head (orf84-orf90) morphogenesis genes were among the most conserved genes in the ɸ437 genome. This is consistent with data by Morgan et al. 48 and Summer et al. 41 , indicating that such conserved genes as well as gene order represents a phage gene repertoire that is fine-tuned to effectively execute key phage functions (as shaped by evolution). Moreover, the key functional genes may be better interchanged in the continuous flux of gene acquisition and recombination in the bacterial host genome. The analyses applied to assign the taxonomic class of ɸ437 show no large sequence similarity to any known phage sequences in the public database. However, the phylogenetic analysis of the selected phage hallmark genes (phage lysozyme, major capsid, portal, tail sheath, tail length tape measure and phage large terminase subunit) revealed ɸ437 to be most related to phages from the Myoviridae family. Moreover, the morphology of ɸ437 placed it in the Myoviridae. We thus propose ɸ437 as a new member of this family, with unique sequence features that do not relate to any of the currently ICTV-recognized subfamilies or genera.
The integration of phage ɸ437 is not well understood and does not fit classical integration mechanisms. We found the site/region of integration in the host bacterium and phage genome showed interrupted blocks, regardless of sequence identity. It is noteworthy that comparative studies of lambdoid bacteriophage genomes 11 also revealed mosaicisms as a consequence of HGTs involving homologous and non-homologous recombinations 49,50 . Additionally, moron genes have been reported to be common in Burkholderia phages 44 . Our analyses found one moron (orf64) that potentially endows the host with a superinfection defense mechanism against other phage infection, enhance host fitness and enhance biofilm formation. Considering this line of evidence, we hypothesize that the gene product potentially plays a role in the P. terrae strain BS437 interaction with a host fungus in the mycosphere, including biofilm formation. Although the significance of this potential moron still remains enigmatic at this point, this analysis gives direction for future experiments.

Materials and Methods
Phage isolation from soil and mycosphere samples. Replicate soil and mycosphere samples (Scleroderma citrinum and Galerina spp.) were obtained from a forest in Noordlaren in autumn 2015, and processed as in Zhang et al. 27 . Attempts to isolate phage from these samples were made using two methods. First, 0.5 g of each mycosphere sample was added to 5 ml of sterile water, after which the mixtures were vortexed vigorously. After one minute still, centrifugation at 100 xg (30 s) was done to sediment course soil particles. The collected supernatant was then spun at maximal speed (7,000 xg) for 15 min, to remove fine soil particles. Following this, 100 µL was filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA); the suspension was then added to 20 mL of LB (Sigma-Aldrich, St. Louis, Mo, USA), with 200 µL of overnight grown 'indicator' bacteria (Table S1). The suspensions were incubated overnight at 28 °C.
Method 2 consisted of directly adding 0.5 g soil or mycosphere sample to 20 mL LB broth and incubating overnight at 28 °C, to foster bacterial growth and potential phage development. Following incubation, the cultures were centrifuged at maximal speed (7,000 xg) for 10 min at 4 °C to pellet bacterial cells, and supernatants filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA). One mL of each filtered supernatant was then added to 3 mL indicator bacteria (Table S1) in LB medium, and incubated overnight at 28 °C. The resulting cultures were then centrifuged at maximum speed for 30 min at 4 °C and the filtered supernatants used for later cultures. The procedure was repeated five times, ultimately yielding a suspension that presumably contains phage particles 51 .
Prophage identifications across genomes. The genomes of the selected Paraburkholderia strains were screened for the presence of prophages by using PHAST 31 -version October 2015, Prophinder/ACLAME 32 -version 04, October 2015 and PhiSpy [PhiSpyNov11_3.2] 33 . PHAST and Prophinder identify prophage regions by using a database of known phage genes, sequence identification, tRNA identification (as phages often use tRNAs as target sites for integration), attachment site recognition and gene clustering density measurements (prophage regions can be identified as clusters of phage-like genes within a bacterial genome) 31,32 . PhiSpy uses several distinct characteristics of prophages, as outlined in the following. First, the median length of predicted proteins; as the median protein lengths in phage regions is much higher than that of proteins in the bacterial genome. Additionally, the directionality of the transcription strand and the GC skew. Both directionality of the transcription strands and GC skew are correlated with the direction of replication. Most consecutive genes in phage genome tend to be encoded on the same strand, in contrast to bacterial consecutive genes. Any observed changes in GC skew might result from the insertion of foreign DNA. Also, the abundance of unique phage words is used, next to the phage insertion site (attP) and the similarity to known phage proteins 33 . We here also applied other criteria to define putative prophage-like (PP) regions: (1) PP of sizes below 10 Kb were discarded 5,11 and (2) when a region consistently appeared in all three independent analyses, we used the PHAST results, as PhiSpy was reported to give less consistent results 52 .

Bacterial growth and MMC-mediated prophage induction. Paraburkholderia terrae strain BS437
became the focus of this study. It was isolated from the mycosphere of Lyophyllum sp strain Karsten 28 and is a current reference strain in our laboratory. The strain was grown in LB broth at 28 °C with shaking (180 rpm). Induction with MMC (Sigma-Aldrich, St. Louis, Mo, USA) was conducted according to Fortier and Moineau 9 , with modifications. Briefly, bacterial cells were introduced into 5 ml of LB medium and incubated overnight at 28 °C (shaking at 180 rpm). The resulting cultures were then transferred (1:100) into replicate Erlenmeyer flasks containing 40 ml of fresh LB medium and growth was monitored until the exponential growth phase (about 10 h incubation). Thereafter, all cultures were split into two 20 ml cultures. MMC was added to the cultures, at final concentrations of either 4 or 10 µg/mL (MMC-4, MMC-10, respectively), with the 'twin' culture serving as the control. The cultures were incubated and the OD 600 was monitored for 24 h. Decreases of the cell density were taken as indications of progressive cell lysis and prophage release. The experiments were done with three biological replicates. The resulting crude lysates were finally filtered over Whatman 0.22 µm cellulose acetate filter (GE Healthcare Life Sciences, Pittsburgh, PA, USA) and stored at −20 °C until further analysis.

Assessment of host range and indicator bacterial strains.
For all phage activity tests, the double agar layer (DAL) method, next to a spot test, was used according to Adams 30 , with some modifications. In one effort, we used the extracted mycosphere and bulk soil directly with selected indicator Paraburkholderia strains (Table S1). Suspensions resulting from the fivefold enrichment with the same indicator bacteria were also used.
Quantitative PCR (qPCR). Specific primer sets for detecting phage genes were developed as the indicator gene to verify the presence of phage ɸ437 in the induced lysate. We selected one phage ɸ437-specific gene: a major capsid protein using the P. terrae BS437 draft sequence 26 . Major capsid genes have been used to assess viral diversity (see review by Adriaenssens and Cowan 53 ). This method followed the path taken to quantify ten closely related lambdoid phages of Escherichia coli strain K-12 34,35 .
Here, we treated the induced lysates and the control (not treated with MMC) with DNase to remove any host genomic DNA (confirmed by host-specific PCR). Using the ɸ437 specific primer set, a 198 bp band was produced from P. terrae BS437 DNA, whereas no bands were amplified from genomic DNA of P. terrae strains BS001, BS007, BS110, 17804 T or P. hospita DSMZ 17164 T and P. caribensis DSMZ 1323 T (Fig. S1a). Then, these strains were used to detect and quantify phage progeny in the induced lysates as described 34,35 .
Briefly, induced cultures were centrifuged and filtered over Whatman 0.22 µm puradisc syringe cellulose acetate filters (GE Healthcare Life Sciences, Pittsburgh, PA, USA) to remove bacterial cells and debris. A drop of chloroform was added to 10-fold diluted filtrates. These were then centrifuged at 2700 xg for 10 min at 4 °C. Then, 2 units of DNaseI endonuclease (Sigma-Aldrich, St. Louis, Mo, USA) with 1.3 µL 10x reaction buffer (Sigma-Aldrich, St. Louis, Mo, USA) was added to 10 µL lysate and the mixture was kept at 37 °C for 1 h. Later 1.5 µL of stop solution (Sigma-Aldrich, St. Louis, Mo, USA) was added and the mixture incubated at 95 °C for 30 min to inactivate DNaseI and also to open up phage capsids. The resulting suspensions were then diluted 10 fold and stored at −20 °C for later analysis. Primers specific for the ɸ437 gene for major capsid protein were used (PP1.437_ca1F: 5′-CACGATGACACGATCCACAC-3′; PP1.437_ca1R: 5′-GAGAACCATGCCCTGAACC-3′). The qPCR reaction mixtures consisted of 12.5 µL SYBR Green (Applied Biosystems, CA, USA), 0.75 µL each primer (Eurogentec, Liège, Belgium), 10 µL ultrapure water and 1 µL sample, for a total 25 µL reaction volume. Amplification and detection of ɸ437 product were performed using ABI 7300 (ThermoFisher Scientific, Waltham, Mass, USA) with qPCR reaction conditions: denaturation at 95 °C for 30 sec, annealing at 60 °C for 1 min and elongation at 72 °C for 60 sec. The qPCR efficiency was 106%.
The examination of the presence of prophage within indicator hosts. Experiments were performed to test the potential integration of ɸ437 (Fig. S1) using spot tests with ɸ437 containing suspensions (titer estimated at 10 8 per ml) on several Paraburkholderia strains (P. terrae BS001, BS007, BS110, 17804 T , P. hospita DSMZ 17164 T and P. caribensis DSMZ 1323 T ) as previously explained. The top and bottom parts of each spots were later streaked onto the new R2A medium and incubated overnight at 28 °C. Colony PCR-based test using specific ɸ437 gene for major capsid protein (198 bp) were used and 20 single-colonies from each strains were tested. The isolated DNA of ɸ437 and the phage suspension produced from strain BS437 were used as positive controls, whereas the unspotted strains and E. coli K-12 were used as negative controls. The test was applied to potential host strain BS007, with 50 more single-colonies.
Phage particle concentration by polyethylene glycol (PEG) 8000. The induced phage particles were purified according to the PEG method of Sambrook and Russell 54 with the following modifications. Induced phage lysate was centrifuged at 11,000 xg for 15 min at 4 °C, and then supernatants were filtered over a Whatman 0.22 µm puradisc syringe filter-cellulose acetate (GE Healthcare Life Sciences, Pittsburgh, PA, USA). NaCl (29.2 g) was dissolved into 500 mL lysates to final concentration 1 M, which was then stored on ice for 1 h. Solid polyethylene glycol (PEG) 8000 was added to the supernatant to a final concentration of 10% (w/v) and the mixture stored overnight at 4 °C to allow phage particles to precipitate. The PEG-precipitated lysate was then centrifuged at 11,000 xg for 10 min at 4 °C (Sorvall SLA-1500 rotor). The supernatants were discarded to 20 mL and 10x SM buffer (10 mM NaCl, 50 mM Tris, 10 mM MgSO 4 , and 0.1% gelatin) was added for storage and later analysis.
Phage DNA extraction and sequencing. Phage DNA extraction was performed with a Phage DNA Isolation Kit (Norgen, Biotek Corp, ON, Canada) using manufacturer's protocols, with slight modification, i.e. DNase I inactivation temperature was 80 °C for 10 min. In addition, 16S rRNA PCR amplification using 16SFP/16SRP universal 16S rRNA gene primer set 55 was performed to confirm the absence of genomic DNA in the phage DNA extracts. Aliquots of amplification products were electrophoresed in 1% agarose gels stained with ethidium bromide and visualized under UV illumination.
Phage DNA was sequenced on the Illumina HiSeq. 2500 paired-end by BaseClear (Leiden, Netherlands). The libraries for the strains were prepared using Illumina genomic Nextera XT libraries. The quality analyses of FASTQ sequence reads were done using the Illumina Casava pipeline version 1.8.3. The Initial quality assessment was based on data passing the Illumina Chastity filtering. Subsequently, reads containing PhiX control signal were removed using an in-house filtering protocol. In addition, reads containing (partial) adapters were clipped (up to minimum read length of 50 bp). The second quality assessment was based on the remaining reads using the FASTQC quality control tool version 0.10.0. The final quality scores per sample yielded 707,8049 reads, or 166 MB, at 37.45 average quality. Reads were then aligned and successfully assembled using the CLC genomics workbench 9 (Aarhus, Denmark) with the default parameters: mismatch cost 2, insertion cost 3, deletion cost 3, length fraction 0.5 and similarity 0.9.
RAST (Rapid Annotation using Subsystem Technology) was subsequently used to annotate the sequenced genome 56 . Predicted hypothetical proteins were checked with PSI-BLASTP and Phyre 2 program 57 . Predicted amino acid sequences of genes with assigned function [and of those without] were analyzed against the non-redundant (nr) NCBI database and the tailed phages database by PSI-BLASTP. Phyre 2 was used to predict secondary and tertiary structures (Table S2). To predict the lifestyle, PHACTS (uses a novel similarity algorithm to create a training set from known phage lifestyles and a random forest that classify a multitude of decision trees 58 ) was used. Phage-bound σ 70 promoters were predicted using predicted promoter tool (http://www.fruitfly.org/ seq_tools/promoter.html) and ρ-independent terminators were identified using the Arnold terminator-finding program 59 . The analysis of tRNA in the phage genome was done using tRNAscan-SE 60 . The attachment (att) sites were analyzed using motif-finding tools MEME 61 . The PROBIUS prediction tool 62 was used to predict transmembrane and signal peptide of genome ɸ437.
Transmission electron microscopy (TEM). Viral particles were detected, and viral morphology examined by TEM (PHILIPS CM10). The phage stocks were directly applied onto carbon-coated nitrocellulose grids, and let it set for about a minute. The excess of liquid was drained with filter paper before negative staining with 1% uranyl acetate followed by washing and drying, before immediate observation in the TEM.