Isolation, Characterization and Genomic Analysis of a Novel Bacteriophage VB_EcoS-Golestan Infecting Multidrug-Resistant Escherichia coli Isolated from Urinary Tract Infection

Escherichia coli (E. coli) is one of the most common uropathogenic bacteria. The emergence of multi-drug resistance among these bacteria resulted in a worldwide public health problem which requires alternative treatment approaches such as phage therapy. In this study, phage VB_EcoS-Golestan, a member of Siphoviridae family, with high lytic ability against E. coli isolates, was isolated from wastewater. Its burst size was large and about 100 plaque-forming units/infected cell, rapid adsorption time, and high resistance to a broad range of pH and temperatures. Bioinformatics analysis of the genomic sequence suggests that VB_EcoS-Golestan is a new phage closely related to Escherichia phages in the Kagunavirus genus, Guernseyvirinae subfamily of Siphoviridae. The genome size was 44829 bp bp that encodes 78 putative ORFs, no tRNAs, 7 potential promoter sequences and 13 Rho-factor-independent terminators. No lysogenic mediated genes were detected in VB_EcoS-Golestan genome. Overall VB_EcoS-Golestan might be used as a potential treatment approach for controlling E. coli mediated urinary tract infection, however, further studies are essential to ensure its safety.


Escherichia coli isolates codes
Names and number of antibiotics to which isolates were resistant  attachment of the bacteria to the bladder cells surface 5 . The frequencies of the studied virulence genes alone or in combination are presented in Supplementary Table S2. The sequences of fimH, pap, sfa, and afa adhesion factors detected in the isolates are deposited in the gene bank with accession numbers of MG041766, LC373009, LC373010, and LC373216, respectively. The most common virulence gene which was detected in all of the UTI isolates was the fimH gene. The next frequent virulence genes were pap and sfa presented in 78.8% (41 out of 52), and 69.2% (36 out of 52) of the isolates, respectively. Finally, afa was the least frequent virulence gene found in only 4% (4 out of 52) of the isolates. All isolates harbored the adhesive genes either singly or in combination. Thirty-four isolates (65.3%) were positive for fimH, pap and sfa genes together. Only, three of the isolates (5.7%) were positive for all four genes. It has been shown that bacterial cell surface structures can serve as receptors for bacteriophages 26 . These structures can be classified according to their structural characteristics, and have several roles, including acting as virulence factors 26,27 . Detection of these receptors by bacteriophage determines the specificity of a phage and its host range. In UPEC isolates, virulence factors such as fimbriae are a good receptors for binding of bacteriophage having tail fibers 26 . Among the isolated E. coli, 25 isolates with pap gene (61%) were sensitive to the VB_EcoS-Golestan phage. Of the 36 isolates harboring sfa gene, 23 isolates (63.9%) were found susceptible to the phage. Moreover, all isolates harboring afa gene (4 isolates) were sensitive to the lytic activity of the phage. Furthermore, the 3 isolates which were encoding all of the adhesion genes (pap, sfa, fimH, and afa) were also found susceptible to VB_EcoS-Golestan phage ( Table 1). These results indicate that there is a significant correlation between virulence factors and bacterial sensitivity to phage (P < 0.05). In other words, presence of these adhesions proteins on the surface of E. coli isolates can increase the chance of phage attachment to the host bacterium. Therefore, E. coli strains with modifications or lower expression of such receptors might be resistant to bacteriophage infection 26 . Stability of the VB_ecoS-Golestan phage. Thermal stability of the phage VB_EcoS-Golestan is shown in Supplementary Fig. S2a. The maximum stability was observed from 37 to 45 °C. The activity of the phage decreased by increasing temperature and was fully deactivated at 75 °C after one hour incubation. The phage showed maximum stability at pH values of 7 and 8, in which no significant differences were observed in the phage titers after 1 h and 24 h incubation. The phage was also stable in the pH values from 5 to 10 after 1 h incubation where there was a negligible decrease in the phage titer after 1 h and 24 h incubation compared to the recorded values for pH of 7 and 8. However, A significant reduction was observed in the phage titer at pH values of 3 and 11 after 1 and 24 h incubation, and it was completely deactivated at pH 2 and 12 (see Supplementary Fig. S2b). This data indicates that VB_EcoS-Golestan has high stability in a wide range of temperatures and pH conditions, which is advantageous for potential application of this phage in phage therapy at different environmental settings 28 . cationic ions and phage adsorption rate. Subjecting the phage to 10 mmol/L of Mg +2 (MgCl 2 ) or Ca +2 (CaCl 2 ) resulted in a significant increase in the adsorption rate compared to the control (Two-way ANOVA; P < 0.05, Fig. 2). About sixty-six percent (65.8%) of the phage was adsorbed to Escherichia coli 333 cells within 5 min in the control mixture (without metal ions). In the cases of adding magnesium chloride or calcium chloride, the adsorption rates were 89.4% and 85.3%, respectively. The highest phage adsorption occurred after 15 minutes, 99.5% in the control and 99.8% in the samples containing cationic ions. Thereafter, no prominent changes were observed. In other words, these cations stabilized the interaction of bacteriophage with its host bacterium. Previous studies reported that cofactors such as Ca 2+ /Mg 2+ ions can stabilize the fragile interface of the virion with its receptors [28][29][30][31] . This means enhanced phage infectivity which can lead to a higher lysis yield in phage therapy. one-step growth curve. The latent period and the burst size of VB_EcoS-Golestan phage were determined by one-step growth test. The latent period was approximately 40 min and the burst size was about 100 plaque forming units (pfu) per cell (see Supplementary Fig. S3). The burst size of a phage is closely related to its propagation, and having a proper burst size is a desirable characteristic for an effective lytic bacteriophage. Therefore, phages with a short latency period and large burst size have been suggested as proper candidates for phage therapy 14 . Therefore, the observed burst size and relatively short latent period of VB_EcoS-Golestan are desirable characteristics for its potential application in phage therapy.
In vitro lytic activity of the phage. The lytic activity of the phage was examined against E. coli 333 culture at its exponential growth phase (OD 600 = 0.4) with different MOIs of VB_EcoS-Golestan. The highest MOI (MOI 10) resulted in the highest lytic activity within the first hour by an approximately 1.5 Log decrease in the titer of E. coli 333 from 10 8 cfu/ml to about 10 7 cfu/ml (P < 0.0001). After 2 h, using MOIs of 0.1, 1 and 10 let to about 3.5 Log decrease in the bacterium titer to about 10 4 cfu/ml (P < 0.0001). The recorded bacterial titer was almost constant afterward until 8 h after incubation. Using MOI of 0.01 also decreased the titer of the bacterium to the same point (10 4 cfu/ml) however after 3 h of incubation. Using lower MOIs (0.001 and 0.0001) resulted in moderate decrease of the bacterium titer within early hours (see Supplementary Fig. S4a). Therefore, using higher phage concentrations resulted in a faster reduction in the bacterium count which could be due to an increased attachment rate at higher phage titer 32,33 .
Further incubation to 24 h resulted in a significant increase in the bacterium titer in both control and phage treated samples. Even at this point the recorded bacterium titers were about 2 Log less than the control (P < 0.0001). The lytic effect of the phage in different MOIs against the host bacterium was also measured by optical absorbance (OD 600 ) and the results were consistent with the results of the bacterial cell counts (see (2020) 10:7690 | https://doi.org/10.1038/s41598-020-63048-x www.nature.com/scientificreports www.nature.com/scientificreports/ Supplementary Fig. S4b). In in vivo application of a phage, reduction of the bacterial populations to un-infective doses provide more time for the innate immunity to overcome the infection 34,35 . Therefore, determining the optimal titer of a phage is a helpful approach to enhance phage infectivity when faced with its host, especially during the first hours of the treatment 32 . Therefore, the observed decrease in the cell count of the host bacterium over the first three hours of exposure of VB_EcoS-Golestan phage and maintaining such trend up to 8 hours after incubation is a significant feature for the potential application of this phage in phage therapy. However, as demonstrated in Supplementary Fig. S4b, the cell density of the host bacterium increased after 24 h. This increase could be due to the selection of those bacterial strains which phages did not adsorb to them, which then resulted in the overpopulation of the resistant phenotype or emergence of mutated strains in the population of the host bacterium 14,27 .Whatever the reason, this is a critical issue, thus it is vital to tackle this issue. An effective strategy against this issue is using phage cocktail which can effectively control the host bacteria and inhibit possible emergence of phage resistance phenotypes 25,34-37 . Using combination therapy, i.e. simultaneous use of antibiotics and bacteriophages is another solution to overcome such issue 25,38 . Restriction profile. EcoRI, EcoRV, NdeI, PstI, BamHI, and HindIII digested the phage genome (see Supplementary Fig. S5). The restriction profiles were studied using Sequenti X Gel Analyzer software 39 . This analysis indicated that the phage was a dsDNA virus with a genome size of approximately 45 kDa.
Genomic analysis. The complete genome of VB_EcoS-Golestan revealed that it does not harbor any harmful gene such as those genes associated with antibiotic resistance, lysogenic, toxins or other virulence factors. This suggests that VB_EcoS-Golestan phage can be introduced as a virulent phage against E. coli.
The genome of phage VB_EcoS-Golestan is 44829 bp in length with a G + C content of 50.6%, which is similar to the majority of available genome sequences of E.coli with GC contents ranging from 50 to 52%. The genome consisted of 78 open reading frames (ORFs) (Supplementary Table S3) with most of them located on the plus strand (64.1%, 50 ORF). All ORFs begin with ATG codon except for ORFs 24, 29, 68 and 99, which start with TTG. Three types of codons were used for prediction of the ORFs, including TAG as the most common stop codons (50%, 39 ORF) followed with TAA (40%, 31 ORF) and TAG (10%, 8 ORF) (Supplementary Table S3). www.nature.com/scientificreports www.nature.com/scientificreports/ Using tRNA-Scan and GtRNAdp no tRNA was detected in the genome. Furthermore, seven transcriptional promoter sequences were identified by PHIRE software (Supplementary Table S4). Thirteen Rho-factor-independent terminators were detected in the genome of VB_EcoS-Golestan using ARnold (Supplementary Table S5), which were assessed according to their location, presence of a U-rich tail, and strongly predicted stem-loop secondary structure (ΔG ≤ −10 kcal mol −1 ) as calculated by MFold 31 .
Of the 78 ORFs, 26 were similar to that of the GenBank functional genes. Forty-six were similar to hypothetical proteins with unknown function. The other six had no similarity with any protein available in the databank (NCBI), thus were unique ORFs in this phage and were registered as hypothetical proteins in the GenBank. The VB_EcoS-Golestan genome was organized in separate functional modules containing genes involved in structural and packaging (10 ORFs), replication and regulation (11 ORFs) and cell lysis (5 ORFs) (Fig. 3).
Eleven protein bands, representing virion structural components, were seen in a SDS polyacrylamide Coomassie-stained gel with sizes ranging from 25 to 150 kDa (Fig. 4). A predominant polypeptide band of about 35 kDa is suggestive of major capsid protein as a result of the high capsid protein copy number. The detected molecular mass corresponds to the predicted molecular weight of this protein. Blastp analysis demonstrated that the VB_EcoS-Golestan major capsid protein has resemblance to the Escherichia phages ST2, K1-dep(4), K1-dep(3), K1-ind(3), K1-ind(2) and K1-ind(1) (sequence identity ranging from 98% to 96%) major capsid proteins within the Kagunavirus genus, Guernseyvirinae subfamily, Siphoviridae family, according to the ICTV classification of phages.
The DNA packaging system of the tailed phages contains a heterodimeric terminase constituted of large and small subunits where the small subunit is accountable for DNA binding and the large subunit (terminase) that intercedes the prohead is responsible for binding and cleavage of the phage concatameric DNA into single genome units 31 . In VB_EcoS-Golestan the products of ORF1 and ORF78 were predicted as large terminase and small terminase proteins, respectively. These proteins have 96% and 99% identity to the large and small terminase proteins of Escherichia phage ST2, respectively. Four ORFs (4, 16, 16 and 29) were predicted as tail proteins, with amino acid sequence identities with orthologous genes of Escherichia phages within the Kagunavirus genus, ranging from 80% to 99%.
Tail fibers in the phage tail, play a very important role in the initiation of the phage coupling with its bacterial receptors and thus have a role in the host specificity 40 . The tail fiber protein encoded by ORF30 had 88% sequence identity with the tail fiber protein of Escherichia phage LM33-P1. ORF 25 of VB_EcoS-Golestan genome encoding a tape measure protein (TMP), the genome second largest gene, contains a HI15114 area in the N-terminal end. It, as a multifunctional protein, has roles in determining the length of the tail (in colaboration with assembly chaperones of the tail), link of the capsid and distal regions of the tail, and genome delivery 40,41 . ORFs 3 and 27 were predicted as hypothetical proteins, while the putative conserved domains of these ORFs were involved in the assembly of the bacteriophage. ORF27 contained a DUF1833 (pfam08875) domain which was predicted as tail assembly chaperone involved in tail assembly. As described above, this ORF in concert with ORF25 (which encodes a tape measure protein) are responsible for determination of the tail length. ORF3 contained a phage SPP1 domain (TIGR01641) and a phage-Mu-F (pfam04233) region toward its C-terminus. These domains are involved in the viral head morphogenesis of double-stranded DNA bacteriophages.
Replication and regulation proteins. Eleven genes in the vB_EcoS-Golestan genome were predicted to play a role in replication and regulation. ORF20 encodes an acid phosphatase consisting of a HAD-PNKP-C (cd07502) family domain. This family consists of the C-terminal domain of the bifunctional enzyme T4 polynucleotide kinase/phosphatase (PNKP). The PNKP phosphatase domain is able to catalyze hydrolytic removal of the 3′-phosphoryl of RNA, DNA, and deoxynucleoside 3′-monophosphates 42 . ORF31, which encodes an exonuclease subunit SbcD, is comprising of PRK10966 domain and DFU4140 (pfam13600) N-terminal domain. It showed 100% identity with exonuclease subunit SbcD of Escherichia phage P AB-2017. The helicase and replicative helicase/ primase encoded by ORFs 33 and 50, demonstrated the highest homology with G AB-2017 ORFs 66 and 52 (98% identity), respectively, which are engaged in replication, recombination, and repair of the phage 43 . The product of ORF39 was predicted as DNA polymerase containing DNA-pol-A superfamily domain with 94% similarity to Escherichia phages K1-ind(3), G AB-2017, K1-dep(1) and K1-dep(4). The VRR-NUC protein that is encoded by ORF36 is associated with PD-(D/E)XK nuclease superfamily protein (ORF43), which include restriction modification enzymes. ORFs 48 and 53 encode helix-turn-helix-family DNA binding proteins and are engaged in DNA replication regulation, transcription, telomere maintenance and repair,. Helix-turn-helix proteins are involved in specific identification of the genome of the virus for the beginning of DNA packaging during virus assembly 44 . cell wall lysis proteins. The dsDNA phages of eubacteria use endolysin or muralytic enzymes together with holin, a small membrane protein, to degrade the bacterial cell wall peptidoglycan 45 . In the VB_EcoS-Golestan, genes recognized playing a role in host cell wall lysis included of putative holin class II (ORF66) and I (ORF67) located at the upstream of putative endolysin (ORF68) that contains autolysin (cd00737), muramidase (COG3772) and phage lysozyme (pfam00959) domains. Another protein that contributes to cell lysis of gram-negative bacteria is spanin. This protein is engaged in the outer membrane interuption and also catalyzes the outer and inner membranes fusion in the gram-negative bacteria 45 . In the VB_EcoS-Golestan, spanin protein is encoded by ORF7 and 8. All of the ORFs encoding proteins involving in cell wall lysis of the VB_EcoS-Golestan Scientific RepoRtS | (2020) 10:7690 | https://doi.org/10.1038/s41598-020-63048-x www.nature.com/scientificreports www.nature.com/scientificreports/ these bacteriophages belong to the Kagunavirus genus, Guernseyvirinae subfamily, Siphoviridae family, according to ICTV classification of phages. Progressive multiple genome alignments were calculated using Mauve (Fig. 5) and Easy fig (Fig. 6) software to determine the relatedness of VB_EcoS-Golestan genome with the homolog phage (mentioned above) and show a considerable relation between VB_EcoS-Golestan and other Escherichia phages within Kagunavirus genus.
To determine the exact taxonomic position of the phage, major capsid and DNA polymerase proteins of VB_ EcoS-Golestan with related phages were analyzed using "One Click" of phylogeny.fr server (Fig. 7). The results of the phylogenetic tree confirm the high homology of phage VB_EcoS-Golestan with Escherichia phages in  (2) and K1ind1 (from bottom to top). Genome similarity is represented by a similarity plot within the colored blocks with the height of the plot proportional to the average nucleotide identity. The white regions represent fragments that were not aligned or contained sequence elements specific to a particular genome.   VB_EcoS-Golestan is a virulent phage that belongs to Kagunavirus genus of Guernseyvirinae subfamily, Siphoviridae family. This lytic bacteriophage had a broad host range specificity against both antibiotic sensitive and multidrug-resistant UPEC isolates, a rapid adsorption time, and large burst size, and high stability at a wide range of pH and temperatures, which makes it a promising agent against E. coli infections. Moreover, annotation of its whole genome sequence confirmed that there is no virulence factor in its genome including, toxin, lysogenic or antibiotic resistance genes. Therefore, VB_EcoS-Golestan is a potential agent for phage therapy of UTI caused by E. coli.

Material and methods
Bacterial isolation. Fifty-two E. coli, isolated from UTI with colony count of ≥10 5 CFU/ml, were collected from the hospitals located in the city of Gorgan, Golestan province, Iran. This study was approved by the local ethics committee (Golestan University of Medical Sciences) (IR.GOMS.REC.1394.209). Informed consent was obtained from all participants and/or their legal guardians. All samplings were performed in compliance with relevant laws and institutional guidelines and in accordance with the ethical standards of the Helsinki Declaration. All of the E. coli isolated were subjected to biochemical characterization following Mahon et al. 46 . All isolates were cultured in brain heart infusion (BHI) broth and stored at −70 °C until further use.

Detection of virulence factors in the bacterial isolates.
Bacterial DNA was isolated using the phenol-chloroform method. The presence of virulence factors fimH, pap, sfa, and afa in E. coli isolates, was detected by PCR using specific primer sets designed for these adhesion genes, as described previously 48 . The PCR product of any positive sample was then sequenced by sanger sequencing (Macrogen company, South Korea) and then identified by BLAST alignment tool and deposited in GenBank. The results were used to identify any correlation between the presence of virulence factors and observed sensitivity to the phage. phage isolation. Municipal wastewater samples were collected from city of Gorgan, Golestan province, Iran.
Twenty ml of the supernatant from centrifuged (12,000 × g, 10 min) wastewater was mixed with 20 ml 2X BHI broth containing the E. coli isolates (in exponential phase, OD 600 = 1). After 24 h incubation at 37 °C, the suspension was centrifuged and the supernatant was filtrated using a sterile syringe filter with a 0.22 µm pore size (Gilson, UK). To determine the phage presence, 10 µl of the filtrate and the E. coli isolate (100 µl) were mixed with a 0.7% melted top soft agar and poured on a plate of brain heart infusion agar. Plaques were identified after overnight incubation at 37 °C. A single plaque on the bacterial lawn was pulled out and mixed with 20 ml of the isolated E. coli suspension, then incubated at 37 °C for 18 h. Then, the double-layer plaque assay was carried out. This was repeated for three times in order to obtain a pure stock of the isolated phage 29 . Multidrug-resistant E. coli isolate 333 was used as a host for the phage isolation.
Phage purification. The phage suspension (~ 10 10 pfu/ml) was centrifuged for 15 min at 13,000×g and the supernatant was filtered (0.22 µm, Gilson, UK), then DNase (1 µg/µl) and RNase (1 µg/µl) were added to the filtrate (1 h at 37 °C) to remove any bacterial DNAs and RNAs. NaCl and polyethylene glycol (PEG) 8000 were added in the phage suspension at final concentrations of 1 M and 10%, respectively. The mixture was stored at 4 °C overnight. The phage was precipitated by 30 min centrifugation at 13,000×g at 4 °C. Two ml of SM buffer (2% gelatin, 5 ml; 1 M Tris-Cl pH 7.5, 50 ml; MgSO4_7H2O, 2 g; NaCl, 5.8 g and ddH2O to 1,000 ml) was used to re-suspend the pellet. The concentrated phage was loaded on a glycerol step gradient (SM buffer with 40 and 5 percent glycerol) and subjected to 2 h ultracentrifugation at 150,000 × g and at 4 °C (Backman L5-65 ultracentrifuge, SW28 rotor). The pellet was re-suspended with SM buffer and stored at 4 °C until further use 43,49 . Determination of the host range. The lytic activity of the isolated phage was examined against the 52 clinical isolates of E. coli using standard spot assay (Table 1). Briefly, 10 µl of the purified phage was spotted in the center of double agar overlay culture of each isolate and incubated at 37 °C. After overnight incubation, plate with clear lytic zone was considered as susceptible to phage-mediated lysis 14 . Several gram-negative and positive standard bacteria (obtained from the Department of Microbiology, Golestan University of Medical Sciences, Iran) were also used to investigate the phage host range. electron microscopy. A drop of the purified phage (~10 12 pfu/ml) was spotted on a carbon-coated copper grid. Then, 10 µl of uranyl acetate 2% was added to the surface of the grid for 30 s, then the excess was removed by filter paper 50 . The fixed sample was studied using Fei Philips TEM, CM-10 (Japan). phage stability. Thermal and pH stability tests were conducted as previously described 32 . For thermal stability test, the phage suspensions were incubated at 37 (Control), 45, 50, 55, 60, 70 and 75 °C for 1 h, and its pH Analysis of calcium and magnesium ions effects on adsorption rate. To evaluate the effects of cationic ions on the phage, MgCl 2, and CaCl 2 (each at a final concentration of 10 mM) were added to the phage infected cultures. Samples were collected at 0, 5, 10, 15 and 20 min intervals to determine the unabsorbed phage titer and reported as a percentage of the initial phage count 28,32 . one-step growth. Latent period and phage burst size were determined by one-step growth test following Li et al. (2014) with some modifications. In brief, E. coli cells (E. coli 333 isolate) were pelleted by centrifugation and re-suspended in fresh BHI broth (2 ml) (~10 9 cfu/ml). The phage was added at a multiplicity of infection (MOI) of 0.01 and allowed to be adsorbed for 15-20 min at 37 °C, then centrifuged at 13,000×g for 1 min. Twenty milliliters of BHI broth was used to re-suspend the pellet and then incubated at 37 °C. Samples (100 µl) were collected up to 120 min at 10 min intervals and then tittered using the soft agar overlay plate 30 .
Bacteriolytic characteristic of the phage. The bacteriolytic activity of the phage in different MOIs was determined using a modified version of our previous study 32 . Ten milliliters of BHI broth was impregnated with 300 microliters of the overnight host culture and incubated at 37 °C until reaching to the optimal density of 0.4 (early logarithmic phase). Different MOIs of 0.0001, 0.001, 0.01, 0.1, 1 and 10 of the phage were used to inoculate the bacterial culture and incubated at 37 °C. Samples were collected at 1, 2, 3, 4, 5, 6, 7, 8 and 24 h and measured using optical densitometry (Eppendorf Bio Photometer plus, Germany) at 600 nm. Moreover, 100 µl of each taken sample was diluted and cultured on 2% blood agar to quantify the bacterium titer (cell count) 32 .
In order to determine the relatedness of the phage genome with the homolog phages, Mauve 61 and EasyFig. 62 softwares were used for progressive multiple genome alignment and comparison of the phages nucleic and amino acid sequences with that of the homolog phages sequences available at NCBI database. Phylogenetic analysis of the phage DNA polymerase and major capsid protein was performed using "One click" tool of phylogeny.fr server (http://www.phylogeny.fr/) 63 . The complete genomic sequence of the phage was submitted in the NCBI database under the accession no. MG099933.1.

Statistical analysis.
The experiments were done in triplicate when required, and the mean ± SD was reported. T-test, Two-way ANOVA and One-way ANOVA-Repeated measures were used for statistical analyses using Graph Pad Prism 6.05 software. The comparison of prevalence rates was performed using Pearson Chi-square test with SPSS software 16.0. A P ≤ 0.05 was considered as significant.