Mycobacteriophage are viruses that infect mycobacteria. More than 1,400 mycobacteriophage genomes have been sequenced, coding for over one hundred thousand proteins of unknown functions. Here we investigate mycobacteriophage Giles-host protein-protein interactions (PPIs) using yeast two-hybrid screening (Y2H). A total of 25 reproducible PPIs were found for a selected set of 10 Giles proteins, including a putative virion assembly protein (gp17), the phage integrase (gp29), the endolysin (gp31), the phage repressor (gp47), and six proteins of unknown function (gp34, gp35, gp54, gp56, gp64, and gp65). We note that overexpression of the proteins is toxic to M. smegmatis, although whether this toxicity and the associated changes in cellular morphology are related to the putative interactions revealed in the Y2H screen is unclear.
Bacteriophages are the most abundant, diverse and highly populated biological entities with an estimated 1031 phage particles in the biosphere1. Mycobacteriophages are viruses of mycobacterial hosts, including Mycobacterium smegmatis and Mycobacterium tuberculosis, the causative agent of tuberculosis2. More than 1,400 completely sequenced mycobacteriophage genomes have been described (http://phagesdb.org)3, which not only have facilitated development of tools for mycobacterial genetics, but also may have therapeutic potential4. However, these genomes display high genetic diversity and encode an abundance of genes of unknown function5,6.
Determining the functions of phage genes will elucidate their mechanism of infection7. Efficient phage DNA replication is metabolically demanding, and phages often reprogram host nucleotide metabolism to their own benefit8. Transcriptomics and metabolomics studies in cyanobacteria show how phage can reroute the host metabolism, such as towards de novo fatty-acid synthesis, or to generate conditions suitable for virus assembly9. The increased rate of fatty acid biosynthesis, including triacylglycerol (TAG), may be a common strategy of viruses: lipid droplets/bodies that mainly contain TAGs serve as a source of energy during phage assembly10. This lipid remodeling may be an evolutionarily conserved strategy used by viruses to hijack host cell machinery. Also, phages are known to exploit other host biological processes including stress response and host replication11.
Many phage proteins regulate the host cell machinery by protein-protein interactions (PPIs) to propagate their progeny12,13. This offers an advantage to the phage by facilitating the production of suitable conditions for phage propagation14. For example, viruses can modulate the host glycome either by regulating host glycosyltransferases or by producing their own glycosyltransferases. The virus-encoded glycosyltransferases are predicted to be involved in a variety of virus–host interactions15. Other phage-encoded homologues of host proteins have been shown to act as endonucleases, sigma factors, RNases or heat-shock proteins16,17,18. However, only few phages have been systematically studied for molecular interactions between phage and host proteins19.
Mycobacteriophage Giles is a temperate phage that forms stable lysogens in M. smegmatis 20. It has a 53,746 bp genome and contains 78 putative protein-coding genes. The repressor (gene 47) is positioned approximately 65% of the genome length from the left genome end, and genes to its left include the virion structure and assembly gene, the integration cassette, and the lysis genes. Many of the genes to its right are of unknown function, but include those coding for a recombination system (RecET-like), DnaQ, DNA Methylase, RuvC, and WhiB20. Transcriptomic studies show that these genes to the right of the repressor are expressed early in lytic growth, and those in the left part of the genomes are expressed late in lytic growth20. A broad search to define Giles genes needed for lytic growth showed that more than half of the non-structural genes are dispensable for plaque formation, although many show minor defects in phage production20. These genetic and transcriptomic analysis support further analysis of mycobacteriophage Giles to elucidate protein-protein interactions21.
Here, we implemented a search for Giles-encoded proteins that interact with host proteins and may play roles in reprogramming of the host machinery. Ten Giles proteins were screened against host proteins using a yeast two-hybrid (Y2H) approach, and a network of 25 interactions was identified. Several of these interactions were pursued with phenotypic screens.
Selection of Giles proteins
A set of 10 Giles-encoded proteins (Table 1) were selected for screening against a M. smegmatis genomic library in a Y2H screen (Fig. 1A). These proteins represent a variety of expression and functional features, and include a putative virion assembly protein (gp17), the phage integrase (gp29), the endolysin (gp31), the phage repressor (gp47), and six proteins of unknown function (gp34, gp35, gp54, gp56, gp64, and gp65). Prior studies have shown that genes encoding the integrase (29) and repressor (47), together with genes 54 and 56 are not required for lytic growth22. In contrast, gene 64 is essential and is implicated in phage DNA replication22. The endolysin is required for lytic growth and the inability to delete genes 34, 35, and 65 suggests that these may also be required22. All ten proteins have few known interactions with other Giles-encoded proteins21, hence we speculated that they are likely to interact with the host.
Library screens detect interactors for Giles proteins
Baits were screened against a custom-made M. smegmatis genomic library as described in the Methods. Approximately eight positive clones for each bait (Giles protein) were selected and sequenced to identify interacting prey partners from the host, except for baits gp54, gp56 and gp64, for which 12–16 clones were sequenced each. Positive clones were then retested using an array-based Y2H screen (Fig. 1B ), and a total of 78 positive clones were sequenced. After removing redundant sequences, we identified 59 Giles-host PPIs and these were retested in an independent Y2H experiment using freshly prepared clones for the host prey proteins (Fig. 1B); the sequences of the host proteins (interactors) are shown in Table S1. The reproducible interactions from the re-test screens were used to construct a Giles-host network of 25 reproducible interactions (Fig. 2).
Size and domains of interacting fragments
For four of the putative Giles-host interactions, more than one independent interacting clone was isolated (two each for Giles gp17, gp56, and two and four for each of the two host proteins interacting with gp54, Fig. 3A). However, the host clones are identical and thus likely represent sibling clones in the library.
Multiple positive prey clones encoding the same fragments of interacting proteins were found for 3 Giles proteins (Gp17, Gp54 and Gp56). For example, Gp54 interacts with glutamate synthase 1 (MSMEG_6459) & MalT-like (Maltose-transcriptional) regulator (MSMEG_4430) and two and four fragments encoding the same protein region were found as interacting partners, respectively. These fragments helped to identify the interacting domain or region within the interactors. For example, only the GltS domain of glutamate synthase 1 (MSMEG_6459) was found to interact with Gp54 (Fig. 3A). However, the library nor the screens were saturated, so that additional or overlapping fragments may have been missed. The information of the interaction domain (or the protein fragment encoded by clone sequence) for all interactors, is shown in Table 1. Given that our prey library was size fractionated (see Methods), most PPIs domains were in the range of 100–200 amino acids (Fig. 3B).
Phage proteins targeting similar processes in different hosts
Although the Giles-host protein interactions seem robust and reproducible in the Y2H screen, they could be spurious positive hits resulting from ‘sticky’ protein interactions, or they may reflect biologically relevant interactions involved in the growth of phage Giles. Because phage-host interactomes have been previously described for Streptococcus phages Cp1, Dp1 and E. coli phage lambda, we searched for interactions that are shared between these data sets (Table 2). In general, the interactomes are quite different to each other, likely reflecting the genetic diversity of these phages. However, phage proteins attacking the same host pathway or protein may reflect shared infection strategies. For instance, we note that both Giles gp17 and Dp-1 gp9 appear to interact with the host PhoU protein23. Whether this interaction is relevant for the phage remains unclear though, given that Giles gp17 and Dp-1 gp9 are functionally unrelated (the function of Dp-1 gp9 is unknown; Giles gp17 is a putative tail assembly protein).
The Giles gp54 - MSMEG_4430 and gp64 - MSMEG_3746 interactions are not required for phage infections
To further explore the relevance of the Giles-host interactions, we determined whether Giles mutants with deletions of interacting non-essential genes had altered plating efficiencies on M. smegmatis mutants with deletions of host interaction protein genes (Fig. 4). However, no changes in plating efficiencies were observed, raising doubts as to whether these interactions are biologically relevant, although it remains plausible that some interactions are involved in roles that are not reflected in the plating assay.
Phenotypes of Giles protein overexpression in bacteria
It has been shown previously that overexpression of phage proteins can be toxic to growth of the host, and for at least some of these that it is mediated by interactions with host proteins13. We therefore overexpressed eight of the Giles proteins in M. smegmatis and determined whether expression is inhibitory for growth, and if morphological changes occur in the cells. All eight genes inhibited M. smegmatis growth when overexpressed (Fig. 5A), but it is unclear whether the toxicity is a result of non-specific consequences of overexpression.
We observed that overexpression of all eight Giles proteins induced some cell lengthening, although no more than three-fold in the maximal effects (Fig. 5B). Because all of the genes behaved similarly, this is likely a non-specific result of general stresses placed on the cells under these conditions. Moreover, similar changes were observed when the same genes were expressed in E. coli, further supporting these as non-specific consequences of overexpression (Fig. 5C). We note that overexpression of Giles gp29 appears to induce some cell branches, perhaps as a consequence of DNA damage associated with non-specific DNA cleavage by the Giles integrase (Fig. 5C). Other cell deformations are observed when Giles gene 17, 64, and 54 are overexpressed including polar bulging and improper septal positioning (Fig. 5D, S3). We note that although such changes could be associated with inhibition of cell function mediated though the interactions identified in the Y2H screen, it remains possible that they reflect interactions between phage and host proteins that were not identified in the Y2H experiment.
Y2H data are often considered as unreliable, and fraught with a large number of false positives. The reasons for this are two-fold. First, many false positives can result from non-reproducible growth in the Y2H screen itself. Second, false positives can arise from proteins that are either ‘sticky’ or improperly folded. The first explanation seems unlikely for the Giles-host interactions described here, as the initial hits were extensively retested, and irreproducible positives eliminated. We can also rule out ‘stickiness’ as this typically results a large number of interactions which we did not find. Improper folding of the mycobacteriophage-encoded proteins in yeast may be more likely although we have no evidence that phage proteins are folding less well than others. We would argue that many interactions found in our study do in fact happen but may not be physiologically relevant. Given that irrelevant interactions are unlikely to have any negative impact on phage replication there should be little selective pressure to lose such interactions in phage evolution. Finally, it is also likely that some potential interactions were not identified, as false negatives also are common in Y2H screens24,25.
We note that for most of the proposed interactions, it is difficult to envisage what role they play in the biology of the phage. For example, the interaction between Giles gp17 and M. smegmatis PhoU (MSMEG_5776) is one of the strongest (as measured by a 3-amino-triazole (3-AT) concentration of 50 mM) and seemingly robust in the Y2H screen. However, Giles gp17 is a putative tail chaperone protein, and is unlikely to play any role in phosphate metabolism during infection.
One of the more plausible interactions we observed is between Giles gp64 and the host CTP synthase (MSMEG_3746). Deletion of Giles 64 results in a defect in phage DNA replication22, and because CTP synthase is involved with nucleotide metabolism, interactions between the two proteins would conceivably play a role in phage DNA replication or its control. We note, however, that MSMEG_3746 is not essential for M. smegmatis growth, and it is more likely that Giles gp64 plays a more direct role in phage DNA replication. Nonetheless, this interaction might be worthwhile examining further to determine if the two proteins interact biochemically.
In summary, we have described here an initial screen to identify phage Giles-encoded proteins that interact with M. smegmatis proteins. It is plausible that some of these are relevant to the growth of phage Giles, although our screen – like other Y2H screens – produced many positive clones that may be physiologically irrelevant and thus will require substantial further analysis to elucidate which are of greatest interest. The phenotypes resulting from overexpression of Giles phage proteins are consistent with at least some of these resulting from interactions of host proteins and inactivation of their function, but it remains to be seen if these are the same as those identified in the Y2H experiment, or whether they result from different interactions that are missed as false negatives in the Y2H screen.
Materials and Methods
Bacterial strains and growth conditions
M. smegmatis mc2 4517 was cultured in DifcoTM Middlebrook 7H9 broth (BD) supplemented with ADC (5 g/L albumin, 2 g/L dextrose, 3 g/L catalase) and 0.05% Tween-80. Selection was performed on solid agar plates with 100 μg/ml hygromycin, 50 μg/ml kanamycin, for both liquid and solid media26. E. coli TOP10 and DH5α were used for cloning and were cultured in LB broth or agar. E. coli was selected at 150 μg/ml hygromycin, 35 μg/ml chloramphenicol, for both liquid and solid media. All strains were grown at 37 °C. For protein expression, E. coli BL21 cells were used. All the expression experiments were done at 30 °C unless otherwise mentioned. For vector details, see Mehla et al. 201527.
Construction of a random genomic Y2H prey library of M. smegmatis
A random genomic fragment library from M. smegmatis mc2155 was constructed as follows. First, genomic DNA was isolated from a stationary 1 L M. smegmatis culture grown in 7H9 medium using the Belisle (1998) protocol28. Subsequently, 100 µg DNA was partially digested with AluI, after which the 400–1500 bp fraction was extracted from a 1% DNA agarose gel. 200 ng blunt DNA fragments were then ligated into 0.5 µg pENTR 1 A (Thermo Scientific) (1:1 molar ratio), which was cut with DraI and EcoRV and dephosphorylated. The ligation mixture was transformed into electrocompetent E. coli MegaX DH10B cells (Thermo Scientific). Transformants (5.4 × 106, 159 × redundancy of the 6.99 Mbp genome) were pooled and stored at −80 °C (20% glycerol). Next, pENTR 1 A/M. smegmatis library plasmid DNA was isolated from a 4 ml overnight culture. 500 ng was then subcloned using a Gateway LR reaction to 500 ng pGADT7g yeast two-hybrid prey vector, which was also transformed into electrocompetent E. coli MegaX DH10B cells. Again, 1.7 × 106 transformants (at 8 × redundancy, taking into account the correct reading frame fusion between the Gal4p activation domain and the inserted coding sequence) were pooled and used for a random genomic fragment library prep (average length of 950 bp).
The ORFs for the Giles proteins were cloned into the Gateway compatible Y2H vector pGBGT7g using LR reaction of Gateway cloning as per supplier’s instructions (Invitrogen). The ORFs encoding Giles proteins, cloned as baits into pGBGT7g, were transformed into Y2H strain AH10929. For protein expression in E. coli and M. smegmatis, the ORFs encoding Giles proteins were also cloned in expression specific shuttle vector pDESTsmg26 using Gateway LR reactions.
Yeast Two-Hybrid Screening
To characterize Giles-host interactions, we screened 10 phage proteins against a random genomic library of M. smegmatis. We used a Y2H library screening approach followed by array-based Y2H screens to verify the interactions found in the library screen30. Thus, Giles-host protein interactions were detected using both library- and array-based screens. The background growth was suppressed by 3-AT (3-amino-triazole) in Y2H screens to minimize the rate of false positives. The 3-AT score was calculated for PPIs as described previously21.
Genomic library screens
The constructed random genomic library (see section above) was transformed into Y2H mating-compatible yeast strain Y187 and screened against the selected Giles phage proteins30. The Giles phage proteins were selected based on their essentiality22 and the coverage in our recently published Giles interactome21. Interacting preys from positive clones from library screens were identified by colony PCR and sequencing. Sequencing was done using a single forward primer at Eurofins Genomics, Louisville KY. The sequences were then analyzed for prey identification using blastN against the M. smegmatis genome. Colonies with no sequence reads were removed at this step.
Array-based Yeast Two-Hybrid
Once the prey proteins were identified from library screening, the plasmids for interacting prey clones were isolated from yeast cells. Briefly, the cells were treated with ZymolyaseR-100T (Sunrise Science Products, Inc.) followed by a standard protocol for plasmid isolation as per the suppliers’ manual (Macherey-Nagel Inc.). The isolated prey proteins were then transformed back into Y2H compatible yeast strain Y187 as previously described29. Then, the interactions between Giles baits and identified host prey proteins were tested using array-based Y2H method as previously described27.
In vivo validation of protein-protein interactions
To validate selected Giles-host interactions, M. smegmatis KOs were constructed. A PCR reaction using primers MSMEG_XXXX A and B were used to amplify the 5′ end of the gene of interest in M. smegmatis from mc2155 DNA. Next, primers C and D were used to amplify the 3’ end of the gene of interest in M. smegmatis from mc2155 DNA (Table S2 ). The PCR products of these two reactions were then used, along with a purified hygromycin resistance gene, in a PCR to amplify the 1.3 kb substrate with primers A and D. This substrate was then electroporated into recombineering mc2155 cells and recombinants were selected with hygromycin containing media. Colonies were verified by PCR. KO mutants were then plated in a top-agar overlay onto 7H10 hygromycin plates. Phages were diluted in phage buffer and spotted on the overlay. The Giles gene deletion KOs were constructed as reported previously22.
Protein expression in E. coli and Mycobacteria
All the selected phage proteins were expressed both in E. coli and M. smegmatis mc 24517 (kindly provided by Prof. Shaun Lott) using a Gateway compatible shuttle vector (pDESTsmg). The vector and methodology details are described elsewhere26. The electrocompetent M. smegmatis cells were prepared in the lab as described previously31.
Briefly, the expression constructs of Giles phage proteins were electroporated in the electrocompetent M. smegmatis cells using a BioRad Gene Pulser (R = 1000 Ω, Q = 25 μF and V = 2.5 kV). The cells were plated on 7H9 medium (supplemented with ADC = Albumin-Dextrose-Catalase); 100 µg/ml of hygromycin and 50 µg/ml of kanamycin/Tween). About 5–6 clones were tested for each phage protein for expression on hard agar. For induction in mycobacteria, acetamide (0.2 mM) was added to the growth media or to solid agar plates.
For expression in E. coli BL21(pLys), the cells were transformed with expression constructs encoding Giles proteins and protein expression was induced using IPTG (0.5 mM) at 30 °C. For E. coli, LB plates with ampicillin (100 µg/ml) and chloramphenicol (35 µg/ml) were used.
Thus, Giles proteins were expressed in M. smegmatis and in E. coli both on 7H9 and LB solid agar and broth, respectively.
Light microscopy and image analysis
The cells were stained and imaged to visualize cell membrane and nucleoid using FM4–64 (Synapto Red C2, Biotium Inc.) and DAPI respectively. The cells were imaged on an Olympus BX41 microscope at 100x in a dark room. Images were captured with a microscope digital camera “AmScope MU1400”. The ImageJ software32 was used for measuring cells dimensions/length (National Institute of Health).
The protein interactions from this publication have been submitted to the IMEx consortium (http://www.imexconsortium.org) through IntAct33 and assigned the identifier IM-26164.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by National Institutes of Health grant R01 GM109895.