Rgg-Shp regulators are important for pneumococcal colonization and invasion through their effect on mannose utilization and capsule synthesis

Microbes communicate with each other by using quorum sensing (QS) systems and modulate their collective ‘behavior’ for in-host colonization and virulence, biofilm formation, and environmental adaptation. The recent increase in genome data availability reveals the presence of several putative QS sensing circuits in microbial pathogens, but many of these have not been functionally characterized yet, despite their possible utility as drug targets. To increase the repertoire of functionally characterized QS systems in bacteria, we studied Rgg144/Shp144 and Rgg939/Shp939, two putative QS systems in the important human pathogen Streptococcus pneumoniae. We find that both of these QS circuits are induced by short hydrophobic peptides (Shp) upon sensing sugars found in the respiratory tract, such as galactose and mannose. Microarray analyses using cultures grown on mannose and galactose revealed that the expression of a large number of genes is controlled by these QS systems, especially those encoding for essential physiological functions and virulence-related genes such as the capsular locus. Moreover, the array data revealed evidence for cross-talk between these systems. Finally, these Rgg systems play a key role in colonization and virulence, as deletion mutants of these QS systems are attenuated in the mouse models of colonization and pneumonia.


Materials and Methods
Bacterial strains and growth conditions. Strains used in this study has been listed in Table S1. Routinely, S. pneumoniae strains were grown in brain heart infusion (BHI) broth, or on blood agar plates supplemented with 5% (v/v) defibrinated horse blood at 37 °C. Chemically defined medium (CDM) supplemented with different sugars was also used for growth of pneumococcal strains. Where appropriate, spectinomycin (100 μg/ml) or kanamycin (250 μg/ml) was added to the culture medium. Escherichia coli strains Top10 (Invitrogen) and DH5α were used for cloning and were grown in Luria broth (LB) or on Luria broth agar with kanamycin (150 μg/ml) or ampicillin (100 μg/ml). Synthetic peptides. Synthetic peptides were used to test the activity of Shp144 and Shp939. Unlabelled synthetic peptides were purchased from Cova Lab as relatively pure preparations (>95%). The amino acid sequences of these peptides are given in Table S2. Synthetic peptides were reconstituted as 6 mM (unlabeled peptides) stocks in dimethyl sulfoxide (DMSO) and stored at −80 °C.

Construction of genetically modified strains, and transcriptional reporters.
To construct the rgg/shp insertion-deletion mutants in strain D39, the splicing by overlap extension (SOEing) PCR method was used as previously described 21,22 . Briefly, the genetic locus surrounding the region to be mutated was individually amplified, and fused with a spectinomycin resistance gene using the primers listed in Table S3 Successful insertion deletion was confirmed by PCR and DNA sequencing. The mutated strains were designated as ∆rgg144 and ∆rgg939.
For the construction of genetically complemented strains, the rgg144 and rgg939 coding sequence and their putative promoter regions were amplified, and cloned into pCEP as described previously 22 . The amplicons were transformed into ∆rgg144 and ∆rgg939, respectively. The transformants were selected for both spectinomycin and kanamycin resistance, and confirmed by PCR. The complemented strain was designated as ∆rgg144Comp and ∆rgg939Comp. Construction of transcriptional reporters followed the general method described previously 18 . After the identification of the putative promoter regions (P) of rgg144 and shp939 using promoter recognition software, these regions were amplified and cloned into an integrative reporter plasmid pPP2 23 . Glucuronic acid assay. Capsular polysaccharide (CPS) production was quantified by the method described previously 24 . Five hundred microliters of pneumococcal culture grown in the presence of 55 mM mannose or glucose from late exponential phase (approximately OD600 1.1 for wild type and 0.7 for the mutants) was mixed with 100 µl of 1% (v/v) Zwittergent 3-14 detergent (Sigma-Aldrich) in 100 mM citric acid (pH 2.0), and then the mixture was incubated at 50 °C for 20 min. The CPS was precipitated with 1 ml of absolute ethanol. The pellet was dissolved in 200 µl distilled water, and 1200 µl 12.5 mM borax (Sigma) in H 2 SO 4 was added. The mixture was Scientific REPORTS | (2018) 8:6369 | DOI: 10.1038/s41598-018-24910-1 vigorously vortexed, boiled for 5 min, and cooled, and then 20 µl 0.15% 3-hydroxydiphenol (Sigma) was added. The absorbance of the mixture at 520 nm was measured, and the glucuronic acid content determined from a standard curve of glucuronic acid (Sigma). β-galactosidase activity assay. β-galactosidase activity was measured as described before 22 , using cells grown anaerobically in CDM supplemented with 55 mM of selected sugars, and the bacterial cells were harvested in the late-exponential phase of growth, unless otherwise stated.
RNA extraction and purification. The extraction of RNA was done as described previously 21,25 . The pneumococcal cultures were grown in CDM supplemented with mannose or galactose under micro-anaerobic conditions until mid-exponential phase. The bacterial cultures were treated with TRIZOL and chloroform, and then precipitated with 2-propanol. Finally, the RNA was treated with amplification grade DNase I, and subsequently purified with an RNeasy Mini kit (Qiagen).
Microarray experiments. S. pneumoniae D39 and its isogenic mutant strains were grown anaerobically in CDM supplemented with either 55 mM galactose or mannose as the sole carbon source. The pneumococcal pellet was harvested at early exponential phase, OD600 approximately 0.3. The experiments were repeated with four biological replicates. The MicroPrep software package was used to obtain the microarray data from the slides. CyberT implementation of a variant of t-test (http://bioinformatics.biol.rug.nl/cybert/index.shtml) was performed and false discovery rates (FDRs) were calculated 26 . For differentially expressed genes, p < 0.001 and FDR < 0.05 were taken for significance threshold. For the identification of differentially expressed genes a Bayesian p-value of <0.001 and a fold-change cut-off of two was applied. All other procedures for the DNA microarray experiments and data analysis were performed as described before 27 .
Microarray data for selected genes was confirmed by quantitative reverse transcriptase PCR as described previously 1 . First strand cDNA was synthesized using approximately 1 μg of DNase-treated total RNA, immediately after isolation, random hexamers and 200 U of SuperScript III reverse transcriptase (Invitrogen) at 42 °C for 55 min. Three independent RNA preparations were used for qRT-PCR analysis.
In vivo virulence studies. To determine the virulence of pneumococcal strains, 8-10-week-old female CD1 outbred mice (Charles River, UK) were lightly anesthetized. For the pneumonia model, a 50 µl inoculum containing approximately 2 × 10 6 CFU in PBS was administered into the nostrils, dropwise 21,28 . Mice were monitored for clinical signs (progressively starry coat, hunched appearance and lethargy) 29 for 7 days. The mice that reached the very lethargic stage were accepted to have reached the end point of the assay, and were killed humanely. The time to reach this point was considered as the 'survival time' . Mice surviving for 7 days post-infection were deemed to have survived the infection. Median survival time was analyzed by the Mann-Whitney U test. To determine the development of bacteremia in each mouse, approximately 20 µl of venous blood was collected at predetermined time points after infection, and viable counts were determined. For the colonization model, CD1 mice were administered with approximately 5 × 10 5 CFU S. pneumoniae/ mouse in 20 μl PBS. The colonization of the nasopharynx by pneumococci was determined as described previously 2,30 . Briefly, at 0 and 7 days post-infection, mice were deeply anesthetized with 5% (v/v) isoflurane over oxygen and then killed by cervical dislocation. Mice were pinned onto a dissection board face up, and the mandible was removed. After introducing two lateral incisions (left and right) starting from the soft palate toward the pane, the palate was pulled back with forceps. The exposed nasopharyngeal tissue was collected, transferred into 10 ml of sterile PBS, weighed, and then homogenized with an Ultra Turrax blender (Ika-Werke, Staufen im Breisgau, Germany). Viable counts in homogenates then were determined.
Nasopharyngeal tissue was collected and transferred into 5 ml of sterile PBS. Tissue samples were homogenized, and viable counts in homogenates were determined by serial dilution in sterile PBS, and plating on blood agar plates. Data were analyzed by analysis of variance followed by the Bonferroni posttest. P values of <0.05 were considered statistically significant.
We also evaluated the expression of rgg genes in vivo. Pneumococci in infected tissues were collected and the expression of each gene was determined in the nasopharnx and lungs relative to blood as described previously 21 .
Ethics statement. In vivo experiments were performed under appropriate project (permit no. 60/4327) and personal (permit no. 80/10279) licenses in line with the United Kingdom Home Office guidelines under the Animals Scientific Procedures Act 1986, and the University of Leicester ethics committee approval. The protocol was approved by both the U.K. Home Office and the University of Leicester ethics committee. When required, the procedures were carried out under anesthetic with isoflurone. Animals were housed in individually ventilated cages in a controlled environment, and were frequently monitored after infection to minimize suffering. Every effort was made to reduce suffering and mice were humanely culled if they became lethargic.
In silico analyses of the distribution of Rggs. To identify Rggs in strain D39 we searched its genome for homologues of the prototypical Rgg, Streptococcus gordonii SGO0496 (AAA26968.1). To this end we turned to NCBI to perform a BLASTp search with default parameters and selected all sequences with an e-value below 1e-10. All Rggs identified in D39 are highlighted in the analysis by Fleuchot and colleagues 16 . To broaden our search and to analyze the distribution of Rgg across pneumococcal strains and related species, we made use of a set of genomes from strains of thirty-one S. pneumoniae, three Streptococcus pseudopneumoniae, eight Streptococcus mitis, six Streptococcus oralis, and one Streptococcus infantis. These genomes have been employed in previous work 9,14 , and were selected from the first large-scale pneumococcal pangenome study 31 , genomes from PCV-7 immunized children 32 , as well as genomes from non-encapsulated strains that make up a distinct phyletic group within pneumococcus [33][34][35][36] . Combined, these strains capture a variety of multilocus sequence types (MLSTs) and serotypes, as well as strains isolated from different disease states and geographic locations. These genomes were annotated using RAST server 37 . The predicted coding sequences were grouped into clusters of homologues employing a previously described clustering algorithm 38,39 . Briefly, clusters are generating by parsing homology searches of all predicted proteins against all possible translations, where a cluster is defined as the group of genes within which each sequence shares at least 70% identity over 70% of its length with one or more of the other genes in the cluster. To identify the Rggs, we selected all clusters where at least one gene was annotated as Rgg, MutR, or GadR. All annotations were confirmed using the CDD NCBI tool, where the C-termini of sequences had hits to the Rgg/GadR/MutR family with e-values lower than 1e-04 40 . Moreover, we employed blastp, using the prototypical Rgg (AAA26968.1) as a query, to search a database of all these genomes for hits with e-values below 1e-10; this output is a subset of the clustering analysis.

Results
Pneumococci encode seven putative Rgg's, with variable distribution across the species. Our experimental studies were performed in the well-characterized D39 strain. In the D39 genome, we captured five putative Rggs: SPD0144, SPD0939, SPD0999, SPD1518, and SPD1952 (these correspond to a subset of predicted Rgg-like sequences 16 ). Their sequences have over 17% sequence identity at the amino acid sequence level to the Rgg prototype, S. gordonii Rgg (Genbank: AAA26968) (see Fig. S1) (www.ncbi.nlm.nih.gov). These sequences encode a putative HTH motif within the first 157 amino acids, a C-terminal Rgg domain, as well the three conserved amino acids typical of Rggs that correspond to G8, R15 and W153 in the S. gordonii Rgg 41 .
To broaden our analysis beyond a single strain, we investigated the distribution of Rgg across pneumococcal strains using a set of thirty-one pneumococcal genomes. These genomes were selected because they consist of highly curated whole-genome sequences and capture a lot of the diversity in the pneumococcal species; we have employed these strains in previous work 9,14 . The pneumococcal set includes genomes used in the first large-scale pneumococcal pangenome study 31 , genomes from PCV-7 immunized children 32 , as well as genomes from non-encapsulated strains that make up a distinct phyletic group within the pneumococcus 33,35,36,42 . Together these strains reflect a large variety of multilocus sequence types (MLSTs) and serotypes, as well as strains isolated from different disease states and geographic locations. The predicted coding sequences from this strain set were annotated with RAST and organized into gene clusters, defined as groups of sequences with 70% identity over 70% of the length 39 . We identified seven clusters with coding sequences annotated as Rgg, MutR, and/or GadR. The CDD NCBI tool was used to identify Rgg C-terminal domains and DNA-binding N-terminal domains in these sequences. Finally, supporting our annotation that these are members of the Rgg family, they share sequence similarity to the Rgg prototype in S. gordonii.
Three clusters, represented by SPD144, SPD999, and SPD1952, are present in all the pneumococcal strains. In contrast, the clusters represented by SPD939 and SPD1518 are present in 54% and 38% of the strains in our pneumococcal set, respectively. Finally two additional clusters were absent in D39 and are rare across pneumococcal strains, these are present in 19% and 3% of the pneumococcal strains ( Fig. 1).
To expand our analysis and determine whether these Rgg are encoded in closely related species, we investigated three S. pseudopneumoniae, eight S. mitis, and six S. oralis genomes, as well as one S. infantis genome as an outgroup (Fig. 1). The orthologues of SPD999 are encoded in all the S. pseudopneumoniae, S. mitis, and S. oralis strains. The orthologues of SPD0144 and SPD1952 are common in these three-related species, and the remaining Rggs are either rare or absent in these related genomes.
Rgg/Shp144 and Rgg/Shp939 are quorum sensing systems. Gram positive bacteria use secreted peptides as signals for QS. A comprehensive in silico analysis of selected species in the genus Streptococci revealed the presence of Rgg proteins associated with internalized small hydrophobic peptides 6,19 . It was found that S. pneumoniae also has homologs of these systems. In this study, we focus on a core Rgg, Rgg/Shp144, and an accessory Rgg, Rgg/Shp939. We hypothesized that shp0144 and shp0939 encode signaling peptides for Rgg144 and Rgg939, respectively. To test this hypothesis, we employed cell-free culture supernatants from the wild type strain, which contains intact copies of rgg and shp, and from the isogenic mutants ∆rgg144, ∆shp144, and ∆rgg939/shp939. These supernatants were mixed with a reporter strain for shp144 that contains a Pshp144-lacZ fusion in the ∆shp144 mutant background. This mutant strain background was used to eliminate induction by the endogenously produced Shp144 (Fig. 2). Fresh uninoculated CDM was used as a negative control. Our results demonstrate that expression of Rgg144 and Shp144 from donor strains is required for transcription of shp144 in the recipient strain, since the activity levels of the reporter were significantly lower when exposed to supernatants from the ∆rgg144 and ∆shp144 than wild type (p < 0.001). Moreover, the mutation of rgg939/shp939 did not affect the activity level. The β-galactosidase activity of the reporter strain was 445.2 ± 7.0 MU for wild type and 416.5 ± 6.5 MU for the ∆rgg939/shp939. In contrast the activity was 165.4 ± 2.3 MU, 157.3 ± 8.7 MU and 173.5 ± 3.8 (n = 4) for the ∆rgg144, ∆shp144 and CDM, respectively. These data strongly suggest the products of shp144 and rgg144 determine the levels of a secreted molecule that can induce the shp144 promoter in recipient cells.
To investigate whether Shp144 is the secreted molecule, we utilized a synthetic form of this peptide. In streptococci the activity of Shp is located at the C-terminal ends of the processed peptides and multiple length peptide-pheromone variants have been identified 6,43 . Thus we added variously sized synthetic versions of the C terminus of Shp144 to the extracellular milieu of the P shp144 reporter strain (Fig. 3). A peptide corresponding to the C-terminal 12 amino acids of Shp144 induces a 2.5-fold change in the reporter, relative to the vehicle alone (p < 0.0001). To determine the minimum amino acid sequence length required for Shp144 activity we utilized synthetic peptides of different lengths. Peptides of 8 to 11 amino acids did not induce Pshp144, the peptide of 12 amino acids displayed maximal activity, with decreasing activity observed for peptides of 13-15 amino acids (Fig. 3). Together, these culture supernatant and synthetic peptides experiments show that rgg144 is required for Shp144 activity, and that Shp144 is a secreted peptide capable of autoinduction in producing and neighboring cells.  To investigate whether Rgg939/SHP939 is also a QS system, we performed a parallel set of experiments, using a reporter for Pshp939 (Pshp939-lacZ construct in a ∆shp939 background). Cell-free culture supernatants from the wild type strain did not induce the reporter strains. As the induction of QS systems require the accumulation of pheromone above threshold level, it is therefore likely that the secreted SHP939 level in these conditions does not reach the threshold required to trigger QS. However, extracellular addition of a synthetic Shp939 corresponding to the C-terminal 8 residues (SHP939-C8) induced a dramatic increase in Pshp939 activity (Fig. 4). Without synthetic peptide, the β-galactosidase activity of the reporter strain was 3.7 ± 0.3 MU, similar results were obtained when the reporter strain was treated with the negative control, namely a scrambled Shp939-C8Rev peptide. In contrast, in the presence of SHP939-C8 and SHP939-C9, representing 8 and 9 amino acids in the C-terminal end of Shp939, respectively, the Pshp939 was significantly induced (p < 0.001). These results strongly suggest that Shp939 is a secreted peptide capable of autoinduction in producing and neighboring cells, and that SHP939-C8 is the most active variant. Finally, our experiments also demonstrate that these SHPs are specific to their cognate Rgg. Synthetic SHP144-C12 does not induce Pshp939 (Fig. 4). Similarly, SHP939-C8 does not induce Pshp144 (Fig. 3).  Having determined that Shp144 and Shp939 are signaling molecules, and identified their most active variants, we then investigated dose dependent induction of Pshp144 and Pshp939. Increasing concentrations of SHP144-C12 and SHP939-C8 led to an increase in Pshp144 and Pshp939 driven β-galactosidase activity. The highest induction was obtained with 250 nM synthetic SHP144-C12 and SHP939-C8 ( Figures S2 and S3).

The regulatory interaction between Rggs and their cognate Shp peptides.
To further evaluate the function of Rggs in the regulation of shp144 and shp939, Pshp144-lacZ and Pshp939-lacZ constructs were transformed into the wild type strain D39, and the mutant ∆rgg144. The β-galactosidase activities were determined in CDM with or without addition of SHP144-C12 (Fig. 5A). The basal β-galactosidase activity of the Pshp144-lacZ fusion was 291 ± 3 MU, and increased further with addition of SHP144-C12 (P < 0.001). In stark contrast, the basal activity of the ∆rgg144 was much lower, and moreover it was not induced by SHP144-C12 (p > 0.05). Thus, we conclude that Rgg144 is required for basal levels and for induction of shp144.
Similarly, to determine the function of Rggs in shp939 expression, Pshp939-lacZ fusion was transformed into wild type D39, and the mutant ∆rgg939. The β-galactosidase activity was determined in CDM with or without SHP939-C8 (Fig. 5B). The results showed that the β-galactosidase activity of the Pshp939-lacZ fusion was induced significantly upon addition of SHP939-C8 (p < 0.0001). In contrast, no induction in the ∆rgg939 genetic background could be detected regardless of the addition of SHP939-C8. These findings demonstrate that Rgg939 is required for basal levels and for induction of shp939.
Next, we tested whether Rgg939 influences Pshp144 induction, and conversely whether Rgg144 influences Pshp939. To this end, we compared Pshp144 and Pshp939 activity across wild type, ∆rgg144, ∆rgg939 and ∆rgg144/939 (Fig. 5A,B). Pshp144-lacZ driven β-galactosidase activity was 186 ± 2 MU for the ∆rgg939 strain, and increased 1.8-fold with the addition of SHP144-C12 (P < 0.001). Although Pshp144 could be induced in ∆rgg939 by addition of SHP144-C12, the level of induction was significantly lower than that of wild type (p < 0.01), suggesting that Rgg939 is required for full induction of Pshp144 (Fig. 5A). We also determined Rgg144's role in induction of Pshp939 in the presence of SHP939-C8 (Fig. 5B). It was found that Pshp939 could be induced in ∆rgg144 background, but the level of induction was 2.2 times less than that of wild type (p < 0.01), signifying that Rgg144 is required for full induction of Pshp939 (Fig. 5B). These data indicate a regulatory interaction between these two QS systems.
Rgg144 and Rgg939 are important for mannose metabolism. In order to evaluate the responsiveness of rgg promoters in response to different carbon sources, the reporter strains Prgg144-lacZ-wt and Prgg939-lacZ-wt were grown in CDM supplemented with glucose, galactose, mannose or N-acetyl glucosamine microaerobically, and β-galactosidase activity was determined at late exponential phase (Fig. 6). These sugars were used because they are known to be present in complex host glycoproteins in the respiratory tract 44 . The results showed that the highest induction of lacZ was obtained when Prgg144-lacZ-wt was grown on mannose (p < 0.0001 compared to glucose), then by galactose (n = 9, p < 0.0001 compared to glucose) and glucose , while the presence of N-acetyl glucosamine led to the lowest β-galactosidase activity. The induction by mannose was significantly higher than that by galactose (p < 0.05). The Prgg939-lacZ-wt displayed a similar expression profile to Prgg144-lacZ-wt. The highest activity was obtained on mannose and the lowest on N-acetyl glucosamine.
To further substantiate the role of Rgg's in mannose metabolism, wild type D39 strain and its isogenic rgg/shp mutants were incubated in CDM supplied with 1% (w/v) glucose, galactose, mannose, or GlcNAc as the primary carbon source. While the growth profiles of the strains were similar to that of wild type on glucose, galactose, and GlcNAc, when mannose was used as the sole carbon source, ∆rgg144, ∆rgg939 and ∆rgg144/939 displayed a lower growth yield (highest OD600: 1.0 ± 0.02, 0.9 ± 0.05 and 0.9 ± 0.1, respectively) and rate (0.35 ± 0.006, 0.33 ± 0.04 and 0.3 ± 0.014, respectively) compared to the wild type D39 (yield 1.21 ± 0.007), (p < 0.0001), (rate 0.395 ± 0.009) (p < 0.05), (Fig. 7), showing the importance of Rgg144 and Rgg939 for mannose metabolism. The complemented mutants, on the other hand, had the same growth rate and yield on mannose ( Figure S4). These results show that the induction of shp promoters depends on the source of carbon and it is very likely that rgg144 and rgg939 play an important role in control of bacterial metabolism when mannose and galactose are abundant sugars.

Identification of Rgg regulon.
To reveal the wider influence of Rggs on pneumococcal biology, the genes potentially regulated by Rggs were determined by microarray analysis after growth on mannose and galactose (Tables 1, S4, 5, 6 and 7). For regulon determination, we used galactose and mannose because of the inducibility of rgg genes by these sugars. Regarding Rgg144, 154 genes were differentially expressed in ∆rgg144 versus wildtype on mannose (Table S4); of these 131 are negatively regulated and 23 are positively regulated. Notable genes repressed by Rgg144 were those putatively involved in (i) replication, recombination and repair, (ii) translation, ribosomal structure and biogenesis, (iii) capsule biosynthesis, (iv) nucleotide, transport and metabolism, and (v) those coding for hypothetical proteins. Furthermore, the locus adjacent to Rgg SPD_1518, encoding SPD_1513-SPD_1517, is also negatively regulated by Rgg144. This may indicate a potential regulatory interaction between Rgg144 and Rgg1518. The genes positively regulated by Rgg144 included the adjacent VP1 peptide 9 and downstream genes (SPD_0145-0147), which has been shown to have a role in biofilm formation and virulence and to be regulated by Rgg144 9 .
On mannose, 218 genes were differentially regulated in the rgg939 deletion mutant relative to the wildtype (Table S5). Of these 177 are negatively regulated and 41 positively regulated by Rgg939. There is a substantial overlap, 93 genes, between the genes negatively regulated by Rgg939 and Rgg144. In addition to this overlap, a number of loci were found to be differentially regulated only by Rgg939 (Table 1). These included genes encoding for putative cell division proteins (SPD_0007-SPD_0011), iron transport (SPD_0915-SPD_0920), cell membrane biogenesis (SPD_0940-SPD_0950), ATP synthase (SPD_1338-SPD_1340), and choline  transport (SPD_1642-SPD_1644). Finally, similar to Rgg144, we found that Rgg939 also influences the expression of genes regulated by other rgg genes. Specifically, Rgg939 upregulates the Rgg144-regulated VP1 locus (SPD_0145-SPD_0147). These interactions suggest cooperative behaviors across these Rggs. Moreover, the regulon overlap suggests that Rgg proteins have a core regulon that may be related to generalized functions of this protein family, and finally the differences between the regulons demonstrates that each Rgg also has specific roles under the same environmental condition.
On galactose, the number of differentially expressed genes controlled by the rgg genes was smaller than on mannose (Tables S6 and S7). Rgg144 regulates the adjacent SPD0144-SPD0149 and the genes SPD1514-SPD1516 that neighbor Rgg1518. Rgg939 regulates the Rgg144-adjacent SPD0146-SPD0147 and its neighboring genes SPD0940-SPD0950. The direction of Rgg939-regulation of the VP1 locus (SPD145-147) is sugar dependent, as it is upregulated in mannose and downregulated in galactose. This shows that different Rggs can regulate the same locus, and that under different environmental conditions the same Rgg can act either as a repressor or an activator for the same target gene.
The expression of selected differentially expressed genes for each Rgg regulon on galactose and mannose was further verified by RT-PCR. We found a similar expression trend as the microarray data (Table S8). In addition, by using 250-300 bp upstream sequence of differentially expressed operons in two different conditions, we identified a consensus sequence in silico for Rggs ( Figure S5 and File S2). The validity of this consensus sequence for Rgg needs to be verified futher in future studies.

Effect of Rggs on capsule synthesis. Capsular polysaccharide (CPS) is the most important pneumococ-
cal virulence factor, protecting the pneumococcus from phagocytosis and playing a crucial role in pneumococcal survival in different environments 45  Rgg939 acted as a repressor of the cps locus, we determined the amount of glucoronic acid produced by the rgg mutants growing on this sugar. In addition, pneumococci grown on glucose were included as control. Glucoronic acid is a major component of the type 2 capsule. Compared with wild type strain D39 (23.5 ± 2.4 µg/10 8 CFU, n = 9), both the rgg144 (37.2 ± 2.1 µg/10 8 CFU, n = 9) and rgg939 (41.9 ± 2.3 µg/10 8 CFU, n = 9) mutants produced more glucuronic acid when pneumococci were cultured on mannose (p < 0.01), but not on glucose.
(p > 0.05) There was no significant difference in capsule production between the wild type and the complemented mutants (p > 0.05). In addition, we investigated the interaction of recombinant Rgg144 and Rgg939 with the putative promoter region of cps. The results showed that both Rgg144 and Rgg939 interacted directly with the putative promoter region of cps, but not with nonspecific gyrB promoter showing the specificity of this interaction (Fig. 8).
To investigate whether Rggs also play a role in disease we evaluated the contribution of both proteins to pneumococcal virulence using a mouse model of pneumonia and septicemia. The median survival time of mice infected intranasally with ∆rgg144, ∆rgg939, and ∆rgg144 + 939 (104 ± 14.2, 98 h ± 15.3, 139 h ± 14.5, 112 h ± 18.8 and 109 h ± 11.2, respectively, n = 10) was significantly greater than the wild type-infected group (46 h ± 3.5, n = 10) (p < 0.01). The introduction of intact copies of rgg144 and rgg939 into the respective mutants reconstituted the virulence of these strains, with the median survival times of mice infected with ∆rgg144Comp (49 h ± 8.8, n = 5) and ∆rgg939Comp (72 h ± 25.5, n = 5) not being significantly different from the wild type-infected cohort (p > 0.05). Thus we conclude that Rggs are not only important in colonization, but also play a role in disease (Fig. 10).
The in vivo role of both QS systems has been further investigated by determining the expression of each rgg in pneumococci recovered from the nasopharynx and the lungs of infected mice relative to their expression in blood. It was found that both rgg144 (2.3-fold ± 0.13 and 2.8-fold ± 0.18, n = 3) and rgg939 (2.1-fold ± 0.10 and 2.4-fold ± 0.12, n = 3) were overexpressed in the lungs and nasopharynx, respectively, compared to their expression in blood.

Discussion
The pneumococcus is exposed to different environments in different host tissues during colonization and invasive disease. The microbe has a high level of adaptive capacity 4,45,46 . These adaptive mechanisms are vital for the in vivo survival of the pneumococcus and therefore represent a viable route for the treatment of pneumococcal infections. However, our knowledge about how the pneumococcus adjusts to changing environments is limited.
In this study we characterized two quorum-sensing systems associated with pneumococcal colonization and virulence. Until recently the repertoire of QS systems linked to the Rgg regulators in S. pneumoniae were limited to the competence regulon and LuxS/AI-2 13 . The landmark study of Fleuchot et al. 16 reported a comprehensive list of rgg genes associated with short hydrophobic peptides (Shp) in streptococci. Although some of Rgg/Shp pairs were studied in detail in S. pyogenes by the Federle group 10,43,47 , and in S. thermophilus 48 , knowledge of the functional roles of pneumococcal Rggs has been largely unknown, except our recent work detailing the involvement of a peptide in the Rgg144 regulon in biofilm formation and pneumococcal pathogenesis 9 , and a recent work linking Rgg939 to pneumococcal capsule and biofilm synthesis by Junges et al. 18 . Here, we have carried out a detailed analysis of Rgg144/Shp144 and Rgg939/Shp939, and demonstrated that these circuits operate as QS systems in the pneumococcus. We demonstrated that SHPs are secreted molecules, and they are regulated by the cognate Rggs. In future, the secreted nature of SHP molecules can be further confirmed by isolating and purifing the peptide in culture supernates by mass spectrophotometry analysis as described previously 16,43 .
Both Rgg QS systems were found to be involved in pneumococcal colonization and virulence. The reduction in colonization and virulence in the mutants is very likely due to the inability of mutants to utilize mannose efficiently, which are found in N-and O-linked glycans 28,46 . This explanation is supported by the fact that the expression of both rgg was stimulated by galactose and mannose, and the absence of Rggs led to the reduced utilization of mannose. The role of Rggs in host derived sugar metabolism was further supported by their higher expression in the respiratory tract, where there is higher level of galactose and mannose, relative to blood, where glucose is the predominant sugar 44 . The involvement of Rgg/Shp system in mannose metabolism was reported previously in S. pyogenes by the Federle group 49 . The pneumococcus has a large repertoire of glycosidases to release these sugars, and has the catabolic pathways to utilize them 2,25,28,46 . Despite their responsiveness to mannose and galactose, we did not detect any differentially expressed genes involved directly in mannose or galactose catabolism in putative Rgg144 and Rgg939 regulons. This led us to put forward the following scenario for likely in vivo roles of Rggs. We hypothesize that galactose and mannose act as signals to alter the pneumococcal phenotype in vivo. On the surface of the human respiratory tract there is a constant interaction between the pneumococcus and high molecular weight glycoproteins covering the apical epithelial surfaces of respiratory tract, such as mucin, which is rich in galactose and mannose. The initial breach of glycan component of mucin is prevented due to the presence of terminal sialic acids. Interestingly, Rgg939 is a repressor of nanA, gene responsible for major pneumococcal neuraminidase activity. Lack of access to the underlying sugars ensures that Rggs down regulate large number of genes involved in protein and capsule synthesis (Tables S3 and S4). Such an expression profile ensures a lower growth rate and promotes a stable commensal existence on mucosal surface. However, once the sialic acid removed in parallel to gradual increase in pneumococcal numbers, the microbe will eventually have access to the sugars located 'below' sialic acid, such as galactose and mannose. Access to these sugars will subsequently increase the expression of cognate shp genes, hence the synthesis of Shp peptides, which then interact with their Rgg proteins, activating Rgg/Shp circuits to modulate target gene expression. Figure 9. Pneumococcal strains without rgg144, rgg939 or rgg144/939 are less able to colonise the nasopharynx. Mice were infected with approximately 5 × 10 5 CFU pneumococci. At day 0 (A) and day 7 (B), five mice were culled, and the number of pneumococci in nasopharynx was quantified by serial dilutions of nasopharyngeal homogenates. Each column represents the mean of data from five mice. Error bars show the standard error of the mean. Significant differences in bacterial counts are seen comparing with the D39 wild type strain using one-way ANOVA and Tukey's multiple comparisons test (*p < 0.05, **p < 0.01 and ****p < 0.0001).
Scientific REPORTS | (2018) 8:6369 | DOI:10.1038/s41598-018-24910-1 One of the most fundamental impacts of Rggs on pneumococcal colonization has been found to be through their role in control of capsule expression. It has been reported that increase in capsule production leads to a decrease in pneumococcal colonization ability 50 . Hence, we speculate that it is the increased capsule synthesis by the mutant in the nasopharynx that led to the decrease in colonization. The extent of colonization may also be influenced by changes in biofilm formation, as Rgg144 positively regulates VP1, which increases biofilm development 9 . Combined, Rggs may increase adherence via downregulation of the capsule and increase biofilm development via upregulation of VP1. This explanation is consistent with the in vitro results, which show an elevated level of capsule synthesis in the mutants compared to the wild type, in the presence of mannose.
The positive association between Rgg expression and virulence is less intuitive, given that capsule production has been shown to enhance pneumococcal virulence. This contradiction, the reduced virulence in the mutants despite increased expression of cps locus, can be explained by different scenarios. Firstly, despite the increased expression of capsule in less-virulent Rgg deletion mutants, Rggs influence other genes that may play a role in virulence. For example, we have shown that the Rgg144-regulated VP1 is a potent virulence factor, thus lower levels of VP1 in the mutant may contribute to the decrease in virulence. Secondly, as our array data showed the regulation exerted by Rggs is condition specific. Therefore, as the pneumococcus migrates into the deeper tissue sites, its encounter with mannose may be limited than the concentration of mannose used in our in vitro regulon determination. Our microarray data reveals further other possible mechanisms for reduction in colonization and virulence. For example, we have seen reduction in expression of iron transport locus, changes in the expression of genes responsible for choline binding proteins, ATPase synthase, and cell division, which are known to be important for pneumococcal attachment, proliferation and energetics 51 . Unlike previous studies, in this study we identified a large number of genes differentially expressed genes in Rgg mutants relative to the wild type 16,43,49 . The direct target regulon of Rgg/SHP systems has often been reported as a small or a limited number of genes 16,52 . However, under different conditions, a higher number of genes might be modulated by these systems either directly or indirectly. In addition, a certain level of pleiotropy cannot be ruled out.
A recent study by Junges et al. 18 linked Rgg939 to pneumococcal capsule and biofilm synthesis due to Rgg939′s regulatory role on the SPD_0940-SPD_0949 locus. This locus contains genes for polysaccharide biosynthesis genes, among which mnaA and mnaB are noteworthy because the proteins coded by these genes play roles in the synthesis of N-acetylmannosaminuronic acid (UDP-ManNAcA), known to be present in serotypes 12 F and 12 A capsular polysaccharide 18 . Hence, whether this particular locus contributes to type 2 capsule biosynthesis is not clear, our study confirms that Rgg939 is involved in control of capsule biosynthesis, and this is not due only to the control exerted over SPD_0940-SPD_0949 locus but also Rgg939′s direct repressor role on capsule biosynthesis genes as demonstrated by microarray analysis and EMSA. It is noteworthy that while Junges et al. 18 observed an increase in capsule following the Rgg 939 overexpresson, we observed an increase in capsule when Rgg939 was deleted. It is possible that this reflect differences in condition, given that we performed experiments in CDM-man where Rgg939 is highly induced. Contrary to Junges et al., we demonstrated in this study that Rgg939 contributes significantly in pneumococcal colonization and virulence. This difference can be due to differences in the time point of sample analysis in each study. For example, while Junges and colleagues 18 analyzed bacterial load in nasal washes and bronchoalveolar lavage fluid at 24 h post-inoculation, we assayed nasal washes at 1 h and 7 days post-inoculation and survival of the mice for up to 7 days. Our analysis shows that the S. pneumoniae type 2 D39 strain encodes five members of the Rgg family. We found that while Rgg939 is required for full induction of shp144, SHP939 does not induce shp144. This shows that the absence of Rgg939 has genome wide effects, and the induction of shp144 is cognate peptide specific. Strikingly, despite low similarity between Rgg144 and Rgg939, we observe substantial overlap among the regulons of these Rggs. It should be also noted that though there is an overlap, there are also unique targets regulated by each Rgg. Rgg regulators are part of RRNPP (Rgg, regulator gene of glucosyltransferase; Rap, response regulator aspartate phosphatases; NprR, neutral protease regulator; PlcR, phosphatidylinositol-specific phospholipase C gene regulator; PrgX, pheromone-responsive transcription factor) family proteins 6,19 . Structure-function studies showed that Rap, NprR, PlcR, and PrgX employ a structurally similar C-terminal tetratricopeptide (TPR)-like repeat domain to bind their cognate peptide pheromones 19,53 . It may be possible that under different environmental conditions the conserved structural properties in different Rggs respond to same stimuli, which leads to the regulation of same genes, while differences in folding pattern or its affinity for the target DNA regulatory elements may provide target specificity to each Rgg, resulting in differences in regulon composition. Currently, there is no established paradigm for the action mechanism for Rggs, and the future structure-function studies similar to those done for PlcR and PrgX can test these hypotheses 54,55 .
Rgg-like regulators form a conserved family of transcriptional regulators and copies of different rgg are found in the individual genomes of a subset of low-G + C Gram-positive bacteria, including the genera Streptococcus, Listeria and Lactobacillus 16 . Even though Rgg has been studied in several streptococci, such as in S. pyogenes, S. oralis, S. mutans, S. suis and S. gordonii 6,11,16,41,43 , those studies focused mainly on one particular Rgg 7,10 yet many different Rggs exist, even within a single strain. The existence of multiple structural variants of Rgg in each bacterium suggests that individual Rggs perform distinct functions within each bacterium and indeed studies in other streptococci have shown that Rggs exert control over a wide range of events, including stress responses, nutrient metabolism, bacteriocin production, biofilm formation, quorum sensing and virulence 6,11,16,41,43 .
There is an urgent need to identify new microbial targets for anti-infectives, which will allow the development of new classes of antibiotics. Most antiinfectives act by directly inhibiting key central cell functions, namely DNA, protein or cell wall synthesis 56 . A different approach is to target virulence factors, metabolic functions or environment responsive elements 57 . Our data clearly show that Rgg144 and Rgg939 can be potential targets for next-generation drugs. One approach is to focus on the methods to interfere with the interaction between the signal peptide and Rgg proteins to modulate pneumococcal virulence and growth.