Transcriptional control of central carbon metabolic flux in Bifidobacteria by two functionally similar, yet distinct LacI-type regulators

Bifidobacteria resident in the gastrointestinal tract (GIT) are subject to constantly changing environmental conditions, which require rapid adjustments in gene expression. Here, we show that two predicted LacI-type transcription factors (TFs), designated AraQ and MalR1, are involved in regulating the central, carbohydrate-associated metabolic pathway (the so-called phosphoketolase pathway or bifid shunt) of the gut commensal Bifidobacterium breve UCC2003. These TFs appear to not only control transcription of genes involved in the bifid shunt and each other, but also seem to commonly and directly affect transcription of other TF-encoding genes, as well as genes related to uptake and metabolism of various carbohydrates. This complex and interactive network of AraQ/MalR1-mediated gene regulation provides previously unknown insights into the governance of carbon metabolism in bifidobacteria.


Results
Identification and genetic analysis of araQ and malR1 on the B. breve UCC2003 genome. This study was initially aimed at the functional characterization of six previously described LacI-type transcription factors (TFs), namely AraQ, MalR1, MalR2, MalR3, MalR5 and MalR6 along with their proposed regulation of central metabolic pathways and malto-oligosaccharide utilisation pathways in B. breve UCC2003 40 . In the course of our experiments we found that MalR1 was in fact functionally more similar to AraQ and for this reason our experimental scope was narrowed to specifically characterise AraQ and MalR1. The encoded products of araQ and malR1 represent proteins of 371 and 338 amino acids, respectively, which exhibit 26% overall identity to each other (91% query coverage). Their respective DNA binding domains (encompassing residues 11-57 for AraQ and residues 3-47 for MalR1) exhibit a higher level of identity at 49% (93% query coverage) (BlastP). This may be a reflection of the conserved nature of the DNA-binding helix-turn-helix regions of LacI-type proteins in general 41 , coupled to the finding that, as will be outlined below, AraQ and MalR1 recognize (near) identical operator sequences.

In vitro characterisation of the regulons of AraQ & MalR1 by EMSA analysis. In order to assess
AraQ and MalR1 interaction with their possible operator sequences, electrophoretic mobility shift assays (EMSAs) were carried out using DNA fragments encompassing upstream regions of genes corresponding to the previously predicted AraQ and MalR1 regulons 40 , and to a number of (differentially transcribed) genes that were identified in microarray analyses involving insertion mutants of araQ and malR1 (discussed below). Obtaining purified and active AraQ or MalR1 protein from B. breve initially proved to be troublesome, in line with similar issues observed for other bifidobacterial LacI-type proteins [34][35][36][37]42 . We adapted our original protocol so as to allow purification of active (i.e. DNA-binding competent) AraQ and MalR1 (see Materials and Methods), which were tested for their ability to bind to DNA sequences encompassing the operator sequence or transcription factor binding site (TFBS) previously predicted for AraQ 40 . The ability of AraQ and MalR1 to bind DNA fragments was graded utilising a non-linear, arbitrary scale (see Materials and Methods). This EMSA analysis revealed that AraQ and MalR1 bind with varying degrees of affinity to the promoter regions of 23 and 20 genes, respectively, out of a total of 33 DNA regions tested. Figure 2 presents an example of complete DNA fragment binding, while additional examples can be found in Supplemental Fig. S1. Table 1 summarises all obtained EMSA results, while Fig. 3 is a schematic representation of the results obtained in this study. Interestingly, some of the promoter regions to which AraQ/MalR1 did not appear to bind, are associated with genes that display differential transcription in the microarray analysis of the corresponding araQ/malR1 mutants (see below), indicating that such differential transcription reflects indirect regulatory effects.
Six of the genes/gene clusters whose promoter regions were clearly shown to be bound by AraQ and MalR1 encode enzymes that are part of the bifid shunt as illustrated in the schematic representation of central metabolism in Fig. 3. Several of these genes are essential for growth on glucose 43 (Table 1) and are also highly transcribed during logarithmic growth on glucose as a sole carbon source 44 . The ability of AraQ to bind to the promoter regions of these six genes/gene clusters involved in central metabolism as well as those of certain genes involved in the metabolism of starch-like/-derived carbohydrates is consistent with previous in silico predictions 40 . EMSA analyses furthermore demonstrated that, similar to AraQ, MalR1 binds to promoter regions of genes/gene clusters involved in central carbon metabolism along with genes involved in the metabolism of starch-like/derived carbohydrates (and that the deduced AraQ and MalR1 regulons therefore largely overlap). The ability of MalR1 to transcriptionally control genes that are required for the metabolism of starch-like/-derived carbohydrates is consistent with in silico predictions 40 . However, MalR1's presumed ability to control genes involved in central carbon metabolism was an unexpected finding, as these genes were not predicted to be members of MalR1's predicted regulon 40 .
Based on their binding behaviour AraQ and MalR1 appear to have a similar affinity for the six promoter regions of the central metabolism genes (Table 1 and Fig. 3). Furthermore, AraQ and MalR1 were both found to be capable of binding their own and each other's presumed promoter regions. This finding suggests that these genes are subject to interactive autoregulation, in which they not only control the transcription of their own genes, but also that of the other. Furthermore, as based on their binding behaviour, AraQ and MalR1 appear to control transcription of the LacI-type-encoding genes MalR2, MalR3, MalR5, MalR6 and CarD-like TF ( Table 1).
Determination of the operator sequences recognized by AraQ and MalR1. As shown above, EMSA analyses employing AraQ and MalR1 clearly demonstrate their in-vitro affinity for various promoter regions, including those that correspond to several central metabolic genes. In order to identify the TFBSs that are recognized by either AraQ or MalR1, fragmentation analysis was carried out (Fig. 1). This analysis was performed employing the promoter regions of the following genes: eno (encoding enolase and corresponding to locus tag Bbr_0725) and pyk (encoding pyruvate kinase and corresponding to locus tag Bbr_0757). These genes were selected as they which hold key roles in central carbon metabolism (Fig. 3). DNA fragments encompassing various sections of the promoter region upstream of either eno or pyk were subjected to EMSA analysis employing purified AraQ or MalR1. These analyses revealed that a 47 and a 48-bp DNA segment of the eno and pyk promoter regions, respectively, each harbour an operator site for both AraQ and MalR1 (Fig. 1). Importantly, both of these segments contain previously predicted candidate AraQ binding sites 40 , specifically the 47-bp segment of DNA located upstream of the eno gene contains a 20 bp imperfect palindrome CATTGTGAGC><GCTCACATCA (with "><" indicating the center of the palindrome), while the 48-bp DNA segment located upstream of the pyk gene contains a very similar 20 bp palindrome CAGTGTGAGC><GCTCACAACA.
Next, bioinformatic analysis through the use of a Position Weight Matrix (PWM) approach was employed to identify additional genes which may be regulated by AraQ and MalR1 in B. breve. For this task two PWM's were used (see Materials and Methods section), resulting in the identification of several additional genes/gene clusters that appear to contain this TFBS in their predicted promoter region (though this will require experimental verification). The full list of genes assigned to the AraQ/MalR1 regulon is present in Supplemental Table S1. The majority of these regulated genes are involved in central carbohydrate metabolism and were previously classified as components of the global AraQ regulon in Bifidobacteriaceae 45 . Some B. breve-specific genes of the AraQ/MalR1 regulon include Bbr_0037/0038 (predicted anhydrase/alkyl hydroperoxide reductase), Bbr_0221-Bbr_0222 (iron-uptake system BfeU-BfeO) 46 Table S1). The use of a modified PWM which was designed based only on sequences of promoter regions with which AraQ/MalR1 had a medium or high affinity did not significantly change the regulon composition. Specifically, additional potential TFBSs were found in (intergenic) promoter regions of Bbr_0117/Bbr_0118 (glucosidase/sugar-binding protein), Bbr_0176 (penicillin binding protein), and Bbr_0921 (long-chain fatty acid-CoA ligase) (Supplemental Table S1; the derived AraQ/MalR1 binding motifs as based on these predicted sites are presented in Fig. 1, Panel 3). Some divergence was identified between AraQ and MalR1 specificity for certain TFBSs, although for both TFs the affinity for these is rather low ( Table 1). The construction of a PWM specific for either AraQ or MalR1 was not possible, the primary reason for this is that TFBSs of differentially bound genes contribute significantly less to the overall weight matrix than those associated with central metabolism (such as eno and pyk), which form the core of the regulon. Furthermore, this analysis found that AraQ and MalR1 bind with the highest affinity to promoter regions of genes involved in central metabolism. This observed high affinity may be explained by the conserved G, C, and A in positions 9, 12 and 15, respectively, of the AraQ/MalR1 binding motif (Fig. 1, Panel 3). www.nature.com/scientificreports www.nature.com/scientificreports/ Interestingly, the deduced TFBSs of AraQ and MalR1 are located upstream of recently assigned −10/−35 promoter sites of six bifid shunt-associated genes/gene clusters (Table 1 and Supplemental Table S3) 24 . This TFBS location relative to the −10/−35 sites is therefore typical of a transcriptional activator and in this context it should be noted that all genes encoding the bifid shunt enzymes exhibit a high level of transcription when grown on glucose 24,44 . Further analysis showed that the TFBSs of AraQ/MalR1 are not located at a particular distance from these promoter sequences: Supplemental Table S3 illustrates that the AraQ/MalR1 TFBS is located 16 and 15 bp upstream of the −35 site in the promoter regions of eno and pyk, and 33, 34 and 35 bp upstream of the −35 site of pfl, icfA and gap, respectively. Therefore, the position of the AraQ/MalR1 TFBS relative to the −35 site implies that AraQ/MalR1 bind in multiples of one helical turn plus one half (i.e. 1.5, 3.5 or 6.5 helical turns) away from the transcriptional start sites (TSS). Finally, analysis of the deduced TSS with respect to the location of the TFBSs of AraQ/MalR1 upstream of carD, glgP and cldE revealed that the TFBS is overlapping with predicted −35/−10 regions, indicating that AraQ/MalR1 in these cases act as repressors, being more typical of LacI-type TFs 48 .
Attempts to identify effector molecules for AraQ and MalR1. Bioinformatic and EMSA analyses allowed the deduction of the AraQ/MalR1 regulons and associated operator sequences. We next wanted to gain further insight into the nature of the environmental/metabolic signals that impact on the recognition ability and/ or binding affinity of AraQ and MalR1 to their target DNA sequences. For this purpose, EMSA analyses were performed whereby AraQ or MalR1 was incubated with one of its binding targets, in this case the araQ DNA promoter region, in the presence of each of a variety of possible (available/relevant) effectors (Supplemental Table S4). These effectors were selected for two main reasons: (i) they occupy key positions within the bifid shunt (e.g. Phosphoenolpyruvate, D-erythrose-4-Phosphate and Glyceraldehyde 3-phosphate), which may allow them to act as cues to provide information on the metabolic status of the cell, and (ii) they are known to act as effector molecules in other organisms 45,49 . The results obtained indicate that under the employed experimental conditions none of the effector molecules tested in this study consistently decreased/increased AraQ or MalR1  Table 1, enzymatic steps which are under the regulation of AraQ and MalR1 are indicated by a red or blue dot, respectively. Abbreviation: Agl, alpha glucosidase; Bgl, beta gluosidase; GlkA, glucokinase; CldC, beta glucosidase; ApuB, amylopullunase; MalQ, glucanotransferase; Rk, ribokinase; Gpi, glucose-6-p isomerase; Tal, transaldolase; Tkt, transketolase; R5PI, ribose-5-P isomerase; R5PR, ribose-5-P reductase; GADPH, glyceraldehyde-3-phosphate dehydrogenase; Xfpk, Xylose-5-P/Fructose-6-P phosphoketolase; AckA, acetate kinase; Ald2, alcohol dehydrogenase 2; Pgk, phosphoglycerate kinase; Gpm, phophoglycerate mutase; Eno, enolase; Pyk, pyruvate kinase; Pfl, pyruvate formate lyase; ldh2, lactate dehydrogenase 2. www.nature.com/scientificreports www.nature.com/scientificreports/ ability to bind to DNA fragment encompassing the araQ promoter region (example effector EMSA can be found in Supplemental Fig. S2). It is possible that the in vitro experimental set up did not allow for the identification of the effector molecule(s), or perhaps that the molecule is the product of another metabolic pathway outside of central carbon metabolism.
Transcriptome analysis of B. breve UCC2003-araQ and B. breve UCC2003-malR1. In order to further investigate the role of araQ and malR1 in controlling central carbon metabolism, insertional mutants were created in each of these genes in B. breve UCC2003, resulting in strains B. breve UCC2003-araQ and B. breve UCC2003-malR1 (respectively; see Materials and Methods). These two mutants were then employed to assess transcriptional changes due to either of these mutations using microarray analysis.
The transcriptome of B. breve UCC2003-araQ, when compared to that of B. breve UCC2003 (Supplemental Tables S5 and S6), revealed that 32 genes were transcriptionally up-regulated and 45 were down-regulated above a fold change of 2.0, p-value < 0.001 (see materials and methods for further details). Furthermore microarray data revealed that, when compared to B. breve UCC2003, the previously characterised malto-oligosaccharide uptake system (Bbr_0118-Bbr_0121) 50 is significantly up-regulated in B. breve UCC2003-araQ (4.4, 4.4, 4.5 and 3.9-fold change respectively; p < 0.001). In silico analysis via a PWM-based approach identified AraQ/MalR1 TFBSs upstream of the above-mentioned gene cluster, the location of these TFBSs indicates that AraQ/MalR1 act as repressors of the Bbr_0118-Bbr_0121 gene cluster. Similarly, malR3 (Bbr_0122; specifying a LacI type transcriptional regulator) and apuB (Bbr_0123; encoding an amylopullulanase) were up regulated 3.7 and 4.3-fold, respectively (p < 0.001) in B. breve UCC2003-araQ as compared with the wild type. rbsD (Bbr_1416), encoding a D-ribose pyranase, which is located in a ribose uptake cluster was also upregulated 2-fold (p < 0.001), PWM approach also identified the AraQ/MalR1 TFBS upstream of this cluster.
Most bifid shunt-associated genes whose promoter regions contain binding sites for AraQ/MalR1 were not significantly altered in the transcriptional analysis employing B. breve UCC2003-araQ or B. breve UCC2003-malR1 (when compared to the transcriptome of wild type B. breve UCC2003; Supplemental Tables S5 and S6). This may be due to compensatory activity of each of the two regulators, in which AraQ will functionally compensate for loss of MalR1 (in B. breve UCC2003-malR1) and vice versa (for strain B. breve UCC2003-araQ). Attempts to create a double mutant of both araQ and malR1 were unsuccessful, indicating that these two regulators form an essential transcriptional circuit of the bifid shunt genes.
The transcriptome of B. breve UCC2003-malR1 revealed, when compared to that of the B. breve UCC2003 WT (Supplemental Tables S7 and S8), just a single gene, malE1 (Bbr_0118), to be transcriptionally up-regulated above a fold change of 2.0, while no genes were found to be significantly (fold change of >2.0) down-regulated in the transcriptomic analysis of the B. breve UCC2003-malR1 mutant. In silico analysis via a PWM-based approach identified a MalR1 TFBS upstream of malE1 (Bbr_0118), where the location of this TFBS indicates that MalR1 acts as a repressor of this gene. Two genes, apuB and malG1, that form part of the predicted malR1-governed regulon 40 , were up-regulated to a lower, yet significant degree (1.8 and 1.3-fold change, respectively; P < 0.001). Notably, some of the central metabolic genes were found to be up-regulated at a rather modest, yet statistically significant level (i.e. xfp, eno, tkt, pfl and ldh, whose transcription was determined to be increased by 1.5, 1.3, 1.3, 1.2 and 1.2-fold, respectively; p-values < 0.004).
Distribution of araQ and malR1 homologues among Bifidobacteriaceae members. In order to estimate conservation of the proposed AraQ/MalR1-mediated regulatory circuit in other bifidobacterial species and their close relatives, we analysed the distribution of the araQ and malR1 genes in a set of 70 Bifidobacteriaceae genomes. This was carried out by aligning the 70 Bifidobacteriaceae genomes using a PATtyfam approach and the BLAST tool, followed by alignment with MUSCLE (see materials and methods for more details). The two regulator-encoding genes showed somewhat different patterns of distribution; araQ was fully conserved among all currently known species of the genus Bifidobacterium, while missing in genomes of closely related genera such as Scardovia and Pseudoscardovia (overall occurrence of araQ was determined to be 94%), whereas malR1 was conserved in 50 genomes (overall occurrence 71%) (Supplemental Table S9). Interestingly, the absence of araQ in several Scardovia and Pseudoscardovia genomes coincides with the absence the arabinose utilization operon araBDA, which supports our previous hypothesis that AraQ evolved from a local regulator of arabinose metabolism in an ancestor Bifidobacteriaceae species to a global regulator of central carbohydrate metabolism in Bifidobacterium and Gardnerella by regulon expansion 51 .
A phylogenetic tree illustrating this information was also was built using IQ-TREE 52 utilising an ultrafast bootstrap with 1000 replicates. BglR (Bbr_1659) is a distant LacI family regulator from B. breve UCC2003, this TF was employed as an out group as its function is predicted to be different from those of MalR1 and AraQ (Supplemental Fig. S3).
AraQ has been identified in distant lineages of Actinobacteria (functioning as a local TF) 45 , while orthologs of malR1 (as well as the other malR genes) are confined to genera within the Bifidobacteriaceae family (Supplemental Table S9), implying that MalR1 is unique to this particular taxon and emerged after AraQ in the process of evolution. Although it is likely that MalR1 shares the same regulatory function as AraQ in many bifidobacteria by recognizing the same binding motif, this consideration is based on both the ubiquitous pattern of malR1 distribution (especially when compared with genes of other MalR regulators, which are mosaically distributed among analysed strains and exhibit considerably lower prevalence), along with experimental validation through EMSA analysis. Thus, the possibility that MalR1 operates differently in other bifidobacteria or closely related genera cannot be entirely excluded. We were also unable to identify conserved joint AraQ/MalR1 binding sites in Gardnerella vaginalis ATCC 14019 using a PWM approach, the only analysed strain that lacks araQ yet possesses malR1. This observation suggests that in this bacterium MalR1 recognizes a different binding motif, and presumably works as a local TF regulating maltose/maltodextrin utilization. www.nature.com/scientificreports www.nature.com/scientificreports/ Survival of B. breve UCC2003-araQ and B. breve UCC2003-malR1 when exposed to porcine bile. Several publications have reported that the (in)ability to survive bile stress is linked with shifts in the expression of genes involved in central metabolism and with the concentration of its resulting end products in various Bifidobacterium species [53][54][55][56] . For this reason, B. breve UCC2003 WT, B. breve UCC2003-araQ and B. breve UCC2003-malR1 mutants were analysed for their ability to grow in the absence (control) as compared to the presence of 1% porcine bile with OD 600nm readings taken at 24-hour post inoculation. Figure 4 displays the fold reduction in OD 600nm following 24-hour growth in the presence of 1% porcine bile as compared to the control. A significant fold reduction in OD 600nm readings was observed for all three strains when grown in the presence of porcine bile, as compared to OD 600nm readings without porcine bile. However, the fold reduction in growth of B. breve UCC2003-araQ and B. breve UCC2003-malR1 mutants was more pronounced compared to the wild type strain B. breve UCC2003: in the presence of 1% porcine bile, the B. breve UCC2003-araQ and B. breve UCC2003-malR1 mutants demonstrated a fold reduction of 11.1 (p-value < 0.010) and 6.4 (p-value < 0.010) in OD 600nm readings, respectively, whereas the fold reduction for the control strain B. breve UCC2003 was 3.5.

Discussion
The distal gut does not supply bacteria with a reliable carbon source; available carbohydrates are dependent on many factors such as host diet and other (competing) bacteria present in the GIT. Therefore, it is necessary for bifidobacteria to be capable of utilising a multitude of carbon sources and to do so efficiently in order to be competitive. Gene regulation is one way in which bifidobacteria ensure that their energy generation is in line with available carbohydrate substrates, and that energy and biosynthetic demands are kept in balance.
Previous analysis of the AraQ regulatory network in 39 bifidobacterial strains predicted that the conserved core of its regulon comprises genes involved in central carbohydrate metabolism 45 . In the current study, we experimentally validated these predictions. Additionally, it appears that AraQ and MalR1 together impose global shifts in gene transcription in B. breve UCC2003 through their apparent ability to regulate a number of TFs (i.e. CarD, MalR2, MalR3 and MalR5) and consequently the regulons of these TFs. AraQ and MalR1 were also found to control the transcription of key genes of the bifid shunt in B. breve UCC2003 (Fig. 3). Furthermore, AraQ and MalR1 appear to regulate transcription of their own genes (auto-regulation), while also regulating each other's transcription. This mechanism of regulating central carbon metabolism with two LacI-type TFs is reminiscent of that of a previously described system in Corynebacterium glutamicum 57,58 , which utilises GntR1 and GntR2 to regulate many of its genes involved in central metabolism. From the current analysis AraQ and MalR1 are deduced to repress transcription of a number of genes related to sugar uptake, and to activate transcription of genes related to central carbon metabolism, while the opposite was seen for GntR1 and GntR2 58 . Initially, we believed that the functional redundancy of malR1 was the product of a gene duplication event, as was predicted for gntR1 and gntR2 58 . However, GntR1 and GntR2 exhibit 78% amino acid similarity, while AraQ and MalR1 display just 26% similarity. A higher level of similarity would be more in line with a (recent) gene duplication event. www.nature.com/scientificreports www.nature.com/scientificreports/ Additionally, it is also apparent from the phylogenetic analysis that AraQ and MalR1 form two non-orthologous groups (Fig. S3).
Analysis of araQ and malR1 distribution across a set of 70 Bifidobacteriaceae genomes revealed slightly different patterns of distribution with araQ present in 94% and malR1 present in 71% of these genomes tested. This distribution may be explained in a number of ways, with one possible evolutionary scenario being motif switching whereby MalR1 evolved to recognise the binding motif of AraQ. While duplicating the role of AraQ, the function of a maltose/maltodextrin utilization regulator, in that case, may have been transferred to other MalR paralogs. Additionally, it seems that in some bifidobacteria of insect/bird/marmoset origin malR1 was lost probably due to different selective pressures experienced by these microorganisms. A PWM approach was also utilised to identify to identify additional genes which may be regulated by AraQ and MalR1 in B. breve. In some cases, AraQ/MalR1 PWMs failed to identify TFBSs of genes not involved in central metabolism to which recombinant TFs showed weak or medium affinity. This contradiction can be explained either by: (i) limitations of the EMSA approach, namely, non-specific binding of AraQ/MalR1 to sites of other TFs from the LacI family present in promoter regions of these genes; (ii) limitations of the PWM approach, specifically insufficient sensitivity of designed matrices.
The binding sites of AraQ/MalR1, where AraQ/MalR1 are believed to function as activators, were found to bind at a distance of a multiple of one turn plus one half, away from the TSS. This is consistent with findings for another LacI-type family member, CcpA, which may act as a repressor or activator 48 , and which binds 14 bp upstream of the −35 site when functioning as an activator. These authors suggested that CcpA binds to the same side of the helix face as the position of the −35 site. The AraQ/MalR1 TFBS was found (in some cases) to be located at a similar distance upstream of the −35 site implying that AraQ/MalR1 may also bind on the same side of the helix face as the position of the −35 site. One important feature of LacI-type TFs is that they can alter the local DNA structure 48 , which in this case may enhance the affinity of the RNA polymerase holoenzyme to its cognate −35/−10 sites.
Transcriptional analysis for B. breve UCC2003-araQ vs the B. breve UCC2003 WT strain demonstrated that transcription of a number of genes is repressed by AraQ, whereas genes where AraQ is predicted to act as a transcriptional activator (based on the location of the TFBS in relation to the −35 site) showed no significant change in expression. This result is in line with our predictions as insertional mutagenesis should reduce the amount of active AraQ protein in the cell, therefore the expression of those genes which AraQ activates should remain unchanged.
The biological significance of a bacterium producing two distinct proteins which appear to have overlapping functions is currently unknown. We hypothesize that there may be a link between the presence of these two proteins and the stress response of the bacterium. Several publications have reported that the (in)ability to survive bile stress is linked with shifts in the expression of genes involved in central metabolism and with the concentration of its resulting end products in various Bifidobacterium species [53][54][55][56] . For this reason, growth assays were carried out on B. breve UCC2003-araQ and B. breve UCC2003-malR1, along with the wild type control strain B. breve UCC2003, when exposed to porcine bile (Fig. 4). This analysis revealed that there is a statistically significant drop in viable cell count for both B. breve UCC2003-araQ and B. breve UCC2003-malR1, suggesting that this hypothesis may be correct yet requires further in-depth investigation. The mechanism behind this phenotype remains unclear, though one possible cause may be that B. breve UCC2003-araQ is unable to quickly adjust its central metabolism in an optimal way under this stress condition, as previously reported 59 . AraQ and MalR1 binding sites have also been identified upstream of a predicted choline transporter (Bbr_0603) and an iron uptake system (Bbr_0121-0122) 46,47 , indicating that these TFs control a number of genes involved in bile and other cell stress responses. AraQ and MalR1 ability to control central metabolism may also aid survival under bile stress conditions by increasing ATP production from the bifid shunt, while fine tuning the expression on essential central metabolic genes, a role which would benefit from the presence of two distinct TF proteins.
As with all in vitro assays, EMSA and transcriptomic analyses have their limitations and it is important to consider these factors on interpretation of the obtained results. Nonetheless, our findings indicate that AraQ and MalR1 play an important role in regulating the metabolic flux of central metabolism in a large proportion of the Bifidobacteriaceae, presumably contributing to metabolic agility and associated ability to adapt to stressful conditions. In conclusion, this work revealed a novel and apparently unique regulatory mechanism employed by most human-derived bifidobacteria. This regulatory system appears to be crucial to provide adaptability and competitiveness in the constantly changing environment of the gastrointestinal tract.
Nucleotide sequence analysis. Sequence data were obtained from the Artemis-mediated 61  www.nature.com/scientificreports www.nature.com/scientificreports/ nih.gov) utilising the basic local alignment search tool (BLAST). Sequence analysis was performed employing the Seqbuilder and Seqman programs of the DNASTAR software package (DNASTAR, Madison, WI). Protein functions were assigned with the use of BLASTP, and homology detection and structure prediction by HMM-HMM comparison and HHpred 63,64 .
For comparative studies, a non-redundant set of 70 Bifidobacteriaceae genomes was created. The search for AraQ and MalR1 orthologs in these genomes was carried out using the PATtyfam approach and the BLAST tool integrated into the SEED genomic platform 65 . Sequences of orthologs were collected into one file and then aligned via MUSCLE 66 . The alignment was subsequently manually filtered to reduce the number of gaps. A phylogenetic tree was built using IQ-TREE 52 with the following parameters: automatic substitution model selection, ultrafast bootstrap with 1000 replicates, and employing BglR (Bbr_1659), a distant LacI family regulator from B. breve UCC2003 with predicted different function, as an outgroup. DNA manipulations. DNA manipulations were carried out based on previously published protocols 67 .
Restriction enzymes and T4 DNA ligase were obtained from Roche Diagnostics (Basel, Switzerland), and were used according to the manufacturer's instructions. PCRs were performed using either Q5 ® High-Fidelity DNA polymerase (New England Biolabs, Hertfordshire, UK) or Extensor Long Range PCR Enzyme master mix (Thermo Scientific, Glouchester, UK). Synthetic oligonucleotides were synthesized by Eurofins (Ebersberg, Germany) and are listed in supplemental Table S2 and S10. Ird-labelled synthetic oligonucleotides were provided by IDT (Integrated DNA technologies, Dresden, Germany) and are listed in Supplemental Table S2. PCR products were purified with the use of a High-Pure PCR product purification kit (Roche, Basel, Switzerland). Plasmid DNA was introduced into E. coli and B. breve by electroporation, and large-scale preparation of chromosomal DNA from B. breve was performed as described previously 68 . Plasmid DNA was obtained from B. breve and E. coli using the Roche High Pure plasmid isolation kit (Roche Diagnostics, Basel, Switzerland). An initial lysis step was performed using 30 mg ml −1 of lysozyme for 30 min at 37 °C as part of the plasmid purification protocol for B. breve.

Construction of B. breve UCC2003 insertion mutants.
Internal fragments of Bbr_0411 (designated here as araQ; fragment used was 402 bp in length, representing codons 74 through to 208 of the 371 codons of this gene) and Bbr_1846 (designated here as malR1; fragment used was 504 bp in length, representing codons 102 through to 267 of the 338 codons of this gene) were amplified by PCR using B. breve UCC2003 chromosomal DNA as a template (primers employed are listed in Supplemental Table S10). Insertional mutagenesis was carried out as previously described 69 . The presence of the tetracycline resistance cassette was confirmed by colony PCR employing primer combination TetF and TetR, while site-specific recombination of potential Tet-resistant mutants was confirmed by colony PCR using a combination of the TetR primer and a primer located upstream of the recombination site in araQ or malR1 in the chromosome of B. breve UCC2003 (see Supplemental  . Insertional mutants and the WT strain were cultivated in mMRS medium supplemented with 0.5% ribose until an OD 600nm of ∼0.6 was achieved. We employed ribose for such transcriptome analyses as this sugar has been utilized previously as a reference carbohydrate for various metabolic studies 37,[70][71][72] . Furthermore, transcriptome analysis for B. breve UCC2003-araQ and B. breve UCC2003 WT was carried out when these strains were grown on lactose as a sole carbon source, which generated similar results as compared to ribose-grown cells except for the expected sugar-specific differences. Cells were harvested by centrifugation at 10,000 rpm for 2 min at room temperature and immediately frozen at −80 °C prior to RNA isolation. DNA microarrays containing oligonucleotide primers representing each of the annotated genes on the genome of B. breve UCC2003 were designed by and obtained from Agilent Technologies (Palo Alto, CA, USA). Cell disruption, RNA isolation, RNA quality control, and cDNA synthesis and labelling were performed as described previously 73 . The labelled cDNA was hybridized using the Agilent Gene Expression hybridization kit (part number 5188-5242) as described in the Agilent Two-ColorMicroarrayBased Gene Expression Analysis v4.0 manual (G4140-90050). Following hybridization, the microarrays were washed in accordance with Agilent's standard procedures and scanned using an Agilent DNA microarray scanner (model G2565A). The generated scans were converted to data files with Agilent's Feature Extraction software (v9.5). The DNA microarray data sets were processed as previously described 53,74,75 . Differential expression tests were performed with the Cyber-T implementation of a variant of the t-test 76 . A gene was considered to exhibit a significantly different transcription level relative to the control when p < 0.001 and an expression ratio of >2 or <0.25. The microarray data sets obtained in this study have been deposited in NCBI's Gene Expression Omnibus database and are accessible through GEO series accession number GSE108949.

AraQ and MalR1 expression and purification.
For the construction of plasmids pQE60 + araQ and pQE60 + malR1, DNA fragments encompassing araQ (corresponding to locus tag Bbr_0411) and malR1 (corresponding to locus tag Bbr_1846) were generated by PCR amplification employing chromosomal DNA of B. breve UCC2003 as a template, Q5 high-fidelity DNA polymerase, and primers araQ_F and araQ_R, and mal-R1_F and malR1_R, respectively (Supplemental Table S2). An in-frame His10-encoding sequence is contained within the 3' end of the pQE60 construct to facilitate downstream protein purification. The araQ-encompassing PCR product was digested with NcoI and BglII, while the malR1-encompassing PCR product was digested with NcoI and BamHI. The digested PCR products were ligated into a similarly digested pQE60, an IPTG-inducible translational fusion plasmid. The ligation mixtures were introduced into E. coli XL1-Blue or E. coli EC101 by electro-transformation, and transformants were then selected on the basis of Ampicillin resistance (Amp R ). The plasmid contents of a number of Amp R transformants were screened by restriction analysis, and the integrity of positively identified clones was verified by sequencing. One verified clone of plasmid pQE60 + araQ and pQE60 + malR1 (i.e. plasmid pQE60 in which either araQ or malR1, respectively, was cloned) ( Table 2) was then selected for protein expression and purification purposes.
E. coli XL1-BLUE was utilised for heterologous expression of AraQ and MalR1. E. coli XL1-BLUE strains containing either pQE60 + araQ or pQE60 + malR1 were inoculated at 2% in LB medium and grown until an OD 600nm of ∼0.5 was reached, at which point protein expression was induced by the addition of IPTG 25 . Following incubation for a further two hours cells were harvested by centrifugation and re-suspended in EMSA buffer (see below). Bacterial cells were disrupted by bead beating in a mini-bead beater (BioSpec Products, Bartlesville, OK, USA) using glass beads. Cellular debris was removed by centrifugation to produce a crude cell extract. Recombinant AraQ and MalR1 proteins, each tagged with an incorporated C-terminal His 10 sequence, were purified from a crude cell extract using a nickel-nitrilotriacetic acid column (Qiagen, Hilden, Germany) according to the manufacturer's instructions (QIAexpressionist, June 2003). Lysis, wash and elution buffers were supplemented with 10% glycerol as this considerably improved the (binding) activity of AraQ and MalR1. Elution fractions were analysed by SDS-polyacrylamide gel electrophoresis, as described previously 77  Electrophoretic mobility shift assay (EMSA). Promoter regions of genes of interest were amplified by PCR utilising individual primer pairs, of which one or both were 5' Ird-700-labelled (provided by IDT, Dresden, Germany) as listed in supplemental Table S3. Electrophoretic mobility shift assays (EMSAs) were performed as described previously 78  www.nature.com/scientificreports www.nature.com/scientificreports/ at 37 °C prior to loading onto a 6% non-denaturing polyacrylamide (PAA) gel prepared in TAE buffer (40 mM Tris-acetate [pH 8.0], 2 mM EDTA) and bound/unbound DNA fragments were then separated by electrophoresis on a 0.5X to 2.0X gradient of TAE at 100 V for 90 min in an Atto Mini PAGE system (Atto Bioscience and Biotechnology, Tokyo, Japan).
Signals and percentage binding inferred from the Integrated Intensity (II) were detected/calculated using an Odyssey infrared imaging system (Li-Cor Biosciences, United Kingdom, Ltd., Cambridge, United Kingdom), and images were electronically captured with the use of Odyssey software v3.0. The ability of AraQ and MalR1 to bind DNA fragments was assessed employing an arbitrary, non-linear scale which spanned from no affinity, low, medium to high affinity (and annotated as −, +, ++, +++, respectively). This non-linear scale was correlated to the percentage of DNA which was bound as compared to unbound target DNA within a given reaction with -representing no DNA bound, + represents up to 15% DNA bound, ++ represents 15% -50% DNA bound and +++ represents 50-100% DNA bound. In some cases, when a DNA fragment was shown to exhibit binding affinity, the binding site was more precisely determined by generating sub-fragments by PCR, followed by EMSAs as described above. To identify possible effectors for AraQ or MalR1, 10 mM of a particular compound (Supplemental Table S4) was added to the binding reaction mixture.

Motif searches.
Motif searches were carried out using Position Weight Matrix (PWM) approach as described previously 40 . Several PWMs were implemented: (i) a PWM used for the AraQ regulon reconstruction in bifidobacteria based on a previous study 40 ; (ii) a new AraQ/MalR1 PWM which was built using the EMSA data as a training set (based on promoter regions with which AraQ/MalR1 had a medium affinity (++) or high affinity (+++). A graphical representation of the identified motifs was obtained using WebLOGO software 79 .
Phenotypic analysis on bile. In order to assess the effects of bile on growth, B. breve UCC2003 and its isogenic derivatives B. breve UCC2003-araQ and B. breve UCC2003-malR1 were inoculated at 1% (v/v) from stock into mMRS medium supplemented with 1% maltose and 0.5% (v/v) L-cysteine HCl and were cultured overnight under anaerobic conditions at 37 °C. The above strains were then inoculated at 1% (v/v) into mMRS medium containing 1% maltose, 0.5% (v/v) L-cysteine HCl along with increasing concentration of porcine bile at 0, 1 and 2% (v/v). mMRS without the addition of a carbohydrate source served as a negative control. Bacteria were evaluated for their ability to grow in the presence of bile with Optical density (OD 600nm ) readings taken at 10 and 24 hr post-inoculation. Samples were assessed as biologically independent triplicates. Statistical analysis was carried out using the Microsoft Excel Data Analysis ToolPak. The statistical significance was determined for the two mutants, by comparing the difference in their OD 600nm readings as compared with the WT control when exposed to bile.

Data availability
All data generated and/or analysed during this study are included in this article and its Supplementary Information files.