Transcriptional portrait of M. bovis BCG during biofilm production shows genes differentially expressed during intercellular aggregation and substrate attachment

Mycobacterium tuberculosis and M. smegmatis form drug-tolerant biofilms through dedicated genetic programs. In support of a stepwise process regulating biofilm production in mycobacteria, it was shown elsewhere that lsr2 participates in intercellular aggregation, while groEL1 was required for biofilm maturation in M. smegmatis. Here, by means of RNA-Seq, we monitored the early steps of biofilm production in M. bovis BCG, to distinguish intercellular aggregation from attachment to a surface. Genes encoding for the transcriptional regulators dosR and BCG0114 (Rv0081) were significantly regulated and responded differently to intercellular aggregation and surface attachment. Moreover, a M. tuberculosis H37Rv deletion mutant in the Rv3134c-dosS-dosR regulon, formed less biofilm than wild type M. tuberculosis, a phenotype reverted upon reintroduction of this operon into the mutant. Combining RT-qPCR with microbiological assays (colony and surface pellicle morphologies, biofilm quantification, Ziehl–Neelsen staining, growth curve and replication of planktonic cells), we found that BCG0642c affected biofilm production and replication of planktonic BCG, whereas ethR affected only phenotypes linked to planktonic cells despite its downregulation at the intercellular aggregation step. Our results provide evidence for a stage-dependent expression of genes that contribute to biofilm production in slow-growing mycobacteria.


Results
Transcriptional profiling during intercellular aggregation and substrate attachment. Our model of biofilm production by BCG consists of four distinct stages based on visual inspection as cultures progressed in Sauton medium from planktonic cells to mature biofilms. There, BCG starts as free-swimming bacteria (24 h) that in the absence of detergent, forms microcolonies of aggregated cells that can be readily visible (7 days). Later, these aggregates attach to the plastic wells (10 days), to finally produce mature surface pellicles that cover all the air-liquid interphase as well as part of the plastic wells (14 days). We previously demonstrated that it is possible to visually detect these steps during BCG biofilm formation at these time points 9 . To reduce potential variation from experiment to experiment, we always started biofilm cultures with cells adjusted at OD600nm 0.03.
To investigate how BCG responds to intercellular aggregation, differential gene expression analysis was used to compare the transcriptome of 7-day-old cultures (visible intercellular aggregation) as compared with 24 h cultures (basal transcriptome of BCG planktonic cells). Next, we interrogated how BCG specifically responds to substrate attachment, by comparing the transcriptome of 10 days-old cultures (visible attachment to the plastic walls) as compared with 7 days cultures (visible intercellular aggregation). The BCG Pasteur 1173P2 genome, used as a reference, has 4,109 protein and RNA-encoding genes. In our assays, we were able to detect differential gene expression [considered as significant (when both Log 2 -fold change ≥ 1 or ≤ − 1 plus p < 0.05) or not] when comparing intercellular aggregation versus growth as planktonic cells. We found mostly gene downregulation during this transition [1503 upregulated (37.5% of the coding potential), 2,605 downregulated (65%), Supplementary Table 1]. For substrate attachment, as compared with intercellular aggregation, there was less of a biased response [1975 (49.3%) upregulated and 2,132 (53.2%) downregulated, Supplementary Table 1].
The total of differentially expressed genes showing a statistically significant difference (Log 2 -fold change ≥ 1 or ≤ − 1, p < 0.05) with respect to the reference, previous growth stage are shown in Supplementary  14 , although their temporal requirement during biofilm production was not elucidated. Later, lsr2 was implicated in intercellular aggregation 1 . We found that PE1, nirB, and lsr2 were moderately upregulated (FC = 0.6, 0.7, and 0.95 Log 2 , respectively, Supplementary Table 1) during the transition from planktonic to intercellular aggregation, while their expression moderately decreased (FC = − 0.88, − 0.64, and − 0.8 Log 2 , respectively, Supplementary Table 1) during substratum attachment. In both instances, the FC set up in our screening to find differentially expressed genes (Log 2 ≥ 1) was not reached, and therefore these 3 genes were not considered as DE in our analyses, although we acknowledge that the p-value found for these genes was Scientific RepoRtS | (2020) 10:12578 | https://doi.org/10.1038/s41598-020-69152-2 www.nature.com/scientificreports/ statistically significant and below the threshold of p ≤ 0.05. As for PE5 and pks1, neither of these genes complied with FC and p value criteria set up here to be considered as DE. GroEL1 was reported to be required for biofilm production in M. smegmatis via its binding to KasA and regulation of mycolic acids synthesis and biofilm maturation 6 . On the other hand, in BCG GL2, deletion of groEL1 produced thinner surface pellicles, devoid of PDIM and with 2-carbon longer mycolic acids 15 , therefore implicating a more complex role for this chaperone in modulation of the cell surface for biofilm production in mycobacteria. In our work, transcription of groEL1 was found to be significantly repressed during the transition from planktonic to intercellular aggregation (Supplementary Table 1), while it was significantly induced after  substratum attachment (Supplementary Table 1). In agreement with our recent report 9 , genes involved in mycolic acid biosynthesis (kasA, kasB, acpM, fas) were significantly induced after substratum attachment (Supplementary  Table 1), therefore confirming their upregulation during biofilm formation in BCG.
Taken together, the most significant changes that BCG experiences during the early stages of biofilm production were the downregulation of genes of the DosR-regulon during intercellular aggregation, and their upregulation upon substrate attachment. Further, we found a plausible explanation for the temporary requirement of genes already reported to be required for biofilm production (PE1, nirB, lsr2, groEL1, kasA, kasB, fas, and acpM) in mycobacteria.
We then evaluated the expression of BCG0114, dosR, BCG0642c, ethR, and BCG3766c by RT-qPCR. dosR and BCG0114 were specifically downregulated at the intercellular aggregation step while both of them were upregulated at the substrate attachment stage (Supplementary Table 1), and as part of the DosR-regulon their role in regulating expression of other genes have been described 12,13,16 . BCG0642c, which encodes for a conserved hypothetical protein with a PhdYeFM antitoxin domain, was significantly upregulated only during intercellular aggregation and downregulated (p = 0.06) at the substrate attachment step (Supplementary Table 1). ethR was specifically downregulated during surface attachment, while BCG3766c, which encodes for a conserved hypothetical proline rich protein, was also was close to significant downregulation during surface attachment and almost reached the criteria to be considered as significantly affected during the intercellular aggregation step (FC 0.76, p = 0.022).
We sought to evaluate and validate the expression of these 5 selected genes by RT-qPCR at the same stages as we did for RNA-Seq analyses (Fig. 1a). Given that the mean Ct value for each gene of interest with respect to the reference gene, rrs, showed a statistically significant difference between the 24 h time point compared to the remaining ones (p < 0.0001 compared with 7 days, 10 days, and 14 days, one-way ANOVA followed by Tukey's multiple comparison test), we were able only to compare differential gene expression as measured by RNA-Seq to that of the substrate attachment (10 days) versus intercellular aggregation (7 days) step as determined by RT-qPCR. Transcription of the reference gene, rrs, was found to be non-significantly different for biofilm samples (mean Ct values of 9.9, 9.7, and 10.7 at days 7, 10, and 14, respectively, with p values 0.9724 for the 10 vs. 7 days comparison, 0.2921 for the 14 vs. 7 days comparison, and 0.282 for the 10 vs. 14 days comparison, determined by the Brown-Forsythe and Welch ANOVA test followed by Dunnett's post-hoc comparison). Similarly, rrs transcription was not statistically different for planktonic cultures, with mean Ct values of 8.9 (log phase) and 9.4 (stationary phase cultures, p = 0.1409 after a two-tailed, unpaired t test with Welch correction).
Further evaluating gene expression by RT-qPCR we noticed that upregulation of BCG0114 and dosR, or downregulation of ethR and BCG0642c, which both started at the substrate attachment step, were maintained during biofilm maturation (Fig. 1a). Whereas induction of BCG3766c was only found during substrate attachment with no further change during biofilm maturation (Fig. 1a).
To complete our gene expression analyses, we used RT-qPCR to monitor the expression of BCG0114, dosR, BCG0642c, ethR, and BCG3766c in planktonic cultures of BCG at early-log and stationary phase (Fig. 1b). Using early-log phase cultures as a reference, we observed that expression of all genes in stationary cultures followed the same pattern as they did during intercellular aggregation (BCG0114 and dosR upregulated; BCG0642c, ethR, and BCG3766c downregulated). Hence, RT-qPCR validated differential expression for 4 out of 5 genes selected from RNA-Seq assays. Moreover, it showed that expression of the 5 selected targets in stationary phase planktonic cultures resembled the pattern found during intercellular aggregation.
Phenotypic changes in multicellular BCG aggregates derived from increased expression of dosR, BCG0114, BCG0642c, ethR, and BCG3766c. Having confirmed that dosR, BCG0114, BCG0642c, ethR, and BCG3766c showed differential expression specifically at either the intercellular aggregation or substrate attachment steps, we next evaluated the effect of increasing their expression by inserting an additional single copy of each one of them into BCG Pasteur via pMV361, under the control of the strong promoter hsp60 17 . Expression of other genes from this promoter has already been shown by us to result in downstream changes at the transcriptomic 18 and proteomic levels, altered infectivity in vitro 19 , and improved immunogenicity or vaccine efficacy in vivo 20 . Using this approach, we uncoupled gene transcription from the temporary differential expression observed in RNA-Seq and RT-qPCR assays.
Our initial assessment of the phenotypic changes in BCG was focused on those related to multicellular aggregates, such as colony morphology, surface pellicle appearance, Ziehl-Neelsen staining of biofilm samples, and biofilm formed at 10 (substrate attachment) and 14 days (biofilm maturation) as measured by crystal violet staining. Regarding colony morphology (Fig. 2a) www.nature.com/scientificreports/ complex, such as: irregular form, waxy dry appearance, wrinkled and rough surface with irregular margins. We also found some differences among the strains, as follows: BCG strains harboring additional copies of BCG0114 (BCG::BCG0114), and dosR (BCG::dosR) showed a grayish color, unlike the other strains that presented a light yellowish color. Also, both strains produced smaller and flatter colonies compared with the others. The strain with an additional copy of BCG3766c (BCG::BCG3766c) was the one most similar to the wild type strain harboring the empty vector (BCG::pMV361), the only difference was its smaller size, with both of them showing an irregular elevation in the center of the colony. Finally, strains harboring an additional copy of ethR (BCG::ethR) or BCG0642c (BCG::BCG0642c) were very similar to each other, the main difference being their size (Fig. 2a).
When we followed biofilm formation in multiwell plates, we noticed that at the substrate attachment step, no noticeable difference was found during biofilm formation (10 days, Fig. 2c), and minor variations in surface pellicles were observed when biofilms were mature (14 days, Fig. 2d), although a lower rugosity was consistently observed for both BCG::BCG0114 and BCG::dosR cultures (Fig. 2d). No major changes were observed in acidfastness or intercellular adherence of the different BCG strains in mature biofilms, with the exception of some metachromatic-like granules present in BCG::ethR and BCG::BCG0642c (Fig. 2e).
Biofilm formed by any of the BCG strains at the substrate attachment step showed no quantitative difference compared with wild type BCG (Fig. 2f), and only BCG::BCG0642c produced more mature biofilm than wild type BCG (p < 0.0001) with BCG::BCG3766c almost reaching significance for a reduced production of this structure (p = 0.0521, One-Way ANOVA followed by Dunnett's multiple comparison test) (Fig. 2g). We also evaluated the capacity to produce biofilms by M. tuberculosis strains with different dosR contents and found that a mutant lacking this operon (H37Rv dosR KO) was affected in its biofilm formation (p = 0.0426, One-Way ANOVA followed by Dunnett's multiple comparison test) with capacity being restored to wild-type levels upon reinsertion of the gene into the chromosome (H37Rv dosR KO::Comp, Fig. 2h).
In summary, compared to wild type BCG, hsp60-driven expression of dosR and BCG0114, led to smaller colonies on agar, with changes in color and elevation, and also produced smoother surface pellicles with no quantitative effect on biofilm production. Moreover, deletion of dosR in M. tuberculosis H37Rv reduced biofilm production (Fig. 2). hsp60-driven expression of BCG0642c enhanced biofilm production with formation of metachromatic-like granules in acid-fast bacteria (Fig. 2). hsp60-driven expression of BCG3766c reduced colony size and surface pellicle rugosity, although less than dosR or BCG0114, and tended to decrease biofilm production (Fig. 2). ethR expression from hsp60 did not produce any detectable difference in the assays performed here (Fig. 2).

Phenotypic changes in planktonic BCG cells derived from increased expression of dosR,
BCG0114, BCG0642c, ethR, and BCG3766c. After evaluating the effect of hsp60-driven expression of dosR, BCG0114, BCG0642c, ethR, and BCG3766c in BCG multicellular phenotypes, we next evaluated the effect of these genes in planktonic BCG cultures. We did not see any major difference in acid-fastness of earlylog phase bacteria, except for the presence of metachromatic-like granules in BCG::BCG0642c (Fig. 3a) as it www.nature.com/scientificreports/ occurred for this strain in biofilm cultures (Fig. 2). An apparent tendency to form tight bundles was also noticed for BCG::ethR and BCG::BCG3766c, although this was not quantified. Regarding the growth curve of planktonic BCG cultures, we monitored apparent growth by reading OD600nm every 24 h. There, we found that the most pronounced differences observed as an apparent growth delay as observed in OD600nm readings occurred when BCG had hsp60-driven expression of either dosR (significant differences in days 2 and 3, and from day 5 to day 12) or BCG0114, which significantly differed at all time points except at the start of the culture (Fig. 3b). Differences in OD600nm coincided with changes in doubling time for the BCG::dosR strain as compared with BCG::pMV361 at days 5  Even though OD600nm for BCG::BCG0114 suggested a marked growth defect as compared with wild type BCG harboring the empty vector (BCG::pMV361), doubling time indicated that there was indeed a significant difference between these strains, but only at day 6 of culture (BCG::pMV361, 55.17 ± 8 h vs. BCG::BCG0114, 88.65 ± 6.66 h; p = 0.0131). On the other hand, apparent growth of BCG::ethR significantly differed from wild type BCG in days 0 (start of the culture) and day 4 (early log-phase) (Fig. 3b), although their doubling time was different only at day 5 of culture (BCG::pMV361, 35.6 ± 4.18 h vs. BCG::ethR, 43.37 ± 5.04 h; p = 0.038).
OD600nm readings of BCG0642c significantly differed in days 0, 2, 4, and from day 10 to day 12 (stationary phase) (Fig. 3b) Next, we compared bacterial replication as colony-forming units (CFU) per milliliter at 3 stages: day 0 (start of the culture), day 4 (early-log phase), and day 10 (stationary phase). We found that despite the apparent growth delay produced in BCG upon hsp60-driven expression of dosR or BCG0114, specifically in the mid-log to stationary phase transition (Fig. 3b), no significant difference in terms of bacterial numbers was found in any of Ziehl-Neelsen staining of the different BCG strains sampled from midlog phase planktonic cultures in 7H9 OADC 0.05% Tween 80 (a). Growth (as OD600nm readings) of each recombinant BCG was compared with that of parental BCG harboring the empty vector pMV361 (BCG WT::pMV361) (b). Two-Way ANOVA followed by Dunnett's multiple comparison test was used to compare apparent growth, using four biological duplicates; p values for the recombinant strains compared with BCG::pMV361 being described in results. Bacterial replication (CFU/mL) (c) was compared by 2-Way ANOVA followed by Tukey's multiple comparison test, using four biological duplicates. Each experiment was repeated independently two times, and one representative result is shown. Brackets encompass the comparisons for which statistically significant p values are shown on top of the bars depicting the means. www.nature.com/scientificreports/ the three stages evaluated here (Fig. 3c). On the other hand, in agreement with differences found in the growth curve at day 4, BCG::ethR had higher CFU/mL than wild type BCG (p < 0.0001 for early-log cultures, Fig. 3c). Also in agreement with changes observed in growth curve was the replication of early-log and stationary phase cultures of BCG::BCG0642c (p = 0.0022 for early-log, and p = 0.0125 for stationary phases cultures, Fig. 3c). No significant change was detected in replication of BCG::BCG3766c compared with BCG::pMV361 (Fig. 3c).

Discussion
Bacteria must accurately regulate growth and stress resilience. The formation of biofilms contributes to stress survival, since these dense multicellular aggregates, in which cells are embedded in an extracellular matrix of self-produced polymers, represent a self-constructed protective 'niche' 21 that yet remains metabolically active even after reaching maturity 22 .
In M. tuberculosis complex bacteria, biofilm production in vitro has been shown to harbor drug-tolerant bacteria 7 and to be genetically linked to this phenotype 8 . Drug-tolerant mycobacteria may comprise a fraction of the population, as mature M. tuberculosis biofilms showed sensitivity to antibiotic treatment when assessed with microcalorimetry (IMC) and tunable diode laser absorption spectroscopy 22 . Drug tolerance has also been found to occur in ex vivo caseum 23 and is thought to contribute to persistence after drug treatment in vivo 24 . In fact, the lungs of chronically infected mice harbored a subpopulation of nongrowing but metabolically active M. tuberculosis 25 .
Evidence reported over the last decade associate the capacity of M. tuberculosis-complex bacteria with virulence in ex vivo or in vivo models 26 . Pang et al. 14 found several genes that were required for biofilm production using a "formation/no formation" readout, while Yang et al. 1 utilized a genetic approach to propose a temporal order for development of mycobacterial biofilms using M. smegmatis as model.
In this work, we performed an unbiased, whole transcriptome analysis, aimed to find genes differentially expressed during intercellular aggregation and substrate attachment. This followed the rationale that upon affecting their expression levels, this may result in either major or subtle changes during biofilm formation by BCG. This contrasts with both Pang et al. 14 strategy based on "formation/no formation" readout in microtiter plates, and the one used by Yang et al. 1 that relied on a clever yet static approach, as these authors screened only one time point to look for M. smegmatis mutants with altered capacity to produce biofilms within a syringe-based model.
It is worth noting that we utilized a panel of 5 DE tools to identify gene expression changes. Selection of genes with potential relevance for the intercellular aggregation and substrate attachment steps during biofilm production by BCG were based on their up-or down-regulation by a twofold or greater change coupled with p < 0.05 after multivariate analysis. Using these criteria, we observed that the most significant changes that BCG experiences at the early stages of biofilm production are the downregulation of part of the DosR-regulon during intercellular aggregation, and their upregulation upon substrate attachment. Therefore, we selected 2 genes from this regulon: dosR itself, and BCG0114 (homologous to Rv0081), and characterized the effects of the strong expression of these genes from the hsp60 promoter both during multicellular and planktonic growth.
DosR has been shown to respond to oxygen limitation and to the presence of nitric oxide in vitro 12,13 . We found that only the absence of dosR in M. tuberculosis, as opposed to its increased expression in BCG, result in reduced biofilm production. We hypothesize that this may be explained by the fact of slow growing mycobacteria requiring DosR to metabolically adapt to low O 2 levels as already reported 27 . This result is also in agreement with reduced dosR transcription and reduced biofilm production observed in a double mutant devoid of the exopolyphosphatases genes 28 and also in a mtrB mutant 29 .
Changes in surface pellicle appearance during biofilm production upon dosR or BCG0114 expression from hsp60, constitute a subtle phenotype that may go unnoticed when assessing transposon insertion mutant for "formation/no formation" readout, as we found no quantitative effect on biofilm production. The relevance of this biofilm-specific change for other in vitro or in vivo phenotypes remains to be determined. We contend that altering either dosR or BCG0114 expression results in unique, biofilm-specific phenotypes, as during planktonic growth, their expression from hsp60 delayed logarithmic growth (Fig. 3b) but with no effect on bacterial replication at the beginning, early-log, and stationary phases (Fig. 3c). This difference might be explained, at least to some extent, by differences in oxygen availability in biofilms as has been suggested 9 . In support of this hypothesis, it is worth noting that the already complex regulatory network utilized by M. tuberculosis to respond to hypoxia 16 has just been refined using a comprehensive genome-wide transcription factor binding map and network topology analysis. This unraveled M. tuberculosis response during the adaptation to varying oxygen levels (normoxia, depletion, early-, mid-, and late hypoxia, and resuscitation) and contributed to further support the role of Rv0081 (BCG0114), DosR, and Lsr2 in adaptation to oxygen availability 30 , regulators that were differentially expressed at distinct stages of biofilm production in BCG (Table 1).
BCG0642c, which encodes for a conserved hypothetical protein with a PhdYeFM antitoxin domain, was significantly upregulated only during the intercellular aggregation step and tended towards downregulation (p = 0.06) at the substrate attachment step (Supplementary Table 1). To date, only the structure and function as an antitoxin of VapB4 (Rv0596c, the orthologous of BCG0642c) has been described 31 but with no other role identified thus far. However, toxin-antitoxin modules have been shown to play a major role in persister formation in many model systems 32 , therefore it seems reasonable to find at least one of these genes as differentially expressed and contributing to biofilm production in BCG (Fig. 2). We acknowledge that a unique, biofilm-specific role for BCG0642c cannot be claimed at this point, given that its expression from hsp60 also affected planktonic replication, positively during early-log phase, and negatively in stationary phase planktonic BCG cultures (Fig. 3).
ethR was specifically downregulated during surface attachment (Supplementary Table 1) but its expression from hsp60 resulted in no change during biofilm production (Fig. 2) yet it favored growth and replication during early-log phase (Fig. 3) www.nature.com/scientificreports/ to mammalian cells 33 but its capacity to produce biofilm was not evaluated in that work. Nevertheless, another report stated that ethR did not participate in biofilm production in M. tuberculosis 34 , which seems in agreement with our findings in BCG. The last gene we evaluated was BCG3766c, which encodes for a conserved hypothetical proline rich protein (Supplementary Table 1). This gene was significantly downregulated during surface attachment and was also downregulated during the intercellular aggregation step as well (FC 0.76, p = 0.022). This may explain why its expression from hsp60 tended to reduce biofilm production (Fig. 2).
We also found differential expression for a number of other genes that we did not further evaluate in this work, including among the most significantly upregulated genes sigE, fadE23 (fatty-acid-CoA ligase, involved in sulfolipid production) 10 , hupB (DNA binding protein), BCG3929 (Rv3866, espG), ppsC (involved in PDIM synthesis), BCG1191 (Rv1130, prpD, 2-methylcitrate dehydratase), and BCG1826 (Rv1794, part of the ESX-5 secretion system) 11 . For BCG3929 (Rv3866, espG), a deletion of espG in M. marinum reduced sliding motility and www.nature.com/scientificreports/ biofilm formation 35 . BCG3929 upregulation during intercellular aggregation as compared to planktonic growth ( Table 2) may explain the defect of the M. marinum mutant in biofilm formation. Biofilm formation by BCG in the presence of the histone methyltransferase SUV39H1 was reduced, an effect proposed to occur via trimethylation of HupB 36 . This suggests a positive effect for this DNA-binding protein for biofilm production, and it is in agreement with hupB upregulation during intercellular aggregation (Table 2).
Rv3385c (orthologous to BCG3454c) was shown to be repressed in mature biofilms formed upon exposure to DTT as compared to late-log cultures of M. tuberculosis 37 . Redox conditions intervene in modulating M. tuberculosis pathogenesis, including activity of DosR 38 , which, to add further complexity to the mechanisms of gene regulation driven by this transcriptional regulator, was recently shown to be positively affected by c-di-GMP binding in M. smegmatis in response, precisely, to oxidative stress 39 .
Biofilm-specific proteins were recognized by antibodies present in sera from M. tuberculosis infected guinea pigs 40 . Of the antigenic proteins reported in that study, we found ceoB and BCG2013 (Rv1996) significantly repressed during the transition from planktonic to intercellular aggregation (Supplementary Table 3), while they were significantly induced after substratum attachment (Supplementary Table 4). Moreover, BCG2232 (Rv2216) and TB39.8 (BCG0050c, Rv0020c, FhaA) were affected specifically after substratum attachment (FC = 1, and − 0.74 Log 2 , respectively, Supplementary Tables 4 and 5). The fact that some biofilm-specific proteins that were recognized in vivo had their encoding genes differentially expressed during biofilm production in vitro by BCG further strengthen the notion of biofilms mimicking aspects found during TB pathogenesis 26 . Taken together, our results show that dosR and BCG0114 were expressed in a temporal order during mycobacterial biofilm formation to produce biofilm-specific changes, which most likely are triggered in response to varying oxygen levels within biofilms. Furthermore, we also provide a potential explanation for a stage dependent expression of additional genes previously reported to contribute to biofilm production in mycobacteria and suggest new targets that can be assessed for their particular contribution to this phenotype.

Methods
Bacterial strains, growth conditions and RnA extraction. M. bovis BCG Pasteur strain (ATCC 35734) or M. tuberculosis H37Rv and derivatives with deletion of the Rv3134c-dosR-dosS operon (referred to as H37Rv dosR KO in Fig. 2) and its complemented strain (referred to as H37Rv dosR::KO::Comp in Fig. 2) 27 were used in this study. Planktonic cultures were performed in Middlebrook 7H9 liquid media (BD) with 10% OADC, 0.2% glycerol, 25 µg/mL of kanamycin, at 37 °C, 100 rpm. Serial dilutions of samples were followed by plating onto Middlebrook 7H10 agar plates supplemented with 10% OADC, 0.5% glycerol, and 25 µg/mL kanamycin served to determine colony-forming units per milliliter (CFU/mL). Biofilms (which include bacteria attached to the plastic wells and surface pellicles) for RNA extraction of BCG strains were cultured in Sauton media as already reported 9 . After 1, 7, 10 and 14 days of incubation, two culture flasks were used to harvest, with a scraper, the whole surface pellicle and biofilm attached to the wells (these samples are referred to as "biofilms"), and transferred into 50 mL tubes that were immediately frozen at − 70 °C. From frozen samples, we proceeded to perform RNA extraction and purification as already reported 41 , to ship these samples to Arizona State University for RNA-Seq analyses. The experiment was repeated three (7, 10, and 14 days cultures) or four times (24 h cultures, because of the low biomass present at this time point), to produce independent replicates.
Temporal expression profiling during biofilm production by BCG. RNA was used to prepare cDNA using Nugen's Ovation RNA-Seq System via single primer isothermal amplification (Catalogue # 7102-A01) and automated on the (BRAVO NGS liquid handler from Agilent). cDNA was quantified on the Nanodrop (Thermo Fisher Scientific). Using Kapa Biosystem's DNA Hyper Plus library preparation kit, (KK8514) cDNA was enzymatically sheared to approximately 150 bp fragments, end repaired and A-tailed. Adapters with unique indexes compatible with Illumina (IDT #00989130v2) were ligated on each sample individually, then, these were cleaned using Kapa pure beads (Kapa Biosciences, KK8002), followed by amplification with Kapa's HIFI enzyme (KK2502). Using Agilent's Tapestation, we analyzed fragment size of each library, and quantified them by qPCR (KAPA Library Quantification Kit, KK4835) on a Quantstudio 5 (Thermo Fisher Scientific). Next, we multiplex pooled and sequenced a 2 × 75 flow cell on the NextSeq500 platform (Illumina) at the ASU's Genomics Core facility.
RNA-seq analysis. Raw FASTQ read data were processed using the R package DuffyNGS as described previously 42 . Briefly, raw reads were passed through a 3-stage alignment pipeline: (1) a prealignment step, to remove unwanted transcripts, such as rRNA; (2) a main genomic alignment step against the genome of interest; and (3) a splice junction alignment step, compared with an index of standard and alternative exon splice junctions. Reads were aligned to M. bovis BCG str. Pasteur (1173P2) with Bowtie2 43 , using the command line option "very-sensitive. " BAM files from stages (2) and (3) were combined into read depth wiggle tracks that recorded both multiply mapped and uniquely mapped reads to each of the forward and reverse strands of the genome of reference at single-nucleotide resolution. Next, gene transcript abundance was measured by summing total reads found inside annotated gene boundaries, expressed as both RPKM and raw read counts. RNA-seq data (raw fastq files and read counts) have been deposited in the GEO repository under accession number GSE150030.
Differentially expressed genes. A panel of 5 DE tools was used to identify gene expression changes between 1-week old biofilm samples and 24-h samples (to determine genes affected or necessary for intercellular aggregation or cell-to-cell attachment) or 10-day old biofilm samples and 1-week old biofilm samples (to determine genes affected or necessary for substratum attachment to start building up the mature biofilm). The tools included (1)   www.nature.com/scientificreports/ hsp60 promoter in pMV361 17 . Identity and fidelity of the ORFs was confirmed by DNA sequencing. Amplify4 for MacOS was used to design and test primers. DNA Strider 3.0 for MacOS was used for virtual cloning and plasmid characterizations. Sequence fidelity of cloned ORFs was evaluated using BLAST alignments both locally with DNA Strider 3.0 for MacOS and by direct comparison with genome sequences of BCG Pasteur 1173P2 (https ://www.genom e.jp/kegg-bin/show_organ ism?org=mbb). Recombinant plasmids were transformed into BCG by electroporation and selected on Middlebrook 7H10 (BD) OADC (BD-BBL) with 0.5% glycerol (Sigma) agar plates containing 25 µg/mL of kanamycin (Sigma).
Quantification of biofilm by crystal violet staining. All mycobacterial strains were cultured in Sauton media, started at OD600nm 0.03, in 24-well (M. tuberculosis) or 48-well non-treated tissue culture plates (BCG strains), and were incubated at 37 °C, 5% CO 2 . Each strain was inoculated into 6 different wells, and experiments were repeated three times for statistical analysis. After 10 days (BCG strains) and 14 days (BCG strains and M. tuberculosis strains) of incubation, liquid media was removed and the whole surface pellicle and biofilm attached to the wells (these samples are referred to as "biofilms") was maintained. Plates were baked at 30 °C for 24 h and 1 ml of 100% methanol was added to each well and incubated at room temperature for 15 min. Then, methanol was removed, and plates were dried at 37 °C for 15 min. Crystal violet (CV) was added to each well and incubated at 37 °C for 5 min. CV was removed and each well was washed four times with deionized water. Plates were dried at 37 °C for 15 min. Dye was extracted with 30% acetic acid for 15 min at 37 °C. Then the extract from each well was diluted 1:10 (M. tuberculosis), 1:20 (BCG, 10 days cultures), or 1:40 (BCG, 14 days cultures) in 30% acetic acid and read for OD550nm.
Statistics. Data distribution for qPCR, CFU, and biofilm quantification were analyzed using the Anderson-Darling and Shapiro-Wilk tests, and found to follow a normal distribution in all instances. Growth (as OD600nm readings) of rBCG strains was compared with that of parental BCG harboring the empty vector pMV361 (BCG WT::pMV361) using Two-Way ANOVA followed by Dunnett's multiple comparison test. Growth (as doubling time) was compared by One-Way ANOVA (logarithmic phase cultures) or Brown-Forsythe and Welch ANOVA (stationary phase cultures) followed by Dunnett's multiple comparison tests. Bacterial replication (CFU/mL) was compared by 2-Way ANOVA followed by Tukey's multiple comparison test. For quantification of biofilms, statistical significance was determined using One Way ANOVA followed by Dunnett's test for multiple comparison. For RT-qPCR analyses, Brown-Forsythe and Welch ANOVA followed by Dunnett's multiple comparison test was used for comparing biofilm samples; multiple t tests followed by Holm-Sidak multiple comparison test was used for comparing planktonic samples. GraphPad Prism 8 for MacOS was used for performing statistical analyses. Assays were conducted in three independent times, and the number of replicates per experiment is indicated in each figure legend.