Abstract
Bacteria often respond to dynamically changing environments by regulating gene expression. Despite this regulation being critically important for growth and survival, little is known about how selection shapes gene regulation in natural populations. To better understand the role natural selection plays in shaping bacterial gene regulation, here we compare differences in the regulatory behaviour of naturally segregating promoter variants from Escherichia coli (which have been subject to natural selection) to randomly mutated promoter variants (which have never been exposed to natural selection). We quantify gene expression phenotypes (expression level, plasticity and noise) for hundreds of promoter variants across multiple environments and show that segregating promoter variants are enriched for mutations with minimal effects on expression level. In many promoters, we infer that there is strong selection to maintain high levels of plasticity, and direct selection to decrease or increase cell-to-cell variability in expression. Taken together, these results expand our knowledge of how gene regulation is affected by natural selection and highlight the power of comparing naturally segregating polymorphisms to de novo random mutations to quantify the action of selection.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The data files can be accessed through the Figshare repository with the identifier https://doi.org/10.6084/m9.figshare.c.5517228.v6
Code availability
All scripts with access to original data files are available through the Zenodo repository with the identifier https://doi.org/10.5281/zenodo.6494122
References
Hawkins, J. S. et al. Mismatch-CRISPRi reveals the co-varying expression–fitness relationships of essential genes in Escherichia coli and Bacillus subtilis. Cell Syst. 11, 523–535 (2020).
Keren, L. et al. Massively parallel interrogation of the effects of gene expression levels on fitness. Cell 166, 1282–1294 (2016).
Dekel, E. & Alon, U. Optimality and evolutionary tuning of the expression level of a protein. Nature 436, 588–592 (2005).
Bedford, T. & Hartl, D. L. Optimization of gene expression by natural selection. Proc. Natl Acad. Sci. USA 106, 1133–1138 (2009).
Rohlfs, R. V., Harrigan, P. & Nielsen, R. Modeling gene expression evolution with an extended Ornstein–Uhlenbeck process accounting for within-species variation. Mol. Biol. Evol. 31, 201–211 (2014).
de Boer, C. G. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat. Biotechnol. 38, 56–65 (2020).
Ireland, W. T. et al. Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time. eLife 9, e55308 (2020).
Brewster, R. C., Jones, D. L. & Phillips, R. Tuning promoter strength through RNA polymerase binding site design in Escherichia coli. PLoS Comput. Biol. 8, e1002811 (2012).
Brewster, R. C. et al. The transcription factor titration effect dictates level of gene expression. Cell 156, 1312–1323 (2014).
Gertz, J., Siggia, E. D. & Cohen, B. A. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).
Maeda, Y. T. & Sano, M. Regulatory dynamics of synthetic gene networks with positive feedback. J. Mol. Biol. 359, 1107–1124 (2006).
Mangan, S., Itzkovitz, S., Zaslaver, A. & Alon, U. The incoherent feed-forward loop accelerates the response-time of the gal system of Escherichia coli. J. Mol. Biol. 356, 1073–1081 (2006).
Rosenfeld, N., Elowitz, M. B. & Alon, U. Negative autoregulation speeds the response times of transcription networks. J. Mol. Biol. 323, 785–793 (2002).
Duveau, F., Yuan, D. C., Metzger, B. P. H., Hodgins-Davis, A. & Wittkopp, P. J. Effects of mutation and selection on plasticity of a promoter activity in Saccharomyces cerevisiae. Proc. Natl Acad. Sci. USA 114, E11218–E11227 (2017).
Hill, M. S., Vande Zande, P. & Wittkopp, P. J. Molecular and evolutionary processes generating variation in gene expression. Nat. Rev. Genet. 22, 203–215 (2021).
López-Maury, L., Marguerat, S. & Bähler, J. Tuning gene expression to changing environments: from rapid responses to evolutionary adaptation. Nat. Rev. Genet. 9, 583–593 (2008).
Basu, S., Mehreja, R., Thiberge, S., Chen, M.-T. & Weiss, R. Spatiotemporal control of gene expression with pulse-generating networks. Proc. Natl Acad. Sci. USA 101, 6355–6360 (2004).
Becskei, A. & Serrano, L. Engineering stability in gene networks by autoregulation. Nature 405, 590–593 (2000).
Eisen, H., Brachet, P., Pereira da Silva, L. & Jacob, F. Regulation of repressor expression in λ. Proc. Natl Acad. Sci. USA 66, 855–862 (1970).
Kalir, S., Mangan, S. & Alon, U. A coherent feed‐forward loop with a SUM input function prolongs flagella expression in Escherichia coli. Mol. Syst. Biol. 1, 2005.0006 (2005). https://doi.org/10.1038/msb4100010
Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif. Proc. Natl Acad. Sci. USA 100, 11980–11985 (2003).
Novick, A. & Weiner, M. Enzyme induction as an all-or-none phenomenon. Proc. Natl Acad. Sci. USA 43, 553–566 (1957).
Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31, 64–68 (2002).
Smits, W. K., Kuipers, O. P. & Veening, J.-W. Phenotypic variation in bacteria: the role of feedback regulation. Nat. Rev. Microbiol. 4, 259–271 (2006).
Madan Babu, M., Teichmann, S. A. & Aravind, L. Evolutionary dynamics of prokaryotic transcriptional regulatory networks. J. Mol. Biol. 358, 614–633 (2006).
Mayo, A. E., Setty, Y., Shavit, S., Zaslaver, A. & Alon, U. Plasticity of the cis-regulatory input function of a gene. PLoS Biol. 4, e45 (2006).
Metzger, B. P. H. & Wittkopp, P. J. Compensatory trans-regulatory alleles minimizing variation in TDH3 expression are common within Saccharomyces cerevisiae. Evol. Lett. 3, 448–461 (2019).
Schaerli, Y. et al. Synthetic circuits reveal how mechanisms of gene regulatory networks constrain evolution. Mol. Syst. Biol. 14, e8102 (2018).
Bar-Even, A. et al. Noise in protein expression scales with natural protein abundance. Nat. Genet. 38, 636–643 (2006).
Elowitz, M. B., Levine, A. J., Siggia, E. D. & Swain, P. S. Stochastic gene expression in a single cell. Science 297, 1183–1186 (2002).
Rossi, N. A., El Meouche, I. & Dunlop, M. J. Forecasting cell fate during antibiotic exposure using stochastic gene expression. Commun. Biol. 2, 259 (2019).
Silander, O. K. et al. A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli. PLoS Genet. 8, e1002443 (2012).
Süel, G. M., Kulkarni, R. P., Dworkin, J., Garcia-Ojalvo, J. & Elowitz, M. B. Tunability and noise dependence in differentiation dynamics. Science 315, 1716–1719 (2007).
Taniguchi, Y. et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538 (2010).
Wolf, L., Silander, O. K. & van Nimwegen, E. Expression noise facilitates the evolution of gene regulation. eLife 4, e05856 (2015).
Urchueguía, A. et al. Genome-wide gene expression noise in Escherichia coli is condition-dependent and determined by propagation of noise through the regulatory network. PLoS Biol. 19, e3001491 (2021).
Duveau, F. et al. Fitness effects of altering gene expression noise in Saccharomyces cerevisiae. eLife 7, e37272 (2018).
Metzger, B. P. H., Yuan, D. C., Gruber, J. D., Duveau, F. & Wittkopp, P. J. Selection on noise constrains variation in a eukaryotic promoter. Nature 521, 344–347 (2015).
Govers, S. K., Adam, A., Blockeel, H. & Aertsen, A. Rapid phenotypic individualization of bacterial sister cells. Sci. Rep. 7, 8473 (2017).
Kotte, O., Volkmer, B., Radzikowski, J. L. & Heinemann, M. Phenotypic bistability in Escherichia coli’s central carbon metabolism. Mol. Syst. Biol. 10, 736 (2014).
Ronin, I., Katsowich, N., Rosenshine, I. & Balaban, N. Q. A long-term epigenetic memory switch controls bacterial virulence bimodality. eLife 6, e19599 (2017).
Acar, M., Mettetal, J. T. & van Oudenaarden, A. Stochastic switching as a survival strategy in fluctuating environments. Nat. Genet. 40, 471–475 (2008).
Veening, J.-W., Smits, W. K. & Kuipers, O. P. Bistability, epigenetics, and bet-hedging in bacteria. Annu. Rev. Microbiol. 62, 193–210 (2008).
Lewis, K. Persister cells, dormancy and infectious disease. Nat. Rev. Microbiol. 5, 48–56 (2007).
Ishii, S., Ksoll, W. B., Hicks, R. E. & Sadowsky, M. J. Presence and growth of naturalized Escherichia coli in temperate soils from Lake Superior watersheds. Appl. Environ. Microbiol. 72, 612–621 (2006).
Sakoparnig, T., Field, C. & van Nimwegen, E. Whole genome phylogenies reflect the distributions of recombination rates for many bacterial species. eLife 10, e65366 (2021).
Santos-Zavaleta, A. et al. RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12. Nucleic Acids Res. 47, D212–D220 (2019).
Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997).
Breckell, G. & Silander, O. K. Complete genome sequences of 47 environmental isolates of Escherichia coli. Microbiol. Resour. Announc. 9, e00222-20 (2020).
Watterson, G. A. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–276 (1975).
Monod, J. The growth of bacterial cultures. Annu. Rev. Microbiol. 3, 371–394 (1949).
Denver, D. R. et al. The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nat. Genet. 37, 544–548 (2005).
Belliveau, N. M. et al. Systematic approach for dissecting the molecular mechanisms of transcriptional regulation in bacteria. Proc. Natl Acad. Sci. USA 115, E4796–E4805 (2018).
Kinney, J. B. & McCandlish, D. M. Massively parallel assays and quantitative sequence–function relationships. Annu. Rev. Genomics Hum. Genet. 20, 99–127 (2019).
Harley, C. B. & Reynolds, R. P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 15, 2343–2361 (1987).
Hornung, G. et al. Noise–mean relationship in mutated promoters. Genome Res. 22, 2409–2417 (2012).
Hodgins-Davis, A., Duveau, F., Walker, E. A. & Wittkopp, P. J. Empirical measures of mutational effects define neutral models of regulatory evolution in Saccharomyces cerevisiae. Proc. Natl Acad. Sci. USA 116, 21085–21093 (2019).
Schmiedel, J. M., Carey, L. B. & Lehner, B. Empirical mean–noise fitness landscapes reveal the fitness impact of gene expression noise. Nat. Commun. 10, 3180 (2019).
Poelwijk, F. J., de Vos, M. G. J. & Tans, S. J. Tradeoffs and optimality in the evolution of gene regulation. Cell 146, 462–470 (2011).
Bernstein, M. R., Zdraljevic, S., Andersen, E. C. & Rockman, M. V. Tightly linked antagonistic-effect loci underlie polygenic phenotypic variation in C. elegans. Evol. Lett. 3, 462–473 (2019).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Notredame, C., Higgins, D. G. & Heringa, J. T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000).
Serres, M. H. & Riley, M. MultiFun, a multifunctional classification scheme for Escherichia coli K-12 gene products. Microb. Comp. Genomics 5, 205–222 (2000).
Zaslaver, A. et al. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat. Methods 3, 623–628 (2006).
Li, C. et al. FastCloning: a highly simplified, purification-free, sequence- and ligation-independent PCR cloning method. BMC Biotechnol. 11, 92 (2011).
Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).
Karp, P. D. et al. The EcoCyc Database. EcoSal Plus https://doi.org/10.1128/ecosalplus.ESP-0006-2018 (2018).
Acknowledgements
We thank T. Cooper, A. Sajuthi and N. Freed for valuable comments on the final draft of this manuscript. We are also grateful to S. Pearless and B. Morampalli for sequencing several plasmid constructs. This work was supported by a Marsden Grant—Royal Society Te Apārangi MAU1703 awarded to O.K.S. The funder had no role in study design, data collection and interpretation or the decision to submit the work for publication.
Author information
Authors and Affiliations
Contributions
M.V. and O.K.S. conceived the project and designed the experiments and analyses. O.K.S. supervised the project. M.V. performed all experiments and all analyses. M.V. wrote the paper with contributions from O.K.S.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks Bianca Sclavi, Mo Siddiq and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Experimental design to assay the effects of segregating and random mutations on gene expression.
a) We isolated ten promoters (aldA, yhjX, mtr, aceB, lacZ, dctA, cdd, ptsG, purA, and tpiA) originating from MG1655. b) We then PCR amplified variants of these ten promoters segregating among environmental E. coli isolates from DNA pools. The average number of mutations across all segregating variants (as compared to MG1655) is 7.2 (ranging from 1 to 12.7 for individual promoters). c) We also performed PCR random mutagenesis using each of the ten MG1655 promoters with a target mutation rate of 1.5 mutations per promoter sequence. d) We cloned the resulting PCR amplicons (both segregating and random) into the pUA66 vector upstream of GFPmut264. We Sanger sequenced all the promoter variants to confirm the presence and location of mutations. From mutagenesis only the variants containing 1 to 3 SNPs were used for further phenotypic assays. e) We then cultured each of these individual promoter variants (1000 in total) in three different environments in triplicates, and f) quantified the modal population expression and modal coefficient of variation levels using flow cytometry.
Extended Data Fig. 2 Comparison of intergenic regions (IGR) genetic variation measures with and without flanking open reading frames (ORFs).
a) and b) Correlation in sequence variation between IGRs and promoters (IGRs with 100 bp of flanking ORF regions). a shows the alignment length normalized Watterson’s estimator θ and b displays the average pairwise nucleotide diversity π.
Extended Data Fig. 3 Segregating genetic variation in promoters correlates with variation in expression levels.
a) and b) Standard deviations in modal expression levels from segregating variants are correlated with the genetic variation of the promoter (IGR with 100 bp flanking regions). Panel a shows the correlation with π and panel b shows the correlation with the number of segregating promoter variants cloned and used for the phenotypic assay. For each promoter, the standard deviation of modal population expression was measured in three environments (three points per promoter, Table 2). The rho and p-values were calculated using Spearman’s correlation test (two-sided).
Extended Data Fig. 4 Comparison of expression levels from promoters in pairs of environments.
All x and y axes represent the modal population expression level in the particular environment noted on the axis label. The blue dotted lines indicate equal expression levels in both environments, that is, no phenotypic plasticity. The further from the line a promoter is, the higher the absolute difference is in expression from the promoter between the two environments (that is, the higher its phenotypic plasticity).
Extended Data Fig. 5 Fits of smoothing splines to modal population expression and modal coefficient of variation (mCV).
A smoothing spline was fitted to all variants (segregating and mutagenized) in each environment. The term “noise” is used for the vertical deviation of each variant, that is, deviation in the mCV from the fitted spline. The mCV is a measure analogous to the coefficient of variation. It was calculated as a standard deviation of log transformed expression levels (stdev) divided by modal population expression level (mode). We observe qualitatively similar patterns for most promoters and growth conditions, in which there were monotonic decreases in noise as expression increased. However, there were exceptions to this pattern, most notably for lacZ in glucose and galactose as well as dctA in L-malic acid; for both of these noise increased and expression increased. In lacZ this is likely due to the fact that in glucose (and galactose) for most promoters, all cells do not express above background fluorescence. However, for a small number of promoters, there are a few cells expressing above background. This increases noise while having little effect on modal expression levels. However, in dctA we speculate that there may in fact be selection for higher noise, resulting in the monotonically increasing relationship we observed and the widely divergent noise levels (Fig. 6).
Extended Data Fig. 6 Overall selection pressure.
We calculated a cumulative z-score for each promoter variant that was indicative of the deviation from the average promoter behaviour for all phenotypes (expression level, plasticity, and noise) between segregating and random variants. The numbers above each pair of segregating and random variants indicate the p-values for two-sided Wilcoxon rank-sum test to test for differences between the two groups. The numbers in bold indicate significant p-values. The horizontal black lines indicate the median values of cumulative z-scores of each group. The promoters are arranged in decreasing order of segregating genetic variation (θ). The MG1655 variant of the mtr promoter was omitted from calculation due to a SNP in GFP (Supplementary Note). The inset shows the full scale of cumulative z-scores on the y-axis for comparison of high values in aceB and purA promoters.
Extended Data Fig. 7 Cell gating strategy from flow cytometry data.
a) Raw data from the flow cytometer of one of the samples. Each point is an individual event recorded by the flow cytometer, the majority of which are expected to be cells. b) Identification of the highest kernel density of forward and side scatter values is displayed as the red cross. c) Removing events that are too far from the highest kernel density point. This ensured compactness of the final gating step. d) Final gating step. The function ellipsoidGate from the flowCore R package was used to isolate the densest homogenous population within the sample.
Supplementary information
Supplementary Information
Supplementary Note
Supplementary Tables
Supplementary Table 1: Genetic variability measures of all IGRs and ORFs present in at least 130 out of 153 environmental E. coli isolates. Supplementary Table 2: Characteristics of promoters selected for phenotypic assays. The functional groups of downstream genes were obtained using MultiFun. *For the full description of the assay environments see Supplementary Table 3. Supplementary Table 3: Promoter–environment combinations. All environments included M9 minimal salt media supplemented with MgSO4, CaCl2 and 50 µg ml−1 of kanamycin. Supplementary Table 4: Primers and oligos used in this work. Bold sequences indicate regions homologous to the ends of the PCR-amplified pUA66 vector used in DNA assembly. In the case of the lacZ and yhjX promoter, two versions of reverse primer exist differing by a single SNP in downstream lacZ gene (C69A) and two SNPs in upstream yhjY gene (G615A and G624A), respectively.
Rights and permissions
About this article
Cite this article
Vlková, M., Silander, O.K. Gene regulation in Escherichia coli is commonly selected for both high plasticity and low noise. Nat Ecol Evol 6, 1165–1179 (2022). https://doi.org/10.1038/s41559-022-01783-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41559-022-01783-2