Transcription initiates at promoters, DNA regions recognized by a DNA-dependent RNA polymerase. We previously identified horizontally acquired Escherichia coli promoters from which the direction of transcription was unclear. In the present study, we show that more than half of these promoters are bidirectional and drive divergent transcription. Using genome-scale approaches, we demonstrate that 19% of all transcription start sites detected in E. coli are associated with a bidirectional promoter. Bidirectional promoters are similarly common in diverse bacteria and archaea, and have inherent symmetry: specific bases required for transcription initiation are reciprocally co-located on opposite DNA strands. Bidirectional promoters enable co-regulation of divergent genes and are enriched in both intergenic and horizontally acquired regions. Divergent transcription is conserved among bacteria, archaea and eukaryotes, but the underlying mechanisms for bidirectionality are different.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Mejía-Almonte, C. et al. Redefining fundamental concepts of transcription initiation in bacteria. Nat. Rev. Genet. https://doi.org/10.1038/s41576-020-0254-8 (2020).
Browning, D. F. & Busby, S. J. W. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2, 57–65 (2004).
Haberle, V. & Stark, A. Eukaryotic core promoters and the functional basis of transcription initiation. Nat. Rev. Mol. Cell Biol. 19, 621–637 (2018).
Bae, B., Feklistov, A., Lass-Napiorkowska, A., Landick, R. & Darst, S. A. Structure of a bacterial RNA polymerase holoenzyme open promoter complex. eLife 4, e08504 (2015).
Feklistov, A. & Darst, S. A. Structural basis for promoter −10 element recognition by the bacterial RNA polymerase σ subunit. Cell 147, 1257–1269 (2011).
Kramm, K., Engel, C. & Grohmann, D. Transcription initiation factor TBP: old friend new questions. Biochem. Soc. Trans. 47, 411–423 (2019).
Butler, J. E. F. The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev. 16, 2583–2592 (2002).
Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).
Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008).
Preker, P. et al. RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851–1854 (2008).
He, Y., Vogelstein, B., Velculescu, V. E., Papadopoulos, N. & Kinzler, K. W. The antisense transcriptomes of human cells. Science 322, 1855–1857 (2008).
Neil, H. et al. Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457, 1038–1042 (2009).
Scruggs, B. S. et al. Bidirectional transcription arises from two distinct hubs of transcription factor binding and active chromatin. Mol. Cell 58, 1101–1112 (2015).
Rege, M. et al. Chromatin dynamics and the RNA exosome function in concert to regulate transcriptional homeostasis. Cell Rep. 13, 1610–1622 (2015).
Wu, X. & Sharp, P. A. XDivergent transcription: a driving force for new gene origination? Cell 155, 990–996 (2013).
Jin, Y., Eser, U., Struhl, K. & Churchman, L. S. The ground state and evolution of promoter region directionality. Cell 170, 889–898 (2017).
Dame, R. T., Rashid, F.-Z. M. & Grainger, D. C. Chromosome organization in bacteria: mechanistic insights into genome structure and function. Nat. Rev. Genet. 21, 227–242 (2019).
Browning, D. F. & Busby, S. J. W. Local and global regulation of transcription initiation in bacteria. Nat. Rev. Microbiol. 14, 638–650 (2016).
Singh, S. S. et al. Widespread suppression of intragenic transcription initiation by H-NS. Genes Dev. 28, 214–219 (2014).
Mitra, P., Ghosh, G., Hafeezunnisa, M. & Sen, R. Rho protein: roles and mechanisms. Annu. Rev. Microbiol. 71, 687–709 (2017).
Keseler, I. M. et al. The EcoCyc database: reflecting new knowledge about Escherichia coli K-12. Nucleic Acids Res. 45, D543–D550 (2017).
Ettwiller, L., Buswell, J., Yigit, E. & Schildkraut, I. A novel enrichment strategy reveals unprecedented number of novel transcription start sites at single base resolution in a model prokaryote and the gut microbiome. BMC Genom. 17, 199 (2016).
Thomason, M. K. et al. Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli. J. Bacteriol. 197, 18–28 (2015).
Singh, S. S., Typas, A., Hengge, R. & Grainger, D. C. Escherichia coli σ 70 senses sequence and conformation of the promoter spacer region. Nucleic Acids Res. 39, 5109–5118 (2011).
Warman, E. A., Singh, S. S., Gubieda, A. G. & Grainger, D. C. A non-canonical promoter element drives spurious transcription of horizontally acquired bacterial genes. Nucleic Acids Res. 48, 4891–4901 (2020).
Santos-Zavaleta, A. et al. A unified resource for transcriptional regulation in Escherichia coli K-12 incorporating high-throughput-generated binding data into RegulonDB version 10.0. BMC Biol. 16, 91 (2018).
Mendoza-Vargas, A. et al. Genome-wide identification of transcription start sites, promoters and transcription factor binding sites in E. coli. PLoS ONE 4, e7526 (2009).
Gill, E. E. et al. High-throughput detection of RNA processing in bacteria. BMC Genom. 19, 223 (2018).
Papenfort, K., Förstner, K. U., Cong, J. P., Sharma, C. M. & Bassler, B. L. Differential RNA-seq of Vibrio cholerae identifies the VqmR small RNA as a regulator of biofilm formation. Proc. Natl Acad. Sci. USA 112, E766–E775 (2015).
Kröger, C. et al. The primary transcriptome, small RNAs and regulation of antimicrobial resistance in Acinetobacter baumannii ATCC 17978. Nucleic Acids Res. 46, 9684–9698 (2018).
Sharma, C. M. et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464, 250–255 (2010).
Cortes, T. et al. Genome-wide mapping of transcriptional start sites defines an extensive leaderless transcriptome in Mycobacterium tuberculosis. Cell Rep. 5, 1121–1131 (2013).
Jeong, Y. et al. The dynamic transcriptional and translational landscape of the model antibiotic producer Streptomyces coelicolor A3(2). Nat. Commun. 7, 11605 (2016).
Fan, B. et al. dRNA-seq reveals genomewide TSSs and noncoding RNAs of plant beneficial rhizobacterium Bacillus amyloliquefaciens FZB42. PLoS ONE 10, e0142002 (2015).
Decker, K. B. & Hinton, D. M. Transcription regulation at the core: similarities among bacterial, archaeal, and eukaryotic RNA polymerases. Annu. Rev. Microbiol. 67, 113–139 (2013).
Grünberger, F. et al. Next generation DNA-seq and differential RNA-seq allow re-annotation of the Pyrococcus furiosus DSM 3638 genome and provide insights into archaeal antisense transcription. Front. Microbiol. 10, 1603 (2019).
Babski, J. et al. Genome-wide identification of transcriptional start sites in the haloarchaeon Haloferax volcanii based on differential RNA-Seq (dRNA-Seq). BMC Genom. 17, 629 (2016).
Jäger, D., Förstner, K. U., Sharma, C. M., Santangelo, T. J. & Reeve, J. N. Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis. BMC Genom. 15, 684 (2014).
Lamberte, L. E. et al. Horizontally acquired AT-rich genes in Escherichia coli cause toxicity by sequestering RNA polymerase. Nat. Microbiol. 2, 16249 (2017).
Chen, J. et al. Stepwise promoter melting by bacterial RNA polymerase. Mol. Cell 78, 275–288.e6 (2020).
Miller, J. Experiments in Molecular Genetics (Cold Spring Harbor Laboratory, 1972).
Haycocks, J. R. J. & Grainger, D. C. Unusually situated binding sites for bacterial transcription factors can have hidden functionality. PLoS ONE 11, e0157016 (2016).
Dugar, G. et al. High-resolution transcriptome maps reveal strain-specific regulatory features of multiple Campylobacter jejuni Isolates. PLoS Genet. 9, e1003495 (2013).
Singh, N. & Wade, J. T. Identification of regulatory RNA in bacterial genomes by genome-scale mapping of transcription start sites. Methods Mol. Biol. 1103, 1–10 (2014).
Crooks, G. E., Hon, G., Chandonia, J.-M. & Brenner, S. E. WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190, https://doi.org/10.1101/gr.849004 (2004).
Haycocks, J. R. J. J. et al. The quorum sensing transcription factor AphA directly regulates natural competence in Vibrio cholerae. PLoS Genet. 15, e1008362 (2019).
Kolb, A., Kotlarz, D., Kusano, S. & Ishihama, A. Selectivity of the Escherichia coli RNA polymerase Eσ38 for overlapping promoters and ability to support CRP activation. Nucleic Acids Res. 23, 819–826 (1995).
Savery, N. J. et al. Transcription activation at class II CRP-dependent promoters: identification of determinants in the C-terminal domain of the RNA polymerase α subunit. EMBO J. 17, 3439–3447 (1998).
Stead, M. B. et al. RNAsnapTM: a rapid, quantitative and inexpensive, method for isolating total RNA from bacteria. Nucleic Acids Res. 40, e156 (2012).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 46, W537–W544 (2018).
Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).
Forrest, D., James, K., Yuzenkova, Y. & Zenkin, N. Single-peptide DNA-dependent RNA polymerase homologous to multi-subunit RNA polymerase. Nat. Commun. 8, 15774 (2017).
Podell, S., Gaasterland, T. & Allen, E. E. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm. BMC Bioinform. 9, 419 (2008).
Kahramanoglou, C. et al. Direct and indirect effects of H-NS and Fis on global gene expression control in Escherichia coli. Nucleic Acids Res. 39, 2073–2091 (2011).
Narayanan, A. et al. Cryo-EM structure of Escherichia coli σ 70 RNA polymerase and promoter DNA complex revealed a role of σ non-conserved region during the open complex formation. J. Biol. Chem. 293, 7367–7375 (2018).
This work was funded by a Leverhulme Trust project grant (no. RPG-2018-198) and a Wellcome Trust Investigator award (no. 212193/Z/18/Z) to D.C.G.
The authors declare no competing interests.
Peer review information Nature Microbiology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Binding patterns for H-NS (peach) and σ70 (purple or green) are derived from ChIP-seq assays19. The RNA 5′ ends associated with σ70 binding (purple or green) were identified by PPP-seq19. Genes are shown as red block arrows. Each ChIP-seq dataset is plotted on a scale of 0 to 360 reads on each strand. The TSS data show a read depth of between 0 and 100 on each strand. The AT-content fluctuates between 30 % and 75 %.
a, β-galactosidase activity derived from canonical promoters listed by Ecocyc21 and not having a transcription start site in the reverse orientation detectable by any of three separate studies19,22,23. Data are presented as mean values (n = 3 independent experiments) +/- SD and individual data points are overlaid as dot plots. b, Direction of transcription from cloned DNA fragments. c, Average forward or reverse β-galactosidase activity of all DNA fragments.
Extended Data Fig. 3 Sequences of cryptic RNA polymerase binding sites associated with divergent transcription.
The figure shows promoter DNA sequences (black typeface) and part of the plasmid DNA backbone (grey typeface). The promoter −10 (red) and −35 (green) elements are highlighted on each DNA strand and transcription start sites are denoted by a bent arrow. Sites of mutations and deletions (Δ) are boxed. The sequences are in the ‘a’ orientation as indicated in Fig. 1. When in the ‘b’ orientation the DNA sequence encompassed by black typeface is the reverse complement. Oligonucleotide D49724, used in primer extension analysis, is indicated by a half arrow and binds to the corresponding sequence in grey bold typeface.
a, The graph shows the number of divergent TSSs separated by different distances. The majority of bottom strand transcription start sites occur 18 bp upstream of top strand RNA initiation sites. However, peaks in the occurrence of divergent TSSs also occur elsewhere. These positions are denoted by green data points. The red data point indicates a sharp decrease in the occurrence of divergent TSSs. The symmetry score increases at spacing intervals where the promoter PWM identifies matching sequences overlapping on each DNA strand. b, DNA sequence motifs associated with each preferred spacing between divergent TSSs. Motifs were generated by aligning sequences according to the top strand TSS. The configuration of promoter −10 elements and TSSs, indicated by each motif, is shown below the respective sequence logo. Key positions within the −10 elements, and TSSs, are underlined. c, Products of in vitro transcription using the illustrated DNA templates. The RNAI transcript is derived from the replication origin of the plasmid DNA template. A representative example of two independent experiments is shown. d, Products of in vitro transcription from the intragenic yigG promoter. Note that products produced in vitro match those produced in vivo (Fig. 1d). A representative example of two independent experiments is shown.
We mapped transcription start sites globally in Bacillus subtilis using cappable-seq. The general properties of B. subtilis promoters were compared with those identified in E. coli. a, Distances between promoter −10 elements and transcription start sites (TSSs) in Escherichia coli and B. subtilis. b,c, DNA sequence motifs associated with unidirectional TSSs in E. coli and B. subtilis. d,e, Positioning of TSSs with respect to coding DNA sequences in E. coli and B. subtilis.
The figure shows the number of divergent transcription start sites separated by different distances (black line in graph). The data point in each graph, corresponding to the preferred configuration of divergent start sites, is green. The pale blue data indicate predicted promoter overlap (that is symmetry) derived from a position weight matrix (PWM) search of each DNA strand. The R2 values indicate the degree of correlation between computational prediction and experimental data shown. The DNA motifs adjacent to each graph were generated by aligning promoter −10 hexamer sequences. For V. cholerae, P. aeruginosa, A. baumannii, M. tuberculosis, and S. coelicolor we aligned −10 elements from start sites separated by 17, 18 and 19 bp. In the case of H. pylori, we aligned −10 elements from those start sites 15, 16 or 17 bp apart. Note that all of these distances typically involve the same configuration of −10 elements because the distance between the −10 hexamer and transcription start site is variable (see Supplementary information Fig. 4). For B. subtilis and B. amyloliquefaciens we aligned start sites separated by 10, 11 or 12 bp.
Extended Data Fig. 7 Bidirectional promoters in the archaea Thermococcus kodakarensis and Haloferax volcanii have a shared TATA box.
a,b, The figure shows the number of divergent transcription start sites separated by different distances (black line in graph). The data points in each graph, corresponding to the preferred configurations of divergent start sites, are green. The pale blue data indicate promoter sequence symmetry. The R2 values indicate the degree of correlation between computational prediction and experimental data shown. b, DNA sequences associated with divergent TSSs in Haloferax volcanii separated by 62, 53 or 34 bp were aligned according to the position of the TSS on the top DNA strand. The inf configuration of key elements is shown above each motif.
a, Detection of directional and bidirectional promoters by PPP-seq19. The pie charts show the fractions of each promoter type detected in H-NS bound or H-NS free regions of the E. coli genome in the presence and absence of H-NS. b, Distribution of position weight matrix (PWM) scores for bidirectional promoters in different sections of the E. coli genome. Higher scores indicate a better match to the PWM describing bidirectional promoters. The bounds of the box represent the first and third quartiles and the centre line is the median. Whiskers extend to 1.5 times the interquartile range.
Extended Data Fig. 9 The ratio of directional to bidirectional promoters is similar in different bacteria.
We used multiple Escherichia coli TSS maps to identify bidirectional promoters (corresponding to divergent TSSs separated by between 25 and 7 bp). We noticed that the proportion of TSSs from bidirectional promoters was much smaller in datasets with fewer total TSSs. We reasoned that this was logical; the chance of detecting both transcripts from a given bidirectional promoter is much smaller for less complete TSS maps. For instance, 19 % of TSSs in the combined E. coli TSS map (28,107 TSSs in total) were derived from a bidirectional promoter. In contrast, this value was only 5 % for TSSs identified by PPP-seq19 (4,846 TSSs in total). The number of total and divergent TSSs, for each E. coli TSS map, is plotted in orange; the relationship is not linear. For comparison, we generated a probability model using a mock TSS map for E. coli. The artificial map consisted of 28,107 randomly selected E. coli genome co-ordinates as TSSs. Of these, 19 % of positions on the bottom strand were set to be between 7 and 25 bp upstream of a top strand co-ordinate (that is the mock data exactly emulated the combined TSS composition of the genuine experimental data for E. coli). We then randomly selected sub-populations of genome co-ordinates from the mock TSS map and determined how many pairs of top and bottom stand positions remained separated by between 7 and 25 bp. These data are plotted in pale blue. Consistent with our logic, the relationship was not linear and resembled the real experimental data in orange. We also determined the number of divergent TSSs pairs amongst those TSSs detected by both dRNA-seq and cappable-seq (5,593 TSSs in total). This data point also fell precisely on the trend line generated by the individual and combined data sets. Hence, excluding TSSs not identified by multiple methods does not alter the frequency at which divergent TSS pairs are detected. Finally, we plotted experimentally determined TSS maps for different bacteria (all TSS numbers were normalised for genome size). Crucially, these organisms have been subject to much less scrutiny than E. coli. Hence, the total number of TSSs identified for each bacterium is comparatively small. Even so, it is clear that all data points fall close to the orange and pale blue trend lines. Hence, the fraction of promoters that are bidirectional must be broadly similar in different bacterial species.
Supplementary Table 1. Location and sequence of bidirectional promoters in E. coli.
Supplementary Table 2. Enrichment of TSSs in RegulonDB among TSSs in the combined RNA-seq dataset. Supplementary Table 3. Positions of TSSs across the B. subtilis genome (NC_000964.3). Supplementary Table 4. Strains, plasmids and oligonucleotides.
Numerical source data for Fig. 1a,f.
Unprocessed gel image for Fig. 1d.
Unprocessed gel image for Fig. 3c.
Unprocessed gel images for Fig. 5b,c,f.
Numerical source data for Fig. 5d.
Numerical source data for Extended Data Fig. 2a.
Unprocessed gel images for Extended Data Fig. 4c,d.
Numerical source data.
Numerical source data.
Numerical source data for Extended Data Fig. 8b,c.
Numerical source data.
Numerical source data for Extended Data Fig. 4a.
Numerical source data for Extended Data Fig. 5a.
About this article
Cite this article
Warman, E.A., Forrest, D., Guest, T. et al. Widespread divergent transcription from bacterial and archaeal promoters is a consequence of DNA-sequence symmetry. Nat Microbiol 6, 746–756 (2021). https://doi.org/10.1038/s41564-021-00898-9