Abstract
The human genome is pervasively transcribed, yet only a small fraction is coding. Here we address whether this non-coding transcription arises at promoters, and detail the interactions of initiation factors TATA box binding protein (TBP), transcription factor IIB (TFIIB) and RNA polymerase (Pol) II. Using ChIP-exo (chromatin immunoprecipitation with lambda exonuclease digestion followed by high-throughput sequencing), we identify approximately 160,000 transcription initiation complexes across the human K562 genome, and more in other cancer genomes. Only about 5% associate with messenger RNA genes. The remainder associates with non-polyadenylated non-coding transcription. Regardless, Pol II moves into a transcriptionally paused state, and TBP and TFIIB remain at the promoter. Remarkably, the vast majority of locations contain the four core promoter elements— upstream TFIIB recognition element (BREu), TATA, downstream TFIIB recognition element (BREd), and initiator element (INR)—in constrained positions. All but the INR also reside at Pol III promoters, where TBP makes similar contacts. This comprehensive and high-resolution genome-wide detection of the initiation machinery produces a consolidated view of transcription initiation events from yeast to humans at Pol II/III TATA-containing/TATA-less coding and non-coding genes.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Identifying transcription start sites and active enhancer elements using BruUV-seq
Scientific Reports Open Access 11 December 2015
-
Gene looping facilitates TFIIH kinase-mediated termination of transcription
Scientific Reports Open Access 19 August 2015
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout






Change history
02 October 2013
Minor changes were made to the core promoter consensus sequences.
References
Buratowski, S., Hahn, S., Guarente, L. & Sharp, P. A. Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56, 549–561 (1989)
Lagrange, T., Kapanidis, A. N., Tang, H., Reinberg, D. & Ebright, R. H. New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB. Genes Dev. 12, 34–44 (1998)
Deng, W. & Roberts, S. G. A core promoter element downstream of the TATA box that is recognized by TFIIB. Genes Dev. 19, 2418–2423 (2005)
Juven-Gershon, T. & Kadonaga, J. T. Regulation of gene expression via the core promoter and the basal transcriptional machinery. Dev. Biol. 339, 225–229 (2010)
Kostrewa, D. et al. RNA polymerase II–TFIIB structure and mechanism of transcription initiation. Nature 462, 323–330 (2009)
He, Y., Fang, J., Taatjes, D. J. & Nogales, E. Structural visualization of key steps in human transcription initiation. Nature 495, 481–486 (2013)
Ptashne, M. & Gann, A. Transcriptional activation by recruitment. Nature 386, 569–577 (1997)
Vannini, A. & Cramer, P. Conservation between the RNA polymerase I, II, and III transcription initiation machineries. Mol. Cell 45, 439–446 (2012)
Kim, T. H. et al. A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005)
Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nature Genet. 38, 626–635 (2006)
Gilmour, D. S. & Lis, J. T. RNA polymerase II interacts with the promoter region of the noninduced hsp70 gene in Drosophila melanogaster cells. Mol. Cell. Biol. 6, 3984–3989 (1986)
Guenther, M. G., Levine, S. S., Boyer, L. A., Jaenisch, R. & Young, R. A. A chromatin landmark and transcription initiation at most promoters in human cells. Cell 130, 77–88 (2007)
Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013)
Kapranov, P., Willingham, A. T. & Gingeras, T. R. Genome-wide transcription and the implications for genomic organization. Nature Rev. Genet. 8, 413–423 (2007)
Rhee, H. S. & Pugh, B. F. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 1408–1419 (2011)
Rhee, H. S. & Pugh, B. F. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 295–301 (2012)
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011)
Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61–D65 (2007)
He, Y., Vogelstein, B., Velculescu, V. E., Papadopoulos, N. & Kinzler, K. W. The antisense transcriptomes of human cells. Science 322, 1855–1857 (2008)
Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008)
Core, L. J. & Lis, J. T. Transcription regulation through promoter-proximal pausing of RNA polymerase II. Science 319, 1791–1792 (2008)
Fenouil, R. et al. CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters. Genome Res. 22, 2399–2408 (2012)
Rozenberg, J. M. et al. All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues. BMC Genomics 9, 67 (2008)
Sainsbury, S., Niesser, J. & Cramer, P. Structure and function of the initially transcribing RNA polymerase II–TFIIB complex. Nature 493, 437–440 (2013)
Basehoar, A. D., Zanton, S. J. & Pugh, B. F. Identification and distinct regulation of yeast TATA box-containing genes. Cell 116, 699–709 (2004)
Singer, V. L., Wobbe, C. R. & Struhl, K. A wide variety of DNA sequences can functionally replace a yeast TATA element for transcriptional activation. Genes Dev. 4, 636–645 (1990)
Smale, S. T. & Baltimore, D. The “initiator” as a transcription control element. Cell 57, 103–113 (1989)
Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010)
Hamada, M., Huang, Y., Lowe, T. M. & Maraia, R. J. Widespread use of TATA elements in the core promoters for RNA polymerases III, II, and I in fission yeast. Mol. Cell. Biol. 21, 6870–6881 (2001)
Geiduschek, E. P. & Tocchini-Valentini, G. P. Transcription by RNA polymerase III. Annu. Rev. Biochem. 57, 873–914 (1988)
White, R. J. & Jackson, S. P. Mechanism of TATA-binding protein recruitment to a TATA-less class III promoter. Cell 71, 1041–1053 (1992)
Carrière, L. et al. Genomic binding of Pol III transcription machinery and relationship with TFIIS transcription factor distribution in mouse embryonic stem cells. Nucleic Acids Res. 40, 270–283 (2012)
Verrijzer, C. P., Chen, J. L., Yokomori, K. & Tjian, R. Binding of TAFs to core elements directs promoter selectivity by RNA polymerase II. Cell 81, 1115–1125 (1995)
Kapranov, P. & St Laurent, G. Dark matter RNA: existence, function, and controversy. Front. Genet. 3, 60 (2012)
Eichler, E. E. et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nature Rev. Genet. 11, 446–450 (2010)
Rhee, H. S. & Pugh, B. F. ChIP-exo method for identifying genomic location of DNA-binding proteins with near-single-nucleotide accuracy. Curr. Protoc. Mol. Biol. Chapter 21, Unit 21.24. (2012)
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009)
Berger, M. F. et al. Integrative analysis of the melanoma transcriptome. Genome Res. 20, 413–427 (2010)
Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012)
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)
Albert, I., Wachi, S., Jiang, C. & Pugh, B. F. GeneTrack–a genomic data processing and visualization framework. Bioinformatics 24, 1305–1306 (2008)
Acknowledgements
We thank R. Reja, S. Mahony, P. Albert and Y. Li for bioinformatic assistance, and M. Cousar and K.-Y. Chan-Salis for experimental support. This work was supported by National Institutes of Health grant GM059055.
Author information
Authors and Affiliations
Contributions
B.J.V. performed the experiments and conducted data analyses. B.J.V. and B.F.P. conceived the experiments, analyses and co-wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Extended data figures and tables
Extended Data Figure 1 Validation of ChIP-exo data and association with ENCODE annotated regions.
a, Pie chart of all 159,117 TFIIB-bound locations in K562 cells parsed into ENCODE-annotated regions. b, Venn overlap among mRNA genes having TBP or TFIIB locations (<500 bp from its TSS) and genes with measured polyadenylated mRNA levels detected by RNA-seq38. Data thresholding may contribute to non-overlapping sets. c, Moving average (100-gene) of mRNA levels versus TFIIB/TBP/Pol II occupancy levels on a median-centred log2 scale.
Extended Data Figure 2 Distribution of TFIIB/TBP/Pol II in CpG islands that overlap mRNA TSSs.
a, Peak-pair distribution for TFIIB, TBP and Pol II at the 5,095 CpG islands that overlap with the mRNA TSSs from Fig. 1b (78% overlap), and with the direction of transcription to the right. Rows are linked, and sorted by CpG island length. CpG island borders are indicated by blue and red bars, respectively. b, Shown is the averaged data from a. c, All 159,117 TFIIB locations were sorted by location, and inter-TFIIB distances calculated (red trace). Data were then sorted by distance, and the standard deviation of adjacent TFIIB occupancy ratios was calculated on a sliding window of 30 values. Peak calling parameters preclude detection of two separate TFIIB locations approximately <40 bp apart. Those that were 40–70 bp apart were correlated, whereas those >70 bp apart were less correlated or uncorrelated.
Extended Data Figure 3 Properties of CPEs associated with RefSeq genes.
a, Average TFIIB and TBP occupancy parsed by the number of mismatches to the TATA consensus. b, Distribution of each candidate CPE relative to each other.
Extended Data Figure 4 CPEs at non-coding loci bound by TFIIB.
a, Bar graph showing the percentage of all 150,754 putative ‘non-coding’ TFIIB binding locations (>500 bp from an annotated RefSeq TSS) that have the indicated number of CPEs. b, Distribution of ChIP-exo peaks on each strand relative to the indicated CPE, for 150,754 putative non-coding TFIIB locations. Opposite strand traces (red) are inverted. c, Distribution of TBP (purple) and Pol II (black) peak-pair midpoints relative to the TATA motif midpoint derived from the 150,754 TFIIB putative non-coding locations. d, TFIIB occupancy versus percentage of locations that code for proteins. All 159,117 TFIIB locations were sorted by occupancy level, and the percentage of locations linked to an annotated RefSeq feature was plotted as a moving average.
Extended Data Figure 5 Enrichment of different RNA fractions at 159,117 TFIIB locations throughout the human genome.
Frequency distribution RNA 5′ ends for poly(A)+(ref. 38) (top) and ENCODE project RNA fractions40 as indicated to the far left. Traces in the left panels are separated by sense (blue) and antisense (red, inverted) orientations relative to the corresponding mRNA TSS, which is directed to the right. Because the TSS orientation is not known for the poly(A)− ncRNA loci, positive and negative strand tags were plotted relative to the TFIIB midpoint. The percentage of putative TFIIB locations that exist within 2 kb of an RNA tag are indicated in the top right corner of each plot.
Extended Data Figure 6 TFIIB core promoter distances.
Candidate CP at varying distances from all 159,117 TFIIB locations, for the indicated spacing variants (not all possible combinations were tested). Digits within spacing variant schematic reflect the base-pair spacing (N) between elements. CPE with high P values (less correlated to the PSPM matrix) have thin lines, whereas low/strong P values (<3 × 10−4) have thick lines.
Extended Data Figure 7 Promoter complexes across cancer cell lines.
a, b, Occupancy levels for TFIIB linked to coding genes (a) and non-coding regions (b) in the indicated cell type were normalized by column. The colour scales represent the range of average-centred, log2-transformed values within each respective column. Detection in all four cell types defines group 1. Groups 2–4 were parsed by k-means clustering. Rows were sorted within groups based on TFIIB occupancy averaged across the four cell types (yellow, black, cyan and grey denote high, medium, low and zero occupancy, respectively). For clarity in b, TFIIB locations that were detected in only one cell line were excluded from clustering. Columns were hierarchically clustered. The MCF7 data set had 20–30% of the coverage of other cell lines (reported in Supplementary Data 3), which probably accounts for an excessive number of zero-occupancy loci (grey).
Supplementary information
Supplementary Data 1
This file contains Supplementary Data 1a. (XLSX 23538 kb)
Supplementary Data 2
This file contains Supplementary Data 1b. (XLSX 22234 kb)
Supplementary Data 3
This file contains Supplementary Data 2. (XLS 26 kb)
Supplementary Data 4
This file contains Supplementary Data 3. (XLS 29157 kb)
Supplementary Data 5
This file contains Supplementary Data 4. (XLSX 532 kb)
Rights and permissions
About this article
Cite this article
Venters, B., Pugh, B. Genomic organization of human transcription initiation complexes. Nature 502, 53–58 (2013). https://doi.org/10.1038/nature12535
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature12535
This article is cited by
-
Beyond the canonical role of TFIIB in eukaryotic transcription
Current Genetics (2022)
-
Getting up to speed with transcription elongation by RNA polymerase II
Nature Reviews Molecular Cell Biology (2015)
-
Identifying transcription start sites and active enhancer elements using BruUV-seq
Scientific Reports (2015)
-
ChIP-nexus enables improved detection of in vivo transcription factor binding footprints
Nature Biotechnology (2015)
-
Gene looping facilitates TFIIH kinase-mediated termination of transcription
Scientific Reports (2015)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.