Mammalian promoters can be separated into two classes, conserved TATA box–enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3′ UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Bajic, V.B., Tan, S.L., Suzuki, Y. & Sugano, S. Promoter prediction analysis on the whole human genome. Nat. Biotechnol. 22, 1467–1473 (2004).
Carninci, P. et al. Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res. 13, 1273–1289 (2003).
Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).
Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).
Kodzius, R. et al. CAGE: cap analysis of gene expression. Nat. Methods 3, 211–222 (2006).
Jackson, D.A., Pombo, A. & Iborra, F. The balance sheet for transcription: an analysis of nuclear RNA metabolism in mammalian cells. FASEB J. 14, 242–254 (2000).
Carninci, P. et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005).
Suzuki, Y., Yoshitomo-Nakagawa, K., Maruyama, K., Suyama, A. & Sugano, S. Construction and characterization of a full length-enriched and a 5′-end-enriched cDNA library. Gene 200, 149–156 (1997).
Bucher, P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J. Mol. Biol. 212, 563–578 (1990).
Karolchik, D. et al. The UCSC Genome Browser database. Nucleic Acids Res. 31, 51–54 (2003).
Suzuki, Y. et al. Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res. 11, 677–684 (2001).
Kadonaga, J.T. The DPE, a core promoter element for transcription by RNA polymerase II. Exp. Mol. Med. 34, 259–264 (2002).
Smale, S.T. & Kadonaga, J.T. The RNA polymerase II core promoter. Annu. Rev. Biochem. 72, 449–479 (2003).
Burke, T.W. & Kadonaga, J.T. The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila . Genes Dev. 11, 3020–3031 (1997).
Schneider, T.D. & Stephens, R.M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100 (1990).
Butler, J.E. & Kadonaga, J.T. The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev. 16, 2583–2592 (2002).
Trinklein, N.D. et al. An abundance of bidirectional promoters in the human genome. Genome Res. 14, 62–66 (2004).
Patton, J., Block, S., Coombs, C. & Martin, M.E. Identification of functional elements in the murine Gabp alpha/ATP synthase coupling factor 6 bi-directional promoter. Gene 369, 35–44 (2005).
Prescott, E.M. & Proudfoot, N.J. Transcriptional collision between convergent genes in budding yeast. Proc. Natl. Acad. Sci. USA 99, 8796–8801 (2002).
Katayama, S. et al. Antisense transcription in the mammalian transcriptome. Science 309, 1564–1566 (2005).
Lenhard, B. et al. Identification of conserved regulatory elements by comparative genome analysis. J. Biol. 2, 13 (2003).
Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).
Keightley, P.D. & Gaffney, D.J. Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents. Proc. Natl. Acad. Sci. USA 100, 13402–13406 (2003).
Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).
Kodzius, R. et al. Absolute expression values for mouse transcripts: re-annotation of the READ expression database by the use of CAGE and EST sequence tags. FEBS Lett. 559, 22–26 (2004).
Schug, J. et al. Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol. 6, R33 (2005).
Sandelin, A. & Wasserman, W.W. Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. J. Mol. Biol. 338, 207–215 (2004).
Landry, J.R., Mager, D.L. & Wilhelm, B.T. Complex controls: the role of alternative promoters in mammalian genomes. Trends Genet. 19, 640–648 (2003).
Rosmarin, A.G., Yang, Z. & Resendes, K.K. Transcriptional regulation in myelopoiesis: Hematopoietic fate choice, myeloid differentiation, and leukemogenesis. Exp. Hematol. 33, 131–143 (2005).
Bonizzi, G. & Karin, M. The two NF-kappaB activation pathways and their role in innate and adaptive immunity. Trends Immunol. 25, 280–288 (2004).
ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).
Kapranov, P. et al. Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. Genome Res. 15, 987–997 (2005).
Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).
Pruitt, K.D., Tatusova, T. & Maglott, D.R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501–D504 (2005).
Brodsky, A.S. et al. Genomic mapping of RNA polymerase II reveals sites of co-transcriptional regulation in human cells. Genome Biol. 6, R64 (2005).
Bentley, D.L. Rules of engagement: co-transcriptional recruitment of pre-mRNA processing factors. Curr. Opin. Cell Biol. 17, 251–256 (2005).
Wu, Y., Zhang, Y. & Zhang, J. Distribution of exonic splicing enhancer elements in human genes. Genomics 86, 329–336 (2005).
Imamura, T. et al. Non-coding RNA directed DNA demethylation of Sphk1 CpG island. Biochem. Biophys. Res. Commun. 322, 593–600 (2004).
Bluthgen, N., Kielbasa, S.M. & Herzel, H. Inferring combinatorial regulation of transcription in silico. Nucleic Acids Res. 33, 272–279 (2005).
Siepel, A. & Haussler, D. Combining phylogenetic and hidden Markov models in biosequence analysis. J. Comput. Biol. 11, 413–428 (2004).
We thank the following individuals for discussion, encouragement and technical assistance: H. Atsui, A. Hasegawa, K. Hayashida, H. Himei, F. Hori, C. Kawazu, M. Kojima, K. Waki, M. Aoki, K Murakami, M. Murata, M. Nishikawa, H. Nishiyori, K. Nomura, M. Ohno, H. Sato, Y. Shigemoto, N. Suzuki, Y. Takeda and K. Yoshida. We especially thank A. Wada, T. Ogawa, M. Muramatsu, A. Kira and all the members of RIKEN Yokohama Research Promotion Division for supporting and encouraging the project. We also thank the Laboratory of Genome Exploration Research Group for secretarial and technical assistance, and Yokohama City University, who provided human samples and computational resources of the RIKEN Super Combined Cluster (RSCC). This work was mainly supported by Research Grant for the Genome Network Project from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT), the RIKEN Genome Exploration Research Project from the Japanese Ministry of Education, Culture, Sports, Science and Technology of the Japanese Government (to Y.H.), Advanced and Innovational Research Program in Life Science (to Y.H.), National Project on Protein Structural and Functional Analysis from MEXT (to Y.H.), Presidential Research Grant for Intersystem Collaboration of RIKEN (to P.C. and Y.H.) and a grant from the Six Framework Program from the European Commission (to P.C.).
The authors declare no competing financial interests.
Mapping CAGE starting sites to the genome. (PDF 590 kb)
Assessment of exonic promoter activity. (PDF 247 kb)
Conservation of promoters and TSS shapes over evolution. (PDF 881 kb)
Initiation site properties and evolutionary changes. (PDF 1028 kb)
Sequence pattern distributions for different classes of promoters. (PDF 347 kb)
Alternative promoters and transcription start sites in 3′ UTRs. (PDF 1166 kb)
CAGE validation examples. (PDF 983 kb)
Definition of TCs and mRNA assignments of TCs. (PDF 88 kb)
Detailed description of the data sets. (PDF 120 kb)
Substitution rate estimates for mouse and human promoters. (PDF 307 kb)
Functional and tissue specificity overrepresentation for different shape classes. (PDF 193 kb)
Internet links to publicly available resources and data sets. (PDF 126 kb)
CAGE reproducibility statistics. (PDF 220 kb)
Overrepresentation index of TFBS in macrophage promoters. (PDF 31 kb)
Overrepresentation and underrepresentation index of TFBS in macrophage promoters, detailed view. (PDF 135 kb)
About this article
Dynamic UTR Usage Regulates Alternative Translation to Modulate Gap Junction Formation during Stress and Aging
Cell Reports (2019)
Genome-Wide Comparative Analysis of HIF Binding Sites in Cyprinus Carpio for In Silico Identification of Functional Hypoxia Response Elements
Frontiers in Genetics (2019)
Biotechnology Journal (2019)
Briefings in Functional Genomics (2019)
Variation of gene expression in plants is influenced by gene architecture and structural properties of promoters
PLOS ONE (2019)