Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The transcription unit architecture of the Escherichia coli genome


Bacterial genomes are organized by structural and functional elements, including promoters, transcription start and termination sites, open reading frames, regulatory noncoding regions, untranslated regions and transcription units. Here, we iteratively integrate high-throughput, genome-wide measurements of RNA polymerase binding locations and mRNA transcript abundance, 5′ sequences and translation into proteins to determine the organizational structure of the Escherichia coli K-12 MG1655 genome. Integration of the organizational elements provides an experimentally annotated transcription unit architecture, including alternative transcription start sites, 5′ untranslated region, boundaries and open reading frames of each transcription unit. A total of 4,661 transcription units were identified, representing an increase of >530% over current knowledge. This comprehensive transcription unit architecture allows for the elucidation of condition-specific uses of alternative sigma factors at the genome scale. Furthermore, the transcription unit architecture provides a foundation on which to construct genome-scale transcriptional and translational regulatory networks.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Figure 1: Flowchart of the systematic iterative integration process.
Figure 2: Integration of the organizational components.
Figure 3: Modular units.
Figure 4: Determination of transcription units and use of alternative TSSs.

Accession codes


Gene Expression Omnibus


  1. MacLean, D., Jones, J.D. & Studholme, D.J. Application of 'next-generation' sequencing technologies to microbial genetics. Nat. Rev. Microbiol. 7, 287–296 (2009).

    PubMed  Google Scholar 

  2. Faith, J.J. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007).

    Article  Google Scholar 

  3. Graham, R., Graham, C. & McMullan, G. Microbial proteomics: a mass spectrometry primer for biologists. Microb. Cell Fact. 6, 26 (2007).

    Article  Google Scholar 

  4. Medini, D. et al. Microbiology in the post-genomic era. Nat. Rev. Microbiol. 6, 419–430 (2008).

    Article  CAS  Google Scholar 

  5. Xia, Q. et al. Protein abundance ratios for global studies of prokaryotes. Proteomics 7, 2904–2919 (2007).

    Article  CAS  Google Scholar 

  6. Fleischmann, R.D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995).

    Article  CAS  Google Scholar 

  7. Reed, J.L., Famili, I., Thiele, I. & Palsson, B.O. Towards multidimensional genome annotation. Nat. Rev. Genet. 7, 130–141 (2006).

    Article  CAS  Google Scholar 

  8. Cho, B.K., Knight, E.M., Barrett, C.L. & Palsson, B.O. Genome-wide analysis of Fis binding in Escherichia coli indicates a causative role for A-/AT-tracts. Genome Res. 18, 900–910 (2008).

    Article  CAS  Google Scholar 

  9. Koonin, E.V. & Wolf, Y.I. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36, 6688–6719 (2008).

    Article  CAS  Google Scholar 

  10. Grainger, D.C. et al. Studies of the distribution of Escherichia coli cAMP-receptor protein and RNA polymerase along the E. coli chromosome. Proc. Natl. Acad. Sci. USA 102, 17693–17698 (2005).

    Article  CAS  Google Scholar 

  11. Ishihama, Y. et al. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9, 102 (2008).

    Article  Google Scholar 

  12. Typas, A. et al. High-throughput, quantitative analyses of genetic interactions in E. coli. Nat. Methods 5, 781–787 (2008).

    Article  CAS  Google Scholar 

  13. Feist, A.M. et al. Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7, 129–143 (2009).

    Article  CAS  Google Scholar 

  14. Cho, B.K. et al. Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli. Proc. Natl. Acad. Sci. USA 105, 19462–19467 (2008).

    Article  CAS  Google Scholar 

  15. Crick, F. Central dogma of molecular biology. Nature 227, 561–563 (1970).

    Article  CAS  Google Scholar 

  16. Campbell, E.A. et al. Structural mechanism for rifampicin inhibition of bacterial RNA polymerase. Cell 104, 901–912 (2001).

    Article  CAS  Google Scholar 

  17. Herring, C.D. et al. Immobilization of Escherichia coli RNA polymerase and location of binding sites by use of chromatin immunoprecipitation and microarrays. J. Bacteriol. 187, 6166–6174 (2005).

    Article  CAS  Google Scholar 

  18. Choi, P.J., Cai, L., Frieda, K. & Xie, X.S. A stochastic single-molecule event triggers phenotype switching of a bacterial cell. Science 322, 442–446 (2008).

    Article  CAS  Google Scholar 

  19. Halasz, G. et al. Detecting transcriptionally active regions using genomic tiling arrays. Genome Biol. 7, R59 (2006).

    Article  Google Scholar 

  20. Power, J. The L-rhamnose genetic system in Escherichia coli K-12. Genetics 55, 557–568 (1967).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Keseler, I.M. et al. EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res. 37, D464–D470 (2009).

    Article  CAS  Google Scholar 

  22. David, L. et al. A high-resolution map of transcription in the yeast genome. Proc. Natl. Acad. Sci. USA 103, 5320–5325 (2006).

    Article  CAS  Google Scholar 

  23. Zimmer, J.S., Monroe, M.E., Qian, W.J. & Smith, R.D. Advances in proteomics data analysis and display using an accurate mass and time tag approach. Mass Spectrom. Rev. 25, 450–482 (2006).

    Article  CAS  Google Scholar 

  24. Rudd, K.E. EcoGene: a genome sequence database for Escherichia coli K-12. Nucleic Acids Res. 28, 60–64 (2000).

    Article  CAS  Google Scholar 

  25. Jaffe, J.D., Berg, H.C. & Church, G.M. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 4, 59–77 (2004).

    Article  CAS  Google Scholar 

  26. Ansong, C. et al. Proteogenomics: needs and roles to be filled by proteomics in genome annotation. Brief. Funct. Genomics Proteomics 7, 50–62 (2008).

    Article  CAS  Google Scholar 

  27. Sabatti, C., Rohlin, L., Oh, M.K. & Liao, J.C. Co-expression pattern from DNA microarray experiments as a tool for operon prediction. Nucleic Acids Res. 30, 2886–2893 (2002).

    Article  CAS  Google Scholar 

  28. Venkatraman, E.S. & Olshen, A.B. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657–663 (2007).

    Article  CAS  Google Scholar 

  29. Yanofsky, C. Attenuation in the control of expression of bacterial operons. Nature 289, 751–758 (1981).

    Article  CAS  Google Scholar 

  30. Kaberdin, V.R. & Blasi, U. Translation initiation and the fate of bacterial mRNAs. FEMS Microbiol. Rev. 30, 967–979 (2006).

    Article  CAS  Google Scholar 

  31. Cho, B.K., Charusanti, P., Herrgard, M.J. & Palsson, B.O. Microbial regulatory and metabolic networks. Curr. Opin. Biotechnol. 18, 360–364 (2007).

    Article  CAS  Google Scholar 

  32. Powell, B.S. et al. Novel proteins of the phosphotransferase system encoded within the rpoN operon of Escherichia coli. Enzyme IIANtr affects growth on organic nitrogen and the conditional lethality of an erats mutant. J. Biol. Chem. 270, 4822–4839 (1995).

    Article  CAS  Google Scholar 

  33. Bieda, M. et al. Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res. 16, 595–605 (2006).

    Article  CAS  Google Scholar 

  34. Reppas, N.B., Wade, J.T., Church, G.M. & Struhl, K. The transition between transcriptional initiation and elongation in E. coli is highly variable and often rate limiting. Mol. Cell 24, 747–757 (2006).

    Article  CAS  Google Scholar 

  35. Lipton, M.S. et al. Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags. Proc. Natl. Acad. Sci. USA 99, 11049–11054 (2002).

    Article  CAS  Google Scholar 

  36. Blattner, F.R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997).

    Article  CAS  Google Scholar 

Download references


The authors thank Derek Lovley at the University of Massachusetts, Amherst for his insightful discussion and Marc Abrams for editing the manuscript. Proteomics experiments were performed using EMSL, a national scientific user facility sponsored by the Department of Energy's Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory. This work was supported by the US National Institutes of Health Grant GM062791 and by the Office of Science (BER), US Department of Energy, cooperative agreement DE-FC02-02ER63446.

Author information

Authors and Affiliations



B.-K.C., K.Z., Y.Q., E.M.K. and B.Ø.P. conceived and designed experiments. B.-K.C., Y.S.P., Y.G. and E.M.K. performed genome-scale experiments. All data analyses were performed by B.-K.C., K.Z., Y.Q., Y.S.P. and C.L.B. The manuscript was written by B.-K.C., K.Z. and B.Ø.P.

Corresponding author

Correspondence to Bernhard Ø Palsson.

Supplementary information

Supplementary Text and Figures

Supplementary Figs. 1–9 and Supplementary Tables 1–14 (PDF 26256 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cho, BK., Zengler, K., Qiu, Y. et al. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol 27, 1043–1049 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing