Abstract
Bacterial genomes are organized by structural and functional elements, including promoters, transcription start and termination sites, open reading frames, regulatory noncoding regions, untranslated regions and transcription units. Here, we iteratively integrate high-throughput, genome-wide measurements of RNA polymerase binding locations and mRNA transcript abundance, 5′ sequences and translation into proteins to determine the organizational structure of the Escherichia coli K-12 MG1655 genome. Integration of the organizational elements provides an experimentally annotated transcription unit architecture, including alternative transcription start sites, 5′ untranslated region, boundaries and open reading frames of each transcription unit. A total of 4,661 transcription units were identified, representing an increase of >530% over current knowledge. This comprehensive transcription unit architecture allows for the elucidation of condition-specific uses of alternative sigma factors at the genome scale. Furthermore, the transcription unit architecture provides a foundation on which to construct genome-scale transcriptional and translational regulatory networks.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
MacLean, D., Jones, J.D. & Studholme, D.J. Application of 'next-generation' sequencing technologies to microbial genetics. Nat. Rev. Microbiol. 7, 287–296 (2009).
Faith, J.J. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007).
Graham, R., Graham, C. & McMullan, G. Microbial proteomics: a mass spectrometry primer for biologists. Microb. Cell Fact. 6, 26 (2007).
Medini, D. et al. Microbiology in the post-genomic era. Nat. Rev. Microbiol. 6, 419–430 (2008).
Xia, Q. et al. Protein abundance ratios for global studies of prokaryotes. Proteomics 7, 2904–2919 (2007).
Fleischmann, R.D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995).
Reed, J.L., Famili, I., Thiele, I. & Palsson, B.O. Towards multidimensional genome annotation. Nat. Rev. Genet. 7, 130–141 (2006).
Cho, B.K., Knight, E.M., Barrett, C.L. & Palsson, B.O. Genome-wide analysis of Fis binding in Escherichia coli indicates a causative role for A-/AT-tracts. Genome Res. 18, 900–910 (2008).
Koonin, E.V. & Wolf, Y.I. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36, 6688–6719 (2008).
Grainger, D.C. et al. Studies of the distribution of Escherichia coli cAMP-receptor protein and RNA polymerase along the E. coli chromosome. Proc. Natl. Acad. Sci. USA 102, 17693–17698 (2005).
Ishihama, Y. et al. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9, 102 (2008).
Typas, A. et al. High-throughput, quantitative analyses of genetic interactions in E. coli. Nat. Methods 5, 781–787 (2008).
Feist, A.M. et al. Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7, 129–143 (2009).
Cho, B.K. et al. Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli. Proc. Natl. Acad. Sci. USA 105, 19462–19467 (2008).
Crick, F. Central dogma of molecular biology. Nature 227, 561–563 (1970).
Campbell, E.A. et al. Structural mechanism for rifampicin inhibition of bacterial RNA polymerase. Cell 104, 901–912 (2001).
Herring, C.D. et al. Immobilization of Escherichia coli RNA polymerase and location of binding sites by use of chromatin immunoprecipitation and microarrays. J. Bacteriol. 187, 6166–6174 (2005).
Choi, P.J., Cai, L., Frieda, K. & Xie, X.S. A stochastic single-molecule event triggers phenotype switching of a bacterial cell. Science 322, 442–446 (2008).
Halasz, G. et al. Detecting transcriptionally active regions using genomic tiling arrays. Genome Biol. 7, R59 (2006).
Power, J. The L-rhamnose genetic system in Escherichia coli K-12. Genetics 55, 557–568 (1967).
Keseler, I.M. et al. EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res. 37, D464–D470 (2009).
David, L. et al. A high-resolution map of transcription in the yeast genome. Proc. Natl. Acad. Sci. USA 103, 5320–5325 (2006).
Zimmer, J.S., Monroe, M.E., Qian, W.J. & Smith, R.D. Advances in proteomics data analysis and display using an accurate mass and time tag approach. Mass Spectrom. Rev. 25, 450–482 (2006).
Rudd, K.E. EcoGene: a genome sequence database for Escherichia coli K-12. Nucleic Acids Res. 28, 60–64 (2000).
Jaffe, J.D., Berg, H.C. & Church, G.M. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 4, 59–77 (2004).
Ansong, C. et al. Proteogenomics: needs and roles to be filled by proteomics in genome annotation. Brief. Funct. Genomics Proteomics 7, 50–62 (2008).
Sabatti, C., Rohlin, L., Oh, M.K. & Liao, J.C. Co-expression pattern from DNA microarray experiments as a tool for operon prediction. Nucleic Acids Res. 30, 2886–2893 (2002).
Venkatraman, E.S. & Olshen, A.B. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657–663 (2007).
Yanofsky, C. Attenuation in the control of expression of bacterial operons. Nature 289, 751–758 (1981).
Kaberdin, V.R. & Blasi, U. Translation initiation and the fate of bacterial mRNAs. FEMS Microbiol. Rev. 30, 967–979 (2006).
Cho, B.K., Charusanti, P., Herrgard, M.J. & Palsson, B.O. Microbial regulatory and metabolic networks. Curr. Opin. Biotechnol. 18, 360–364 (2007).
Powell, B.S. et al. Novel proteins of the phosphotransferase system encoded within the rpoN operon of Escherichia coli. Enzyme IIANtr affects growth on organic nitrogen and the conditional lethality of an erats mutant. J. Biol. Chem. 270, 4822–4839 (1995).
Bieda, M. et al. Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res. 16, 595–605 (2006).
Reppas, N.B., Wade, J.T., Church, G.M. & Struhl, K. The transition between transcriptional initiation and elongation in E. coli is highly variable and often rate limiting. Mol. Cell 24, 747–757 (2006).
Lipton, M.S. et al. Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags. Proc. Natl. Acad. Sci. USA 99, 11049–11054 (2002).
Blattner, F.R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997).
Acknowledgements
The authors thank Derek Lovley at the University of Massachusetts, Amherst for his insightful discussion and Marc Abrams for editing the manuscript. Proteomics experiments were performed using EMSL, a national scientific user facility sponsored by the Department of Energy's Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory. This work was supported by the US National Institutes of Health Grant GM062791 and by the Office of Science (BER), US Department of Energy, cooperative agreement DE-FC02-02ER63446.
Author information
Authors and Affiliations
Contributions
B.-K.C., K.Z., Y.Q., E.M.K. and B.Ø.P. conceived and designed experiments. B.-K.C., Y.S.P., Y.G. and E.M.K. performed genome-scale experiments. All data analyses were performed by B.-K.C., K.Z., Y.Q., Y.S.P. and C.L.B. The manuscript was written by B.-K.C., K.Z. and B.Ø.P.
Corresponding author
Supplementary information
Supplementary Text and Figures
Supplementary Figs. 1–9 and Supplementary Tables 1–14 (PDF 26256 kb)
Rights and permissions
About this article
Cite this article
Cho, BK., Zengler, K., Qiu, Y. et al. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol 27, 1043–1049 (2009). https://doi.org/10.1038/nbt.1582
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.1582
This article is cited by
-
The transcriptional landscape of a rewritten bacterial genome reveals control elements and genome design principles
Nature Communications (2021)
-
Streptomyces: host for refactoring of diverse bioactive secondary metabolites
3 Biotech (2021)
-
Genome-scale determination of 5´ and 3´ boundaries of RNA transcripts in Streptomyces genomes
Scientific Data (2020)
-
System-level understanding of gene expression and regulation for engineering secondary metabolite production in Streptomyces
Journal of Industrial Microbiology and Biotechnology (2020)
-
Genome-wide Identification of DNA-protein Interaction to Reconstruct Bacterial Transcription Regulatory Network
Biotechnology and Bioprocess Engineering (2020)