Resources abstract

Nature Biotechnology 27, 1043 - 1049 (2009)
Published online: 1 November 2009 | doi:10.1038/nbt.1582

The transcription unit architecture of the Escherichia coli genome

Byung-Kwan Cho1, Karsten Zengler1, Yu Qiu1, Young Seoub Park1, Eric M Knight1,3, Christian L Barrett1, Yuan Gao2 & Bernhard Ø Palsson1

Bacterial genomes are organized by structural and functional elements, including promoters, transcription start and termination sites, open reading frames, regulatory noncoding regions, untranslated regions and transcription units. Here, we iteratively integrate high-throughput, genome-wide measurements of RNA polymerase binding locations and mRNA transcript abundance, 5′ sequences and translation into proteins to determine the organizational structure of the Escherichia coli K-12 MG1655 genome. Integration of the organizational elements provides an experimentally annotated transcription unit architecture, including alternative transcription start sites, 5′ untranslated region, boundaries and open reading frames of each transcription unit. A total of 4,661 transcription units were identified, representing an increase of >530% over current knowledge. This comprehensive transcription unit architecture allows for the elucidation of condition-specific uses of alternative sigma factors at the genome scale. Furthermore, the transcription unit architecture provides a foundation on which to construct genome-scale transcriptional and translational regulatory networks.

  1. Department of Bioengineering, University of California, San Diego, La Jolla, California, USA.
  2. Center for the Study of Biological Complexity and Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA.
  3. Present address: Center for Systems Biology, University of Iceland, Vatnsmyravegar, Reykjavik, Iceland.

Correspondence to: Bernhard Ø Palsson1 e-mail: