Access

Letters to Nature

Nature 417, 851-854 (20 June 2002) | doi:10.1038/nature00831; Received 9 November 2001; Accepted 15 April 2002

A global analysis of Caenorhabditis elegans operons

Thomas Blumenthal1, Donald Evans1, Christopher D. Link2, Alessandro Guffanti3, Daniel Lawson3, Jean Thierry-Mieg4, Danielle Thierry-Mieg4, Wei Lu Chiu5, Kyle Duke6, Moni Kiraly6 & Stuart K. Kim6

  1. Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Box B121, 4200 E. 9th Avenue, Denver, Colorado 80262, USA
  2. Institute of Behavioral Genetics, Box 447, University of Colorado, Boulder, Colorado 80309, USA
  3. The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
  4. Gene Network Laboratory, National Institute of Genetics, Mishima 411, Japan, and National Center for Biotechnology Information, Bethesda, Maryland, USA
  5. Department of Molecular Sciences and Technologies, Pfizer Global Research & Development—Ann Arbor, 2800 Plymouth Road, Ann Arbor, Michigan 48105, USA
  6. Departments of Developmental Biology and Genetics, Stanford University Medical Center, 279 Campus Drive, Stanford, California 94305, USA

Correspondence to: Thomas Blumenthal1 Correspondence and requests for materials should be addressed to T.B. (e-mail: Email: tom.blumenthal@uchsc.edu).

Top

The nematode worm Caenorhabditis elegans and its relatives are unique among animals in having operons1. Operons are regulated multigene transcription units, in which polycistronic pre-messenger RNA (pre-mRNA coding for multiple peptides) is processed to monocistronic mRNAs. This occurs by 3' end formation and trans-splicing using the specialized SL2 small nuclear ribonucleoprotein particle2 for downstream mRNAs1. Previously, the correlation between downstream location in an operon and SL2 trans-splicing has been strong, but anecdotal3. Although only 28 operons have been reported, the complete sequence of the C. elegans genome reveals numerous gene clusters4. To determine how many of these clusters represent operons, we probed full-genome microarrays for SL2-containing mRNAs. We found significant enrichment for about 1,200 genes, including most of a group of several hundred genes represented by complementary DNAs that contain SL2 sequence. Analysis of their genomic arrangements indicates that >90% are downstream genes, falling in 790 distinct operons. Our evidence indicates that the genome contains at least 1,000 operons, 2–8 genes long, that contain about 15% of all C. elegans genes. Numerous examples of co-transcription of genes encoding functionally related proteins are evident. Inspection of the operon list should reveal previously unknown functional relationships.