Post-transcriptional gene regulation frequently occurs through elements in mRNA 3′ untranslated regions (UTRs)1, 2. Although crucial roles for 3′UTR-mediated gene regulation have been found in Caenorhabditis elegans3, 4, 5, most C. elegans genes have lacked annotated 3′UTRs6, 7. Here we describe a high-throughput method for reliable identification of polyadenylated RNA termini, and we apply this method, called poly(A)-position profiling by sequencing (3P-Seq), to determine C. elegans 3′UTRs. Compared to standard methods also recently applied to C. elegans UTRs8, 3P-Seq identified 8,580 additional UTRs while excluding thousands of shorter UTR isoforms that do not seem to be authentic. Analysis of this expanded and corrected data set suggested that the high A/U content of C. elegans 3′UTRs facilitated genome compaction, because the elements specifying cleavage and polyadenylation, which are A/U rich, can more readily emerge in A/U-rich regions. Indeed, 30% of the protein-coding genes have mRNAs with alternative, partially overlapping end regions that generate another 10,480 cleavage and polyadenylation sites that had gone largely unnoticed and represent potential evolutionary intermediates of progressive UTR shortening. Moreover, a third of the convergently transcribed genes use palindromic arrangements of bidirectional elements to specify UTRs with convergent overlap, which also contributes to genome compaction by eliminating regions between genes. Although nematode 3′UTRs have median length only one-sixth that of mammalian 3′UTRs, they have twice the density of conserved microRNA sites, in part because additional types of seed-complementary sites are preferentially conserved. These findings reveal the influence of cleavage and polyadenylation on the evolution of genome architecture and provide resources for studying post-transcriptional gene regulation.
At a glance
- From birth to death: the complex lives of eukaryotic mRNAs. Science 309, 1514–1518 (2005)
- mRNA localization: gene expression in the spatial dimension. Cell 136, 719–730 (2009) &
- Control of the sperm-oocyte switch in Caenorhabditis elegans hermaphrodites by the fem-3 3′ untranslated region. Nature 349, 346–348 (1991) &
- Negative regulatory sequences in the lin-14 3′-untranslated region are necessary to generate a temporal switch during Caenorhabditis elegans development. Genes Dev. 5, 1813–1824 (1991) , , , &
- 3′ UTRs are the primary regulators of gene expression in the C. elegans germline. Curr. Biol. 18, 1476–1482 (2008) , , &
- WormBase 2007. Nucleic Acids Res. 36, D612–D617 (2008) et al.
- UTRome.org: a platform for 3′UTR biology in C. elegans . Nucleic Acids Res. 36, D57–D62 (2008) , , , &
- The landscape of C. elegans 3′UTRs. Science 329, 432–435 (2010) et al.
- Oligo(dT) primer generates a high frequency of truncated cDNAs through internal poly(A) priming during reverse transcription. Proc. Natl Acad. Sci. USA 99, 6152–6156 (2002) et al.
- Massively parallel sequencing of the polyadenylated transcriptome of C. elegans . Genome Res. 19, 657–666 (2009) et al.
- NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61–D65 (2007) , &
- A functional human poly(A) site requires only a potent DSE and an A-rich upstream sequence. EMBO J. 29, 1523–1536 (2010) , , &
- A complex containing CstF-64 and the SL2 snRNP connects mRNA 3′ end formation and trans-splicing in C. elegans operons. Genes Dev. 15, 2562–2571 (2001) et al.
- Transcriptional collision between convergent genes in budding yeast. Proc. Natl Acad. Sci. USA 99, 8796–8801 (2002) &
- PRG-1 and 21U-RNAs interact to form the piRNA complex required for fertility in C. elegans . Mol. Cell 31, 67–78 (2008) et al.
- Mammalian microRNAs: experimental evaluation of novel and previously annotated genes. Genes Dev. 24, 992–1009 (2010) et al.
- Intronic microRNA precursors that bypass Drosha processing. Nature 448, 83–86 (2007) , &
- Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 19, 92–105 (2009) , , &
- Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis elegans . Nature Struct. Mol. Biol. 17, 173–179 (2010) et al.
- The microRNA miR-124 controls gene expression in the sensory nervous system of Caenorhabditis elegans . Nucleic Acids Res. 38, 3780–3793 (2010) et al.
- A genome-wide map of conserved microRNA targets in C. elegans . Curr. Biol. 16, 460–471 (2006) et al.
- The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans . Nature 403, 901–906 (2000) et al.
- The Caenorhabditis elegans hunchback-like gene lin-57/hbl-1 controls developmental time and is regulated by microRNAs. Dev. Cell 4, 625–637 (2003) et al.
- A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 33, 201–212 (2005) , , &
- Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008) et al.
- Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science 320, 1643–1647 (2008) , , , &
- Progressive lengthening of 3′ untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proc. Natl Acad. Sci. USA 106, 7028–7033 (2009) , , , &
- Widespread shortening of 3′UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009) &
- An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans . Science 294, 858–862 (2001) , , &
- Protein factors in pre-mRNA 3'-end processing. Cell Mol. Life Sci. 65, 1099–1122 (2008) , &
- Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835–840 (2010) , , &
- Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009) , , &
- Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 35, D26–D31 (2007) , , &
- The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998)
- Trans-splicing and operons. WormBook 25, 1–9 (2005)
- The UCSC genome browser database. Nucleic Acids Res. 31, 51–54 (2003) et al.
- Supplementary Information (16M)
The file contains a Supplementary Discussion, additional references, Supplementary Tables 1-5 and 7-10 (see separate file for Supplementary Table 6) and Supplementary Figures 1-15 with legends.
- Supplementary Dataset 1 (3.6M)
This file contains the coordinates of processed data used in the analyses.
- Supplementary Dataset 2 (1.7M)
This file contains coordinates of UTRs defined in the study. It was noticed that a small fraction (<0.5%) of the UTRs listed in the original file were likely artefacts so a revised dataset 2 file was added on 06 January 2011 and this file was replaced on 18 February 2011, after it was noticed that one of its columns was missing.
- Supplementary Table 6 (214K)
This file contains a table summarizing miRNA sequencing data, The original file was not displaying correctly and was replaced on 06 January 2011.