Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome

Abstract

In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comparable information is lacking for noncoding RNAs (ncRNAs). We present a comparative screen of vertebrate genomes for structural noncoding RNAs, which evaluates conserved genomic DNA sequences for signatures of structural conservation of base-pairing patterns and exceptional thermodynamic stability. We predict more than 30,000 structured RNA elements in the human genome, almost 1,000 of which are conserved across all vertebrates. Roughly a third are found in introns of known genes, a sixth are potential regulatory elements in untranslated regions of protein-coding mRNAs and about half are located far away from any known gene. Only a small fraction of these sequences has been described previously. A comparison with recent tiling array data shows that more than 40% of the predicted structured RNAs overlap with experimentally detected sites of transcription. The widespread conservation of secondary structure points to a large number of functional ncRNAs and cis-acting mRNA structures in the human genome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Annotation procedure.
Figure 2: Statistical analysis of predicted structural RNAs.
Figure 3: Selected examples of candidates for novel structural RNAs detected with P > 0.9.

Similar content being viewed by others

References

  1. The Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).

  2. Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004).

    Article  CAS  Google Scholar 

  3. Kampa, D. et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004).

    Article  CAS  Google Scholar 

  4. Johnson, J.M., Edwards, S., Shoemaker, D. & Schadt, E.E. Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. Trends Genet. 21, 93–102 (2005).

    Article  CAS  Google Scholar 

  5. Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005).

    Article  CAS  Google Scholar 

  6. Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).

    Article  Google Scholar 

  7. Imanishi, T. et al. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biology 2, 0856–0875 (2004).

    Article  CAS  Google Scholar 

  8. Cawley, S. et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).

    Article  CAS  Google Scholar 

  9. Hüttenhofer, A., Schattner, P. & Polacek, N. Non-coding RNAs: hope or hype? Trends Genet. 21, 289–297 (2005).

    Article  Google Scholar 

  10. Hofacker, I.L. et al. Automatic detection of conserved RNA structure elements in complete RNA virus genomes. Nucleic Acids Res. 26, 3825–3836 (1998).

    Article  CAS  Google Scholar 

  11. Rivas, E., Klein, R.J., Jones, T.A. & Eddy, S.R. Computational identification of noncoding RNAs in E. coli by comparative genomics. Curr. Biol. 11, 1369–1373 (2001).

    Article  CAS  Google Scholar 

  12. Washietl, S., Hofacker, I.L. & Stadler, P.F. Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA 102, 2454–2459 (2005).

    Article  CAS  Google Scholar 

  13. Moulton, V. Tracking down noncoding RNAs. Proc. Natl. Acad. Sci. USA 102, 2269–2270 (2005).

    Article  CAS  Google Scholar 

  14. Shabalina, S.A. & Kondrashov, A.S. Pattern of selective constraint in C. elegans and C. briggsae genomes. Genet. Res. 74, 23–30 (1999).

    Article  CAS  Google Scholar 

  15. Shabalina, S.A., Ogurtsov, A.Y., Kondrashov, V.A. & Kondrashov, A.S. Selective constraint in intergenic regions of human and mouse genomes. Trends Genet. 17, 373–376 (2001).

    Article  CAS  Google Scholar 

  16. Margulies, E.H., Blanchette, M., Haussler, D. & Green, E.D. Identification and characterization of multi-species conserved sequences. Genome Res. 13, 2507–2518 (2003).

    Article  CAS  Google Scholar 

  17. Dermitzakis, E.T. et al. Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035 (2003).

    Article  CAS  Google Scholar 

  18. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  Google Scholar 

  19. International Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

  20. Cooper, G.M. et al. Characterization of evolutionary rates and constraints in three mammalian genomes. Genome Res. 14, 539–548 (2004).

    Article  CAS  Google Scholar 

  21. Le, S.V., Chen, J.H., Currey, K.M. & Maizel, J.V., Jr. A program for predicting significant RNA secondary structures. Comput. Appl. Biosci. 4, 153–159 (1988).

    CAS  PubMed  Google Scholar 

  22. Washietl, S. & Hofacker, I.L. Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J. Mol. Biol. 342, 19–30 (2004).

    Article  CAS  Google Scholar 

  23. Hofacker, I.L., Fekete, M. & Stadler, P.F. Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319, 1059–1066 (2002).

    Article  CAS  Google Scholar 

  24. Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).

    Article  CAS  Google Scholar 

  25. Accardo, M.C. et al. A computational search for box C/D snoRNA genes in the D. melanogaster genome. Bioinformatics 20, 3293–3301 (2004).

    Article  CAS  Google Scholar 

  26. Childs, J.L., Poole, A.W. & Turner, D.H. Inhibition of Escherichia coli RNase P by oligonucleotide directed misfolding of RNA. RNA 9, 1437–1445 (2003).

    Article  CAS  Google Scholar 

  27. Lin, J. et al. A universal telomerase RNA core structure includes structured motifs required for binding the telomerase reverse transcriptase protein. Proc. Natl. Acad. Sci. USA 101, 14713–14718 (2004).

    Article  CAS  Google Scholar 

  28. Avner, P. & Heard, E. X-chromosome inactivation: counting, choice, and initiation. Nat. Rev. Genet. 2, 59–67 (2001).

    Article  CAS  Google Scholar 

  29. Rougeulle, C. & Heard, E. Antisense RNA in imprinting: spreading silence through Air. Trends Genet. 18, 434–437 (2002).

    Article  CAS  Google Scholar 

  30. Pang, K.C. et al. RNAdb — comprehensive mammalian noncoding RNA database. Nucleic Acids Res. Database issue. 33, D125–D130 (2005).

    Article  CAS  Google Scholar 

  31. Hüttenhofer, A. et al. RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse. EMBO J. 20, 2943–2953 (2001).

    Article  Google Scholar 

  32. Bachellerie, J.-P., Cavaillé, J. & Hüttenhofer, A. The expanding snoRNA world. Biochimie 84, 775–790 (2002).

    Article  CAS  Google Scholar 

  33. Berezikov, E. et al. Phylogenetic shadowing and computational identification of human microRNA genes. Cell 120, 21–24 (2005).

    Article  CAS  Google Scholar 

  34. Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).

    Article  CAS  Google Scholar 

  35. Mattick, J.S. RNA regulation: a new genetics? Nat. Rev. Genet. 5, 316–323 (2004).

    Article  CAS  Google Scholar 

  36. Glazov, E.A., Pheasant, M., McGraw, E.A., Bejerano, G. & Mattick, J.S. Ultraconserved elements in insect genomes: a highly conserved intronic sequence implicated in the control of homothorax mrna splicing. Genome Res. 15, 800–808 (2005).

    Article  CAS  Google Scholar 

  37. Doudna, J.A. Structural genomics of RNA. Nat. Struct. Biol. 7, 954–956 (2000).

    Article  CAS  Google Scholar 

  38. Hartig, J.S., Grüne, I., Najafi-Shoushtari, S.H. & Famulok, M. Sequence-specific detection of microRNAs by signal-amplifying ribozymes. J. Am. Chem. Soc. 126, 722–723 (2004).

    Article  CAS  Google Scholar 

  39. Missal, K., Rose, D. & Stadler, P.F. Non-coding RNAs in Ciona intestinalis. Bioinformatics 21, Suppl 2, ii77–ii78 (2005).

    Article  CAS  Google Scholar 

  40. Griffiths-Jones, S. The microRNA Registry. Nucleic Acids Res. 32, D109–D111 (2004).

    Article  CAS  Google Scholar 

  41. Liu, C. et al. NONCODE: an integrated knowledge database of non-coding RNAs. Nucleic Acids Res. Database issue. 33, D112–D115 (2005).

    Article  CAS  Google Scholar 

  42. Pesole, G. et al. UTRdb and UTRSite: specialized databases of sequences and functional elements of 5′ and 3′ untranslated regions of eukaryotic mRNAs. Update 2002. Nucleic Acids Res. 30, 335–340 (2002).

    Article  CAS  Google Scholar 

  43. Scherer, S.W. et al. Human chromosome 7: DNA sequence and biology. Science 300, 767–772 (2003).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Austrian Fonds zur Förderung der Wissenschaftlichen Forschung, Project No. P15893, by the German DFG Bioinformatics Initiative BIZ-6/1-2, and by the Austrian Gen-AU bioinformatics integration network sponsored by bm:bwk.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter F Stadler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

Northern Blot analysis of five H/ACA snoRNA candidates. (PDF 30 kb)

Supplementary Table 1

Detailed results of the native screen and the random control screen. (PDF 12 kb)

Supplementary Table 2

MicroRNAs missing from the input set. (PDF 11 kb)

Supplementary Table 3

H/ACA snoRNAs missing from the input set. (PDF 10 kb)

Supplementary Table 4

Selected ncRNAs from literature with conserved RNA secondary structures detected in our screen. (PDF 11 kb)

Supplementary Table 5

50 Selected RNAz Hits in intergenic regions overlapping with 'transfrag' transcriptional map. (PDF 347 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Washietl, S., Hofacker, I., Lukasser, M. et al. Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nat Biotechnol 23, 1383–1390 (2005). https://doi.org/10.1038/nbt1144

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1144

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing