Article | Published:

Sequence-based identification of 3D structural modules in RNA with RMDetect

Nature Methods volume 8, pages 513519 (2011) | Download Citation

Abstract

Structural RNA modules, sets of ordered non-Watson-Crick base pairs embedded between Watson-Crick pairs, have central roles as architectural organizers and sites of ligand binding in RNA molecules, and are recurrently observed in RNA families throughout the phylogeny. Here we describe a computational tool, RNA three-dimensional (3D) modules detection, or RMDetect, for identifying known 3D structural modules in single and multiple RNA sequences in the absence of any other information. Currently, four modules can be searched for: G-bulge loop, kink-turn, C-loop and tandem-GA loop. In control test sequences we found all of the known modules with a false discovery rate of 0.23. Scanning through 1,444 publicly available alignments, we identified 21 yet unreported modules and 141 known modules. RMDetect can be used to refine RNA 2D structure, assemble RNA 3D models, and search and annotate structured RNAs in genomic data.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

References

  1. 1.

    , & The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res. 30, 3497–3531 (2002).

  2. 2.

    et al. Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments. Nucleic Acids Res. 33, 2395–2409 (2005).

  3. 3.

    & Analysis of RNA motifs. Curr. Opin. Struct. Biol. 13, 300–308 (2003).

  4. 4.

    & A common motif organizes the structure of multi-helix loops in 16 S and 23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583 (1998).

  5. 5.

    et al. RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Res. 29, 4724–4735 (2001).

  6. 6.

    , & CMfinder—a covariance model based RNA motif finding algorithm. Bioinformatics 22, 445–452 (2006).

  7. 7.

    & The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 452, 51–55 (2008).

  8. 8.

    , & Assemble: an interactive graphical tool to analyze and build RNA architectures at the 2D and 3D levels. Bioinformatics 26, 2057–2059 (2010).

  9. 9.

    , & Atomic accuracy in predicting and designing noncanonical RNA structure. Nat. Methods 7, 291–294 (2010).

  10. 10.

    The amazing world of bacterial structured RNAs. Genome Biol. 11, 108 (2010).

  11. 11.

    Structural Motifs in RNA. Annu. Rev. Biochem. 68, 287–300 (1999).

  12. 12.

    et al. The kink-turn: a new RNA secondary structure motif. EMBO J. 20, 4214–4221 (2001).

  13. 13.

    & Automated motif extraction and classification in RNA tertiary structures. RNA 14, 2489–2497 (2008).

  14. 14.

    et al. FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J. Math. Biol. 56, 215–252 (2008).

  15. 15.

    et al. Finding 3D motifs in ribosomal RNA structures. Nucleic Acids Res. 37, e29 (2009).

  16. 16.

    , & RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment. Nucleic Acids Res. 38, e176 (2010).

  17. 17.

    et al. Searching RNA motifs and their intermolecular contacts with constraint networks. Bioinformatics 22, 2074–2080 (2006).

  18. 18.

    , & A major family of motifs involving G? A mismatches in ribosomal RNA. J. Mol. Biol. 242, 1–8 (1994).

  19. 19.

    , & Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules. Biochimie 84, 961–973 (2002).

  20. 20.

    , & Structural insights into amino acid binding and gene control by a lysine riboswitch. Nature 455, 1263–1267 (2008).

  21. 21.

    , & Crystal structure of a phage Twort group I ribozyme-product complex. Nat. Struct. Mol. Biol. 12, 82–89 (2005).

  22. 22.

    , & NMR structure and dynamics of the specifier loop domain from the Bacillus subtilis tyrS T box leader RNA. Nucleic Acids Res. 38, 3388–3398 (2010).

  23. 23.

    , & Archaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snu13p snoRNP core protein. Nucleic Acids Res. 30, 931–941 (2002).

  24. 24.

    et al. The structure of threonyl-tRNA synthetase-tRNA(Thr) complex enlightens its repressor activity and reveals an essential zinc ion in the active site. Cell 97, 371–381 (1999).

  25. 25.

    et al. Identification of transcription factor binding sites with variable-order Bayesian networks. Bioinformatics 21, 2657–2666 (2005).

  26. 26.

    Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–451 (1975).

  27. 27.

    et al. Rfam: updates to the RNA families database. Nucleic Acids Res. 37, D136–D140 (2009).

  28. 28.

    , & An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol. Biol. 1, 19 (2006).

  29. 29.

    et al. GISSD: Group I Intron Sequence and Structure Database. Nucleic Acids Res. 36, D31–D37 (2008).

  30. 30.

    et al. Exceptional structured noncoding RNAs revealed by bacterial metagenome analysis. Nature 462, 656–659 (2009).

  31. 31.

    et al. Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes. Genome Biol. 11, R31 (2010).

  32. 32.

    Regulation of the vitamin B12 metabolism and transport in bacteria by a conserved RNA structural element. RNA 9, 1084–1097 (2003).

  33. 33.

    et al. cis-acting RNA signals in the NS5B C-terminal coding sequence of the Hepatitis C virus genome. J. Virol. 78, 10865–10877 (2004).

  34. 34.

    et al. Derivation of a structural model for the c-myc IRES. J. Mol. Biol. 310, 111–126 (2001).

  35. 35.

    et al. Identification of an RNA hairpin in poliovirus RNA that serves as the primary template in the in vitro uridylylation of VPg. J. Virology 74, 10359–10370 (2000).

  36. 36.

    et al. Identification of novel small RNAs using comparative genomics and microarrays. Genes Dev. 1637–1651 (2001).

  37. 37.

    & A novel RNA product of the tyrT operon of Escherichia coli. Nucleic Acids Res. 19, 5863–5870 (1991).

  38. 38.

    et al. Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline. Nucleic Acids Res. 35, 4809–4819 (2007).

  39. 39.

    et al. A potential RNA drug target in the hepatitis C virus internal ribosomal entry site. RNA 6, 1423–1431 (2000).

  40. 40.

    et al. The structure of the ribosome with elongation factor G trapped in the posttranslocational state. Science 326, 694–699 (2009).

  41. 41.

    et al. Structures of two RNA octamers containing tandem G.A base pairs. Acta Crystallogr. D Biol. Crystallogr. 60, 829–835 (2004).

  42. 42.

    et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

  43. 43.

    et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35, 7188–7196 (2007).

  44. 44.

    , & VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 1974–1975 (2009).

  45. 45.

    et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge University Press, 1998).

  46. 46.

    et al. Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie 125, 167–188 (1994).

  47. 47.

    , & Measuring covariation in RNA alignments: physical realism improves information measures. Bioinformatics 22, 2988–2995 (2006).

Download references

Acknowledgements

We thank R. Backofen for useful suggestions. J.A.C. is supported by the Ph.D. Program in Computational Biology of the Instituto Gulbenkian de Ciência, Portugal (sponsored by Fundação Calouste Gulbenkian, Siemens SA and Fundação para a Ciência e Tecnologia; SFRH/BD/33528/2008).

Author information

Affiliations

  1. Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France.

    • José Almeida Cruz
    •  & Eric Westhof

Authors

  1. Search for José Almeida Cruz in:

  2. Search for Eric Westhof in:

Contributions

J.A.C. conceived the algorithms, performed the computations and wrote the manuscript. E.W. conceived the research and wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Eric Westhof.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–10, Supplementary Table 1, Supplementary Notes 1–6, Supplementary Data 1–3

Zip files

  1. 1.

    Supplementary Software 1

    Rmdetect

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.1603

Further reading