Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A compendium of RNA-binding motifs for decoding gene regulation

Abstract

RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: RNAcompete data for 207 RBPs.
Figure 2: Motifs obtained by RNAcompete for RRM (outer ring) and KH domain proteins (inner ring).
Figure 3: RBD sequence identity enables inference of RNA motifs.
Figure 4: Conservation of motif matches in human RNA regulatory regions.
Figure 5: RBFOX1 is a putative regulator of RNA stability in autism.

Accession codes

Accessions

Gene Expression Omnibus

Data deposits

Raw and processed microarray data are available at GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE41235. The derived motifs and results of analyses are available at http://hugheslab.ccbr.utoronto.ca/supplementary-data/RNAcompete_eukarya/.

References

  1. Glisovic, T., Bachorik, J. L., Yong, J. & Dreyfuss, G. RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett. 582, 1977–1986 (2008)

    Article  CAS  Google Scholar 

  2. Keene, J. D. RNA regulons: coordination of post-transcriptional events. Nature Rev. Genet. 8, 533–543 (2007)

    Article  ADS  CAS  Google Scholar 

  3. Cook, K. B., Kazan, H., Zuberi, K., Morris, Q. & Hughes, T. R. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 39, D301–D308 (2011)

    Article  CAS  Google Scholar 

  4. Gabut, M., Chaudhry, S. & Blencowe, B. J. SnapShot: The splicing regulatory machinery. Cell 133, 192.e1 (2008)

    Article  Google Scholar 

  5. Auweter, S. D., Oberstrass, F. C. & Allain, F. H. Sequence-specific binding of single-stranded RNA: is there a code for recognition? Nucleic Acids Res. 34, 4943–4959 (2006)

    Article  CAS  Google Scholar 

  6. De Gaudenzi, J. G., Noe, G., Campo, V. A., Frasch, A. C. & Cassola, A. Gene expression regulation in trypanosomatids. Essays Biochem. 51, 31–46 (2011)

    Article  CAS  Google Scholar 

  7. Noyes, M. B. et al. Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell 133, 1277–1289 (2008)

    Article  CAS  Google Scholar 

  8. Berger, M. F. et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133, 1266–1276 (2008)

    Article  CAS  Google Scholar 

  9. Christensen, R. G. et al. Recognition models to predict DNA-binding specificities of homeodomain proteins. Bioinformatics 28, i84–i89 (2012)

    Article  CAS  Google Scholar 

  10. Liu, J. & Stormo, G. D. Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors. Bioinformatics 24, 1850–1857 (2008)

    Article  CAS  Google Scholar 

  11. Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nature Biotechnol. 27, 667–670 (2009)

    Article  CAS  Google Scholar 

  12. Berger, M. F. & Bulyk, M. L. Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors. Nature Protocols 4, 393–411 (2009)

    Article  CAS  Google Scholar 

  13. Li, X., Quon, G., Lipshitz, H. D. & Morris, Q. Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure. RNA 16, 1096–1107 (2010)

    Article  CAS  Google Scholar 

  14. Hoell, J. I. et al. RNA targets of wild-type and mutant FET family proteins. Nature Struct. Mol. Biol. 18, 1428–1431 (2011)

    Article  CAS  Google Scholar 

  15. Miyamoto, S., Hidaka, K., Jin, D. & Morisaki, T. RNA-binding proteins Rbm38 and Rbm24 regulate myogenic differentiation via p21-dependent and -independent regulatory pathways. Genes Cells 14, 1241–1252 (2009)

    Article  CAS  Google Scholar 

  16. Anyanful, A. et al. The RNA-binding protein SUP-12 controls muscle-specific splicing of the ADF/cofilin pre-mRNA in C. elegans . J. Cell Biol. 167, 639–647 (2004)

    Article  CAS  Google Scholar 

  17. Stefl, R., Skrisovska, L. & Allain, F. H. RNA sequence- and shape-dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep. 6, 33–38 (2005)

    Article  CAS  Google Scholar 

  18. Brooks, A. N. et al. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res. 21, 193–202 (2011)

    Article  CAS  Google Scholar 

  19. Huelga, S. C. et al. Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins. Cell Rep. 1, 167–178 (2012)

    Article  CAS  Google Scholar 

  20. Burd, C. G. & Dreyfuss, G. RNA binding specificity of hnRNP A1: significance of hnRNP A1 high-affinity binding sites in pre-mRNA splicing. EMBO J. 13, 1197–1204 (1994)

    Article  CAS  Google Scholar 

  21. Blanchette, M. et al. Genome-wide analysis of alternative pre-mRNA splicing and RNA-binding specificities of the Drosophila hnRNP A/B family members. Mol. Cell 33, 438–449 (2009)

    Article  CAS  Google Scholar 

  22. Goodarzi, H. et al. Systematic discovery of structural elements governing stability of mammalian messenger RNAs. Nature 485, 264–268 (2012)

    Article  ADS  CAS  Google Scholar 

  23. Moses, A. M., Chiang, D. Y., Pollard, D. A., Iyer, V. N. & Eisen, M. B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004)

    Article  Google Scholar 

  24. Yeo, G. W. et al. An RNA code for the FOX2 splicing regulator revealed by mapping RNA-protein interactions in stem cells. Nature Struct. Mol. Biol. 16, 130–137 (2009)

    Article  CAS  Google Scholar 

  25. Morris, A. R., Mukherjee, N. & Keene, J. D. Ribonomic analysis of human Pum1 reveals cis-trans conservation across species despite evolution of diverse mRNA target sets. Mol. Cell. Biol. 28, 4093–4103 (2008)

    Article  CAS  Google Scholar 

  26. Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008)

    Article  ADS  CAS  Google Scholar 

  27. Wang, E. T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710–724 (2012)

    Article  CAS  Google Scholar 

  28. Sawicka, K., Bushell, M., Spriggs, K. A. & Willis, A. E. Polypyrimidine-tract-binding protein: a multifunctional RNA-binding protein. Biochem. Soc. Trans. 36, 641–647 (2008)

    Article  CAS  Google Scholar 

  29. Biedermann, B., Hotz, H. R. & Ciosk, R. The Quaking family of RNA-binding proteins: coordinators of the cell cycle and differentiation. Cell Cycle 9, 1929–1933 (2010)

    Article  CAS  Google Scholar 

  30. Izquierdo, J. M. Hu antigen R (HuR) functions as an alternative pre-mRNA splicing regulator of Fas apoptosis-promoting receptor on exon definition. J. Biol. Chem. 283, 19077–19084 (2008)

    Article  CAS  Google Scholar 

  31. Markus, M. A. & Morris, B. J. RBM4: a multifunctional RNA-binding protein. Int. J. Biochem. Cell Biol. 41, 740–743 (2009)

    Article  CAS  Google Scholar 

  32. Myer, V. E., Fan, X. C. & Steitz, J. A. Identification of HuR as a protein implicated in AUUUA-mediated mRNA decay. EMBO J. 16, 2130–2139 (1997)

    Article  CAS  Google Scholar 

  33. Van Etten, J. et al. Human Pumilio proteins recruit multiple deadenylases to efficiently repress messenger RNAs. J. Biol. Chem. 287, 36370–36383 (2012)

    Article  CAS  Google Scholar 

  34. Xue, Y. et al. Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Mol. Cell 36, 996–1006 (2009)

    Article  CAS  Google Scholar 

  35. Zhang, C. et al. Defining the regulatory network of the tissue-specific splicing factors Fox-1 and Fox-2. Genes Dev. 22, 2550–2563 (2008)

    Article  CAS  Google Scholar 

  36. Fogel, B. L. et al. RBFOX1 regulates both splicing and transcriptional networks in human neuronal development. Hum. Mol. Genet. 21, 4171–4186 (2012)

    Article  CAS  Google Scholar 

  37. Voineagu, I. et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011)

    Article  CAS  Google Scholar 

  38. Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010)

    Article  ADS  CAS  Google Scholar 

  39. Hogan, D. J., Riordan, D. P., Gerber, A. P., Herschlag, D. & Brown, P. O. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 6, e255 (2008)

    Article  Google Scholar 

  40. Qin, X., Ahn, S., Speed, T. P. & Rubin, G. M. Global analyses of mRNA translational control during early Drosophila embryogenesis. Genome Biol. 8, R63 (2007)

    Article  Google Scholar 

  41. Tadros, W. et al. SMAUG is a major regulator of maternal mRNA destabilization in Drosophila and its translation is activated by the PAN GU kinase. Dev. Cell 12, 143–155 (2007)

    Article  CAS  Google Scholar 

  42. Lécuyer, E. et al. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187 (2007)

    Article  Google Scholar 

  43. Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009)

    Article  CAS  Google Scholar 

  44. Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012)

    Article  CAS  Google Scholar 

  45. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)

    Article  Google Scholar 

  46. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005)

    Article  ADS  CAS  Google Scholar 

  47. Mahony, S. & Benos, P. V. STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res. 35, W253–W258 (2007)

    Article  Google Scholar 

Download references

Acknowledgements

We thank H. van Bakel for computational support, A. Ramani and J. Calarco for discussions, Y. Wu, G. Rasanathan, M. Krishnamoorthy, O. Boright, A. Janska, J. Li, S. Talukder, A. Cote and S. Votruba for technical assistance, L. Sutherland for purchasing RBM5 protein and for feedback on the manuscript, S. Jain for software modified to create Fig. 2, and N. Barbosa-Morais for generating cRPKM values from autism RNA-seq data. We thank M. Kiledjian (PCBP1 and PCBP2), J. Stevenin (SRSF2 and SFRS7), S. Richard (QKI), M. Gorospe (TIA1), B. Chabot (SRSF9), A. Berglund (MBNL1), F. Pagani (DAZAP1), A. Bindereif (HNRNPL), M. Freeman (HNRNPK), E. Miska (LIN28A), K. Kohno (YBX1), M. Garcia-Blanco (PTBP1), R. Wharton (PUM-HD), C. Smibert (Vts1p) and M. Blanchette (Hrb27C, Hrb87F and Hrb98DE) for sending published constructs. This work was supported by funding from NIH (1R01HG00570 to T.R.H. and Q.D.M., R01GM084034 to K.W.L.), CIHR (MOP-49451 to T.R.H., MOP-93671 to Q.D.M., MOP-125894 to Q.D.M. and T.R.H., MOP-67011 to B.J.B., and MOP-14409 to H.D.L.), and the Intramural Program of the NIDDK (DK015602-05 to E.P.L.). K.B.C. and S.G. hold NSERC Alexander Graham Bell Canada Graduate Scholarships. M.T.W. was funded by fellowships from CIHR and CIFAR. H.S.N. holds a Charles H. Best Fellowship and was funded partially by awards from CIFAR to T.R.H. and B.J.F. M.I. is the recipient of an HFSP LT Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

D.R., H.K., K.B.C., M.T.W. and H.S.N. made unique, essential and extensive contributions to the manuscript, and are ordered by amount of time and effort contributed. D.R. and H.K. developed most of the laboratory and computational components of RNAcompete, respectively. D.R., H.Z., A.Y., H.N., L.H.M., S.A.S., C.A.Y., S.M.K., B.N., D.M., W.L., R.S.L. and M.Q. cloned, expressed and purified the proteins. D.R. ran the RNAcompete assays, including data extraction. H.K. and K.B.C. processed the data, H.K. and K.B.C. generated motifs, and H.K., K.B.C., M.T.W. and H.S.N. performed the motif analyses. H.K. assembled the in vivo protein-RNA data sets. L.H.M. and R.K.D. performed and analysed RIP-seq data. K.B.C. developed the supplementary website and Figs 1 and 2 with assistance from H.K. and M.T.W. M.T.W. and M.A. created the cisBP-RNA database. M.T.W., H.S.N. and T.R.H. created Fig. 3. H.S.N. performed the analyses of human splicing, RNA stability data and human sequence conservation, and created Figs 4 and 5. M.I. and S.G. generated and analysed RNA-seq data and S.G. performed reporter-based RNA stability assays. X.L. performed Drosophila data analysis. H.D.L., F.P., A.H.C., R.P.C., B.J.F., R.A.A., K.W.L., L.O.F.P., E.P.L., B.J.B. and A.G.F. helped organize and support the project, and provided feedback on the manuscript. B.J.F., B.J.B. and A.G.F. provided critical advice and commentary on data analysis. Q.D.M. and T.R.H. conceived of the study, supervised the project and wrote the manuscript with contributions from D.R., H.K., K.B.C., B.J.B., A.F. and H.S.N.

Corresponding authors

Correspondence to Quaid D. Morris or Timothy R. Hughes.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Methods, Supplementary Figures 1-6, Supplementary Tables 1-4 and additional references. (PDF 2569 kb)

Supplementary Data 1

This file shows RNA-binding proteins with known consensus motifs. It contains panels for human and Drosophila listing RBPs with known consensus motifs as well as the Pubmed ID of the publication that defined the motif. (XLSX 27 kb)

Supplementary Data 2

The RNAcompete master file. This file contains data on all RNAcompete experiments indexed by motif ID including: name, systematic ID and species of protein queried, the resulting motif, amino acid sequence of plasmid insert, and information on binding conditions used. (XLSX 2614 kb)

Supplementary Data 3

Secondary structure analysis. This file contains data panels in which each row corresponds to a significantly enriched secondary structure context for a given RNAcompete experiment along with P-values and effect sizes. Classification panel summarizes analysis results by motif. (XLSX 30 kb)

Supplementary Data 4

Clustered E-scores. This file contains the data matrix used in Figures 1b and S7. (TXT 16827 kb)

Supplementary Data 5

Comparison of RNAcompete and literature motifs. This file shows the results of comparison with previously defined motifs for RNAcompete RBPs. (XLSX 515 kb)

Supplementary Data 6

AUROC scores for in vivo and in vitro defined motifs on in vivo binding data. This file contains AUROCs for RNAcompete motifs on in vivo binding data described in Table S2, along with motifs learned by Malarkey on these data and AUROC scores for previously defined motifs for these RBPs. (XLSX 19 kb)

Supplementary Data 7

Post-transcriptional regulation (PTR) analysis in human. This file contains additional details and results of PTR analysis in human including predicted RBP-transcript regulatory networks for splicing and stability analysis. (XLSX 1445 kb)

Supplementary Data 8

Post-transcriptional regulation (PTR) analysis in Drosophila. This file contains details and results of PTR analysis for Drosophila including lists of PTR categories enriched for RNAcompete-derived IUPAC motifs, weights of trained logistic regression classifiers, Drosophila RBP(s) associated with each IUPAC motif, and IUPAC motifs queried. (XLSX 44 kb)

Supplementary Data 9

Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs.Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs. (XLSX 34 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ray, D., Kazan, H., Cook, K. et al. A compendium of RNA-binding motifs for decoding gene regulation. Nature 499, 172–177 (2013). https://doi.org/10.1038/nature12311

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature12311

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing