Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Motif-based analysis of large nucleotide data sets using MEME-ChIP

Abstract

MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by CLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP's interactive HTML output groups and aligns significant motifs to ease interpretation. This protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: MEME-ChIP submission form.
Figure 2: Universal options (expanded).
Figure 3: MEME options (expanded).
Figure 4: DREME options (expanded).
Figure 5: CentriMo options (expanded).
Figure 6: Job verification page.
Figure 7: MEME-ChIP report (top).
Figure 8: MEME-ChIP report (PROGRAMS).
Figure 9: MEME-ChIP report (INPUT FILES and command line).
Figure 10: MEME-ChIP report (MOTIFS).
Figure 11: MEME-ChIP report (top motif group expanded).
Figure 12: JASPAR Gata1 motif.
Figure 13: Tomtom analysis of MEME Gata1 motif (two most similar known motifs).
Figure 14: CentriMo analysis of GATA1 ChIP-seq peaks (top 15 most centrally enriched motifs).
Figure 15: CentriMo analysis reveals peak-calling artifacts.
Figure 16: MEME-ChIP report (fourth motif group expanded).
Figure 17: Tomtom analysis of DREME motifs.
Figure 18: Puf3p MEME-ChIP report (top).
Figure 19: Puf3p MEME-ChIP report (top motif group expanded).
Figure 20: CentriMo analysis of Puf3p PAR-CLIP cross-linking regions (top three most locally enriched motifs).
Figure 21: Puf3p MEME-ChIP report (top four motifs).

Similar content being viewed by others

References

  1. Machanick, P. & Bailey, T.L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bailey, T.L. & Elkan, C.P. in Fitting a mixture model by expectation-maximization to discover motifs in biopolymers. (eds. Altman, R., Brutlag, D., Karp, P., Lathrop, R., & Searls, D.) Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology 28–36 (AAAI Press, 1994).

  3. Bailey, T.L. DREME: Motif discovery in transcription factor ChIP-seq data. Bioinformatics 27, 1653–1659 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bailey, T. & Machanick, P. Inferring direct DNA binding from ChIP-seq. Nucleic Acids Res. 40, e128 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  6. Freeberg, M.A. et al. Pervasive and dynamic protein binding sites of the mRNA transcriptome in Saccharomyces cerevisiae. Genome Biol. 14, R13 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Kulakovskiy, I.V., Boeva, V.A., Favorov, A.V. & Makeev, V.J. Deep and wide digging for binding motifs in ChIP-seq data. Bioinformatics 26, 2622–2623 (2010).

    Article  CAS  PubMed  Google Scholar 

  8. Kuttippurathu, L. et al. CompleteMOTIFs: DNA motif discovery platform for transcription factor binding experiments. Bioinformatics 27, 715–717 (2011).

    Article  CAS  PubMed  Google Scholar 

  9. Jin, V.X., Apostolos, J., Nagisetty, N.S. & Farnham, P.J. W-ChIPMotifs: a web application tool for de novo motif discovery from ChIP-based high-throughput data. Bioinformatics 25, 3191–3193 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zambelli, F., Pesole, G. & Pavesi, G. PscanChIP: finding over-represented transcription factor-binding site motifs and their correlations in sequences from ChIP-seq experiments. Nucleic Acids Res. 41, W535–W543 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Thomas-Chollier, M. et al. A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs. Nat. Protoc. 7, 1551–1568 (2012).

    Article  CAS  PubMed  Google Scholar 

  12. Sun, W. et al. TherMos: estimating protein-DNA binding energies from in vivo binding profiles. Nucleic Acids Res. 41, 5555–5568 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Bailey, T.L. et al. Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS Comput. Biol. 9, e1003326 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Stephen, G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).

    Article  CAS  Google Scholar 

  15. Park, P.J. ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669–680 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zhang, Y. et al. Model-based analysis of ChIP-seq (MACS). Genome Biol. 9, R137 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W. & Lenhard, B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucliec Acids Res. 32, D91–D94 (2004).

    Article  CAS  Google Scholar 

  18. Newburger, D.E. & Bulyk, M.L. UniPROBE: an online database of protein binding microarray data on protein-DNA interactions. Nucleic Acids Res. 37 (suppl. 1), D77–D82 (2009).

    Article  CAS  PubMed  Google Scholar 

  19. Gerber, A.P., Herschlag, D. & Brown, P.O. Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast. PLoS Biol. 2, e79 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Saint-Georges, Y. et al. Yeast mitochondrial biogenesis: a role for the PUF RNA-binding protein Puf3p in mRNA localization. PLoS ONE 3 e2293 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Ray, D. et al. A compendium of RNA-binding motifs for decoding gene regulation. Nature 499, 172–177 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Kent, W.J. et al. The Human Genome Browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Goecks, J., Nekrutenko, A. & Taylor, J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11, R86 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Wadman, I.A. et al. The LIM-only protein Lmo2 is a bridging molecule assembling an erythroid, DNA-binding complex which includes the TAL1, E47, GATA-1 and Ldb1/NLI proteins. EMBO J. 16, 3145–3157 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Whitington, T., Frith, M.C., Johnson, J. & Bailey, T.L. Inferring transcription factor complexes from ChIP-seq data. Nucleic Acids Res. 39, e98 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Hess, J., Angel, P. & Schorpp-Kistner, M. AP-1 subunits: quarrel and harmony among siblings. J. Cell Sci. 117, 5965–5973 (2004).

    Article  CAS  PubMed  Google Scholar 

  27. Tallack, M.R. et al. A global role for KLF1 in erythropoiesis revealed by ChIP-seq in primary erythroid cells. Genome Res. 20, 1052–1063 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hogan, D.J., Riordan, D.P., Gerber, A.P., Herschlag, D. & Brown, P.O. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 6, e255 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  29. Sharov, A.A. & Ko, M.S.H. Exhaustive search for over-represented DNA sequence motifs with CisFinder. DNA Res. 16, 261–273 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Luehr, S., Hartmann, H. & Söding, J. The XXmotif web server for eXhaustive, weight matriX-based motif discovery in nucleotide sequences. Nucleic Acids Res. 40 (Web server issue): W104–W109 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Sung Rhee, H. & Franklin Pugh, B. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 1408–1419 (2011).

    Article  CAS  Google Scholar 

  32. van Steensel, B. & Henikoff, S. Identification of in vivo DNA targets of chromatin proteins using tethered dam methyltransferase. Nat. Biotechnol. 18, 424–428 (2000).

    Article  CAS  PubMed  Google Scholar 

  33. Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013).

    Article  CAS  PubMed  Google Scholar 

  34. Licatalosi, D.D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Sanford, J.R. et al. Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts. Genome Res. 19, 381–394 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chi, S.W., Zang, J.B., Mele, A. & Darnell, R.B. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature 460, 479–486 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Hafner, M. et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141, 129–141 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. König, J. et al. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 17, 909–915 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Zhang, C. & Darnell, R.B. Mapping in vivo protein-RNA interactions at single-nucleotide resolution from HITS-CLIP data. Nat. Biotechnol. 29, 607–614 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Crawford, G.E. et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Giresi, P.G., Kim, J., McDaniell, R.M., Iyer, V.R. & Lieb, J.D. FAIRE (formaldehyde-assisted isolation of regulatory elements) isolates active regulatory elements from human chromatin. Genome Res. 17, 877–885 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Auerbach, R.K. et al. Mapping accessible chromatin regions using Sono-seq. Proc. Natl. Acad. Sci. USA 106, 14926–14931 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  PubMed  Google Scholar 

  44. Jin, F. et al. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503, 290–294 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Gupta, S., Stamatoyannopoulos, J.A., Bailey, T.L. & Noble, W.S. Quantifying similarity between motifs. Genome Biol. 8, R24 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Kishore, S. et al. A quantitative analysis of CLIP methods for identifying binding sites of RNA-binding proteins. Nat. Methods 8, 559–564 (2011).

    Article  CAS  PubMed  Google Scholar 

  47. Corcoran, D.L. et al. PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data. Genome Biol. 12, R79 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by the US National Institutes of Health awards R01 RR021692, R01 GM103544 and R01 GM098039.

Author information

Authors and Affiliations

Authors

Contributions

W.M. and W.S.N. wrote the initial draft. T.L.B. conceived the study cases, wrote the anticipated results section and wrote the second draft. W.M. and T.L.B. verified the study cases. W.S.N., W.M. and T.L.B. edited the final manuscript.

Corresponding author

Correspondence to Timothy L Bailey.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, W., Noble, W. & Bailey, T. Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nat Protoc 9, 1428–1450 (2014). https://doi.org/10.1038/nprot.2014.083

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2014.083

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics