Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Systematic identification of mammalian regulatory motifs' target genes and functions

Abstract

We developed an algorithm, Lever, that systematically maps metazoan DNA regulatory motifs or motif combinations to sets of genes. Lever assesses whether the motifs are enriched in cis-regulatory modules (CRMs), predicted by our PhylCRM algorithm, in the noncoding sequences surrounding the genes. Lever analysis allows unbiased inference of functional annotations to regulatory motifs and candidate CRMs. We used human myogenic differentiation as a model system to statistically assess greater than 25,000 pairings of gene sets and motifs or motif combinations. We assigned functional annotations to candidate regulatory motifs predicted previously and identified gene sets that are likely to be co-regulated via shared regulatory motifs. Lever allows moving beyond the identification of putative regulatory motifs in mammalian genomes, toward understanding their biological roles. This approach is general and can be applied readily to any cell type, gene expression pattern or organism of interest.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Purchase on Springer Link

Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Lever schema.
Figure 2: Analysis of the time course of human skeletal muscle differentiation.
Figure 3: Lever screen of 101 myogenic gene sets using a dictionary of 174 motifs.
Figure 4: Experimental validation of computationally predicted CRMs.

Similar content being viewed by others

The ENCODE Project Consortium, Michael P. Snyder, … Richard M. Myers

Accession codes

Accessions

Gene Expression Omnibus

References

  1. Bulyk, M.L. Computational prediction of transcription-factor binding site locations. Genome Biol. 5, 201 (2003).

    Article  Google Scholar 

  2. Blanchette, M. et al. Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 16, 656–668 (2006).

    Article  CAS  Google Scholar 

  3. Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 124, 47–59 (2006).

    Article  CAS  Google Scholar 

  4. Pennacchio, L.A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499–502 (2006).

    Article  CAS  Google Scholar 

  5. Thompson, W., Palumbo, M.J., Wasserman, W.W., Liu, J.S. & Lawrence, C.E. Decoding human regulatory circuits. Genome Res. 14, 1967–1974 (2004).

    Article  CAS  Google Scholar 

  6. Zhou, Q. & Wong, W.H. CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. USA 101, 12114–12119 (2004).

    Article  CAS  Google Scholar 

  7. Wasserman, W.W. & Fickett, J. Identification of regulatory regions which confer muscle-specific gene expression. J. Mol. Biol. 278, 167–181 (1998).

    Article  CAS  Google Scholar 

  8. Philippakis, A.A., He, F.S. & Bulyk, M.L. Modulefinder: a tool for computational discovery of cis regulatory modules. Pac. Symp. Biocomput. 10, 519–530 (2005).

    Google Scholar 

  9. Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).

    Article  CAS  Google Scholar 

  10. Elemento, O. & Tavazoie, S. Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol. 6, R18 (2005).

    Article  Google Scholar 

  11. Huber, B.R. & Bulyk, M.L. Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data. BMC Bioinformatics 7, 229 (2006).

    Article  Google Scholar 

  12. Ettwiller, L. et al. The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates. Genome Biol. 6, R104 (2005).

    Article  Google Scholar 

  13. Bulyk, M.L. DNA microarray technologies for measuring protein-DNA interactions. Curr. Opin. Biotechnol. 17, 422–430 (2006).

    Article  CAS  Google Scholar 

  14. Bulyk, M.L., Huang, X., Choo, Y. & Church, G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. USA 98, 7158–7163 (2001).

    Article  CAS  Google Scholar 

  15. Mukherjee, S. et al. Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat. Genet. 36, 1331–1339 (2004).

    Article  CAS  Google Scholar 

  16. Berger, M.F. et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429–1435 (2006).

    Article  CAS  Google Scholar 

  17. Philippakis, A.A. et al. Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells. PLOS Comput. Biol. 2, e53 (2006).

    Article  Google Scholar 

  18. Moses, A.M., Chiang, D.Y., Pollard, D.A., Iyer, V.N. & Eisen, M.B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004).

    Article  Google Scholar 

  19. Margulies, E.H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007).

    Article  CAS  Google Scholar 

  20. Messenguy, F. & Dubois, E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene 316, 1–21 (2003).

    Article  CAS  Google Scholar 

  21. Blais, A. et al. An initial blueprint for myogenic differentiation. Genes Dev. 19, 553–569 (2005).

    Article  CAS  Google Scholar 

  22. Daury, L. et al. Opposing functions of ATF2 and Fos-like transcription factors in c-Jun-mediated myogenin expression and terminal differentiation of avian myoblasts. Oncogene 20, 7998–8008 (2001).

    Article  CAS  Google Scholar 

  23. Wang, Z. et al. Myocardin and ternary complex factors compete for SRF to control smooth muscle gene expression. Nature 428, 185–189 (2004).

    Article  CAS  Google Scholar 

  24. Martinez-Fernandez, S. et al. Pitx2c overexpression promotes cell proliferation and arrests differentiation in myoblasts. Dev. Dyn. 235, 2930–2939 (2006).

    Article  CAS  Google Scholar 

  25. Gurtner, A. et al. Requirement for down-regulation of the CCAAT-binding activity of the NF-Y transcription factor during skeletal muscle differentiation. Mol. Biol. Cell 14, 2706–2715 (2003).

    Article  CAS  Google Scholar 

  26. Ludwig, M.Z., Bergman, C., Patel, N.H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000).

    Article  CAS  Google Scholar 

  27. Wasserman, W.W., Palumbo, M., Thompson, W., Fickett, J. & Lawrence, C. Human-mouse genome comparisons to locate regulatory sites. Nat. Genet. 26, 225–228 (2000).

    Article  CAS  Google Scholar 

  28. Kasabov, N.K. Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering (MIT Press, Cambridge, Massachusetts, 1998).

    Google Scholar 

  29. Mootha, V.K. et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).

    Article  CAS  Google Scholar 

  30. Berriz, G.F., King, O.D., Bryant, B., Sander, C. & Roth, F.P. Characterizing gene sets with FuncAssociate. Bioinformatics 19, 2502–2504 (2003).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank E. Margulies and the ENCODE Multiple Sequence Alignment working group for generously allowing use of their phylogenetic tree before its publication; S. Asthana, S. Sunyaev, G. Kryukov, M. Berger, T. Siggers and A. Aboukhalil for helpful discussions; J. Chee, E. Mathewson and T. Sierra for technical assistance; S. Elledge, A. Friedman, T. Siggers, M. Berger and F. De Masi for critical reading of the manuscript; A. Donner (Brigham & Women's Hospital) for the generous gift of human lens epithelial cells; and K. Cichowski (Brigham & Women's Hospital) for kindly providing lentiviral reagents. This work was funded in part by a PhRMA Foundation Informatics Research Starter Grant (M.L.B.), a William F. Milton Fund Award (M.L.B.), a Harvard-MIT Division of Health Sciences & Technology (HST) Taplin Award (M.L.B.) and US National Institutes of Health (NIH) National Human Genome Research Institute (R01 HG002966 to M.L.B.). J.B.W. was supported in part by an NIH Training Grant T32 HL07627 and NIH Individual National Research Service Award F32 AR051287. A.A.P. was supported in part by a National Defense Science and Engineering Graduate Fellowship from the Department of Defense and an Athinoula Martinos Fellowship from HST. S.A.J. was supported in part by a US National Science Foundation Postdoctoral Research Fellowship in Biological Informatics.

Author information

Authors and Affiliations

Authors

Contributions

J.B.W. participated in the experimental design, performed the experiments and participated in analysis of the results and drafting of the manuscript. A.A.P. conceived of the PhylCRM scoring algorithm, participated in programming PhylCRM and running PhylCRM analyses, the development of Lever, programming Lever, running Lever analyses and analyzing the results and drafting of the manuscript. S.A.J. optimized the performance and participated in programming PhylCRM, running PhylCRM analyses, development of Lever, programming Lever and running Lever analyses and in analysis of the results and drafting of the manuscript. F.S.H. assisted with programming PhylCRM and running PhylCRM analyses. J.L. assisted with the experiments. M.L.B. conceived of the study and participated in the study design, analysis of the results and drafting of the manuscript.

Corresponding author

Correspondence to Martha L Bulyk.

Supplementary information

Supplementary Text and Figures

Supplementary figures 1–12, Supplementary Tables 1, 2, 4, Supplementary Methods, Supplementary Results (PDF 7182 kb)

Supplementary Table 3

Statistically significant GM pairs from Lever analyses. (XLS 3676 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Warner, J., Philippakis, A., Jaeger, S. et al. Systematic identification of mammalian regulatory motifs' target genes and functions. Nat Methods 5, 347–353 (2008). https://doi.org/10.1038/nmeth.1188

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1188

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing