Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data

Abstract

Molecular interactions between protein complexes and DNA mediate essential gene-regulatory functions. Uncovering such interactions by chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) has recently become the focus of intense interest. We here introduce quantitative enrichment of sequence tags (QuEST), a powerful statistical framework based on the kernel density estimation approach, which uses ChIP-Seq data to determine positions where protein complexes contact DNA. Using QuEST, we discovered several thousand binding sites for the human transcription factors SRF, GABP and NRSF at an average resolution of about 20 base pairs. MEME motif-discovery tool–based analyses of the QuEST-identified sequences revealed DNA binding by cofactors of SRF, providing evidence that cofactor binding specificity can be obtained from ChIP-Seq data. By combining QuEST analyses with Gene Ontology (GO) annotations and expression data, we illustrate how general functions of transcription factors can be inferred.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: QuEST's representation of ChIP-Seq data using density profiles.
Figure 2: Reproducibility and robustness of QuEST results assessed by comparison between two independent NRSF datasets.
Figure 3: Resolution of QuEST as quantified by the distance between CDP peak calls and TFBS motif centers.
Figure 4: Motif analysis results.

Similar content being viewed by others

References

  1. Cawley, S. et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).

    Article  CAS  Google Scholar 

  2. Pokholok, D.K., Zeitlinger, J., Hannett, N.M., Reynolds, D.B. & Young, R.A. Activated signal transduction kinases frequently occupy target genes. Science 313, 533–536 (2006).

    Article  CAS  Google Scholar 

  3. Birney, E. et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

    Article  CAS  Google Scholar 

  4. Lieb, J.D. Genome-wide mapping of protein-DNA interactions by chromatin immunoprecipitation and DNA microarray hybridization. Methods Mol. Biol. 224, 99–109 (2003).

    CAS  PubMed  Google Scholar 

  5. Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

    Article  CAS  Google Scholar 

  6. Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651–657 (2007).

    Article  CAS  Google Scholar 

  7. Mardis, E.R. ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613–614 (2007).

    Article  CAS  Google Scholar 

  8. Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

    Article  CAS  Google Scholar 

  9. Wold, B. & Myers, R.M. Sequence census methods for functional genomics. Nat. Methods 5, 19–21 (2008).

    Article  CAS  Google Scholar 

  10. Johnson, D.S. et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393–403 (2008).

    Article  Google Scholar 

  11. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962).

    Article  Google Scholar 

  12. Rosmarin, A.G., Resendes, K.K., Yang, Z., McMillan, J.N. & Fleming, S.L. GA-binding protein transcription factor: a review of GABP as an integrator of intracellular signaling and protein-protein interactions. Blood Cells Mol. Dis. 32, 143–154 (2004).

    Article  CAS  Google Scholar 

  13. Lin, J.M. et al. Transcription factor binding and modified histones in human bidirectional promoters. Genome Res. 17, 818–827 (2007).

    Article  CAS  Google Scholar 

  14. Cen, B., Selvaraj, A. & Prywes, R. Myocardin/MKL family of SRF coactivators: key regulators of immediate early and muscle specific gene expression. J. Cell. Biochem. 93, 74–82 (2004).

    Article  CAS  Google Scholar 

  15. Posern, G. & Treisman, R. Actin' together: serum response factor, its cofactors and the link to signal transduction. Trends Cell Biol. 16, 588–596 (2006).

    Article  CAS  Google Scholar 

  16. Pipes, G.C., Creemers, E.E. & Olson, E.N. The myocardin family of transcriptional coactivators: versatile regulators of cell growth, migration, and myogenesis. Genes Dev. 20, 1545–1556 (2006).

    Article  CAS  Google Scholar 

  17. Cooper, S.J., Trinklein, N.D., Nguyen, L. & Myers, R.M. Serum response factor binding sites differ in three human cell types. Genome Res. 17, 136–144 (2007).

    Article  CAS  Google Scholar 

  18. Collins, P.J., Kobayashi, Y., Nguyen, L., Trinklein, N.D. & Myers, R.M. The ets-related transcription factor GABP directs bidirectional transcription. PLoS Genet. 3, e208 (2007).

    Article  Google Scholar 

  19. Schoenherr, C.J. & Anderson, D.J. Silencing is golden: negative regulation in the control of neuronal gene transcription. Curr. Opin. Neurobiol. 5, 566–571 (1995).

    Article  CAS  Google Scholar 

  20. Ballas, N., Grunseich, C., Lu, D.D., Speh, J.C. & Mandel, G. REST and its corepressors mediate plasticity of neuronal gene chromatin throughout neurogenesis. Cell 121, 645–657 (2005).

    Article  CAS  Google Scholar 

  21. Philippar, U. et al. The SRF target gene Fhl2 antagonizes RhoA/MAL-dependent activation of SRF. Mol. Cell 16, 867–880 (2004).

    Article  CAS  Google Scholar 

  22. Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. in Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology 28–36 (AAAI Press, Menlo Park, California, 1994).

  23. Schoenherr, C.J., Paquette, A.J. & Anderson, D.J. Identification of potential target genes for the neuron-restrictive silencer factor. Proc. Natl. Acad. Sci. USA 93, 9881–9886 (1996).

    Article  CAS  Google Scholar 

  24. Madsen, C.S., Regan, C.P. & Owens, G.K. Interaction of CArG elements and a GC-rich repressor element in transcriptional regulation of the smooth muscle myosin heavy chain gene in vascular smooth muscle cells. J. Biol. Chem. 272, 29842–29851 (1997).

    Article  CAS  Google Scholar 

  25. Buchwalter, G., Gross, C. & Wasylyk, B. Ets ternary complex transcription factors. Gene 324, 1–14 (2004).

    Article  CAS  Google Scholar 

  26. Mortazavi, A., Leeper Thompson, E.C., Garcia, S.T., Myers, R.M. & Wold, B. Comparative genomics modeling of the NRSF/REST repressor network: from single conserved sites to genome-wide repertoire. Genome Res. 16, 1208–1221 (2006).

    Article  CAS  Google Scholar 

  27. Crooks, G.E., Hon, G., Chandonia, J.M. & Brenner, S.E. WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by US National Institutes of Health grants 5 U01 HG003162 and 1 U54-HG004576 to R.M.M., and by funds from the Stanford Pathology and Genetics Ultra-High Throughput Sequencing Initiative. We thank L. Tsavaler for performing the Illumina expression analysis, W.H. Wong, K. McCue and members of Sidow lab for valuable discussions and suggestions.

Author information

Authors and Affiliations

Authors

Contributions

A.V., S.B., An.S. and Ar.S. conceived the QuEST peak calling concept and developed the preliminary statistical framework. A.V. further developed and refined the statistical framework, and implemented QuEST. R.M.M. and D.S.J. devised the ChIP experiments. D.S.J., C.M. and E.A. performed the ChIP experiments. A.V. applied QuEST to the sequence data, and produced all quantitative results. A.V. and Ar.S. wrote the manuscript. A.V., D.S.J., S.B., R.M.M. and Ar.S. edited the manuscript.

Corresponding author

Correspondence to Arend Sidow.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–6, Supplementary Tables 1–3, Supplementary Methods (PDF 2728 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Valouev, A., Johnson, D., Sundquist, A. et al. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 5, 829–834 (2008). https://doi.org/10.1038/nmeth.1246

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1246

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing