Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

High-resolution computational models of genome binding events

An Erratum to this article was published on 01 October 2006

Abstract

Direct physical information that describes where transcription factors, nucleosomes, modified histones, RNA polymerase II and other key proteins interact with the genome provides an invaluable mechanistic foundation for understanding complex programs of gene regulation. We present a method, joint binding deconvolution (JBD), which uses additional easily obtainable experimental data about chromatin immunoprecipitation (ChIP) to improve the spatial resolution of the transcription factor binding locations inferred from ChIP followed by DNA microarray hybridization (ChIP-Chip) data. Based on this probabilistic model of binding data, we further pursue improved spatial resolution by using sequence information. We produce positional priors that link ChIP-Chip data to sequence data by guiding motif discovery to inferred protein-DNA binding sites. We present results on the yeast transcription factors Gcn4 and Mig2 to demonstrate JBD's spatial resolution capabilities and show that positional priors allow computational discovery of the Mig2 motif when a standard approach fails.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: JBD probabilistically models key aspects of ChIP-Chip experiments.
Figure 2: JBD predicts the binding probability of the yeast transcription factor Gcn4 every 30 bp across the entire genome.
Figure 3: Performance of the JBD, Mpeak and Ratio methods.
Figure 4: Positional priors for motif discovery improve robustness to false input DNA sequence regions.

Similar content being viewed by others

References

  1. Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).

    Article  CAS  Google Scholar 

  2. Lieb, J., Liu, X., Botstein, D. & Brown, P. Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nat. Genet. 28, 327–324 (2001).

    Article  CAS  Google Scholar 

  3. Iyer, V. et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001).

    Article  CAS  Google Scholar 

  4. Simon, I. et al. Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106, 697–708 (2001).

    Article  CAS  Google Scholar 

  5. Lee, T. et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002).

    Article  CAS  Google Scholar 

  6. Horak, C. et al. GATA-1 binding sites mapped in the betaglobin locus by using mammalian ChIP-chip analysis. Proc. Natl. Acad. Sci. USA 99, 2924–2929 (2002).

    Article  CAS  Google Scholar 

  7. Weinmann, A., Yan, P., Oberley, M., Huang, T. & Farnham, P. Isolating human transcription factor targets by coupling chromatin immunoprecipitation and CpG island microarray analysis. Genes Dev. 16, 235–244 (2002).

    Article  CAS  Google Scholar 

  8. Li, Z. et al. A global transcriptional regulatory role for c-Myc in Burkitts lymphoma cells. Proc. Natl. Acad. Sci. USA 100, 8164–8169 (2003).

    Article  CAS  Google Scholar 

  9. Wells, J., Yan, P., Cechvala, M., Huang, T. & Farnham, P. Identification of novel pRb binding sites using CpG microarrays suggests that E2F recruits pRb to specific genomic sites during S phase. Oncogene 22, 1445–1460 (2003).

    Article  CAS  Google Scholar 

  10. Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).

    Article  CAS  Google Scholar 

  11. Cawley, S. et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).

    Article  CAS  Google Scholar 

  12. Robert, F. et al. Global position and recruitment of HATs and HDACs in the yeast genome. Molecular Cell 16, 119–209 (2004).

    Article  Google Scholar 

  13. Pokholok, D.K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).

    Article  CAS  Google Scholar 

  14. Wyrick, J. et al. Genome-wide distribution of ORC and MCM proteins in S. cerevisiae: high-resolution mapping of replication origins. Science 294, 2357–2360 (2001).

    Article  CAS  Google Scholar 

  15. Gerton, J. et al. Inaugural article: global mapping of meiotic recombination hotspots and coldspots in the yeast Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 97, 11383–11390 (2000).

    Article  CAS  Google Scholar 

  16. Bernstein, B.E. et al. Methylation of histone H3 Lys 4 in coding regions of active genes. Proc. Natl. Acad. Sci. USA 99, 8695–8700 (2002).

    Article  CAS  Google Scholar 

  17. Ng, H., Robert, F., Young, R. & Struhl, K. Regulated recruitment of the ATP-dependent chromatin remodeling complex RSC in response to transcriptional repression and activation. Genes Dev. 16, 806–819 (2002).

    Article  CAS  Google Scholar 

  18. Robyr, D. et al. Microarray deacetylation maps determine genome-wide functions for yeast histone deacetylases. Cell 109, 437–446 (2002).

    Article  CAS  Google Scholar 

  19. Nagy, P., Cleary, M., Brown, P. & Lieb, J. Genomewide demarcation of RNA polymerase II transcription units revealed by physical fractionation of chromatin. Proc. Natl. Acad. Sci. USA 100, 6364–6369 (2003).

    Article  CAS  Google Scholar 

  20. Kurdistani, S.K., Tavazoie, S. & Grunstein, M. Mapping global histone acetylation patterns to gene expression. Cell 117, 721–733 (2004).

    Article  CAS  Google Scholar 

  21. Bernstein, B.E. et al. Genomic maps and comparative analysis of histone modifications in human and mouse. Cell 120, 169–181 (2005).

    Article  CAS  Google Scholar 

  22. Yuan, G. et al. Genome-scale identification of nucleosome positions in S. cerevisiae. Science 309, 626–630 (2005).

    Article  CAS  Google Scholar 

  23. Marion, R.M. et al. Sfp1 is a stress- and nutrient-sensitive regulator of ribosomal protein gene expression. Proc. Natl. Acad. Sci. USA 101, 14315–14322 (2004).

    Article  CAS  Google Scholar 

  24. Li, X. & Wong, W. Sampling motifs on phylogenetic trees. Proc. Natl. Acad. Sci. USA 102, 9481–9486 (2005).

    Article  CAS  Google Scholar 

  25. Hartemink, A.J., Gifford, D.K., Jaakkola, T.S. & Young, R.A. Combining location and expression data for principled discovery of genetic regulatory network models. Proceedings of Pacific Symposium on Biocomputing, (Lihue, Hawaii, January 3–7, 2002) 7, 437–449 (2002).

  26. Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nat. Biotechnol. 21, 1337–1342 (2003).

    Article  CAS  Google Scholar 

  27. Luscombe, N. et al. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431, 308–312 (2004).

    Article  CAS  Google Scholar 

  28. Buck, M.J., Nobel, A.B. & Lieb, J.D. Chipotle: a user-friendly tool for the analysis of chip-chip data. Genome Biol. 6, R97 (2005).

    Article  Google Scholar 

  29. Roberts, C. et al. Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science 287, 873–880 (2000).

    Article  CAS  Google Scholar 

  30. Keles, S., Dudoit, S., van der Laan, M. & Cawley, S.E. Multiple testing methods for ChIP-Chip high density oligonucleotide array data. Berkeley Electronic Press (June, 2004). http://www.bepress.com/ucbbiostat/paper147

    Google Scholar 

  31. Kim, T.H. et al. A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005).

    Article  CAS  Google Scholar 

  32. Boyer, L.A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956 (2005).

    Article  CAS  Google Scholar 

  33. Bailey, T. & Elkan, C. The value of prior knowledge in discovering motifs with MEME. in Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, 21–29 (AAAI Press, Menlo Park, CA, 1995).

    Google Scholar 

  34. Wingender, E. et al. The TRANSFAC system on gene expression regulation. Nucleic Acids Res. 29, 281–283 (2001).

    Article  CAS  Google Scholar 

  35. Lutfiyya, L. & Johnston, M. Two zinc-finger-containing repressors are responsible for glucose repression of SUC2 expression. Mol. Cell. Biol. 16, 4790–4797 (1996).

    Article  CAS  Google Scholar 

  36. Neal, R.M. Probabilistic inference using Markov Chain Monte Carlo methods. Tech. Rep. CRG-TR-93–1, Dept. of Computer Science, University of Toronto (1993).

  37. Brooks, S.P. Markov Chain Monte Carlo method and its application. Statistician 47, 69–100 (1998).

    Google Scholar 

  38. Minka, T.P. Expectation propagation for approximate Bayesian inference. in Proceedings of Uncertainty in Artificial Intelligence 362–369 (2001). http://research.microsoft.com/~minka/papers/ep/minka-ep-uai.pdf

    Google Scholar 

  39. Qi, Y. Extending expectation propagation for graphical models. Ph.D. thesis, MIT (2004). http://www.csail.mit.edu/~alanqi/papers/Qi-PhD-thesis-MIT-04.pdf

    Google Scholar 

  40. Gordon, D. B., Nekludova, L., McCallum, S. & Fraenkel, E. Tamo: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs. Bioinformatics 21, 3164–3165 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank Duncan Odom and Laurie Boyer for providing DNA fragment length data for human transcription factors. This work was funded by the National Institutes of Health under grant number GM-069676.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David K Gifford.

Ethics declarations

Competing interests

David Gifford consulted for Agilent Corporation during the course of this research.

Supplementary information

Supplementary Fig. 1

CHIP-chip experimental protocol. (PDF 54 kb)

Supplementary Fig. 2

Enrichment ratios for three replicates of an untagged control experiment and two replicates of Mig2. (PDF 109 kb)

Supplementary Fig. 3

Motif discovery output. (PDF 117 kb)

Supplementary Fig. 4

Influence function for yeast, human embryonic stem cells and human liver. (PDF 48 kb)

Supplementary Table 1

MIPS categories of Gen4 targets identified by JBD. (PDF 20 kb)

Supplementary Table 2

Known Gen4 and Mig2 log-odds weight matrix as discovered by motif discovery with positional priors from JBD and as previously published by Haribson et al. and Luftiya et al. (PDF 25 kb)

Supplementary Table 3

Performance (median distance in bp) of JBD on synthetic data. (PDF 35 kb)

Supplementary Table 4

Standard error of the mean spatial resolution of JBD on synthetic data. (PDF 34 kb)

Supplementary Table 5

Mean distance in nucleotides between binding calls and Gen4 motif in 573 promoter regions with a conserved Gen4 motif. (PDF 20 kb)

Supplementary Discussion (PDF 10 kb)

Supplementary Methods (PDF 72 kb)

Supplementary Data 1 (TXT 0 kb)

Supplementary Data 2 (TXT 8 kb)

Supplementary Data 3 (TXT 3 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qi, Y., Rolfe, A., MacIsaac, K. et al. High-resolution computational models of genome binding events. Nat Biotechnol 24, 963–970 (2006). https://doi.org/10.1038/nbt1233

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1233

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing