Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities

Abstract

Transcription factors (TFs) interact with specific DNA regulatory sequences to control gene expression throughout myriad cellular processes. However, the DNA binding specificities of only a small fraction of TFs are sufficiently characterized to predict the sequences that they can and cannot bind. We present a maximally compact, synthetic DNA sequence design for protein binding microarray (PBM) experiments1 that represents all possible DNA sequence variants of a given length k (that is, all 'k-mers') on a single, universal microarray. We constructed such all k-mer microarrays covering all 10–base pair (bp) binding sites by converting high-density single-stranded oligonucleotide arrays to double-stranded (ds) DNA arrays. Using these microarrays we comprehensively determined the binding specificities over a full range of affinities for five TFs of different structural classes from yeast, worm, mouse and human. The unbiased coverage of all k-mers permits high-throughput interrogation of binding site preferences, including nucleotide interdependencies, at unprecedented resolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Design of a universal microarray for PBM experiments.
Figure 2: Relating PBM signal intensity to individual k-mers.
Figure 3: Determination of motifs and logos for five TFs.

Similar content being viewed by others

References

  1. Mukherjee, S. et al. Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat. Genet. 36, 1331–1339 (2004).

    Article  CAS  Google Scholar 

  2. Bulyk, M.L., Huang, X., Choo, Y. & Church, G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. USA 98, 7158–7163 (2001).

    Article  CAS  Google Scholar 

  3. Berger, M.F. & Bulyk, M.L. Protein binding microarrays (PBMs) for rapid, high-throughput characterization of the sequence specificities of DNA-binding proteins. Methods Mol. Biol. 338, 245–260 (2006).

    CAS  Google Scholar 

  4. Golomb, S. . Shift Register Sequences (Aegean Park Press, Laguna Hills, California, USA, 1967).

  5. Kwan, A.H., Czolij, R., Mackay, J.P. & Crossley, M. Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities. Nucleic Acids Res. 31, e124 (2003).

    Article  Google Scholar 

  6. Linnell, J. et al. Quantitative high-throughput analysis of transcription factor binding specificities. Nucleic Acids Res. 32, e44 (2004).

    Article  Google Scholar 

  7. Oliphant, A.R., Brandl, C.J. & Struhl, K. Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. Mol. Cell. Biol. 9, 2944–2949 (1989).

    Article  CAS  Google Scholar 

  8. Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W.W. & Lenhard, B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 32, D91–D94 (2004).

    Article  CAS  Google Scholar 

  9. Liu, J. & Stormo, G.D. Quantitative analysis of EGR proteins binding to DNA: assessing additivity in both the binding site and the protein. BMC Bioinformatics 6, 176 (2005).

    Article  Google Scholar 

  10. Miller, J.C. & Pabo, C.O. Rearrangement of side-chains in a Zif268 mutant highlights the complexities of zinc finger–DNA recognition. J. Mol. Biol. 313, 309–315 (2001).

    Article  CAS  Google Scholar 

  11. Bjerve, S. Error bounds for linear combinations of order statistics. Ann. Stat. 5, 357–369 (1977).

    Article  Google Scholar 

  12. Lieb, J.D., Liu, X., Botstein, D. & Brown, P.O. Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nat. Genet. 28, 327–334 (2001).

    Article  CAS  Google Scholar 

  13. Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).

    Article  CAS  Google Scholar 

  14. Berg, O.G. & von Hippel, P.H. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J. Mol. Biol. 193, 723–750 (1987).

    Article  CAS  Google Scholar 

  15. Bulyk, M.L., Johnson, P.L. & Church, G.M. Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res. 30, 1255–1261 (2002).

    Article  CAS  Google Scholar 

  16. Jiang, J. & Levine, M. Binding affinities and cooperative interactions with bHLH activators delimit threshold responses to the dorsal gradient morphogen. Cell 72, 741–752 (1993).

    Article  CAS  Google Scholar 

  17. Gaudet, J. & Mango, S.E. Regulation of organogenesis by the Caenorhabditis elegans FoxA protein PHA-4. Science 295, 821–825 (2002).

    Article  CAS  Google Scholar 

  18. Tanay, A. Extensive low-affinity transcriptional interactions in the yeast genome. Genome Res. 16, 962–972 (2006).

    Article  CAS  Google Scholar 

  19. Warren, C.L. et al. Defining the sequence-recognition profile of DNA-binding molecules. Proc. Natl. Acad. Sci. USA 103, 867–872 (2006).

    Article  CAS  Google Scholar 

  20. Singh-Gasson, S. et al. Maskless fabrication of light-directed oligonucleotide microarrays using a digital micromirror array. Nat. Biotechnol. 17, 974–978 (1999).

    Article  CAS  Google Scholar 

  21. Blancafort, P., Segal, D.J. & Barbas, C.F., III. Designing transcription factor architectures for drug discovery. Mol. Pharmacol. 66, 1361–1371 (2004).

    Article  CAS  Google Scholar 

  22. Philippakis, A.A. et al. Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells. PLoS Comput. Biol. 2, e53 (2006).

    Article  Google Scholar 

  23. Braun, P. et al. Proteome-scale purification of human proteins from bacteria. Proc. Natl. Acad. Sci. USA 99, 2654–2659 (2002).

    Article  CAS  Google Scholar 

  24. Li, M.Z. & Elledge, S.J. MAGIC, an in vivo genetic method for the rapid construction of recombinant DNA molecules. Nat. Genet. 37, 311–319 (2005).

    Article  CAS  Google Scholar 

  25. Dudley, A.M., Aach, J., Steffen, M.A. & Church, G.M. Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc. Natl. Acad. Sci. USA 99, 7554–7559 (2002).

    Article  CAS  Google Scholar 

  26. Morton, T.A. & Myszka, D.G. Kinetic analysis of macromolecular interactions using surface plasmon resonance biosensors. Methods Enzymol. 295, 268–294 (1998).

    Article  CAS  Google Scholar 

  27. Wilmen, A., Pick, H., Niedenthal, R.K., Sen-Gupta, M. & Hegemann, J.H. The yeast centromere CDEI/Cpf1 complex: differences between in vitro binding and in vivo function. Nucleic Acids Res. 22, 2791–2800 (1994).

    Article  CAS  Google Scholar 

  28. Christy, B. & Nathans, D. DNA binding site of the growth factor–inducible protein Zif268. Proc. Natl. Acad. Sci. USA 86, 8737–8741 (1989).

    Article  CAS  Google Scholar 

  29. Okkema, P.G. & Fire, A. The Caenorhabditis elegans NK-2 class homeoprotein CEH-22 is involved in combinatorial activation of gene expression in pharyngeal muscle. Development 120, 2175–2186 (1994).

    CAS  Google Scholar 

  30. Klemm, J.D., Rould, M.A., Aurora, R., Herr, W. & Pabo, C.O. Crystal structure of the Oct-1 POU domain bound to an octamer site: DNA recognition with tethered DNA-binding modules. Cell 77, 21–32 (1994).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank T.V.S. Murthy, Leo Brizuela and Josh LaBaer for providing the Cbf1 and Rap1 clones, Gwenael Badis-Breard and Tim Hughes for providing the Zif268 DNA-binding domain clone and Shufen Meng for assistance with the Biacore technology. We also thank Stephen Gisselbrecht, Amy Donner and Rachel McCord for critical reading of the manuscript. This work was funded in part by grants R01 HG003985 and R01 HG003420 from National Institutes of Health/National Human Genome Research Institute to M.L.B. M.F.B. was supported in part by a National Science Foundation Graduate Research Fellowship. A.A.P. was supported in part by a National Defense Science and Engineering Graduate Fellowship, a National Science Foundation Graduate Research Fellowship and an Athinoula Martinos Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

M.F.B. participated in the array design, experimental design, analysis of results and drafting of the manuscript, and performed the experiments; A.A.P. conceived the idea of using de Bruijn sequences and participated in the binding site survey, array design, experimental design, analysis of results and drafting of the manuscript; A.M.Q. provided linear feedback shift register expertise and assisted in the array design; F.S.H. participated in the binding site survey; P.W.E. III conceived the concept of the compact universal array; M.L.B. conceived the concept of the compact universal array and participated in the array design, experimental design, analysis of the results and drafting of the manuscript.

Corresponding author

Correspondence to Martha L Bulyk.

Ethics declarations

Competing interests

A subset of the authors—M.L.B., A.A.P and P.W.E. 3rd—have filed a patent application, through Brigham and Women's Hospital, covering the sequence design described in this paper.

Supplementary information

Supplementary Fig. 1

Survey of binding sites in the JASPAR database. (PDF 570 kb)

Supplementary Fig. 2

Reproducibility of Cy3 dUTP signal intensities. (PDF 512 kb)

Supplementary Fig. 3

Correlation of PBM signal intensities with affinities. (PDF 679 kb)

Supplementary Fig. 4

Effects of binding site position and orientation on PBM signal. (PDF 629 kb)

Supplementary Fig. 5

Comparison of median signal intensities for 28 Zif268 variants for fixed versus variable position, orientation, and flanking sequence. (PDF 654 kb)

Supplementary Fig. 6

Correspondence between median signal intensities for 7-mers on distinct de Bruijn sequences. (PDF 296 kb)

Supplementary Fig. 7

Biacore measurements supporting interdependence between the first two positions of the Cbf1 binding site. (PDF 920 kb)

Supplementary Fig. 8

Minimum number of unique features on an array for different values of k. (PDF 652 kb)

Supplementary Fig. 9

Dependence of Cy3 dUTP incorporation upon sequence context. (PDF 609 kb)

Supplementary Methods (DOC 76 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berger, M., Philippakis, A., Qureshi, A. et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat Biotechnol 24, 1429–1435 (2006). https://doi.org/10.1038/nbt1246

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1246

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing