Abstract
Chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP–array) has become a popular procedure for studying genome-wide protein–DNA interactions and transcription regulation. However, it can only map the probable protein–DNA interaction loci within 1–2 kilobases resolution. To pinpoint interaction sites down to the base-pair level, we introduce a computational method, Motif Discovery scan (MDscan), that examines the ChIP–array-selected sequences and searches for DNA sequence motifs representing the protein–DNA interaction sites. MDscan combines the advantages of two widely adopted motif search strategies, word enumeration1,2,3,4 and position-specific weight matrix updating5,6,7,8,9, and incorporates the ChIP–array ranking information to accelerate searches and enhance their success rates. MDscan correctly identified all the experimentally verified motifs from published ChIP–array experiments in yeast10,11,12,13 (STE12, GAL4, RAP1, SCB, MCB, MCM1, SFF, and SWI5), and predicted two motif patterns for the differential binding of Rap1 protein in telomere regions. In our studies, the method was faster and more accurate than several established motif-finding algorithms5,8,9. MDscan can be used to find DNA motifs not only in ChIP–array experiments but also in other experiments in which a subgroup of the sequences can be inferred to contain relatively abundant motif sites. The MDscan web server can be accessed at http://BioProspector.stanford.edu/MDscan/.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Curation, inference, and assessment of a globally reconstructed gene regulatory network for Streptomyces coelicolor
Scientific Reports Open Access 18 February 2022
-
Genome-scale exploration of transcriptional regulation in the nisin Z producer Lactococcus lactis subsp. lactis IO-1
Scientific Reports Open Access 02 March 2020
-
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes
BMC Genomics Open Access 09 August 2016
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout
References
van Helden, J., Andre, B. & Collado-Vides, J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842 (1998).
Bussemaker, H.J., Li, H. & Siggia, E.D. Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc. Natl. Acad. Sci. USA 97, 10096–10100 (2000).
Sinha, S. & Tompa, M. A statistical method for finding transcription factor binding sites. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 344–354 (2000).
Vilo, J., Brazma, A., Jonassen, I., Robinson, A. & Ukkonen, E. Mining for putative regulatory elements in the yeast genome using gene expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 384–394 (2000).
Hertz, G.Z., Hartzell, G.W. & Stormo, G.D. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput. Appl. Biosci. 6, 81–92 (1990).
Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).
Liu, J.S., Neuwald, A.F. & Lawrence, C.E. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Am. Stat. Assoc. 90, 1156–1170 (1995).
Roth, F.P., Hughes, J.D., Estep, P.W. & Church, G.M. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotechnol. 16, 939–945 (1998).
Liu, X., Brutlag, D.L. & Liu, J.S. BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac. Symp. Biocomput. 127–138 (2001).
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Iyer, V.R. et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001).
Lieb, J.D., Liu, X., Botstein, D. & Brown, P.O. Promoter-specific binding of Rap1p revealed by genome-wide maps of protein-DNA association. Nat. Genet. 28, 327–334 (2001).
Simon, I. et al. Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106, 697–708 (2001).
Dolan, J.W., Kirkman, C. & Fields, S. The yeast STE12 protein binds to the DNA sequence mediating pheromone induction. Proc. Natl. Acad. Sci. USA 86, 5703–5707 (1989).
Graham, I.R. & Chambers, A. Use of a selection technique to identify the diversity of binding sites for the yeast RAP1 transcription factor. Nucleic Acids Res. 22, 124–130 (1994).
Buchman, A.R., Kimmerly, W.J., Rine, J. & Kornberg, R.D. Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae. Mol. Cell Biol. 8, 210–225 (1988).
Idrissi, F.Z. & Pina, B. Functional divergence between the half-sites of the DNA-binding sequence for the yeast transcriptional regulator Rap1p. Biochem. J. 341, 477–482 (1999).
Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297 (1998).
Lawrence, C.E. et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214 (1993).
Liu, J.S. Monte Carlo Strategies in Scientific Computing (Springer, New York, 2001).
Acknowledgements
The authors thank the Brown lab at Stanford (especially Jason D. Lieb) and the Young lab at MIT (especially Bing Ren) for their valuable data and scientific insight. This work is supported by National Human Genome Research Institute grants R01 HGF02235 and R01 HG02518-01, and National Science Foundation grant DMS-0094613.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, X., Brutlag, D. & Liu, J. An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experiments. Nat Biotechnol 20, 835–839 (2002). https://doi.org/10.1038/nbt717
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt717
This article is cited by
-
Curation, inference, and assessment of a globally reconstructed gene regulatory network for Streptomyces coelicolor
Scientific Reports (2022)
-
Genome-scale exploration of transcriptional regulation in the nisin Z producer Lactococcus lactis subsp. lactis IO-1
Scientific Reports (2020)
-
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes
BMC Genomics (2016)
-
UpCoT: an integrated pipeline tool for clustering upstream DNA sequences of orthologous genes in prokaryotic genomes
3 Biotech (2016)
-
Visualizing translocation dynamics and nascent transcript errors in paused RNA polymerases in vivo
Genome Biology (2015)