How can we computationally extract an unknown motif from a set of target sequences? What are the principles behind the major motif discovery algorithms? Which of these should we use, and how do we know we've found a 'real' motif?
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets
BMC Bioinformatics Open Access 18 June 2018
-
DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
BMC Bioinformatics Open Access 11 June 2018
-
A systematic approach to RNA-associated motif discovery
BMC Genomics Open Access 14 February 2018
Access options
Subscribe to Journal
Get full journal access for 1 year
$99.00
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.

References
D'haeseleer. P. What are DNA sequence motifs? Nat. Biotechnol. 24, 423–425 (2006).
Sinha, S. & Tompa, M. YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31, 3586–3588 (2003).
Pavesi, G. et al. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res. 32 (Web Server Issue), W199–W203 (2004).
Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).
Tompa, M. et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005).
Li, N. & Tompa, M. Analysis of computational approaches for motif discovery. Alg. Mol. Biol. 1, 8 (2006).
Hu, J., Li, B. & Kihara, D. Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005).
Thijs, G. et al. A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes. J. Comp. Biol. 9, 447–464 (2002).
Huber, B.R. & Bulyk, M.L. Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data. BMC Bioinformatics 7, 229 (2006).
Hughes, J.D. et al. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000).
McGuire, A.M., Hughes, J.D. & Church, G.M. Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res. 10, 744–757 (2000).
Huang, H.-D. et al. Identifying transcriptional regulatory sites in the human genome using an integrated system. Nucleic Acids Res. 32, 1948–1956 (2004).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
D'haeseleer, P. How does DNA sequence motif discovery work?. Nat Biotechnol 24, 959–961 (2006). https://doi.org/10.1038/nbt0806-959
Issue Date:
DOI: https://doi.org/10.1038/nbt0806-959
This article is cited by
-
Sequence motif finder using memetic algorithm
BMC Bioinformatics (2018)
-
DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
BMC Bioinformatics (2018)
-
SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets
BMC Bioinformatics (2018)
-
A systematic approach to RNA-associated motif discovery
BMC Genomics (2018)
-
RefSelect: a reference sequence selection algorithm for planted (l, d) motif search
BMC Bioinformatics (2016)