Abstract
Genome-wide location analysis has become a standard technology to unravel gene regulation networks. The accurate characterization of nucleotide signatures in sequences is key to uncovering the regulatory logic but remains a computational challenge. This protocol describes how to best characterize these signatures (motifs) using the new standalone version of Trawler, which was designed and optimized to analyze chromatin immunoprecipitation (ChIP) data sets. In particular, we describe the three main steps of Trawler_standalone (motif discovery, clustering and visualization) and discuss the appropriate parameters to be used in each step depending on the data set and the biological questions addressed. Compared to five other motif discovery programs, Trawler_standalone is in most cases the fastest algorithm to accurately predict the correct motifs especially for large data sets. Its running time ranges within few seconds to several minutes, depending on the size of the data set and the parameters used. This protocol is best suited for bioinformaticians seeking to use Trawler_standalone in a high-throughput manner.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Ettwiller, L., Paten, B., Ramialison, M., Birney, E. & Wittbrodt, J. Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation. Nat. Methods 4, 563–565 (2007).
Bryne, J.C. et al. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 36, D102–D106 (2008).
Berger, M.F. et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133, 1266–1276 (2008).
Linhart, C., Halperin, Y. & Shamir, R. Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. Genome Res. 18, 1180–1189 (2008).
Hughes, J.D., Estep, P.W., Tavazoie, S. & Church, G.M. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000).
Liu, X.S., Brutlag, D.L. & Liu, J.S. An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 20, 835–839 (2002).
Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).
Pavesi, G., Mereghetti, P., Mauri, G. & Pesole, G. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic. Acids Res. 32, W199–W203 (2004).
Zhang, X. et al. Genome-wide analysis of cAMP-response element binding protein occupancy, phosphorylation, and target gene activation in human tissues. Proc. Natl. Acad. Sci. USA 102, 4459–4464 (2005).
Linsley, P.S. et al. Transcripts targeted by the microRNA-16 family cooperatively regulate cell cycle progression. Mol. Cell Biol. 27, 2240–2252 (2007).
Ren, B. et al. E2F integrates cell cycle progression with DNA repair, replication, and G(2)/M checkpoints. Genes Dev. 16, 245–256 (2002).
Cao, Y. et al. Global and gene-specific analyses show distinct roles for Myod and Myog at a common set of promoters. EMBO J. 25, 502–511 (2006).
Rodriguez, A. et al. Requirement of bic/microRNA-155 for normal immune function. Science 316, 608–611 (2007).
Crooks, G.E., Hon, G., Chandonia, J. & Brenner, S.E. WebLogo: a sequence logo generator. Genome Res. 14 (6): 1188–1190 (2004).
Matys, V. et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34 (Database issue): D108–D110 (2006).
Griffiths-Jones, S., Saini, H.K., van Dongen, S. & Enright, A.J. miRBase: tools for microRNA genomics. Nucleic Acids Res. 36, D154–D158 (2008).
Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).
Clamp, M., Cuff, J., Searle, S.M. & Barton, G.J. The Jalview Java alignment editor. Bioinformatics 20, 426–427 (2004).
Roepcke, S., Grossmann, S., Rahmann, S. & Vingron, M. T-Reg Comparator: an analysis tool for the comparison of position weight matrices. Nucleic Acids Res. 33, W438–W441 (2005).
Blanchette, M. et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004).
Karolchik, D. et al. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36, D773–D779 (2008).
Acknowledgements
We thank the Wittbrodt lab for fruitful discussions and Florence Besse, Dirk-Dominik Dolle and Mythily Ganapathi for testing Trawler_standalone. This work was supported by FP7-CISSTEM.
Author information
Authors and Affiliations
Contributions
Y.H. with the help of M.R. built the standalone distribution; Y.H., M.R. B.P. and L.E. have contributed to improve the algorithm; and Y.H., M.R., B.P. and L.E. have written the paper.
Corresponding author
Rights and permissions
About this article
Cite this article
Haudry, Y., Ramialison, M., Paten, B. et al. Using Trawler_standalone to discover overrepresented motifs in DNA and RNA sequences derived from various experiments including chromatin immunoprecipitation. Nat Protoc 5, 323–334 (2010). https://doi.org/10.1038/nprot.2009.158
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2009.158
This article is cited by
-
Detection and identification of cis-regulatory elements using change-point and classification algorithms
BMC Genomics (2022)
-
Nuclear transporter Importin-13 plays a key role in the oxidative stress transcriptional response
Nature Communications (2021)
-
Cardiac gene expression data and in silico analysis provide novel insights into human and mouse taste receptor gene regulation
Naunyn-Schmiedeberg's Archives of Pharmacology (2015)
-
An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
BMC Bioinformatics (2010)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.