Abstract
This protocol shows how to access the Regulatory Sequence Analysis Tools (RSAT) via a programmatic interface in order to automate the analysis of multiple data sets. We describe the steps for writing a Perl client that connects to the RSAT Web services and implements a workflow to discover putative cis-acting elements in promoters of gene clusters. In the presented example, we apply this workflow to lists of transcription factor target genes resulting from ChIP-chip experiments. For each factor, the protocol predicts the binding motifs by detecting significantly overrepresented hexanucleotides in the target promoters and generates a feature map that displays the positions of putative binding sites along the promoter sequences. This protocol is addressed to bioinformaticians and biologists with programming skills (notions of Perl). Running time is ∼6 min on the example data set.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Thomas-Chollier, M. et al. RSAT: regulatory sequence analysis tools. Nucleic Acids Res. 36, W119–W127 (2008).
Brohée, S. et al. NeAT: network analysis tools. Nucleic Acids Res. 36, Suppl_2 W444–W451 (2008).
Turatsinze, J.-V., Thomas-Chollier, M., Defrance, M. & van Helden, J. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules. Nat. Protoc. doi:10.1038/nprot.2008.97 (2008).
Defrance, M., Janky, R., Sand, O. & van Helden, J. Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences. Nat. Protoc. doi:10.1038/nprot.2008.98 (2008).
Brohée, S., Faust, K., Lima-Mendez, G., Vanderstocken, G. & van Helden, J. Network Analysis Tools: from biological networks to clusters and pathways. Nat. Protoc. doi:10.1038/nprot.2008.100 (2008).
van Helden, J., André, B. & Collado-Vides, J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842 (1998).
van Helden, J. Regulatory sequence analysis tools. Nucleic Acids Res. 31, 3593–3596 (2003).
van Helden, J., Andre, B. & Collado-Vides, J. A web site for the computational analysis of yeast regulatory sequences. Yeast 16, 177–187 (2000).
Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).
Montgomery, S.B. et al. ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation. Bioinformatics 22, 637–640 (2006).
Vlieghe, D. et al. A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 34, D95–D97 (2006).
Sandelin, A., Alkema, W., Engström, P., Wasserman, W.W. & Lenhard, B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 32, D91–D94 (2004).
Aerts, S. et al. TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis. Nucleic Acids Res. 33, W393–W396 (2005).
Aerts, S. et al. Toucan: deciphering the cis-regulatory logic of coregulated genes. Nucleic Acids Res. 31, 1753–1764 (2003).
Saeed, A.I. et al. TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34, 374–378 (2003).
Saeed, A.I. et al. TM4 microarray software suite. Methods Enzymol. 411, 134–193 (2006).
Reimers, M. & Carey, V.J. Bioconductor: an open source framework for bioinformatics and computational biology. Methods Enzymol. 411, 119–134 (2006).
van Helden, J., Rios, A.F. & Collado-Vides, J. Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Res. 28, 1808–1818 (2000).
Oinn, T. et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 3045–3054 (2004).
Hull, D. et al. Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34, W729–W732 (2006).
Acknowledgements
This work was supported by the BioSapiens Network of Excellence funded under the sixth Framework program of the European Communities (LSHG-CT-2003-503265) (O.S. postdoc grant, E.V. research fellowship), the Vrije Universiteit Brussel (Geconcerteerde Onderzoeksactie 29) (M.T.-C. PhD grant) and by the Belgian Program on Interuniversity Attraction Poles, initiated by the Belgian Federal Science Policy Office, project P6/25 (BioMaGNet).
Author information
Authors and Affiliations
Corresponding author
Supplementary information
Supplementary Figure 1
Full version of the feature map showing the positions of the significant hexanucleotides found in promoters of genes from the BAS1_YPD cluster. (PDF 82 kb)
Rights and permissions
About this article
Cite this article
Sand, O., Thomas-Chollier, M., Vervisch, E. et al. Analyzing multiple data sets by interconnecting RSAT programs via SOAP Web services—an example with ChIP-chip data. Nat Protoc 3, 1604–1615 (2008). https://doi.org/10.1038/nprot.2008.99
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2008.99
This article is cited by
-
A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs
Nature Protocols (2012)
-
Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules
Nature Protocols (2008)
-
Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences
Nature Protocols (2008)
-
Network Analysis Tools: from biological networks to clusters and pathways
Nature Protocols (2008)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.