Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules


This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server ( The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

Figure 1: Representations of the binding specificity for the Krüppel transcription factor of Drososphila melanogaster.
Figure 2: Graphical flowchart showing the links between regulatory sequence analysis tool (RSAT) programs used in the protocol.
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7: A matrix-scan result for the detection of individual sites.
Figure 8: Feature-maps for the even-skipped example.


This work was supported by the Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture, FRIA (J.-V.T. PhD grant), the Vrije Universiteit Brussel (Geconcerteerde Onderzoeksactie 29) (M.T.-C. PhD grant), and by the BioSapiens Network of Excellence funded under the sixth Framework program of the European Communities (LSHG-CT-2003-503265). The postdoctoral grant of M.D. was funded by the Belgian Program on Interuniversity Attraction Poles, initiated by the Belgian Federal Science Policy Office, project P6/25 (BioMaGNet).

Correspondence to Jacques van Helden.

Turatsinze, JV., Thomas-Chollier, M., Defrance, M. et al. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules. Nat Protoc 3, 1578–1588 (2008).

