The combination of bioinformatic and biological approaches constitutes a powerful method for identifying gene regulatory elements. High-quality genome sequences are available in public databases for several vertebrate species. Comparative cross-species sequence analysis of these genomes shows considerable conservation of noncoding sequences in DNA. Biological analyses show that an unexpectedly high number of the conserved sequences correspond to functional cis-regulatory regions that influence gene transcription. Because research biologists are often unfamiliar with the bioinformatic resources at their disposal, this commentary discusses how to integrate biological and bioinformatic methods in the discovery of gene regulatory regions and includes a tutorial on widely available comparative genomics programs.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
CBS: an open platform that integrates predictive methods and epigenetics information to characterize conserved regulatory features in multiple Drosophila genomes
BMC Genomics Open Access 10 December 2012
-
Identification of conserved domains in the promoter regions of nitric oxide synthase 2: implications for the species-specific transcription and evolutionary differences
BMC Genomics Open Access 08 August 2007
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Baltimore, D. Our genome unveiled. Nature 409, 814–816 (2001).
Levine, M. & Tjian, R. Transcription regulation and animal diversity. Nature 424, 147–151 (2003).
Carey, M. & Smale, S.T. Transcriptional Regulation in Eukaryotes: Concepts, Strategies, and Techniques (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 2000).
Fischle, W., Wang, Y. & Allis, C.D. Histone and chromatin cross-talk. Current Opinion in Cell Biology 15, 172–183 (2003).
Arnone, M. & Davidson, E. The hardwiring of development: organization and function of genomic regulatory systems. Development 124, 1851–1864 (1997).
Davidson, E.H. Genomic Regulatory Systems: Development and Evolution (Academic, San Diego, 2001).
Kirschner, M. & Gerhart, J. Evolvability. Proc. Natl. Acad. Sci. USA 95, 8420–8427 (1998).
Locascio, A., Manzanares, M., Blanco, M.J. & Nieto, M.A. Modularity and reshuffling of Snail and Slug expression during vertebrate evolution. Proc. Natl. Acad. Sci. USA 99, 16841–16846 (2002).
Lynch, M. & Conery, J.S. The origins of genome complexity. Science 302, 1401–1404 (2003).
Mancini-DiNardo, D., Steele, S.J.S., Ingram, R.S. & Tilghman, S.M. A differentially methylated region within the gene Kcnq1 functions as an imprinted promoter and silencer. Hum. Mol. Genet. 12, 283–294 (2003).
Loots, G.G. et al. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288, 136–140 (2000).
Loots, G.G., Ovcharenko, I., Pachter, L., Dubchak, I. & Rubin, E.M. rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res. 12, 832–839 (2002).
Pennacchio, L.A. & Rubin, E.M. Genomic strategies to identify mammalian regulatory sequences. Nat. Rev. Genet. 2, 100–109 (2001).
Frazer, K.A., Elnitski, L., Church, D.M., Dubchak, I. & Hardison, R.C. Cross-species sequence comparisons: a review of methods and available resources. Genome Res. 13, 1–12 (2003).
Pennacchio, L.A. & Rubin, E.M. Comparative genomic tools and databases: providing insights into the human genome. J. Clin. Invest. 111, 1099–1106 (2003).
Wasserman, W.W. & Sandelin, A. Applied bioinformatics for the identification of regulatory elements. Nat. Rev. Genet. 5, 276–287 (2004).
Agarwal, S. & Rao, A. Modulation of chromatin structure regulates cytokine gene expression during T cell differentiation. Immunity 9, 765–775 (1998).
Takemoto, N. et al. Th2-specific DNase I-hypersensitive sites in the murine IL-13 and IL-4 intergenic region. Int. Immunol. 10, 1981–1985 (1998).
Agarwal, S., Avni, O. & Rao, A. Cell-type-restricted binding of the transcription factor NFAT to a distal IL-4 enhancer in vivo. Immunity 12, 643–652 (2000).
Lee, G.R., Fields, P.E. & Flavell, R.A. Regulation of IL-4 gene expression by distal regulatory elements and GATA-3 at the chromatin level. Immunity 14, 447–459 (2001).
Mohrs, M. et al. Deletion of a coordinate regulator of type 2 cytokine expression in mice. Nat. Immunol. 2, 842–847 (2001).
Solymar, D.C., Agarwal, S., Bassing, C.H., Alt, F.W. & Rao, A. A 3′ enhancer in the IL-4 gene regulates cytokine production by Th2 cells and mast cells. Immunity 17, 41–50 (2002).
Smale, S.T. & Fisher, A.G. Chromatin structure and gene regulation in the immune system. Annu. Rev. Immunol. 20, 427–462 (2002).
Ansel, K.M., Lee, D.U. & Rao, A. An epigenetic view of helper T cell differentiation. Nat. Immunol. 4, 616–623 (2003).
Lee, D.U., Avni, O., Chen, L. & Rao, A. A distal enhancer in the interferon-γ (IFN-γ) locus revealed by genome sequence comparison. J. Biol. Chem. 279, 4802–4810 (2004).
Kim, H.P., Kelly, J. & Leonard, W.J. The basis for IL-2-induced IL-2 receptor α chain gene regulation: importance of two widely separated IL-2 response elements. Immunity 15, 159–172 (2001).
Göttgens, B. et al. Long-range comparison of human and mouse SCL loci: localized regions of sensitivity to restriction endonucleases correspond precisely with peaks of conserved noncoding sequences. Genome Res. 11, 87–97 (2001).
Chapman, M.A. et al. Comparative and functional analyses of LYL1 loci establish marsupial sequences as a model for phylogenetic footprinting. Genomics 81, 249–259 (2003).
Glusman, G. et al. Comparative genomics of the human and mouse T cell receptor loci. Immunity 15, 337–349 (2001).
Amsen, D. et al. Instruction of distinct CD4 T helper cell fates by different notch ligands on antigen-presenting cells. Cell 117, 515–526 (2004).
Hammond, K.J. & Kronenberg, M. Natural killer T cells: natural or unnatural regulators of autoimmunity? Curr. Opin. Immunol. 15, 683–689 (2003).
Weiss, D.L. & Brown, M.A. Regulation of IL-4 production in mast cells: a paradigm for cell-type-specific gene expression. Immunol. Rev. 179, 35–47 (2001).
Falcone, F.H., Haas, H. & Gibbs, B.F. The human basophil: a new appreciation of its role in immune responses. Blood 96, 4028–4038 (2000).
Frazer, K.A. et al. Computational and biological analysis of 680 kb of DNA sequence from the human 5q31 cytokine gene cluster region. Genome Res. 7, 495–512 (1997).
Lee, D.U., Agarwal, S. & Rao, A. Th2 lineage commitment and efficient IL-4 production involves extended demethylation of the IL-4 gene. Immunity 16, 649–660 (2002).
Hural, J.A., Kwan, M., Henkel, G., Hock, M.B. & Brown, M.A. An intron transcriptional enhancer element regulates IL-4 gene locus accessibility in mast cells. J. Immunol. 165, 3239–3249 (2000).
Ludwig, M.Z., Bergman, C., Patel, N.H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000).
Stern, D.L. Evolutionary developmental biology and the problem of variation. Evolution 54, 1079–1091 (2000).
Bergman, C.M. & Kreitman, M. Analysis of conserved noncoding DNA in drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11, 1335–1345 (2001).
Doyle, J.J. & Gaut, B.S. Evolution of genes and taxa: a primer. Plant Mol. Biology 42, 1–23 (2000).
Wolfe, K.H. & Shields, D.C. Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708–713 (1997).
Lutfalla, G. et al. Comparative genomic analysis reveals independent expansion of a lineage-specific gene family in vertebrates: The class II cytokine receptors and their ligands in mammals and fish. BMC Genomics 4, 29 (2003).
Birney, E. et al. An overview of Ensembl. Genome Res. 14, 925–928 (2004).
Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information: update. Nucleic Acids Res. 32, D35–40 (2004).
Karolchik, D. et al. The UCSC genome browser database. Nucleic Acids Res. 31, 51–54 (2003).
Koski, L.B. & Golding, G.B. The closest BLAST hit is often not the nearest neighbor. J. Mol. Evol. 52, 540–542 (2001).
Forsyth, S., Horvath, A. & Coughlin, P. A review and comparison of the murine α1-antitrypsin and α1-antichymotrypsin multigene clusters with the human clade A serpins. Genomics 81, 336–345 (2003).
Thomas, J.W. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003).
Cooper, G.M. et al. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003).
Durbin, R., Eddy, S.R., Krogh, A. & Mitchison, G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge University Press, Cambridge, 1998).
Pollard, D.A., Bergman, C.M., Stoye, J., Celniker, S.E. & Eisen, M.B. Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics 5, 6 (2004).
Bray, N., Dubchak, I. & Pachter, L. AVID: A global alignment program. Genome Res. 13, 97–102 (2003).
Mayor, C. et al. VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047 (2000).
Brudno, M. et al. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731 (2003).
Schwartz, S. et al. Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003).
Schwartz, S. et al. PipMaker—A web server for aligning two genomic DNA sequences. Genome Res. 10, 577–586 (2000).
Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S. & Morgenstern, B. Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics 4, 66 (2003).
Gross, D.S. & Garrard, W.T. Nuclease hypersensitive sites in chromatin. Annu. Rev. Biochem. 57, 159–197 (1988).
Adlam, M. & Siu, G. Hierarchical interactions control CD4 gene expression during thymocyte development. Immunity 18, 173–184 (2003).
Lee, G.R., Fields, P.E., Griffin, T.J. & Flavell, R.A. Regulation of the Th2 cytokine locus by a locus control region. Immunity 19, 145–153 (2003).
Horsley, V., Jansen, K.M., Mills, S.T. & Pavlath, G.K. IL-4 acts as a myoblast recruitment factor during mammalian muscle growth. Cell 113, 483–494 (2003).
Yamashita, M. et al. Identification of a conserved GATA3 response element upstream proximal from the interleukin-13 gene locus. J. Biol. Chem. 277, 42399–42408 (2002).
Burgess-Beusse, B. et al. The insulation of genes from external enhancers and silencing chromatin. Proc. Natl. Acad. Sci. USA 99, 16433–16437 (2002).
Crawford, G.E. et al. Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. Proc. Natl. Acad. Sci. USA 101, 992–997 (2004).
Ellmeier, W., Sunshine, M.J., Maschek, R. & Littman, D.R. Combined deletion of CD8 locus cis-regulatory elements affects initiation but not maintenance of CD8 expression. Immunity 16, 623–634 (2002).
Taniuchi, I., Sunshine, M.J., Festenstein, R. & Littman, D.R. Evidence for distinct CD4 silencer functions at different stages of thymocyte differentiation. Mol. Cell 10, 1083–1096 (2002).
Taniuchi, I. et al. Differential requirements for Runx proteins in CD4 repression and epigenetic silencing during T lymphocyte development. Cell 111, 621–633 (2002).
Avni, O. et al. TH cell differentiation is accompanied by dynamic changes in histone acetylation of cytokine genes. Nat. Immunol. 3, 643–651 (2002).
Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev. 16, 6–21 (2002).
Schug, J. & Overton, G.C. http://www.cbil.upenn.edu/tess (Computational Biology and Informatics Laboratory, School of Medicine, University of Pennsylvania, Philadelphia, 1997).
Kel-Margoulis, O.V. et al. Composition-sensitive analysis of the human genome for regulatory signals. In Silico Biol. 3, 145–171 (2003).
Lenhard, B. et al. Identification of conserved regulatory elements by comparative genome analysis. J. Biol. 2, 13.1–13.11 (2003).
Wray, G.A. et al. The evolution of transcriptional regulation in eukaryotes. Mol. Biol. Evol. 20, 1377–1419 (2003).
Rutherford, S.L. From genotype to phenotype: buffering mechanisms and the storage of genetic information. Bioessays 22, 1095–1105 (2000).
Bell, A.C., West, A.G. & Felsenfeld, G. The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell 98, 387–396 (1999).
Szabo, S.J. et al. A novel transcription factor, T-bet, directs Th1 lineage commitment. Cell 100, 655–669 (2000).
Hardison, R.C. Comparative genomics. PLoS Biol. 1, E58 (2003).
Author information
Authors and Affiliations
Supplementary information
Supplementary Tutorial
Bioinformatics for the Bench Scientist. (HTM 4 kb)
Rights and permissions
About this article
Cite this article
Nardone, J., Lee, D., Ansel, K. et al. Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA. Nat Immunol 5, 768–774 (2004). https://doi.org/10.1038/ni0804-768
Issue Date:
DOI: https://doi.org/10.1038/ni0804-768
This article is cited by
-
CBS: an open platform that integrates predictive methods and epigenetics information to characterize conserved regulatory features in multiple Drosophila genomes
BMC Genomics (2012)
-
New cis-regulatory elements in the Rht-D1b locus region of wheat
Functional & Integrative Genomics (2012)
-
Identification of conserved domains in the promoter regions of nitric oxide synthase 2: implications for the species-specific transcription and evolutionary differences
BMC Genomics (2007)
-
Epigenetic regulation of Ifng expression
Nature Immunology (2007)
-
Transcription factors T-bet and Runx3 cooperate to activate Ifng and silence Il4 in T helper type 1 cells
Nature Immunology (2007)