Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Human-mouse genome comparisons to locate regulatory sites

Abstract

Elucidating the human transcriptional regulatory network1 is a challenge of the post-genomic era. Technical progress so far is impressive, including detailed understanding of regulatory mechanisms for at least a few genes in multicellular organisms2,3,4, rapid and precise localization of regulatory regions within extensive regions of DNA by means of cross-species comparison5,6,7, and de novo determination of transcription-factor binding specificities from large-scale yeast expression data8. Here we address two problems involved in extending these results to the human genome: first, it has been unclear how many model organism genomes will be needed to delineate most regulatory regions; and second, the discovery of transcription-factor binding sites (response elements) from expression data has not yet been generalized from single-celled organisms to multicellular organisms. We found that 98% (74/75) of experimentally defined sequence-specific binding sites of skeletal-muscle-specific transcription factors are confined to the 19% of human sequences that are most conserved in the orthologous rodent sequences. Also we found that in using this restriction, the binding specificities of all three major muscle-specific transcription factors (MYF, SRF and MEF2) can be computationally identified.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Probability of alignment for the sequences flanking the 5′ end of the first exon of natriuretic propeptide (NPPA).
Figure 2: Conservation of genomic sequence between humans and rodents for alignments where lengthRodent/lengthHuman ≥0.5.
Figure 3: Three patterns identified in the 5′ flanking sequences of genes selectively expressed in skeletal muscle.

Similar content being viewed by others

The ENCODE Project Consortium, Michael P. Snyder, … Richard M. Myers

References

  1. Kadonaga, J.T. Eukaryotic transcription: an interlaced network of transcription factors and chromatin-modifying machines. Cell 92, 307 –313 (1998).

    Article  CAS  Google Scholar 

  2. Orkin, S.H. Regulation of globin gene expression in erythroid cells. Eur. J. Biochem. 231, 271–281 ( 1995).

    Article  CAS  Google Scholar 

  3. Qin, W. et al. Molecular characterization of the creatine kinases and some historical perspectives. Mol. Cell. Biochem. 184, 153 –167 (1998).

    Article  CAS  Google Scholar 

  4. Yuh, C.H., Bolouri, H. & Davidson, E.H. Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. Science 279, 1896–1902 (1998).

    Article  CAS  Google Scholar 

  5. Aparicio, S. et al. Organization of the Fugu rubripes Hox clusters: evidence for continuing evolution of vertebrate Hox complexes. Nature Genet. 16, 79–83 ( 1997).

    Article  CAS  Google Scholar 

  6. Brickner, A.G., Koop, B.F., Aronow, B.J. & Wiginton, D.A. Genomic sequence comparison of the human and mouse adenosine deaminase gene regions. Mamm. Genome 10, 95–101 (1999).

    Article  CAS  Google Scholar 

  7. Hardison, R.C., Oeltjen, J. & Miller, W. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res. 8, 959–966 (1997).

    Article  Google Scholar 

  8. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J. & Church, G.M. Systematic determination of genetic network architecture. Nature Genet. 22, 281–285 (1999).

    Article  CAS  Google Scholar 

  9. Fickett, J.W. & Wasserman, W.W. Discovery and modeling of transcriptional regulatory regions. Curr. Opin. Biotechnol. 11, 19–24 (2000).

    Article  CAS  Google Scholar 

  10. Stormo, G.D. & Fields, D.S. Specificity, free energy and information content in protein-DNA interactions. Trends Biochem. Sci. 23, 109–113 (1998).

    Article  CAS  Google Scholar 

  11. Werner, T. Models for prediction and recognition of eukaryotic promoters. Mamm. Genome 10, 168–175 (1999).

    Article  CAS  Google Scholar 

  12. Fickett, J.W. & Hatzigeorgiou, A.G. Eukaryotic promoter recognition . Genome Res. 7, 861–878 (1997).

    Article  CAS  Google Scholar 

  13. Tronche, F., Ringeisen, F., Blumenfeld, M., Yaniv, M. & Pontoglio, M. Analysis of the distribution of binding sites for a tissue-specific transcription factor in the vertebrate genome. J. Mol. Biol. 266, 231– 245 (1997).

    Article  CAS  Google Scholar 

  14. Duret, L. & Bucher, P. Searching for regulatory elements in human noncoding sequences. Curr. Opin. Struct. Biol. 7, 399–406 (1997).

    Article  CAS  Google Scholar 

  15. Koop, B.F. Human and rodent DNA sequence comparisons: a mosaic model of genomic evolution . Trends Genet. 11, 367– 371 (1995).

    Article  CAS  Google Scholar 

  16. Wasserman, W.W. & Fickett, J.W. Identification of regulatory regions which confer muscle-specific gene expression. J. Mol. Biol. 278, 167–181 (1998).

    Article  CAS  Google Scholar 

  17. Battey, J., Jordan, E., Cox, D. & Dove, W. An action plan for mouse genomics. Nature Genet. 21, 73– 75 (1999).

    Article  CAS  Google Scholar 

  18. Sonnhammer, E.L. & Durbin, R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 29, GC1– 10 (1995).

    Google Scholar 

  19. Huang, X.Q., Hardison, R.C. & Miller, W. A space-efficient algorithm for local similarities. Comput. Appl. Biosci. 6, 373–381 (1990).

    CAS  PubMed  Google Scholar 

  20. Zhu, J., Liu, J.S. & Lawrence, C.E. Bayesian adaptive sequence alignment algorithms. Bioinformatics 14, 25–39 (1998).

    Article  CAS  Google Scholar 

  21. Lania, L., Majello, B. & De Luca, P. Transcriptional regulation by the Sp family proteins . Int. J. Biochem. Cell Biol. 29, 1313– 1323 (1997).

    Article  CAS  Google Scholar 

  22. Scherf, M., Klingenhoff, A. & Werner, T. Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach . J. Mol. Biol. 297, 599– 606 (2000).

    Article  CAS  Google Scholar 

  23. Lawrence, C.E. et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208 –214 (1993).

    Article  CAS  Google Scholar 

  24. Liu, J.S., Neuwald, A.F. & Lawrence, C.E. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Am. Stat. Assoc. 90, 1156–1170 (1995).

    Article  Google Scholar 

  25. Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell. 9, 3273–3297 (1998).

    Article  CAS  Google Scholar 

  26. Sankoff, D. & Cedergren, R.J. A test for nucleotide sequence homology. J. Mol. Biol. 77, 169– 164 (1973).

    Article  CAS  Google Scholar 

  27. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 ( 1990).

    Article  CAS  Google Scholar 

  28. Agarwal, P. & States, D.J. A Bayesian evolutionary distance for parametrically aligned sequences. J. Comput. Biol. 3, 1–17 (1996).

    Article  CAS  Google Scholar 

  29. Liu, J.S. & Lawrence, C.E. Bayesian inference on biopolymer models. Bioinformatics 15, 38– 52 (1999).

    Article  CAS  Google Scholar 

  30. Wootton, J.C. & Federhen, S. Analysis of compositional biased regions in sequence databases. Methods Enzymol. 266 , 554–571 (1996).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank our colleagues at SmithKline Beecham and the Wadsworth Center for input, and the Computational Molecular Biology Core at the Wadsworth Center and I. Auger for assistance. This work was supported by grants from the NIH to J.W.F. (NHGRI R01 HG00981-03) and C.E.L. (NHGRI R01 HG01257).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles E. Lawrence.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wasserman, W., Palumbo, M., Thompson, W. et al. Human-mouse genome comparisons to locate regulatory sites. Nat Genet 26, 225–228 (2000). https://doi.org/10.1038/79965

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/79965

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing