Nature Genetics
26, 225 - 228 (2000)
doi:10.1038/79965
Human-mouse genome comparisons to locate regulatory sitesWyeth W. Wasserman1, 3, Michael Palumbo2, William Thompson2, James W. Fickett1
& Charles E. Lawrence21
Bioinformatics Group, SmithKline Beecham Pharmaceuticals,
King of Prussia, Pennsylvania, USA. 2
Wadsworth Center, New York State Department of Health,
Empire State Plaza, PO Box 509, Albany,
New York, USA. 3
Present address: Center for Genomics Research, Karolinska
Institutet, Stockholm, Sweden.
Correspondence should be addressed to Charles E. Lawrence lawrence@wadsworth.orgElucidating the human transcriptional regulatory network1
is a challenge of the post-genomic era. Technical progress so far is impressive,
including detailed understanding of regulatory mechanisms for at least a few
genes in multicellular organisms2,
3,
4, rapid and precise localization
of regulatory regions within extensive regions of DNA by means of cross-species
comparison5,
6,
7, and de novo determination of transcription-factor
binding specificities from large-scale yeast expression data8.
Here we address two problems involved in extending these results to the human
genome: first, it has been unclear how many model organism genomes will be
needed to delineate most regulatory regions; and second, the discovery of
transcription-factor binding sites (response elements) from expression data
has not yet been generalized from single-celled organisms to multicellular
organisms. We found that 98% (74/75) of experimentally defined sequence-specific
binding sites of skeletal-muscle-specific transcription factors are confined
to the 19% of human sequences that are most conserved in the orthologous rodent
sequences. Also we found that in using this restriction, the binding specificities
of all three major muscle-specific transcription factors (MYF, SRF and MEF2)
can be computationally identified.
|