Sequencing and comparison of yeast species to identify genes and regulatory elements


Identifying the functional elements encoded in a genome is one of the principal challenges in modern biology. Comparative genomics should offer a powerful, general approach. Here, we present a comparative analysis of the yeast Saccharomyces cerevisiae based on high-quality draft sequences of three related species (S. paradoxus, S. mikatae and S. bayanus). We first aligned the genomes and characterized their evolution, defining the regions and mechanisms of change. We then developed methods for direct identification of genes and regulatory motifs. The gene analysis yielded a major revision to the yeast gene catalogue, affecting approximately 15% of all genes and reducing the total count by about 500 genes. The motif analysis automatically identified 72 genome-wide elements, including most known regulatory motifs and numerous new motifs. We inferred a putative function for most of these motifs, and provided insights into their combinatorial interactions. The results have implications for genome analysis of diverse organisms, including the human.

Figure 1: Aligned ORFs across four species. A 50-kb segment of S. cerevisiae chromosome VII aligned with orthologous contigs from each of the other three species.
Figure 2: Genome evolution.
Figure 3: Evolutionary tree of the four yeast species.
Figure 4: Spurious ORF rejected by RFC test.
Figure 5: Examples of proposed changes in gene structure.
Figure 6: Conservation in the GAL1GAL10 intergenic region.
Figure 7: Distribution of motifs by conservation score.


We thank D. Botstein, M. Cherry, K. Dolinski, D. Fisk, S. Weng and other members of the Saccharomyces Genome Database staff for assistance with SGD, for making our data available to the community through SGD, and for discussions; J. Butler, S. Calvo, J. Galagan, D. Jaffe, J. Lehar and L. Jun Ma for technical advice and discussions; the staff of the Whitehead/MIT Center for Genome Research Sequencing Center who generated the shotgun sequence from the three yeast species; T. Lee, N. Rinaldi, R. Young and J. Zeitlinger for sharing data about chromatin immunoprecipitation experiments and for discussions; M. Eisen and A. Gasch for sharing information about gene expression clusters and for discussions; E. Louis and I. Roberts for providing yeast strains and discussions; B. Berger, G. Fink, D. Gifford, S. Lindquist and H. True-Krobb for discussions; and L. Gaffney for assistance with figures.

Supplementary information

Supplementary Figure 1: nucleotide alignment for Figure 4 (PDF 124 kb)

Supplementary Figure 2: nucleotide alignment for Figures 5a, 5b, 5c (PDF 890 kb)

Supplementary Figure 3: nucleotide alignment for Figures 5d, 5e (PDF 1548 kb)

Supplementary methods and index to author’s website (DOC 71 kb)

