Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Sequencing and comparison of yeast species to identify genes and regulatory elements

Abstract

Identifying the functional elements encoded in a genome is one of the principal challenges in modern biology. Comparative genomics should offer a powerful, general approach. Here, we present a comparative analysis of the yeast Saccharomyces cerevisiae based on high-quality draft sequences of three related species (S. paradoxus, S. mikatae and S. bayanus). We first aligned the genomes and characterized their evolution, defining the regions and mechanisms of change. We then developed methods for direct identification of genes and regulatory motifs. The gene analysis yielded a major revision to the yeast gene catalogue, affecting approximately 15% of all genes and reducing the total count by about 500 genes. The motif analysis automatically identified 72 genome-wide elements, including most known regulatory motifs and numerous new motifs. We inferred a putative function for most of these motifs, and provided insights into their combinatorial interactions. The results have implications for genome analysis of diverse organisms, including the human.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Aligned ORFs across four species. A 50-kb segment of S. cerevisiae chromosome VII aligned with orthologous contigs from each of the other three species.
Figure 2: Genome evolution.
Figure 3: Evolutionary tree of the four yeast species.
Figure 4: Spurious ORF rejected by RFC test.
Figure 5: Examples of proposed changes in gene structure.
Figure 6: Conservation in the GAL1GAL10 intergenic region.
Figure 7: Distribution of motifs by conservation score.

Similar content being viewed by others

References

  1. Goffeau, A. et al. Life with 6000 genes. Science 274, 546, 563–567 (1996)

    ADS  CAS  PubMed  Google Scholar 

  2. Kowalczuk, M., Mackiewicz, P., Gierlik, A., Dudek, M. R. & Cebrat, S. Total number of coding open reading frames in the yeast genome. Yeast 15, 1031–1034 (1999)

    Article  CAS  PubMed  Google Scholar 

  3. Harrison, P. M., Kumar, A., Lang, N., Snyder, M. & Gerstein, M. A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 30, 1083–1090 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Velculescu, V. E. et al. Characterization of the yeast transcriptome. Cell 88, 243–251 (1997)

    Article  CAS  PubMed  Google Scholar 

  5. Blandin, G. et al. Genomic exploration of the hemiascomycetous yeasts: 4. The genome of Saccharomyces cerevisiae revisited. FEBS Lett. 487, 31–36 (2000)

    Article  CAS  PubMed  Google Scholar 

  6. Wood, V., Rutherford, K. M., Ivens, A., Rajandream, M.-A. & Barrell, B. A Re-annotation of the Saccaromyces cerevisiae genome. Comp. Funct. Genomics 2, 143–154 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. International Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Article  Google Scholar 

  8. Bailey, T. L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994)

    CAS  PubMed  Google Scholar 

  9. Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J. & Church, G. M. Systematic determination of genetic network architecture. Nature Genet. 22, 281–285 (1999)

    Article  CAS  PubMed  Google Scholar 

  10. Stormo, G. D. DNA binding sites: representation and discovery. Bioinformatics 16, 16–23 (2000)

    Article  CAS  PubMed  Google Scholar 

  11. McGuire, A. M., Hughes, J. D. & Church, G. M. Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res. 10, 744–757 (2000)

    Article  CAS  PubMed  Google Scholar 

  12. Loots, G. G. et al. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288, 136–140 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  13. Pennacchio, L. A. & Rubin, E. M. Genomic strategies to identify mammalian regulatory sequences. Nature Rev. Genet. 2, 100–109 (2001)

    Article  CAS  PubMed  Google Scholar 

  14. Oeltjen, J. C. et al. Large-scale comparative sequence analysis of the human and murine Bruton's tyrosine kinase loci reveals conserved regulatory domains. Genome Res. 7, 315–329 (1997)

    Article  CAS  PubMed  Google Scholar 

  15. Cliften, P. F. et al. Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis. Genome Res. 11, 1175–1186 (2001)

    Article  CAS  PubMed  Google Scholar 

  16. Alm, R. A. et al. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397, 176–180 (1999)

    Article  ADS  PubMed  Google Scholar 

  17. Carlton, J. M. et al. Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii. Nature 419, 512–519 (2002)

    Article  ADS  CAS  PubMed  Google Scholar 

  18. Perrin, A. et al. Comparative genomics identifies the genetic islands that distinguish Neisseria meningitidis, the agent of cerebrospinal meningitis, from other Neisseria species. Infect. Immun. 70, 7063–7072 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. McClelland, M. et al. Comparison of the Escherichia coli K-12 genome with sampled genomes of a Klebsiella pneumoniae and three salmonella enterica serovars, Typhimurium, Typhi and Paratyphi. Nucleic Acids Res. 28, 4974–4986 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Batzoglou, S. et al. ARACHNE: a whole-genome shotgun assembler. Genome Res. 12, 177–189 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Gardner, M. J. et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419, 498–511 (2002)

    Article  ADS  CAS  PubMed  Google Scholar 

  22. Fischer, G., James, S. A., Roberts, I. N., Oliver, S. G. & Louis, E. J. Chromosomal evolution in Saccharomyces. Nature 405, 451–454 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Dunham, M. J. et al. Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc. Natl Acad. Sci. USA 99, 16144–16149 (2002)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  24. Blanchette, M. & Tompa, M. Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 12, 739–748 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Fischer, G., Neuveglise, C., Durrens, P., Gaillardin, C. & Dujon, B. Evolution of gene order in the genomes of two related yeast species. Genome Res. 11, 2009–2019 (2001)

    Article  CAS  PubMed  Google Scholar 

  26. Wolfe, K. H. & Shields, D. C. Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708–713 (1997)

    Article  ADS  CAS  PubMed  Google Scholar 

  27. Bon, E. et al. Genomic exploration of the hemiascomycetous yeasts: 5. Saccharomyces bayanus var. uvarum. FEBS Lett. 487, 37–41 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  28. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

    Article  Google Scholar 

  29. Dujon, B. et al. Complete DNA sequence of yeast chromosome XI. Nature 369, 371–378 (1994)

    Article  ADS  CAS  PubMed  Google Scholar 

  30. Sharp, P. M. & Li, W. H. The codon Adaptation Index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 1281–1295 (1987)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  31. Clark, T. A., Sugnet, C. W. & Ares, M. Jr Genome-wide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907–910 (2002)

    Article  ADS  CAS  PubMed  Google Scholar 

  32. Hurst, L. D. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18, 486 (2002)

    Article  PubMed  Google Scholar 

  33. Chu, S. et al. The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)

    Article  ADS  CAS  PubMed  Google Scholar 

  34. True, H. L. & Lindquist, S. L. A yeast prion provides a mechanism for genetic variation and phenotypic diversity. Nature 407, 477–483 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  35. Koufopanou, V., Goddard, M. R. & Burt, A. Adaptation for horizontal transfer in a homing endonuclease. Mol. Biol. Evol. 19, 239–246 (2002)

    Article  CAS  PubMed  Google Scholar 

  36. Haber, J. E. Mating-type gene switching in Saccharomyces cerevisiae. Annu. Rev. Genet. 32, 561–599 (1998)

    Article  CAS  PubMed  Google Scholar 

  37. Hampson, S., Kibler, D. & Baldi, P. Distribution patterns of over-represented k-mers in non-coding yeast DNA. Bioinformatics 18, 513–528 (2002)

    Article  CAS  PubMed  Google Scholar 

  38. McCue, L. et al. Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res. 29, 774–782 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Gelfand, M. S., Koonin, E. V. & Mironov, A. A. Prediction of transcription regulatory sites in Archaea by a comparative genomic approach. Nucleic Acids Res. 28, 695–705 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Keegan, L., Gill, G. & Ptashne, M. Separation of DNA binding from the transcription-activating function of a eukaryotic regulatory protein. Science 231, 699–704 (1986)

    Article  ADS  CAS  PubMed  Google Scholar 

  41. Zhu, J. & Zhang, M. Q. SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15, 607–611 (1999)

    Article  CAS  PubMed  Google Scholar 

  42. Mewes, H. W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 27, 44–48 (1999)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Dwight, S. S. et al. Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res. 30, 69–72 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Lee, T. I. et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002)

    Article  ADS  CAS  PubMed  Google Scholar 

  45. Gasch, A. P. & Eisen, M. B. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol. 3 RESEARCH0059 (2002)

  46. Mosley, A. L., Lakshmanan, J., Aryal, B. K. & Ozcan, S. Glucose-mediated phosphorylation converts the transcription factor Rgt1 from a repressor to an activator. J. Biol. Chem. 278, 10322–10327 (2003)

    Article  CAS  PubMed  Google Scholar 

  47. Lindgren, A. et al. The pachytene checkpoint in Saccharomyces cerevisiae requires the Sum1 transcriptional repressor. EMBO J. 19, 6489–6497 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Jacobs Anderson, J. S. & Parker, R. Computational identification of cis-acting elements affecting post-transcriptional control of gene expression in Saccharomyces cerevisiae. Nucleic Acids Res. 28, 1604–1617 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Zeitlinger, J. et al. Program-specific distribution of a transcription factor dependent on partner transcription factor and MAPK signaling. Cell 113, 395–404 (2003)

    Article  CAS  PubMed  Google Scholar 

  50. Morillon, A., Springer, M. & Lesage, P. Activation of the Kss1 invasive-filamentous growth pathway induces Ty1 transcription and retrotransposition in Saccharomyces cerevisiae. Mol. Cell Biol. 20, 5766–5776 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. Botstein, M. Cherry, K. Dolinski, D. Fisk, S. Weng and other members of the Saccharomyces Genome Database staff for assistance with SGD, for making our data available to the community through SGD, and for discussions; J. Butler, S. Calvo, J. Galagan, D. Jaffe, J. Lehar and L. Jun Ma for technical advice and discussions; the staff of the Whitehead/MIT Center for Genome Research Sequencing Center who generated the shotgun sequence from the three yeast species; T. Lee, N. Rinaldi, R. Young and J. Zeitlinger for sharing data about chromatin immunoprecipitation experiments and for discussions; M. Eisen and A. Gasch for sharing information about gene expression clusters and for discussions; E. Louis and I. Roberts for providing yeast strains and discussions; B. Berger, G. Fink, D. Gifford, S. Lindquist and H. True-Krobb for discussions; and L. Gaffney for assistance with figures.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Manolis Kellis or Eric S. Lander.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kellis, M., Patterson, N., Endrizzi, M. et al. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003). https://doi.org/10.1038/nature01644

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature01644

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing