Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Treasures and traps in genome-wide data sets: case examples from yeast

Key Points

  • The budding yeast Saccharomyces cerevisiae, because of its compact genome and tractable genetics, has served as a prime model organism for developing and applying large-scale genomic technologies.

  • During the past few years, several comparable genome-wide functional studies have been published. In many cases, the majority of results can be confirmed by completely independent methods, but it has also become apparent that there are limitations and uncertainties associated with these technologies.

  • These problems and possible solutions are discussed by highlighting yeast work in which gene expression, protein abundance, protein–DNA interactions, DNA replication, protein interactions and genome-wide mutagenesis have been investigated.

  • Each method has its own drawbacks and might not be comprehensive.

  • Differences in the genetic background of strains, in experimental set-ups and in analytical methods contribute to inconsistencies between databases. Technical difficulties and experimental variation are also the cause of false-positive and false-negative results. Because of the work involved, some large-scale studies have not been replicated, and appropriate controls might not be feasible. In some cases, it might be difficult to judge the rate of erroneous results.

  • The integration of genome-wide data sets increases fidelity and generates the most relevant information. Computational approaches towards this aim are discussed.

Abstract

Since the publication of the Saccharomyces cerevisiae genome sequence, much effort has been dedicated to developing high-throughput techniques to generate comprehensive information about the function and dynamics of all genes in this yeast's genome. These techniques have generated data sets that typically contain large amounts of reliable and valuable biological information. Nevertheless, there are also uncertainties that are associated with such large-scale studies, which we discuss in this review. These uncertainties increase with the complexity of the organism under study. On the basis of the results from yeast, we should learn much from human and mouse genomic data sets. However, as with yeast data sets, they might also contain misleading results.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Strain differences in Saccharomyces cerevisiae.
Figure 2: Detecting the interaction sites of DNA-binding proteins by chromatin immunoprecipitation.
Figure 3: Identifying origins of replication by measuring the timing and progression of DNA replication.
Figure 4: Measuring protein–protein interactions.
Figure 5: Method used to construct S. cerevisiae deletion strains systematically.

Similar content being viewed by others

References

  1. Goffeau, A. et al. Life with 6000 genes. Science 274, 546, 563–567 (1996).

    CAS  Google Scholar 

  2. Kumar, A. & Snyder, M. Emerging technologies in yeast genomics. Nature Rev. Genet. 2, 302–312 (2001).

    Article  CAS  Google Scholar 

  3. Pandey, A. & Mann, M. Proteomics to study genes and genomes. Nature 405, 837–846 (2000).

    Article  CAS  Google Scholar 

  4. DeRisi, J. L., Iyer, V. R. & Brown, P. O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–686 (1997).

    Article  CAS  Google Scholar 

  5. Wodicka, L., Dong, H., Mittmann, M., Ho, M.-H. & Lockhart, D. J. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nature Biotechnol. 15, 1359–1367 (1997).

    Article  CAS  Google Scholar 

  6. Velculescu, V. E., Zhang, L., Vogelstein, B. & Kinzler, K. W. Serial analysis of gene expression. Science 270, 484–487 (1995).

    Article  CAS  Google Scholar 

  7. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995).

    Article  CAS  Google Scholar 

  8. Primig, M. et al. The core meiotic transcriptome in budding yeasts. Nature Genet. 26, 415–423 (2000).

    Article  CAS  Google Scholar 

  9. Jelinsky, S. A., Estep, P., Church, G. M. & Samson, L. D. Regulatory networks revealed by transcriptional profiling of damaged Saccharomyces cerevisiae cells: Rpn4 links base excision repair with proteasomes. Mol. Cell. Biol. 20, 8157–8167 (2000).

    Article  CAS  Google Scholar 

  10. Cho, R. J. et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998).

    Article  CAS  Google Scholar 

  11. Spellman, P. T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297 (1998).

    Article  CAS  Google Scholar 

  12. Gasch, A. P. et al. Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p. Mol. Biol. Cell 12, 2987–3003 (2001).

    Article  CAS  Google Scholar 

  13. Jelinsky, S. A. & Samson, L. D. Global response of Saccharomyces cerevisiae to an alkylating agent. Proc. Natl Acad. Sci. USA 96, 1486–1491 (1999).

    Article  CAS  Google Scholar 

  14. Velculescu, V. E. et al. Characterization of the yeast transcriptome. Cell 88, 243–251 (1997).

    Article  CAS  Google Scholar 

  15. Chu, S. et al. The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998).

    Article  CAS  Google Scholar 

  16. Hughes, T. R. et al. Functional discovery via a compendium of expression profiles. Cell 102, 109–126 (2000).State-of-the-art DNA microarray study showing that cellular pathways affected by drug or genetic perturbations can be determined by comparing the gene expression profile with a compendium of 300 expression profiles that were measured under different conditions.

    Article  CAS  Google Scholar 

  17. Hughes, J. D., Estep, P. W., Tavazoie, S. & Church, G. M. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000).

    Article  CAS  Google Scholar 

  18. Dirick, L., Bohm, T. & Nasmyth, K. Roles and regulation of Cln-Cdc28 kinases at the start of the cell cycle of Saccharomyces cerevisiae. EMBO J. 14, 4803–4813 (1995).

    Article  CAS  Google Scholar 

  19. Gygi, S. P., Rochon, Y., Franza, B. R. & Aebersold, R. Correlation between protein and mRNA abundance in yeast. Mol. Cell. Biol. 19, 1720–1730 (1999).

    Article  CAS  Google Scholar 

  20. Futcher, B., Latter, G. I., Monardo, P., McLaughlin, C. S. & Garrels, J. I. A sampling of the yeast proteome. Mol. Cell. Biol. 19, 7357–7368 (1999).

    Article  CAS  Google Scholar 

  21. Washburn, M. P., Wolters, D. & Yates, J. R. III. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nature Biotechnol. 19, 242–247 (2001).

    Article  CAS  Google Scholar 

  22. Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).

    Article  CAS  Google Scholar 

  23. Iyer, V. R. et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001).References 22 and 23 describe an approach to mapping the interaction sites of DNA-binding proteins in vivo by combining DNA microarrays with chromatin immunoprecipitation.

    Article  CAS  Google Scholar 

  24. Newlon, C. S. & Theis, J. F. DNA replication joins the revolution: whole-genome views of DNA replication in budding yeast. Bioessays 24, 300–304 (2002).

    Article  CAS  Google Scholar 

  25. Raghuraman, M. K. et al. Replication dynamics of the yeast genome. Science 294, 115–121 (2001).

    Article  CAS  Google Scholar 

  26. Wyrick, J. J. et al. Genome-wide distribution of ORC and MCM proteins in S. cerevisiae: high-resolution mapping of replication origins. Science 294, 2357–2360 (2001).

    Article  CAS  Google Scholar 

  27. Fields, S. & Song, O. A novel genetic system to detect protein–protein interactions. Nature 340, 245–246 (1989).

    Article  CAS  Google Scholar 

  28. Uetz, P. et al. A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).

    Article  CAS  Google Scholar 

  29. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. Sci. USA 98, 4569–4574 (2001).References 28 and 29 are independent studies in which hundreds of binary protein–protein interactions were determined in a high-throughput yeast two-hybrid assay.

    Article  CAS  Google Scholar 

  30. Hodges, P. E., McKee, A. H., Davis, B. P., Payne, W. E. & Garrels, J. I. The Yeast Proteome Database (YPD): a model for the organization and presentation of genome-wide functional data. Nucleic Acids Res. 27, 69–73 (1999).

    Article  CAS  Google Scholar 

  31. Fromont-Racine, M., Rain, J. C. & Legrain, P. Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens. Nature Genet. 16, 277–282 (1997).

    Article  CAS  Google Scholar 

  32. Gavin, A. C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).

    Article  CAS  Google Scholar 

  33. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).References 32 and 33 are independent approaches by which hundreds of protein complexes were purified and identified using mass spectrometry.

    Article  CAS  Google Scholar 

  34. Kumar, A. & Snyder, M. Protein complexes take the bait. Nature 415, 123–124 (2002).

    Article  CAS  Google Scholar 

  35. Winzeler, E. et al. Functional characterization of the Saccharomyces cerevisiae genome by precise deletion and parallel analysis. Science 285, 901–906 (1999).

    Article  CAS  Google Scholar 

  36. Giaever, G. et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387–391 (2002).References 35 and 36 describe the creation of a complete set of barcoded yeast deletion strains.

    Article  CAS  Google Scholar 

  37. Ball, C. A. et al. Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res. 28, 77–80 (2000).

    Article  CAS  Google Scholar 

  38. Tong, A. H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).

    Article  CAS  Google Scholar 

  39. Giaever, G. et al. Genomic profiling of drug sensitivities via induced haploinsufficiency. Nature Genet. 21, 278–283 (1999).

    Article  CAS  Google Scholar 

  40. Raamsdonk, L. M. et al. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nature Biotechnol. 19, 45–50 (2001).

    Article  CAS  Google Scholar 

  41. Hughes, T. R. et al. Widespread aneuploidy revealed by DNA microarray expression profiling. Nature Genet. 25, 333–337 (2000).

    Article  CAS  Google Scholar 

  42. Elledge, S. J. & Davis, R. W. Two genes differentially regulated in the cell cycle and by DNA-damaging agents encode alternative regulatory subunits of ribonucleotide reductase. Genes Dev. 4, 740–751 (1990).

    Article  CAS  Google Scholar 

  43. Winzeler, E. A., Lee, B., McCusker, J. H. & Davis, R. W. Whole genome genetic-typing in yeast using high-density oligonucleotide arrays. Parasitology 118, S73–S80 (1999).

    Article  CAS  Google Scholar 

  44. Ross-Macdonald, P. et al. Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402, 413–418 (1999).

    Article  CAS  Google Scholar 

  45. Valencia, M. et al. NEJ1 controls non-homologous end joining in Saccharomyces cerevisiae. Nature 414, 666–669 (2001).

    Article  CAS  Google Scholar 

  46. Ooi, S. L., Shoemaker, D. D. & Boeke, J. D. A DNA microarray-based genetic screen for nonhomologous end-joining mutants in Saccharomyces cerevisiae. Science 294, 2552–2556 (2001).

    Article  CAS  Google Scholar 

  47. Marcotte, E. M., Pellegrini, M., Thompson, M. J., Yeates, T. O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).

    Article  CAS  Google Scholar 

  48. Ge, H., Liu, Z., Church, G. M. & Vidal, M. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nature Genet. 29, 482–486 (2001).

    Article  CAS  Google Scholar 

  49. Jansen, R., Greenbaum, D. & Gerstein, M. Relating whole-genome expression data with protein–protein interactions. Genome Res. 12, 37–46 (2002).

    Article  CAS  Google Scholar 

  50. von Mering, C. et al. Comparative assessment of large-scale data sets of protein–protein interactions. Nature 417, 399–403 (2002).Computational approach to assess the accuracy, potential and bias of different genomic approaches to determine protein–protein interactions.

    Article  CAS  Google Scholar 

  51. Steinmetz, L. M. et al. Dissecting the architecture of a quantitative trait locus in yeast. Nature 416, 326–330 (2002).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

E.A.W. is supported by a New Scholars Award from The Ellison Medical Foundation. B.G. is supported by a fellowship from the Swiss National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elizabeth A. Winzeler.

Related links

Related links

DATABASES

<i>Saccharomyces</i> Genome Database

Cln1

Gal4

NEJ1

RNR1

RNR3

Ste12

FURTHER INFORMATION

Munich Information Centre for Protein Sciences — Comprehensive Yeast Genome Database

Saccharomyces Genome Database

Saccharomyces Genome Deletion Project

Transposon-Insertion Phenotypes, Localization and Expression in Saccharomyces (TRIPLES)

Yeast Proteome Database

Glossary

TRANSCRIPTOME

The entire mRNA complement that is expressed by an organism. The expression of mRNA transcripts is dynamic and changes under different conditions.

TWO-DIMENSIONAL GEL

A gel-electrophoresis method by which proteins are separated by charge in the first dimension and by size in the second.

MASS SPECTROMETRY

A technique that provides accurate information about the molecular mass of complex molecules. It can identify extremely small amounts of proteins by their mass-fragment spectra.

CODON-USAGE BIAS

A measurement that takes into account the frequency of less commonly used alternative codons in a protein-coding sequence.

LIQUID CHROMATOGRAPHY

A method that separates molecules from mixtures by selective adsorption.

PHEROMONE

A molecule that is produced for signalling between individuals of the same species.

AUTONOMOUS REPLICATION SEQUENCE

(ARS). A consensus sequence that is necessary, but not sufficient, to allow the replication and maintenance of an otherwise non-replicating plasmid in yeast.

MESELSON–STAHL EXPERIMENT

The experiment that was originally used to prove semiconservative DNA replication. Cells were first grown in a medium that contains heavy isotopes and then moved to a medium that contains light isotopes. The newly replicated DNA was analysed by density-gradient centrifugation. Because only heavy–light hybrids were observed versus heavy- or light-labelled DNA, it was concluded that DNA replicates semi-conservatively.

BAIT

In a yeast two-hybrid approach, this is the protein that is fused to the Gal4 DNA-binding domain.

PREY

In a yeast two-hybrid approach, this is the protein that is fused to the Gal4 activating domain.

ORTHOLOGUE

A homologous gene that originated through speciation.

STOICHIOMETRIC

The molar ratio of interacting molecules.

TETRAD DISSECTION

Phenotypic analysis of the four haploid daughter cells that come from the meiotic cell division of a single diploid cell in yeast. In Saccharomyces cerevisiae, the four meiotic daughter cells are in a single ascus, or spore. Physically associated with one another in this 'tetrad', they can be dissected away from each other, allowing phenotypic segregation ratios to be scored precisely.

SYNTHETIC INTERACTIONS

These occur when a double mutant has a phenotype that is more severe or milder than the phenotype of either single-mutant parent. For suppressors (synthetic viable), the double mutant is viable when at least one of the single mutants is not. For synthetic-lethal mutants, the double mutant is inviable under conditions in which both parental mutations are viable.

COLLATERAL MUTATION

A random second-site mutation that might have been introduced inadvertently during site-directed mutagenesis.

ANEUPLOIDY

The presence of extra copies, or no copies, of some chromosomes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grünenfelder, B., Winzeler, E. Treasures and traps in genome-wide data sets: case examples from yeast. Nat Rev Genet 3, 653–661 (2002). https://doi.org/10.1038/nrg886

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg886

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing