Abstract
The availability of over 20 fully sequenced genomes has driven the development of new methods to find protein function and interactions. Here we group proteins by correlated evolution1, correlated messenger RNA expression patterns2 and patterns of domain fusion3 to determine functional relationships among the 6,217 proteins of the yeast Saccharomyces cerevisiae. Using these methods, we discover over 93,000 pairwise links between functionally related yeast proteins. Links between characterized and uncharacterized proteins allow a general function to be assigned to more than half of the 2,557 previously uncharacterized yeast proteins. Examples of functional links are given for a protein family of previously unknown function, a protein whose human homologues are implicated in colon cancer and the yeast prion Sup35.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Pellegrini,M., Marcotte,E. M., Thompson, M. J., Eisenberg,D. & Yeates,T. O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA 96, 4285–4288 (1999).
Eisen,M. B., Spellman,P. T., Brown,P. O. & Botstein,D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863– 14868 (1998).
Marcotte,E. M. et al. Detecting protein function and protein–protein interactions from genome sequences. Science 285, 751– 753 (1999).
Mewes,H. W., Hani,J., Pfeiffer,F. & Frishman,D. MIPS: a database for protein sequences and complete genomes. Nucleic Acids Res. 26, 33–37 ( 1998).
Karp,P., Riley,M., Paley,S. & Pellegrini-Toole,A. EcoCyc: Encyclopedia of Escherichia coli genes and metabolism. Nucleic Acids Res. 26, 50–53 (1998).
The yeast genome directory. Nature 387 (suppl), 1–105 (1997 ).
Bairoch,A. & Apewiler,R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res. 27, 49–54 ( 1999).
Bardosi,A., Eber,S. W., Hendrys,M. & Pekrun,A. Myopathy with altered mitochondria due to a triosephosphate isomerase (TPI) deficiency. Acta Neuropathol. (Berl.) 79, 387– 394 (1990).
Wickner,R. B. [URE3] as an altered URE2 protein: evidence for a prion analog in Saccharomyces cerevisiae. Science 264, 566– 569 (1994).
Miyaki,M. et al. Germline mutation of MSH6 as the cause of hereditary nonpolyposis colorectal cancer. Nature Struct. Biol. 17, 271–272 (1997).
Fishel,R. et al. The human mutator gene homologue MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 75, 1027–1038 (1993).
Kushirov,V. V. et al. Nucleotide sequence of the Sup2(Sup35) gene of Saccharomyces cerevisiae. Gene 66, 45– 54 (1988).
Stansfield,I. et al. The products of the SUP45 (eRF1) and SUP35 genes interact to mediate translation termination in Saccharomyces cerevisiae. EMBO J. 14, 4365–4373 ( 1995).
Chen,X., Sullivan,D. S. & Huffaker, T. C. Two yeast genes with similarity to TCP-1 are required for microtubule and actin function in vivo. Proc. Natl Acad. Sci. USA 91, 9111–9115 ( 1994).
Johnson,R. E., Kovvali,G. K., Prakash,L. & Prakash,S. Requirement of the yeast MSH3 and MSH6 genes for MSH2-dependent genomic stability. J. Biol. Chem. 271, 7285– 7288 (1996).
Lynch,H. T., Fusaro,R. M. & Lynch, J. F. Cancer genetics in the new era of molecular biology. Ann. NY Acad. Sci. 833, 1– 28 (1997).
Papadopolous,N. et al. Mutations of a MutL homolog in hereditary colon cancer. Science 263, 1625–1629 ( 1994).
West,M. G., Horne,D. W. & Appling, D. R. Metabolic role of cytoplasmic isozymes of 5,10-methylenetetrahydrofolate dehydrogenase in Saccharomyces cerevisiae. Biochemistry 35, 3122–3132 ( 1996).
Dandekar,T., Snel,B., Huynen,M. & Bork,P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328 (1998).
Overbeek,R., Fonstein,M., D'Souza,M., Pusch,G. D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA 96, 2896– 2901 (1999).
Altschul,S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Spellman,P. T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273– 3297 (1998).
DeRisi,J. L., Iyer,V. R. & Brown,P. O Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680– 686 (1998).
Chu,S. et al. The transcriptional program of sporulation in budding yeast. Science 282, 699–705 ( 1998).
Myers,L. C., Gustafsson,C. M., Hayashibara, K. C., Brown,P. O. & Kornberg,R. D. Mediator protein mutations that selectively abolish activated transcription. Proc. Natl Acad. Sci. USA 96, 67–72 ( 1999).
Horton,P. & Nakai,K. Better prediction of protein cellular localization sites with the k nearest neighbors classifier. Intell. Sys. Molec. Biol. 5, 147–152 (1997).
Acknowledgements
This work was supported by a Department of Energy/Oak Ridge Institute for Science and Education Hollaender postdoctoral Fellowship (E.M.), a Sloan Foundation/Department of Energy postdoctoral fellowship (M.P.), and grants from the DOE.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marcotte, E., Pellegrini, M., Thompson, M. et al. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999). https://doi.org/10.1038/47048
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/47048
This article is cited by
-
GFICLEE: ultrafast tree-based phylogenetic profile method inferring gene function at the genomic-wide level
BMC Genomics (2021)
-
A unified framework for integrative study of heterogeneous gene regulatory mechanisms
Nature Machine Intelligence (2020)
-
RNA oxidation in chromatin modification and DNA-damage response following exposure to formaldehyde
Scientific Reports (2020)
-
Functional Categorization of Disease Genes Based on Spectral Graph Theory and Integrated Biological Knowledge
Interdisciplinary Sciences: Computational Life Sciences (2019)
-
A survey of computational methods in protein–protein interaction networks
Annals of Operations Research (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.