Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Gaining confidence in high-throughput protein interaction networks

Abstract

Although genome-scale technologies have benefited from statistical measures of data quality, extracting biologically relevant pathways from high-throughput proteomics data remains a challenge. Here we develop a quantitative method for evaluating proteomics data. We present a logistic regression approach that uses statistical and topological descriptors to predict the biological relevance of protein-protein interactions obtained from high-throughput screens for yeast. Other sources of information, including mRNA expression, genetic interactions and database annotations, are subsequently used to validate the model predictions without bias or cross-pollution. Novel topological statistics show hierarchical organization of the network of high-confidence interactions: protein complex interactions extend one to two links, and genetic interactions represent an even finer scale of organization. Knowledge of the maximum number of links that indicates a significant correlation between protein pairs (correlation distance) enables the integrated analysis of proteomics data with data from genetics and gene expression. The type of analysis presented will be essential for analyzing the growing amount of genomic and proteomics data in model organisms and humans.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Network cross-comparison.
Figure 2: Confidence score validation.
Figure 3: Distance-dependent correlations.
Figure 4: Joint analysis of physical and genetic interactions.
Figure 5: Joint analysis of cell division cycle coexpression with physical interactions.
Figure 6: Joint analysis of sporulation-specific expression with physical interactions.

References

  1. Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).

    Article  CAS  Google Scholar 

  2. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).

    Article  CAS  Google Scholar 

  3. Tong, A.H. et al. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321–324 (2002).

    Article  CAS  Google Scholar 

  4. Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).

    Article  CAS  Google Scholar 

  5. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).

    Article  CAS  Google Scholar 

  6. von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).

    Article  CAS  Google Scholar 

  7. Bader, G.D. & Hogue, C.W. Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002).

    Article  CAS  Google Scholar 

  8. Phizicky, E., Bastiaens, P.I., Zhu, H., Snyder, M. & Fields, S. Protein analysis on a proteomic scale. Nature 422, 208–215 (2003).

    Article  CAS  Google Scholar 

  9. Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).

    Article  CAS  Google Scholar 

  10. Deane, C.M., Salwinski, L., Xenarios, I. & Eisenberg, D. Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell. Proteomics 1, 349–356 (2002).

    Article  CAS  Google Scholar 

  11. Maslov, S. & Sneppen, K. Specificity and stability in topology of protein networks. Science 296, 910–913 (2002).

    Article  CAS  Google Scholar 

  12. Watts, D.J. & Strogatz, S.H. Collective dynamics of 'small-world' networks. Nature 393, 440–442 (1998).

    Article  CAS  Google Scholar 

  13. Barabasi, A.L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).

    Article  CAS  Google Scholar 

  14. Jeong, H., Mason, S.P., Barabasi, A.L. & Oltvai, Z.N. Lethality and centrality in protein networks. Nature 411, 41–42 (2001).

    Article  CAS  Google Scholar 

  15. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N. & Barabasi, A.L. Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002).

    Article  CAS  Google Scholar 

  16. Wolf, Y.I., Karev, G. & Koonin, E.V. Scale-free networks in biology: new insights into the fundamentals of evolution? Bioessays 24, 105–109 (2002).

    Article  Google Scholar 

  17. Goldberg, D.S. & Roth, F.P. Assessing experimentally derived interactions in a small world. Proc. Natl. Acad. Sci. USA 100, 4372–4376 (2003).

    Article  CAS  Google Scholar 

  18. Ge, H., Liu, Z., Church, G.M. & Vidal, M. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat. Genet. 29, 482–486 (2001).

    Article  CAS  Google Scholar 

  19. Jansen, R., Greenbaum, D. & Gerstein, M. Relating whole-genome expression data with protein-protein interactions. Genome Res. 12, 37–46 (2002).

    Article  CAS  Google Scholar 

  20. Kemmeren, P. et al. Protein interaction verification and functional annotation by integrated analysis of genome-scale data. Mol. Cell 9, 1133–1143 (2002).

    Article  CAS  Google Scholar 

  21. Matthews, L.R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126 (2001).

    Article  CAS  Google Scholar 

  22. Lee, T.I. et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002).

    Article  CAS  Google Scholar 

  23. Milo, R. et al. Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002).

    Article  CAS  Google Scholar 

  24. McCullagh, P. & Nelder, J.A. Generalized Linear Models, edn. 2 (Chapman & Hall, London, 1983).

    Book  Google Scholar 

  25. Hastie, T., Tibshirani, R. & Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, New York, 2001).

    Book  Google Scholar 

  26. Xenarios, I. et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002).

    Article  CAS  Google Scholar 

  27. Jansen, R. et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003).

    Article  CAS  Google Scholar 

  28. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A.F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18 (Suppl. 1), S233–S240 (2002).

    Article  Google Scholar 

  29. Bader, G.D. & Hogue, C.W. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003).

    Article  Google Scholar 

  30. Guet, C.C., Elowitz, M.B., Hsing, W. & Leibler, S. Combinatorial synthesis of genetic networks. Science 296, 1466–1470 (2002).

    Article  CAS  Google Scholar 

  31. Bhalla, U.S., Ram, P.T. & Iyengar, R. MAP kinase phosphatase as a locus of flexibility in a mitogen-activated protein kinase signaling network. Science 297, 1018–1023 (2002).

    Article  CAS  Google Scholar 

  32. Cho, R.J. et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998).

    Article  CAS  Google Scholar 

  33. Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297 (1998).

    Article  CAS  Google Scholar 

  34. Zhao, L.P., Prentice, R. & Breeden, L. Statistical modeling of large microarray data sets to identify stimulus-response profiles. Proc. Natl. Acad. Sci. USA 98, 5631–5636 (2001).

    Article  CAS  Google Scholar 

  35. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  Google Scholar 

  36. Bader, J.S. Greedily building protein networks with confidence. Bioinformatics 19, 1869–1874 (2003).

    Article  CAS  Google Scholar 

  37. Chu, S. et al. The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998).

    Article  CAS  Google Scholar 

  38. Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science; published online 6 November 2003 (doi:10.1126/science.1090289).

  39. Mewes, H.W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002).

    Article  CAS  Google Scholar 

  40. Hughes, T.R. et al. Functional discovery via a compendium of expression profiles. Cell 102, 109–126 (2000).

    Article  CAS  Google Scholar 

  41. Tong, A.H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

J.S.B. acknowledges his colleagues at CuraGen who generated much of the data analyzed here and whose discussions have been enjoyable and productive.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joel S Bader.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bader, J., Chaudhuri, A., Rothberg, J. et al. Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 22, 78–85 (2004). https://doi.org/10.1038/nbt924

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt924

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing