Abstract
Comprehensive protein-interaction mapping projects are underway for many model species and humans. A key step in these projects is estimating the time, cost and personnel required for obtaining an accurate and complete map. Here we modeled the cost of interaction-map completion for various experimental designs. We showed that current efforts may require up to 20 independent tests covering each protein pair to approach completion. We explored designs for reducing this cost substantially, including prioritization of protein pairs, probability thresholding and interaction prediction. The best experimental designs lowered cost by fourfold overall and >100-fold in early stages of mapping. We demonstrate the best strategy in an ongoing project in Drosophila melanogaster, in which we mapped 450 high-confidence interactions using 47 microtiter plates, versus thousands of plates expected using current designs. This study provides a framework for assessing the feasibility of interaction mapping projects and for future efforts to increase their efficiency.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Fields, S. High-throughput two-hybrid analysis. The promise and the peril. FEBS J. 272, 5391–5399 (2005).
Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003).
Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).
Li, S. et al. A map of the interactome network of the metazoan C. elegans. Science 303, 540–543 (2004).
Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).
Stelzl, U. et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005).
Suzuki, H. et al. Protein-protein interaction panel using mouse full-length cDNAs. Genome Res. 11, 1758–1765 (2001).
Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).
Gavin, A.C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006).
Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).
Krogan, N.J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).
Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).
Pokholok, D.K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Tong, A.H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).
Collins, S.R., Schuldiner, M., Krogan, N.J. & Weissman, J.S. A strategy for extracting and analyzing large-scale quantitative epistatic interaction data. Genome Biol. 7, R63 (2006).
Bao, L. et al. Combining gene expression QTL mapping and phenotypic spectrum analysis to uncover gene regulatory relationships. Mamm. Genome 17, 575–583 (2006).
Chesler, E.J., Lu, L., Wang, J., Williams, R.W. & Manly, K.F. WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nat. Neurosci. 7, 485–486 (2004).
Petretto, E. et al. Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet. 2, e172 (2006).
Schadt, E.E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).
Rain, J.C. et al. The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001).
Parrish, J.R. et al. A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol. 8, R130 (2007).
LaCount, D.J. et al. A protein interaction network of the malaria parasite Plasmodium falciparum. Nature 438, 103–107 (2005).
Uetz, P. et al. Herpesviral protein networks and their interaction with the human proteome. Science 311, 239–242 (2006).
von Brunn, A. et al. Analysis of intraviral protein-protein interactions of the SARS coronavirus ORFeome. PLoS ONE 2, e459 (2007).
Lander, E.S. & Waterman, M.S. Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics 2, 231–239 (1988).
Weber, J.L. & Myers, E.W. Human whole-genome shotgun sequencing. Genome Res. 7, 401–409 (1997).
von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Hart, G.T., Ramani, A.K. & Marcotte, E.M. How complete are current yeast and human protein-interaction networks? Genome Biol. 7, 120 (2006).
Lappe, M. & Holm, L. Unraveling protein interaction networks with near-optimal efficiency. Nat. Biotechnol. 22, 98–103 (2004).
Cusick, M.E., Klitgord, N., Vidal, M. & Hill, D.E. Interactome: gateway into systems biology. Hum. Mol. Genet. 14 (special issue 2), R171–R181 (2005).
Kocher, T. & Superti-Furga, G. Mass spectrometry-based functional proteomics: from molecular machines to protein networks. Nat. Methods 4, 807–815 (2007).
Parrish, J.R., Gulyas, K.D. & Finley, R.L. Jr. Yeast two-hybrid contributions to interactome mapping. Curr. Opin. Biotechnol. 17, 387–393 (2006).
Deane, C.M., Salwinski, L., Xenarios, I. & Eisenberg, D. Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell. Proteomics 1, 349–356 (2002).
Stanyon, C.A. et al. A Drosophila protein-interaction map centered on cell-cycle regulators. Genome Biol. 5, R96 (2004).
Adams, M.D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
Zhong, J., Zhang, H., Stanyon, C.A., Tromp, G. & Finley, R.L. Jr. A strategy for constructing large protein interaction maps using the yeast two-hybrid system: regulated expression arrays and two-phase mating. Genome Res. 13, 2691–2699 (2003).
Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005).
Matthews, L.R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126 (2001).
Boulton, S.J. et al. Combined functional genomic maps of the C. elegans DNA damage response. Science 295, 127–131 (2002).
Ben-Hur, A. & Noble, W.S. Kernel methods for predicting protein-protein interactions. Bioinformatics 21 Suppl 1, i38–i46 (2005).
Jansen, R. et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003).
Lee, I., Date, S.V., Adai, A.T. & Marcotte, E.M. A probabilistic functional network of yeast genes. Science 306, 1555–1558 (2004).
Lu, L.J., Xia, Y., Paccanaro, A., Yu, H. & Gerstein, M. Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005).
von Mering, C. et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).
Yu, H., Paccanaro, A., Trifonov, V. & Gerstein, M. Predicting interactions in protein networks by completing defective cliques. Bioinformatics 22, 823–829 (2006).
Finley, R.L. Jr & Brent, R. Interaction mating reveals binary and ternary connections between Drosophila cell cycle regulators. Proc. Natl. Acad. Sci. USA 91, 12980–12984 (1994).
Kerrien, S. et al. IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565 (2007).
Acknowledgements
We thank S. Bandyopadhyay for critical reading of the manuscript, I. Bronner, K. Gulyas, B. Mangiola and H. Zhang for expert technical assistance with the two-hybrid assays, and R. Karp and R. Sharan for discussions of earlier versions of this work. This work was supported by US National Institutes of Health grants RR018627, GM070743 and HG001536.
Author information
Authors and Affiliations
Contributions
A.S.S. and T.I. formulated the probabilistic model and performed the simulations. J.Y., K.R.G. and R.L.F. generated all new reported Y2H data. A.S.S., R.L.F. and T.I. wrote the paper.
Corresponding author
Supplementary information
Supplementary Text and Figures1
Supplementary Figure 1, Supplementary Tables 1–2, Supplementary Methods (PDF 525 kb)
Supplementary Data
Results from protein-protein interaction predictions in Drosophila. (XLS 575 kb)
Rights and permissions
About this article
Cite this article
Schwartz, A., Yu, J., Gardenour, K. et al. Cost-effective strategies for completing the interactome. Nat Methods 6, 55–61 (2009). https://doi.org/10.1038/nmeth.1283
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1283
This article is cited by
-
Next-generation large-scale binary protein interaction network for Drosophila melanogaster
Nature Communications (2023)
-
Microarray expression profile of mRNAs and long noncoding RNAs and the potential role of PFK-1 in infantile hemangioma
Cell Division (2021)
-
RETRACTED ARTICLE: IspH inhibitors kill Gram-negative bacteria and mobilize immune clearance
Nature (2021)
-
Immunomodulatory and Antioxidative potentials of adipose-derived Mesenchymal stem cells isolated from breast versus abdominal tissue: a comparative study
Cell Regeneration (2020)
-
Genome-wide inference of the Camponotus floridanus protein-protein interaction network using homologous mapping and interacting domain profile pairs
Scientific Reports (2020)