Several attempts have been made to systematically map protein-protein interaction, or 'interactome', networks. However, it remains difficult to assess the quality and coverage of existing data sets. Here we describe a framework that uses an empirically-based approach to rigorously dissect quality parameters of currently available human interactome maps. Our results indicate that high-throughput yeast two-hybrid (HT-Y2H) interactions for human proteins are more precise than literature-curated interactions supported by a single publication, suggesting that HT-Y2H is suitable to map a significant portion of the human interactome. We estimate that the human interactome contains ∼130,000 binary interactions, most of which remain to be mapped. Similar to estimates of DNA sequence data quality and genome size early in the Human Genome Project, estimates of protein interaction data quality and interactome size are crucial to establish the magnitude of the task of comprehensive human interactome mapping and to elucidate a path toward this goal.
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Vidal, M. Interactome modeling. FEBS Lett. 579, 1834–1838 (2005).
Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).
Stelzl, U. et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005).
Ewing, R.M. et al. Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol. Syst. Biol. 3, 89 (2007).
Peri, S. et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res. 32, D497–D501 (2004).
Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).
Bader, G.D. et al. BIND–The Biomolecular Interaction Network Database. Nucleic Acids Res. 29, 242–245 (2001).
Hermjakob, H. et al. IntAct: an open source molecular interaction database. Nucleic Acids Res. 32, D452–D455 (2004).
Xenarios, I. et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002).
Mewes, H.W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002).
Ramani, A.K., Bunescu, R.C., Mooney, R.J. & Marcotte, E.M. Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 6, R40 (2005).
Lehner, B. & Fraser, A.G. A first-draft human protein-interaction map. Genome Biol. 5, R63 (2004).
Hart, G.T., Ramani, A.K. & Marcotte, E.M. How complete are current yeast and human protein-interaction networks? Genome Biol. 7, 120 (2006).
Futschik, M.E., Chaurasia, G. & Herzel, H. Comparison of human protein-protein interaction maps. Bioinformatics 23, 605–611 (2007).
von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Reguly, T. et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J. Biol. 5, 11 (2006).
Gandhi, T.K. et al. Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nat. Genet. 38, 285–293 (2006).
Patil, A. & Nakamura, H. Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinformatics 6, 100 (2005).
Huang, H., Jedynak, B.M. & Bader, J.S. Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps. PLoS Comput. Biol. 3, e214 (2007).
D'Haeseleer, P. & Church, G.M. Estimating and improving protein interaction error rates. Proc. IEEE Comput. Syst. Bioinform. Conf. 216–223 (2004).
Grigoriev, A. On the number of protein-protein interactions in the yeast proteome. Nucleic Acids Res. 31, 4157–4161 (2003).
Deane, C.M., Salwinski, L., Xenarios, I. & Eisenberg, D. Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell. Proteomics 1, 349–356 (2002).
Sprinzak, E., Sattath, S. & Margalit, H. How reliable are experimental protein-protein interaction data? J. Mol. Biol. 327, 919–923 (2003).
Rual, J.F. et al. Human ORFeome version 1.1: a platform for reverse proteomics. Genome Res. 14, 2128–2135 (2004).
Cusick, M.E. et al. Literature-curated protein interaction datasets. Nat. Methods (in the press).
Braun, P. et al. An experimentally derived confidence score for binary protein-protein interactions. Nat. Methods advance online publication, doi:10.1038/nmeth.1281 (7 December 2008).
Eyckerman, S. et al. Design and application of a cytokine-receptor-based interaction trap. Nat. Cell Biol. 3, 1114–1119 (2001).
Stumpf, M.P. et al. Estimating the size of the human interactome. Proc. Natl. Acad. Sci. USA 105, 6959–6964 (2008).
Ramírez, F., Schlicker, A., Assenov, Y., Lengauer, T. & Albrecht, M. Computational analysis of human protein interaction networks. Proteomics 7, 2541–2552 (2007).
Collins, S.R. et al. Towards a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol. Cell. Proteomics 6, 439–450 (2007).
Fields, C., Adams, M.D., White, O. & Venter, J.C. How many genes in the human genome? Nat. Genet. 7, 345–346 (1994).
Lemmens, I., Lievens, S., Eyckerman, S. & Tavernier, J. Reverse MAPPIT detects disruptors of protein-protein interactions in human cells. Nat. Protoc. 1, 92–97 (2006).
We thank members of CCSB and the Vidal, Barabasi, Wanker and Tavernier laboratories and S. Sahasrabuddhe, R. Bell, R. Chettier and C. Wiggins for helpful discussions; E. Smith for help generating Figure 1; and Agencourt Biosciences for sequencing assistance. This work was supported by the US National Human Genome Research Institute (2R01HG001715 and 5P50HG004233 to M.V. and F.P.R.), the US National Cancer Institute (5U54CA112952 to J. Nevins, subcontract to M.V.; and 5U01CA105423 to S.H. Orkin, project to M.V.), the US National Institutes of Health (IH U01 A1070499-01 and U56 CA113004 to A.-L.B. and postdoctoral training grant fellowship T32CA09361 to K.V.), the Ellison Foundation (to M.V.), the W.M. Keck Foundation (to M.V.), Dana-Farber Cancer Institute Institute Sponsored Research funds (to M.V.), the US National Science Foundation (ITR DMR-0926737 and IIS-0513650 to A.-L.B.), Deutsches Bundesministerium für Bildung und Forschung (NGFN2, KB-P04T01, KB-P04T03 and 01GR0471 to E.E.W. and U.S.), Deutsche Forschungsgemeinschaft (SFB 577 and SFB618 to E.E.W.), the University of Ghent and the “Fonds Wetenschappelijk Onderzoek–Vlaanderen” (FWO-V) G.0031.06 (GOA12051401 to J. Tavernier) and the National Cancer Institute of Canada (to C.B.). I.L. is a postdoctoral fellow with the FWO-V. M.V. is a “Chercheur Qualifié Honoraire” from the Fonds de la Recherche Scientifique (French Community of Belgium).
Supplementary Figure 1; Supplementary Tables 2,3,5; Supplementary Data 1–4; Supplementary Methods (PDF 2639 kb)
List of interactions in various datasets used in pair wise test experiments using MAPPIT and Y2H-CCSB assays (XLS 144 kb)
Scores for the Y2H-CCSB and MAPPIT experiments on the hsPRS-v1 and hsRRS-v1 to compute assay sensitivity and background positive rate and scores on subsets of the LC, MDC-HI1 and CCSB-HI1 interaction datasets (XLS 78 kb)
Identity of the ORFs making up the Y2H-CCSB repeat screens (XLS 396 kb)
Interactions found in the Y2H-CCSB repeat screens (XLS 69 kb)
Interactions found in the Y2H-CCSB repeat screens reported according to MIMIX specifications (XLS 207 kb)
Identity of the ORFs making up the MDC-HI1 search space (space II) (XLS 159 kb)
About this article
Cite this article
Venkatesan, K., Rual, J., Vazquez, A. et al. An empirical framework for binary interactome mapping. Nat Methods 6, 83–90 (2009) doi:10.1038/nmeth.1280
EvoPPI 1.0: a Web Platform for Within- and Between-Species Multiple Interactome Comparisons and Application to Nine PolyQ Proteins Determining Neurodegenerative Diseases
Interdisciplinary Sciences: Computational Life Sciences (2019)
Protein Interaction Z Score Assessment (PIZSA): an empirical scoring scheme for evaluation of protein–protein interactions
Nucleic Acids Research (2019)
APID database: redefining protein–protein interaction experimental evidences and binary interactomes
Cell Reports (2019)
Frontiers in Pharmacology (2019)