Abstract
High-quality datasets are needed to understand how global and local properties of protein-protein interaction, or 'interactome', networks relate to biological mechanisms, and to guide research on individual proteins. In an evaluation of existing curation of protein interaction experiments reported in the literature, we found that curation can be error-prone and possibly of lower quality than commonly assumed.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Cusick, M.E., Klitgord, N., Vidal, M. & Hill, D.E. Interactome: Gateway into systems biology. Hum. Mol. Genet. 14, R171–R181 (2005).
Bader, S., Kuhner, S. & Gavin, A.C. Interaction networks for systems biology. FEBS Lett. 582, 1220–1224 (2008).
Vidal, M. Interactome modeling. FEBS Lett. 579, 1834–1838 (2005).
Roberts, P.M. Mining literature for systems biology. Brief. Bioinform. 7, 399–406 (2006).
Venkatesan, K. et al. An empirical framework for binary interactome mapping. Nat. Methods 6, 83–90 (2008).
Stumpf, M.P. et al. Estimating the size of the human interactome. Proc. Natl. Acad. Sci. USA 105, 6959–6964 (2008).
Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008).
Parrish, J.R., Gulyas, K.D. & Finley, R.L. Jr. Yeast two-hybrid contributions to interactome mapping. Curr. Opin. Biotechnol. 17, 387–393 (2006).
Ito, T. et al. Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell. Proteomics 1, 561–566 (2002).
Köcher, T. & Superti-Furga, G. Mass spectrometry-based functional proteomics: from molecular machines to protein networks. Nat. Methods 4, 807–815 (2007).
Suter, B., Kittanakom, S. & Stagljar, I. Interactive proteomics: what lies ahead? Biotechniques 44, 681–691 (2008).
Tarassov, K. et al. An in vivo map of the yeast protein interactome. Science 320, 1465–1470 (2008).
Garrels, J.I. YPD–A database for the proteins of Saccharomyces cerevisiae. Nucleic Acids Res. 24, 46–49 (1996).
Hong, E.L. et al. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 36, D577–D581 (2008).
Swarbreck, D. et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2007).
Pagel, P. et al. The MIPS mammalian protein-protein interaction database. Bioinformatics 21, 832–834 (2005).
Bader, G.D., Betel, D. & Hogue, C.W. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).
Salwinski, L. et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004).
Chatr-aryamontri, A. et al. MINT: the Molecular INTeraction database. Nucleic Acids Res. 35, D572–D574 (2007).
Kerrien, S. et al. IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565 (2007).
Reguly, T. et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J. Biol. 5, 11 (2006).
Mishra, G.R. et al. Human protein reference database—2006 update. Nucleic Acids Res. 34, D411–D414 (2006).
Myers, C.L., Barrett, D.R., Hibbs, M.A., Huttenhower, C. & Troyanskaya, O.G. Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).
Mika, S. & Rost, B. Protein-protein interactions more conserved within species than across species. PLoS Comput. Biol. 2, e79 (2006).
Simonis, N. et al. Empirically-controlled mapping of the Caenorhabditis elegans protein-protein interaction network. Nat. Methods 6, 47–54 (2008).
Jansen, R. & Gerstein, M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr. Opin. Microbiol. 7, 535–545 (2004).
Braun, P. et al. An experimentally derived confidence score for binary protein-protein interactions. Nat. Methods 6, 91–97 (2008).
Bader, G.D. & Hogue, C.W. Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002).
Ramírez, F., Schlicker, A., Assenov, Y., Lengauer, T. & Albrecht, M. Computational analysis of human protein interaction networks. Proteomics 7, 2541–2552 (2007).
Howe, D. et al. The future of biocuration. Nature 455, 47–50 (2008).
Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).
Peri, S. et al. Development of Human Protein Reference Database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).
Orchard, S. et al. Submit your interaction data the IMEx way. A step by step guide to trouble-free deposition. Proteomics 7, 28–34 (2007).
Kerrien, S. et al. Broadening the horizon - Level 2.5 of the HUPO-PSI format for molecular interactions. BMC Biol. 5, 44 (2007).
Gavin, A.C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006).
Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).
Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).
Krogan, N.J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).
Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).
Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).
Alfarano, C. et al. The Biomolecular Interaction Network Database (BIND) and related tools 2005 update. Nucleic Acids Res. 33, D418–D424 (2005).
Mathivanan, S. et al. An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 7, S19 (2006).
Gentleman, R. & Huber, W. Making the most of high-throughput protein-interaction data. Genome Biol. 8, 112 (2007).
Mackay, J.P., Sunde, M., Lowry, J.A., Crossley, M. & Matthews, J.M. Protein interactions: is seeing believing? Trends Biochem. Sci. 32, 530–531 (2007).
Mackay, J.P., Sunde, M., Lowry, J.A., Crossley, M. & Matthews, J.M. Response to Chatr-aryamontri et al.: Protein interactions: to believe or not to believe? Trends Biochem. Sci. 33, 242–243 (2008).
Nelson, D.R. Gene nomenclature by default, or BLASTing to Babel. Hum. Genomics 2, 196–201 (2005).
Krogan, N.J. et al. A Snf2 family ATPase complex required for recruitment of the histone H2A variant Htz1. Mol. Cell 12, 1565–1576 (2003).
Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).
Philippi, S. & Kohler, J. Addressing the problems with life-science databases for traditional uses and systems biology. Nat. Rev. Genet. 7, 482–488 (2006).
Kiemer, L., Costa, S., Ueffing, M. & Cesareni, G. WI.-PHI a weighted yeast interactome enriched for direct physical interactions. Proteomics 7, 932–943 (2007).
Chatr-Aryamontri, A., Ceol, A., Licata, L. & Cesareni, G. Protein interactions: integration leads to belief. Trends Biochem. Sci. 33, 241–242 (2008).
Boxem, M. et al. A protein domain-based interactome network for C. elegans early embryogenesis. Cell 134, 534–545 (2008).
von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Batada, N.N., Hurst, L.D. & Tyers, M. Evolutionary and physiological importance of hub proteins. PLoS Comput. Biol. 2, e88 (2006).
Orchard, S. et al. The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat. Biotechnol. 25, 894–898 (2007).
Hermjakob, H. et al. The HUPO PSI's molecular interaction format–a community standard for the representation of protein interaction data. Nat. Biotechnol. 22, 177–183 (2004).
Ceol, A., Chatr-Aryamontri, A., Licata, L. & Cesareni, G. Linking entries in protein interaction database to structured text: the FEBS Letters experiment. FEBS Lett. 582, 1171–1177 (2008).
Gerstein, M., Seringhaus, M. & Fields, S. Structured digital abstract makes text mining easy. Nature 447, 142 (2007).
Taylor, C.F. et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat. Biotechnol. 26, 889–896 (2008).
Stevens, S.W. et al. Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol. Cell 9, 31–44 (2002).
Fromont-Racine, M., Rain, J.C. & Legrain, P. Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens. Nat. Genet. 16, 277–282 (1997).
Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).
Matthews, L.R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126 (2001).
Yu, H. et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14, 1107–1118 (2004).
Ramani, A.K., Bunescu, R.C., Mooney, R.J. & Marcotte, E.M. Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 6, R40 (2005).
Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005).
Levy, E.D. & Pereira-Leal, J.B. Evolution and dynamics of protein interactions and networks. Curr. Opin. Struct. Biol. 18, 349–357 (2008).
Tompa, P. & Fuxreiter, M. Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. Trends Biochem. Sci. 33, 2–8 (2008).
Fuxreiter, M., Tompa, P. & Simon, I. Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23, 950–956 (2007).
Beltrao, P. & Serrano, L. Specificity and evolvability in eukaryotic protein interaction networks. PLoS Comput. Biol. 3, e25 (2007).
Acknowledgements
This work was supported by US National Human Genome Research Institute grants R01 HG001715 to M.V. and F.P.R., P50 HG004233 to M.V. and R01 HG003224 to F.P.R. by funds from the W.M. Keck Foundation to M.V. by an award (DBI-0703905) from the National Science Foundation to M.V., J.R.E. and D.E.H. and by Institute Sponsored Research funds from the Dana-Farber Cancer Institute Strategic Initiative to M.V. and CCSB. is a Chercheur Qualifié Honoraire from the Fonds de la Recherche Scientifique (FRS-FNRS, French Community of Belgium). We thank all members of CCSB for constructive discussions.
Author information
Authors and Affiliations
Corresponding authors
Supplementary information
Supplementary Table 1
Yeast recuration. (XLS 47 kb)
Supplementary Table 2
Human LC-multiple recuration. (XLS 250 kb)
Supplementary Table 3
Human literature sampled recuration. (XLS 129 kb)
Supplementary Table 4
Arabidopsis recuration. (XLS 120 kb)
Rights and permissions
About this article
Cite this article
Cusick, M., Yu, H., Smolyar, A. et al. Literature-curated protein interaction datasets. Nat Methods 6, 39–46 (2009). https://doi.org/10.1038/nmeth.1284
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1284
This article is cited by
-
Next-generation large-scale binary protein interaction network for Drosophila melanogaster
Nature Communications (2023)
-
Meta-analysis defines principles for the design and analysis of co-fractionation mass spectrometry experiments
Nature Methods (2021)
-
Genome-wide inference of the Camponotus floridanus protein-protein interaction network using homologous mapping and interacting domain profile pairs
Scientific Reports (2020)
-
Openness and trust in data-intensive science: the case of biocuration
Medicine, Health Care and Philosophy (2020)
-
IMMAN: an R/Bioconductor package for Interolog protein network reconstruction, mapping and mining analysis
BMC Bioinformatics (2019)