Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Literature-curated protein interaction datasets

An Addendum to this article was published on 25 November 2009

Abstract

High-quality datasets are needed to understand how global and local properties of protein-protein interaction, or 'interactome', networks relate to biological mechanisms, and to guide research on individual proteins. In an evaluation of existing curation of protein interaction experiments reported in the literature, we found that curation can be error-prone and possibly of lower quality than commonly assumed.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Distribution of the number of published manuscripts supporting each interaction.
Figure 2: Distribution of the publications in literature-curated datasets by the number of interactions reported in the publication.
Figure 3: Overlaps of reported curation for yeast PPIs.
Figure 4: Summary of recuration results.

References

  1. Cusick, M.E., Klitgord, N., Vidal, M. & Hill, D.E. Interactome: Gateway into systems biology. Hum. Mol. Genet. 14, R171–R181 (2005).

    Article  CAS  Google Scholar 

  2. Bader, S., Kuhner, S. & Gavin, A.C. Interaction networks for systems biology. FEBS Lett. 582, 1220–1224 (2008).

    Article  CAS  Google Scholar 

  3. Vidal, M. Interactome modeling. FEBS Lett. 579, 1834–1838 (2005).

    Article  CAS  Google Scholar 

  4. Roberts, P.M. Mining literature for systems biology. Brief. Bioinform. 7, 399–406 (2006).

    Article  CAS  Google Scholar 

  5. Venkatesan, K. et al. An empirical framework for binary interactome mapping. Nat. Methods 6, 83–90 (2008).

    Article  Google Scholar 

  6. Stumpf, M.P. et al. Estimating the size of the human interactome. Proc. Natl. Acad. Sci. USA 105, 6959–6964 (2008).

    Article  CAS  Google Scholar 

  7. Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008).

    Article  CAS  Google Scholar 

  8. Parrish, J.R., Gulyas, K.D. & Finley, R.L. Jr. Yeast two-hybrid contributions to interactome mapping. Curr. Opin. Biotechnol. 17, 387–393 (2006).

    Article  CAS  Google Scholar 

  9. Ito, T. et al. Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell. Proteomics 1, 561–566 (2002).

    Article  CAS  Google Scholar 

  10. Köcher, T. & Superti-Furga, G. Mass spectrometry-based functional proteomics: from molecular machines to protein networks. Nat. Methods 4, 807–815 (2007).

    Article  Google Scholar 

  11. Suter, B., Kittanakom, S. & Stagljar, I. Interactive proteomics: what lies ahead? Biotechniques 44, 681–691 (2008).

    Article  CAS  Google Scholar 

  12. Tarassov, K. et al. An in vivo map of the yeast protein interactome. Science 320, 1465–1470 (2008).

    Article  CAS  Google Scholar 

  13. Garrels, J.I. YPD–A database for the proteins of Saccharomyces cerevisiae. Nucleic Acids Res. 24, 46–49 (1996).

    Article  CAS  Google Scholar 

  14. Hong, E.L. et al. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 36, D577–D581 (2008).

    Article  CAS  Google Scholar 

  15. Swarbreck, D. et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2007).

    Article  Google Scholar 

  16. Pagel, P. et al. The MIPS mammalian protein-protein interaction database. Bioinformatics 21, 832–834 (2005).

    Article  CAS  Google Scholar 

  17. Bader, G.D., Betel, D. & Hogue, C.W. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).

    Article  CAS  Google Scholar 

  18. Salwinski, L. et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004).

    Article  CAS  Google Scholar 

  19. Chatr-aryamontri, A. et al. MINT: the Molecular INTeraction database. Nucleic Acids Res. 35, D572–D574 (2007).

    Article  CAS  Google Scholar 

  20. Kerrien, S. et al. IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565 (2007).

    Article  CAS  Google Scholar 

  21. Reguly, T. et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J. Biol. 5, 11 (2006).

    Article  Google Scholar 

  22. Mishra, G.R. et al. Human protein reference database—2006 update. Nucleic Acids Res. 34, D411–D414 (2006).

    Article  CAS  Google Scholar 

  23. Myers, C.L., Barrett, D.R., Hibbs, M.A., Huttenhower, C. & Troyanskaya, O.G. Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).

    Article  Google Scholar 

  24. Mika, S. & Rost, B. Protein-protein interactions more conserved within species than across species. PLoS Comput. Biol. 2, e79 (2006).

    Article  Google Scholar 

  25. Simonis, N. et al. Empirically-controlled mapping of the Caenorhabditis elegans protein-protein interaction network. Nat. Methods 6, 47–54 (2008).

    Article  Google Scholar 

  26. Jansen, R. & Gerstein, M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr. Opin. Microbiol. 7, 535–545 (2004).

    Article  CAS  Google Scholar 

  27. Braun, P. et al. An experimentally derived confidence score for binary protein-protein interactions. Nat. Methods 6, 91–97 (2008).

    Article  Google Scholar 

  28. Bader, G.D. & Hogue, C.W. Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002).

    Article  CAS  Google Scholar 

  29. Ramírez, F., Schlicker, A., Assenov, Y., Lengauer, T. & Albrecht, M. Computational analysis of human protein interaction networks. Proteomics 7, 2541–2552 (2007).

    Article  Google Scholar 

  30. Howe, D. et al. The future of biocuration. Nature 455, 47–50 (2008).

    Article  CAS  Google Scholar 

  31. Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).

    Article  CAS  Google Scholar 

  32. Peri, S. et al. Development of Human Protein Reference Database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).

    Article  CAS  Google Scholar 

  33. Orchard, S. et al. Submit your interaction data the IMEx way. A step by step guide to trouble-free deposition. Proteomics 7, 28–34 (2007).

    Article  Google Scholar 

  34. Kerrien, S. et al. Broadening the horizon - Level 2.5 of the HUPO-PSI format for molecular interactions. BMC Biol. 5, 44 (2007).

    Article  Google Scholar 

  35. Gavin, A.C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006).

    Article  CAS  Google Scholar 

  36. Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).

    Article  CAS  Google Scholar 

  37. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).

    Article  CAS  Google Scholar 

  38. Krogan, N.J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).

    Article  CAS  Google Scholar 

  39. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).

    Article  CAS  Google Scholar 

  40. Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).

    Article  CAS  Google Scholar 

  41. Alfarano, C. et al. The Biomolecular Interaction Network Database (BIND) and related tools 2005 update. Nucleic Acids Res. 33, D418–D424 (2005).

    Article  CAS  Google Scholar 

  42. Mathivanan, S. et al. An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 7, S19 (2006).

    Article  Google Scholar 

  43. Gentleman, R. & Huber, W. Making the most of high-throughput protein-interaction data. Genome Biol. 8, 112 (2007).

    Article  Google Scholar 

  44. Mackay, J.P., Sunde, M., Lowry, J.A., Crossley, M. & Matthews, J.M. Protein interactions: is seeing believing? Trends Biochem. Sci. 32, 530–531 (2007).

    Article  CAS  Google Scholar 

  45. Mackay, J.P., Sunde, M., Lowry, J.A., Crossley, M. & Matthews, J.M. Response to Chatr-aryamontri et al.: Protein interactions: to believe or not to believe? Trends Biochem. Sci. 33, 242–243 (2008).

    Article  CAS  Google Scholar 

  46. Nelson, D.R. Gene nomenclature by default, or BLASTing to Babel. Hum. Genomics 2, 196–201 (2005).

    Article  CAS  Google Scholar 

  47. Krogan, N.J. et al. A Snf2 family ATPase complex required for recruitment of the histone H2A variant Htz1. Mol. Cell 12, 1565–1576 (2003).

    Article  CAS  Google Scholar 

  48. Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).

    Article  CAS  Google Scholar 

  49. Philippi, S. & Kohler, J. Addressing the problems with life-science databases for traditional uses and systems biology. Nat. Rev. Genet. 7, 482–488 (2006).

    Article  CAS  Google Scholar 

  50. Kiemer, L., Costa, S., Ueffing, M. & Cesareni, G. WI.-PHI a weighted yeast interactome enriched for direct physical interactions. Proteomics 7, 932–943 (2007).

    Article  CAS  Google Scholar 

  51. Chatr-Aryamontri, A., Ceol, A., Licata, L. & Cesareni, G. Protein interactions: integration leads to belief. Trends Biochem. Sci. 33, 241–242 (2008).

    Article  CAS  Google Scholar 

  52. Boxem, M. et al. A protein domain-based interactome network for C. elegans early embryogenesis. Cell 134, 534–545 (2008).

    Article  CAS  Google Scholar 

  53. von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).

    Article  CAS  Google Scholar 

  54. Batada, N.N., Hurst, L.D. & Tyers, M. Evolutionary and physiological importance of hub proteins. PLoS Comput. Biol. 2, e88 (2006).

    Article  Google Scholar 

  55. Orchard, S. et al. The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat. Biotechnol. 25, 894–898 (2007).

    Article  CAS  Google Scholar 

  56. Hermjakob, H. et al. The HUPO PSI's molecular interaction format–a community standard for the representation of protein interaction data. Nat. Biotechnol. 22, 177–183 (2004).

    Article  CAS  Google Scholar 

  57. Ceol, A., Chatr-Aryamontri, A., Licata, L. & Cesareni, G. Linking entries in protein interaction database to structured text: the FEBS Letters experiment. FEBS Lett. 582, 1171–1177 (2008).

    Article  CAS  Google Scholar 

  58. Gerstein, M., Seringhaus, M. & Fields, S. Structured digital abstract makes text mining easy. Nature 447, 142 (2007).

    Article  CAS  Google Scholar 

  59. Taylor, C.F. et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat. Biotechnol. 26, 889–896 (2008).

    Article  CAS  Google Scholar 

  60. Stevens, S.W. et al. Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol. Cell 9, 31–44 (2002).

    Article  CAS  Google Scholar 

  61. Fromont-Racine, M., Rain, J.C. & Legrain, P. Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens. Nat. Genet. 16, 277–282 (1997).

    Article  CAS  Google Scholar 

  62. Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).

    Article  CAS  Google Scholar 

  63. Matthews, L.R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126 (2001).

    Article  CAS  Google Scholar 

  64. Yu, H. et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14, 1107–1118 (2004).

    Article  CAS  Google Scholar 

  65. Ramani, A.K., Bunescu, R.C., Mooney, R.J. & Marcotte, E.M. Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 6, R40 (2005).

    Article  Google Scholar 

  66. Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005).

    Article  CAS  Google Scholar 

  67. Levy, E.D. & Pereira-Leal, J.B. Evolution and dynamics of protein interactions and networks. Curr. Opin. Struct. Biol. 18, 349–357 (2008).

    Article  CAS  Google Scholar 

  68. Tompa, P. & Fuxreiter, M. Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. Trends Biochem. Sci. 33, 2–8 (2008).

    Article  CAS  Google Scholar 

  69. Fuxreiter, M., Tompa, P. & Simon, I. Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23, 950–956 (2007).

    Article  CAS  Google Scholar 

  70. Beltrao, P. & Serrano, L. Specificity and evolvability in eukaryotic protein interaction networks. PLoS Comput. Biol. 3, e25 (2007).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by US National Human Genome Research Institute grants R01 HG001715 to M.V. and F.P.R., P50 HG004233 to M.V. and R01 HG003224 to F.P.R. by funds from the W.M. Keck Foundation to M.V. by an award (DBI-0703905) from the National Science Foundation to M.V., J.R.E. and D.E.H. and by Institute Sponsored Research funds from the Dana-Farber Cancer Institute Strategic Initiative to M.V. and CCSB. is a Chercheur Qualifié Honoraire from the Fonds de la Recherche Scientifique (FRS-FNRS, French Community of Belgium). We thank all members of CCSB for constructive discussions.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Michael E Cusick or Marc Vidal.

Supplementary information

Supplementary Table 1

Yeast recuration. (XLS 47 kb)

Supplementary Table 2

Human LC-multiple recuration. (XLS 250 kb)

Supplementary Table 3

Human literature sampled recuration. (XLS 129 kb)

Supplementary Table 4

Arabidopsis recuration. (XLS 120 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cusick, M., Yu, H., Smolyar, A. et al. Literature-curated protein interaction datasets. Nat Methods 6, 39–46 (2009). https://doi.org/10.1038/nmeth.1284

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1284

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing