• An Addendum to this article was published on 25 November 2009

Abstract

High-quality datasets are needed to understand how global and local properties of protein-protein interaction, or 'interactome', networks relate to biological mechanisms, and to guide research on individual proteins. In an evaluation of existing curation of protein interaction experiments reported in the literature, we found that curation can be error-prone and possibly of lower quality than commonly assumed.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Interactome: Gateway into systems biology. Hum. Mol. Genet. 14, R171–R181 (2005).

  2. 2.

    , & Interaction networks for systems biology. FEBS Lett. 582, 1220–1224 (2008).

  3. 3.

    Interactome modeling. FEBS Lett. 579, 1834–1838 (2005).

  4. 4.

    Mining literature for systems biology. Brief. Bioinform. 7, 399–406 (2006).

  5. 5.

    et al. An empirical framework for binary interactome mapping. Nat. Methods 6, 83–90 (2008).

  6. 6.

    et al. Estimating the size of the human interactome. Proc. Natl. Acad. Sci. USA 105, 6959–6964 (2008).

  7. 7.

    et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008).

  8. 8.

    , & Yeast two-hybrid contributions to interactome mapping. Curr. Opin. Biotechnol. 17, 387–393 (2006).

  9. 9.

    et al. Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell. Proteomics 1, 561–566 (2002).

  10. 10.

    & Mass spectrometry-based functional proteomics: from molecular machines to protein networks. Nat. Methods 4, 807–815 (2007).

  11. 11.

    , & Interactive proteomics: what lies ahead? Biotechniques 44, 681–691 (2008).

  12. 12.

    et al. An in vivo map of the yeast protein interactome. Science 320, 1465–1470 (2008).

  13. 13.

    YPD–A database for the proteins of Saccharomyces cerevisiae. Nucleic Acids Res. 24, 46–49 (1996).

  14. 14.

    et al. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 36, D577–D581 (2008).

  15. 15.

    et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2007).

  16. 16.

    et al. The MIPS mammalian protein-protein interaction database. Bioinformatics 21, 832–834 (2005).

  17. 17.

    , & BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).

  18. 18.

    et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004).

  19. 19.

    et al. MINT: the Molecular INTeraction database. Nucleic Acids Res. 35, D572–D574 (2007).

  20. 20.

    et al. IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565 (2007).

  21. 21.

    et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J. Biol. 5, 11 (2006).

  22. 22.

    et al. Human protein reference database—2006 update. Nucleic Acids Res. 34, D411–D414 (2006).

  23. 23.

    , , , & Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).

  24. 24.

    & Protein-protein interactions more conserved within species than across species. PLoS Comput. Biol. 2, e79 (2006).

  25. 25.

    et al. Empirically-controlled mapping of the Caenorhabditis elegans protein-protein interaction network. Nat. Methods 6, 47–54 (2008).

  26. 26.

    & Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr. Opin. Microbiol. 7, 535–545 (2004).

  27. 27.

    et al. An experimentally derived confidence score for binary protein-protein interactions. Nat. Methods 6, 91–97 (2008).

  28. 28.

    & Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002).

  29. 29.

    , , , & Computational analysis of human protein interaction networks. Proteomics 7, 2541–2552 (2007).

  30. 30.

    et al. The future of biocuration. Nature 455, 47–50 (2008).

  31. 31.

    et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).

  32. 32.

    et al. Development of Human Protein Reference Database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).

  33. 33.

    et al. Submit your interaction data the IMEx way. A step by step guide to trouble-free deposition. Proteomics 7, 28–34 (2007).

  34. 34.

    et al. Broadening the horizon - Level 2.5 of the HUPO-PSI format for molecular interactions. BMC Biol. 5, 44 (2007).

  35. 35.

    et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006).

  36. 36.

    et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).

  37. 37.

    et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).

  38. 38.

    et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).

  39. 39.

    et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).

  40. 40.

    et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).

  41. 41.

    et al. The Biomolecular Interaction Network Database (BIND) and related tools 2005 update. Nucleic Acids Res. 33, D418–D424 (2005).

  42. 42.

    et al. An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 7, S19 (2006).

  43. 43.

    & Making the most of high-throughput protein-interaction data. Genome Biol. 8, 112 (2007).

  44. 44.

    , , , & Protein interactions: is seeing believing? Trends Biochem. Sci. 32, 530–531 (2007).

  45. 45.

    , , , & Response to Chatr-aryamontri et al.: Protein interactions: to believe or not to believe? Trends Biochem. Sci. 33, 242–243 (2008).

  46. 46.

    Gene nomenclature by default, or BLASTing to Babel. Hum. Genomics 2, 196–201 (2005).

  47. 47.

    et al. A Snf2 family ATPase complex required for recruitment of the histone H2A variant Htz1. Mol. Cell 12, 1565–1576 (2003).

  48. 48.

    et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).

  49. 49.

    & Addressing the problems with life-science databases for traditional uses and systems biology. Nat. Rev. Genet. 7, 482–488 (2006).

  50. 50.

    , , & WI.-PHI a weighted yeast interactome enriched for direct physical interactions. Proteomics 7, 932–943 (2007).

  51. 51.

    , , & Protein interactions: integration leads to belief. Trends Biochem. Sci. 33, 241–242 (2008).

  52. 52.

    et al. A protein domain-based interactome network for C. elegans early embryogenesis. Cell 134, 534–545 (2008).

  53. 53.

    et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).

  54. 54.

    , & Evolutionary and physiological importance of hub proteins. PLoS Comput. Biol. 2, e88 (2006).

  55. 55.

    et al. The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat. Biotechnol. 25, 894–898 (2007).

  56. 56.

    et al. The HUPO PSI's molecular interaction format–a community standard for the representation of protein interaction data. Nat. Biotechnol. 22, 177–183 (2004).

  57. 57.

    , , & Linking entries in protein interaction database to structured text: the FEBS Letters experiment. FEBS Lett. 582, 1171–1177 (2008).

  58. 58.

    , & Structured digital abstract makes text mining easy. Nature 447, 142 (2007).

  59. 59.

    et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat. Biotechnol. 26, 889–896 (2008).

  60. 60.

    et al. Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol. Cell 9, 31–44 (2002).

  61. 61.

    , & Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens. Nat. Genet. 16, 277–282 (1997).

  62. 62.

    et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).

  63. 63.

    et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126 (2001).

  64. 64.

    et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14, 1107–1118 (2004).

  65. 65.

    , , & Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 6, R40 (2005).

  66. 66.

    et al. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005).

  67. 67.

    & Evolution and dynamics of protein interactions and networks. Curr. Opin. Struct. Biol. 18, 349–357 (2008).

  68. 68.

    & Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. Trends Biochem. Sci. 33, 2–8 (2008).

  69. 69.

    , & Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23, 950–956 (2007).

  70. 70.

    & Specificity and evolvability in eukaryotic protein interaction networks. PLoS Comput. Biol. 3, e25 (2007).

Download references

Acknowledgements

This work was supported by US National Human Genome Research Institute grants R01 HG001715 to M.V. and F.P.R., P50 HG004233 to M.V. and R01 HG003224 to F.P.R. by funds from the W.M. Keck Foundation to M.V. by an award (DBI-0703905) from the National Science Foundation to M.V., J.R.E. and D.E.H. and by Institute Sponsored Research funds from the Dana-Farber Cancer Institute Strategic Initiative to M.V. and CCSB. is a Chercheur Qualifié Honoraire from the Fonds de la Recherche Scientifique (FRS-FNRS, French Community of Belgium). We thank all members of CCSB for constructive discussions.

Author information

Author notes

    • Kavitha Venkatesan
    • , Jean-François Rual
    •  & Heather Borick

    Present addresses: Novartis Institutes for Biomedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA (K.V.), Department of Cell Biology, Harvard Medical School, 240 Longwood Avenue, Boston, Massachusetts 02115, USA (J.-F.R.) and Department of Biological Sciences, 132 Long Hall, Clemson University, Clemson, South Carolina 29634, USA (H.B.).

    • Michael E Cusick
    •  & Haiyuan Yu

    These authors contributed equally to this work.

Affiliations

  1. Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, Massachusetts 02115, USA.

    • Michael E Cusick
    • , Haiyuan Yu
    • , Alex Smolyar
    • , Kavitha Venkatesan
    • , Anne-Ruxandra Carvunis
    • , Nicolas Simonis
    • , Jean-François Rual
    • , Heather Borick
    • , Pascal Braun
    • , Matija Dreze
    • , David E Hill
    • , Frederick P Roth
    •  & Marc Vidal
  2. Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA.

    • Michael E Cusick
    • , Haiyuan Yu
    • , Alex Smolyar
    • , Kavitha Venkatesan
    • , Anne-Ruxandra Carvunis
    • , Nicolas Simonis
    • , Jean-François Rual
    • , Heather Borick
    • , Pascal Braun
    • , Matija Dreze
    • , David E Hill
    •  & Marc Vidal
  3. Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications de Grenoble (TIMC-IMAG), Unité Mixte de Recherche 5525 Centre National de la Recherche Scientifique (CNRS), Faculté de Médecine, Université Joseph Fourier, 38706 La Tronche Cedex, France.

    • Anne-Ruxandra Carvunis
  4. Unité de Recherche en Biologie Moléculaire, Facultés Universitaires Notre-Dame de la Paix, 61 Rue de Bruxelles, 5000 Namur, Wallonia, Belgium.

    • Jean Vandenhaute
  5. Genomic Analysis Laboratory, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, USA.

    • Mary Galli
    • , Junshi Yazaki
    •  & Joseph R Ecker
  6. Plant Biology Laboratory, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, USA.

    • Junshi Yazaki
    •  & Joseph R Ecker
  7. Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 250 Longwood Avenue, Boston, Massachusetts 02115, USA.

    • Frederick P Roth

Authors

  1. Search for Michael E Cusick in:

  2. Search for Haiyuan Yu in:

  3. Search for Alex Smolyar in:

  4. Search for Kavitha Venkatesan in:

  5. Search for Anne-Ruxandra Carvunis in:

  6. Search for Nicolas Simonis in:

  7. Search for Jean-François Rual in:

  8. Search for Heather Borick in:

  9. Search for Pascal Braun in:

  10. Search for Matija Dreze in:

  11. Search for Jean Vandenhaute in:

  12. Search for Mary Galli in:

  13. Search for Junshi Yazaki in:

  14. Search for David E Hill in:

  15. Search for Joseph R Ecker in:

  16. Search for Frederick P Roth in:

  17. Search for Marc Vidal in:

Corresponding authors

Correspondence to Michael E Cusick or Marc Vidal.

Supplementary information

Excel files

  1. 1.

    Supplementary Table 1

    Yeast recuration.

  2. 2.

    Supplementary Table 2

    Human LC-multiple recuration.

  3. 3.

    Supplementary Table 3

    Human literature sampled recuration.

  4. 4.

    Supplementary Table 4

    Arabidopsis recuration.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nmeth.1284

Further reading