Abstract
A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
The protein-protein interaction ontology: for better representing and capturing the biological context of protein interaction
BMC Genomics Open Access 16 November 2021
-
OscoNet: inferring oscillatory gene networks
BMC Bioinformatics Open Access 25 August 2020
-
Towards reproducible computational drug discovery
Journal of Cheminformatics Open Access 28 January 2020
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout





References
Miyazaki, S., Sugawara, H., Gojobori, T. & Tateno, Y. DNA Data Bank of Japan (DDBJ). Nucleic Acids Res. 31, 13–16 (2003).
Stoesser, G. et al. The EMBL Nucleotide Sequence Database: major new developments. Nucleic Acids. Res. 31, 17–22 (2003).
Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. & Wheeler, D.L. GenBank. Nucleic Acids Res. 31, 23–27 (2003).
Westbrook, J., Feng, Z., Chen, L., Yang, H. & Berman, H.M. The Protein Data Bank and structural genomics. Nucleic Acids Res. 31, 489–491 (2003).
Spellman, P.T. et al. Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol. 3, research0046.1–0046.9 (2003).
Brazma, A. et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29, 365–371 (2001).
Ball, C.A. Microarray Gene Expression Data (MGED) Society: standards for microarray data. Science 298, 539 (2002).
Orchard, O., Hermjakob, H. & Apweiler, R. The Proteomics Standards Initiative. Proteomics 7, 1374–1376 (2003).
Taylor, C.F. et al. A systematic approach to modeling, capturing and disseminating proteomics experimental data. Nat. Biotechnol. 21, 247–254 (2003).
Bader, G.D., Betel, D. & Hogue, C.W.V. BIND, the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).
Salwinski, L. et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004).
Mewes, H.W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34 (2002).
Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).
von Mering, C. et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).
Bader, G.D. & Hogue, C.W. BIND—a data specification for storing and describing biomolecular interactions, molecular complexes and pathways. Bioinformatics 16, 465–477 (2000).
Kaiser, J. Proteomics. Public-private group maps out initiatives. Science 296, 827 (2002).
Orchard, S., Kersey, P., Hermjakob, H. & Apweiler, R. The HUPO Proteomics Standards Initiative meeting: towards common standards for exchanging proteomics data. Comp. Funct. Genomics 4, 16–19 (2003).
Orchard, S. et al. Progress in establishing common standards for exchanging proteomics data: the second meeting of the HUPO Proteomics Standards Initiative. Comp. Funct. Genomics 4, 203–206 (2003).
Hucka, M. et al. The Systems Biology Markup Language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003).
The Gene Ontology Consortium. Creating the gene ontology resource: design and implementation. Genome Res. 11, 1425–1433 (2001).
Boeckmann, B. et al. The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003).
Deane, C.M., Salwinski, L., Xenarios, I. & Eisenberg, D. Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell Proteomics 1, 349–356 (2002).
Rain, J.-R. et al. The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001).
Garavelli, J.S. The RESID Database of Protein Modifications: 2003 developments. Nucleic Acids Res. 31, 499–501 (2003).
Day, R.N., Periasamy, A. & Schaufele, F. Fluorescence resonance energy transfer microscopy of localized protein interactions in the living cell nucleus. Methods 25, 4–18 (2001).
Reboul, J. et al. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat. Genet. 34, 35–41 (2003).
Peri, S. et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).
Hermjakob, H. et al. IntAct—an open source molecular interaction database. Nucleic Acids Res., 32, D452–D455 (2004).
Husi, H. & Grant, S.G. Construction of a Protein-Protein Interaction Database (PPID) for Synaptic Biology. in Neuroscience Databases: A Practical Guide. (R. Kotter, ed.) 1–62 (Boston/Dordrecht/London, Kluwer Academic Publishers, 2002).
Acknowledgements
This work was supported partially by EU grant number QLRI-CT-2001-00015 under the Research and Technological Development program 'Quality of Life and Management of Living Resources'. The PSI meetings were supported by the Human Proteome Organization. The work in the University of Rome 'Tor Vergata' was supported by grants from Associazione Italiana per la Ricerca sul Cancro and grant GTF02011 from Telethon. M.L. is supported by the European Molecular Biology Laboratory International PhD program and Biotechnology and Biological Sciences Research Council grant 8/C19399. Y.L. and R.Z. are supported by grants 2001AA233031, 2002CB512801, 110CB510209. M.V.'s laboratory is supported by grants from the US National Cancer Institute and National Human Genome Research Institute. L.M.-P. would like to thank Jens Pedersen, Claudia Bagni, Benedetta Mattei, Elena Santonico, Federico Demasi and Michael Ashburner for contributions to the controlled vocabularies. Emmanuel Cézanne, Sébastien Cros, Claire Even, Nicolas Jolibert, Sandrine Marquès, Christophe Roumegous, Patrick Sablayrolles and René Thomas-Nelson contributed to the development of the PSI XSLT utilities. The collaborative development process has been facilitated by the infrastructure provided by Source Forge.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Institute of Bioinformatics, International Tech Park, Whitefield Road, 560 066 Bangalore, India.
Rights and permissions
About this article
Cite this article
Hermjakob, H., Montecchi-Palazzi, L., Bader, G. et al. The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data. Nat Biotechnol 22, 177–183 (2004). https://doi.org/10.1038/nbt926
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt926
This article is cited by
-
The protein-protein interaction ontology: for better representing and capturing the biological context of protein interaction
BMC Genomics (2021)
-
DOME: recommendations for supervised machine learning validation in biology
Nature Methods (2021)
-
OscoNet: inferring oscillatory gene networks
BMC Bioinformatics (2020)
-
Towards reproducible computational drug discovery
Journal of Cheminformatics (2020)