The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data


A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).

Figure 5: Graphical representation of XML document structure.
Figure 1: Graphical representation of the PSI MI format.
Figure 2: PSI MI example file.
Figure 3: 'Interaction detection' controlled vocabulary.
Figure 4: The PIMWalker network visualization tool.


This work was supported partially by EU grant number QLRI-CT-2001-00015 under the Research and Technological Development program 'Quality of Life and Management of Living Resources'. The PSI meetings were supported by the Human Proteome Organization. The work in the University of Rome 'Tor Vergata' was supported by grants from Associazione Italiana per la Ricerca sul Cancro and grant GTF02011 from Telethon. M.L. is supported by the European Molecular Biology Laboratory International PhD program and Biotechnology and Biological Sciences Research Council grant 8/C19399. Y.L. and R.Z. are supported by grants 2001AA233031, 2002CB512801, 110CB510209. M.V.'s laboratory is supported by grants from the US National Cancer Institute and National Human Genome Research Institute. L.M.-P. would like to thank Jens Pedersen, Claudia Bagni, Benedetta Mattei, Elena Santonico, Federico Demasi and Michael Ashburner for contributions to the controlled vocabularies. Emmanuel Cézanne, Sébastien Cros, Claire Even, Nicolas Jolibert, Sandrine Marquès, Christophe Roumegous, Patrick Sablayrolles and René Thomas-Nelson contributed to the development of the PSI XSLT utilities. The collaborative development process has been facilitated by the infrastructure provided by Source Forge.

This article is cited by


