Structure-based prediction of protein–protein interactions on a genome-wide scale

A Corrigendum to this article was published on 06 March 2013

This article has been updated


The genome-wide identification of pairs of interacting proteins is an important step in the elucidation of cell regulatory mechanisms1,2. Much of our present knowledge derives from high-throughput techniques such as the yeast two-hybrid assay and affinity purification3, as well as from manual curation of experiments on individual systems4. A variety of computational approaches based, for example, on sequence homology, gene co-expression and phylogenetic profiles, have also been developed for the genome-wide inference of protein–protein interactions (PPIs)5,6. Yet comparative studies suggest that the development of accurate and complete repertoires of PPIs is still in its early stages7,8,9. Here we show that three-dimensional structural information can be used to predict PPIs with an accuracy and coverage that are superior to predictions based on non-structural evidence. Moreover, an algorithm, termed PrePPI, which combines structural information with other functional clues, is comparable in accuracy to high-throughput experiments, yielding over 30,000 high-confidence interactions for yeast and over 300,000 for human. Experimental tests of a number of predictions demonstrate the ability of the PrePPI algorithm to identify unexpected PPIs of considerable biological interest. The surprising effectiveness of three-dimensional structural information can be attributed to the use of homology models combined with the exploitation of both close and remote geometric relationships between proteins.

Figure 1: Predicting protein–protein interactions using PrePPI.
Figure 2: ROC curve and Venn diagram for PrePPI predictions and high-throughput experiments in yeast.
Figure 3: Models for the PPI formed between PRKD1 and PRKCE, and EEF1D and VHL using homology models and remote structural relationships.

Change history

  • 06 March 2013

    Nature 490, 556–560 (2012); doi:10.1038/nature11503 In this Letter, one of the points shown in Fig. 2 and Supplementary Figs 8, 9 and Supplementary Table 4 reflects the presence of interactions that had been erroneously deposited from a previous publication1 into the IntAct database. We have now used the MINT database to retrieve these interactions, and Fig.


This work is supported by National Institutes of Health grants GM030518 and GM094597 (B.H.), CA121852 (A.C. and B.H.), DK057539 (D.A.), CA082683 (T.H.), R01NS043915 (T.M.). L.D. thanks the China Scholarship Council scholarship 2010626059. We thank U. Pieper from A. Sali’s laboratory for help with ModBase, and H. Lee for help with SkyBase.

Q.C.Z., D.P., A.C. and B.H. designed the research; Q.C.Z. performed the computational work; Q.C.Z., D.P., A.C. and B.H. analysed the data; L.D. set up the PrePPI web server, L.Q., Y.S., C.A.T. and B.B. performed co-immunoprecipitation studies, Q.C.Z., D.P., A.C. and B.H. wrote the paper including text from C.L., D.A., T.H. and T.M.

Correspondence to Andrea Califano or Barry Honig.

Zhang, Q., Petrey, D., Deng, L. et al. Structure-based prediction of protein–protein interactions on a genome-wide scale. Nature 490, 556–560 (2012).

