PIPE4: Fast PPI Predictor for Comprehensive Inter- and Cross-Species Interactomes

The need for larger-scale and increasingly complex protein-protein interaction (PPI) prediction tasks demands that state-of-the-art predictors be highly efficient and adapted to inter- and cross-species predictions. Furthermore, the ability to generate comprehensive interactomes has enabled the appraisal of each PPI in the context of all predictions leading to further improvements in classification performance in the face of extreme class imbalance using the Reciprocal Perspective (RP) framework. We here describe the PIPE4 algorithm. Adaptation of the PIPE3/MP-PIPE sequence preprocessing step led to upwards of 50x speedup and the new Similarity Weighted Score appropriately normalizes for window frequency when applied to any inter- and cross-species prediction schemas. Comprehensive interactomes for three prediction schemas are generated: (1) cross-species predictions, where Arabidopsis thaliana is used as a proxy to predict the comprehensive Glycine max interactome, (2) inter-species predictions between Homo sapiens-HIV1, and (3) a combined schema involving both cross- and inter-species predictions, where both Arabidopsis thaliana and Caenorhabditis elegans are used as proxy species to predict the interactome between Glycine max (the soybean legume) and Heterodera glycines (the soybean cyst nematode). Comparing PIPE4 with the state-of-the-art resulted in improved performance, indicative that it should be the method of choice for complex PPI prediction schemas.


Benchmarking the New PIPE Algorithm
Timing experiments are performed using the intra-species predictions for three model organisms: H. sapiens, A. thaliana, and S. cerevisiae. These were selected as representing three conventional applications of PIPE in decreasing order of proteome size. Intra-species predictions were performed to ensure use-case compatibility between the PIPE3 and PIPE4 methods. Timing experiments average the time for computing a single PPI over the entire interactome. The relevant benchmark test measure for the intra-species comparison of PIPE3 and PIPE4 are listed in Supplementary Table S1, where the symbol notation are defined in the Methods section of the manuscript. Similarly, the relevant benchmark measures for the combined inter-and cross-species prediction schema are given in Supplementary Table S2. The predictions between H. glycines and G. max using the PIPE3 would originally have required ~42 days to compute, whereas it was generated in ~2 days using PIPE4, producing a 21.1x speedup. Such speedups are necessary if we were to use PIPE iteratively for purposes such as protein engineering as done in InSIPS. All benchmark measures are tabulated in Supplementary Table S3.

Cross-Species Validation Experiments
Previous versions of PIPE have shown success in predicting intra-species PPI in relatively well-studied species with large numbers of experimentally verified PPIs. Cross-species PPI prediction permits experimental data taken from well-studied species to be used to investigate the putative PPI networks of under-studied species which are of research interest yet have very little data available. Through the following experiments, the best practices for cross-species PPI prediction will be examined, including from which species to take training data, for which species valid predictions can then be made, whether using combinations of training species could be advantageous, and validation experiments to demonstrate why the PIPE score needs to be modified for crossspecies prediction.
As described in the main text, the cross-species version of PIPE, PIPE4, was created to keep track of the species of origin for each protein and normalizes the landscape score by considering only those PPIs reflected in the training data. To exemplify the importance of this normalization factor, consider the following toy example: When using one organism as a proxy for another, the number of possible interactions varies. The original PIPE3 SW score would naively pool the proteomes; however, this does not appropriately represent the true frequency of a window within a proteome. As depicted in Supplementary Fig. 3, the PIPE3 SW would normalize over both proteomes. The prevalence of a given window can vary dramatically between organisms and so the PIPE4 SW score corrects for this by normalizing only over those proteomes which actually have training data; A. thaliana only in this case. This normalization becomes more pronounced as the number of species increases and does not strictly scale uniformly across all predicted interactions since each protein window has a varying number of similar proteins. A diverse set of model organisms were considered to examine this normalization factor change further.
The training PPIs for 17 organisms were assembled for the inter-and cross-species experiments (Supplementary Table  S4). To control for the amount of available training data, we considered only those organisms with at least 2,000 known PPIs. An equivalently sized set of 2,000 PPIs were randomly subsampled from the set of all known PPIs for each of these resulting eight organisms.
The results from these experiments are summarized using both ROC and PR curves, and the area under the PRC (AU-PRC) and precision at 25% TPR (Pr@25Re) were used as scoring metrics to compare the classification performance of the original PIPE3 SW score and the modified PIPE4 SW score.

Supplementary Fig. 2. Toy Example Contrasting the PIPE3 and PIPE4 Similarity Weighted Normalization.
Each line between the proteins indicates a single "hit" where a window in one protein is similar to a window within the corresponding protein. Hits only exist within the A. thaliana proteome as it is a well-studied organisms containing several training PPIs. In PIPE4, this hit count is normalized only by protein pairs within A. thaliana.

One-to-Many Predictions
The One-to-Many prediction experiments used the training data from one organism to predict interactions for multiple others. This test examines if the modified SW score normalization affected the performance for cross-species predictions when using a single training species. Here, each of the eight species was used to predict cross-species interactions for the remaining seven species. Averaged results over each of the test species are summarized in Supplementary Table S5 and a set of example ROC curves for test species M. musculus are depicted in Supplementary Fig. S4. A Student's paired t-test under the null hypothesis of equal means yielded a difference in means of 0.011 and 0.019 for AUPRC and Pr@15Re respectively, both with p < 0.001.

Many-to-One Predictions
The Many-to-One prediction experiments considered pooling the training data for multiple species to then make predictions for another. This test sought to determine whether the normalization change becomes increasingly pronounced with an increase in training species. Here, the interactions for each test species were made with a model trained using PPIs pooled from the other seven species. To correctly reflect this utility in actual application, all available interactions were used here (no subsampling was performed for each of the eight species). Results over each of the test species are summarized in Supplementary Table S6 and the ROC curve for each test species is depicted in Supplementary Fig. S5. A Student's paired t-test under the null hypothesis of equal means yielded a difference in means of 0.096 and 0.164 for AUPRC and Pr@15Re respectively, both with p < 0.05.