In silico-prediction of protein–protein interactions network about MAPKs and PP2Cs reveals a novel docking site variants in Brachypodium distachyon

Protein-protein interactions (PPIs) underlie the molecular mechanisms of most biological processes. Mitogen-activated protein kinases (MAPKs) can be dephosphorylated by MAPK-specific phosphatases such as PP2C, which are critical to transduce extracellular signals into adaptive and programmed responses. However, the experimental approaches for identifying PPIs are expensive, time-consuming, laborious and challenging. In response, many computational methods have been developed to predict PPIs. Yet, these methods have inherent disadvantages such as high false positive and negative results. Thus, it is crucial to develop in silico approaches for predicting PPIs efficiently and accurately. In this study, we identified PPIs among 16 BdMAPKs and 86 BdPP2Cs in B. distachyon using a novel docking approach. Further, we systematically investigated the docking site (D-site) of BdPP2C which plays a vital role for recognition and docking of BdMAPKs. D-site analysis revealed that there were 96 pairs of PPIs including all BdMAPKs and most BdPP2Cs, which indicated that BdPP2C may play roles in other signaling networks. Moreover, most BdPP2Cs have a D-site for BdMAPKs in our prediction results, which suggested that our method can effectively predict PPIs, as confirmed by their 3D structure. In addition, we validated this methodology with known Arabidopsis and yeast phosphatase-MAPK interactions from the STRING database. The results obtained provide a vital research resource for exploring an accurate network of PPIs between BdMAPKs and BdPP2Cs.

Protein kinases or phosphatases in eukaryotic cells are main basics of signal transduction mechanisms by catalyzing covalent addition or subtraction of phosphate groups to serine and threonine/tyrosine residues in their substrates, which causes the quick and reversible modification of proteins and thus regulates plant living situations to adapt to their environment rapidly and precisely 1 . In particular, mitogen-activated protein kinase (MAPK, also called MPK) is a crucial protein kinase plays vital role in converting environmental and developmental signals into distinct nuclear responses 2,3 . MAPKs are activated by their specific activator MAPK kinases (MAPKKs, MKKs) through double phosphorylation on both conserved threonine (T) and tyrosine(Y) residues located in the kinase activation loop 4,5 . In contrast, inactivation of MAPKs is induced by dephosphorylation of cognate residues by various tyrosine phosphatases 6 , serine/threonine phosphatases(e.g.PP2C) 7 and/or dual-specificity phosphatases (DUSPs) 8 .
The formation of complex between MAPK and its connate activator, substrate, scaffold or inactivator is normally achieved through specific docking interactions 9 . The docking interactions increase the efficiency of all the enzymatic reactions and may help to regulate the specificity of molecular recognition 10 . It was discovered that MAPKs contain a common docking domain (CD domain) that is featured by a cluster of negatively charged amino acids in the C-terminal region outside the catalytic domain that binds the basic residues at the N

Results
Prediction of interaction pairs. Networks of PPIs supply a framework for understanding the biological processes and can give insights into the molecular mechanism inside the cell. Thus the scholarship of PPIs promotes the understanding of biological mechanisms. MAPK is regulated through gene expression between cell receptor and cell response by transcription factions. To better elucidate the molecular mechanism, our knowledge no work in the literature has been executed regarding the interaction prediction of PP2Cs with MAPKs in B. distachyon. MAPKs are prominent components of protein phosphorylation cascades transduce extracellular signals to plant defense responses. Thus this paper identifying the interacting MAPKs with PP2Cs of B. distachyon which is relate to disease signaling process resorting to docking approach. Docking studies is a method which predicts two proteins (PP2C and MAPK) when bound to each other to a complex, the more stable structure (less global energy) higher the probability of their interaction.
We defined that the absolute value of the lowest energy (AVLE) of two proteins is higher than 65.74 that is the maximum value of the interaction between all subgroup B PP2Cs and MPKs, which means them interacting with each other. Docking studies revealed that 96 from 1376 pairs of PPIs, that is 6.98℅, exhibit higher possibility of interaction including all MAPKs and 49 PP2Cs (Fig. 1, Table S1). The results suggested that PP2C may be connected with other signaling pathway in which PP2C also plays vital roles about negative regulator 36,37 . Moreover, not all of BdMAPKs and BdPP2Cs are only connected into a single network component (Fig. 1). Some proteins show highly correlated while others do not. For example, BdMAPK20-1, BdMAPK20-2, BdMAPK20-4 and BdMAPK20-5 interact with more BdPP2Cs than others. Of course, there were also some BdPP2Cs showed stronger interaction with BdMAPKs in the chart, including BdPP2C22, BdPP2C31, BdPP2C34, BdPP2C39, BdPP2C41, BdPP2C45, BdPP2C49, BdPP2C50, BdPP2C71 and BdPP2C84 (Fig. 1, Table S1). These results indicated that our method could be used to predict the PPIs between BdPP2Cs and BdMAPKs, which were effortless and timesaving in silico "Docking" strategy.

Identification of putative D-site in BdPP2Cs.
Docking interaction is the precondition of protein molecular recognition and enzymatic reaction. Except subgroup B PP2C, no MAPK-docking sites have been reported. To investigate the correlation of docking site for MAPK with the prediction results, we further analyzed the sequence features of PP2C such as the existence or position of putative docking site. In order to search for the sequence that conform the rules of D-site in PP2C using hidden Markov model, we have derived a list of potential novel D-site (PP2C) from B. distachyon which also showed interaction with BdMAPK through docking studies (Table 1). There are 34 BdPP2C proteins which also show 73.96% PPIs harboring D-site in identified interaction with BdMAPK (Table 1). D-finder assigned a Viterbi probability score of 2.85E-22 to a putative D-site (core sequence: RRGRRRPDILTI) that it found in residues 29-40 of BdPP2C55 (Table 1). It could be hypothesized that the amino acids surrounding the minimal D-site motif contribute to MAPK function. The results indicated that the prediction PPIs have the structural basis of recognition each other and also clearly revealed the complexity of BdMAPK interaction with several PP2C triggered in response to diverse upstream stimuli. Of course, number of novel candidate BdMAPK negative regulators was predicted and need to be confirmed by in vitro assay.
Model Validation. In order to assess the reliability of prediction results and docking site, we investigated the 3D protein structure of BdMAPK6 in complex with docking peptides from BdPP2C25 (high affinity site, BdPP2C25: RRSLSCKA; BdMAPK6, KMLTFDPRQRITVEGAL-visible regions marked in yellow; Fig. 2). It has reported that AtMAPK6 and AtMAPK4 can be dephosphorylated by AtPP2C1 which is orthologous genes with BdPP2C6 38 . Therefore, we choose them to evaluate the possibility of PPIs between BdPP2Cs and BdMAPKs, which was also used as the criterion of cutoff. It is found that the position of BdPP2C6 in "ϕ-X-ϕ" locate in protuberant edge which is easier to insert into the hydrophobic pockets. Moreover, BdMAPK6 in common docking motif has three side chain docking pockets which are consistent with the interaction situation (Fig. 2). The results suggested that the PPI of BdMAPK6 and BdPP2C25 have structural evidence support. Furthermore, other BdPP2C proteins also show similar structure in docking site for BdMAPKs (Fig. 3). For example, the 3D structure of BdPP2C47 harbors a D-site (consensus sequence: RRRRRLEMRRFRL, Table 1), which are consistent with the situation of interacting BdMPKs (Fig. 3).

Database Validation.
To better verify our prediction method, the AVLE of more PPIs with reported in public STRING database were performed. Because these PPIs about PP2C and MAPK have confirmed through experiments, instead of pure untested computational procedure. We can assess the reliability of our method through comparing the difference of AVLE about phosphatase-MAPK interaction between B. distachyon and other species. Interestingly, the results show that the AVLE of most (35 pairs) Arabidopsis and yeast PPIs less than 65.74, except 4 pairs PPIs (Table 2). For example, the AVLE of AP2C4-AtMPK6 and AP2C1-AtMPK3 is 73.9 and 70.22, respectively ( Table 2). These results suggested that the PPI of BdMAPKs and BdPP2Cs have more favorable evidence support. Because we have already chosen the max AVLE between subgroup B BdPP2Cs and BdMPK3, BdMPK4 or BdMPK6, when the AVLE of most Arabidopsis and yeast PPIs in public database less than 65.74, which indicated that the bigger AVLE of BdPP2C and BdMAPK should even more have protein-protein interaction. That is, these results indirectly suggested that our prediction method have low false positive rate and dependability. Furthermore, in spite of having difference in other species, the same values remain have certain reference value for predicting the PPIs between PP2Cs and MAPKs on other species.   39 . It is one of key problems to understand their roles on signaling under given conditions how they manage to act specifically on the appropriate substrate. In this context, it is critical to know the mechanisms that mediate the interaction between PP2Cs and MAPKs. In general, MAPK-interaction proteins harbor D-sites at the N-terminus region, which can specific bind cognate CD domain of MAPK (Fig. 4B) 11,12 . The D-site is characterized by a cluster of positively charged amino acids outside the catalytic domain, whose consensus sequence is (R/K 1-3 -X 1-6ϕ-X-ϕ) (Fig. 4A,C) 10 . As expected, PP2C-type phosphatases were identified a region matching the consensus MAPK interaction motif (named KIM; 38 . BdPP2C6, BdPP2C25, BdPP2C73 and BdPP2C78 belong to the subgroup B PP2Cs, which also has a typical KIM 40 . In addition, it only has reported that the interactions and structural features of PP2Cs with MAPKs in model plant Arabidopsis. So, we also independently investigated the AVLE of PPIs with BdMPK3/4/6 for the clues of their interaction each other (Table 3). However, other PP2Cs maybe interact with MAPK, although the PP2C protein does not apparently contain the KIM on N-terminal extension, commonly used MAPK docking sites. It is possible that these plant PP2C-type phosphatases are also interacted with MAPK through other residues located within or outside the typical D-site (Table 1). Indeed, another type of D-site named DEF motif which has a consensus FXFP of human and yeast PP2C proteins has been reported 41 . Furthermore, spatial and temporal regulators of PP2C dephosphorylation are probably needed to ensure specific downstream activation of MAPKs. Future identification of all PP2C D-sites and their effects to interaction possibility will be necessary for a detailed understanding of PP2C regulation.  computational approaches have been developed to effectively and accurately predict protein interactions 30,32 , to date we still need develop more effective methods to investigate the PPIs situation of them. However, construction of a PPI network is the first step for the systematically study of a given organism, which not only provides clues for the signaling pathway but also contributes to comprehend protein functions 46 . Therefore, we took a high-throughput approach to understand the PP2C/MAPK signaling pathways. Specially, we identified 96 pairs PPIs and generated a PPI network (Fig. 1). Moreover, our study uncovers a series of putative docking site variants on PP2Cs (Table 1), thus inferring and predicting their functions and roles in MAPK signaling pathways.
Comparison with other computational methods. So far, a variety of computational methods for predicting PPIs have been developed by investigators. They all can accurately predict protein interaction only based on different aspects. Such as, the evolutionary information, protein structure, physicochemical characteristics, weighted sparse representation, and so on. The incorporating evolutionary information and physicochemical characteristics feature extraction method for predicting PPIs using a newly developed discriminative vector machine (DVM) classifier had proposed 31 . The RVM-BiGP that combines the relevance vector machine (RVM) model and Bi-gram Probabilities (BiGP) for PPIs detection from protein sequences had developed using protein evolutionary information 47 . Moreover, the combining weighted sparse representation based classifier (WSRC) and global encoding (GE) of amino acid sequence had also performed 33 . It has reported a method for inference of protein-protein interactions from protein amino acids sequences 32 . While we also propose a novel computational   method for predicting PPIs based on the energy of the protein 3D structure closed. Subsequently, we verified the feasibility of our method through a series of models, including the 3D structure reconstruction, putative protein docking domain prediction and public STRING database validation. For example, BdMAPK6 in common docking motif has three side chain docking pockets which are consistent with the situation of interacting BdPP2C25 which contain a D-site (Fig. 2). The analysis results indicated that our method is feasible for predicting PPIs, and can be used as an effective supplementary tool for future proteomics research in the traditional experimental methods.

Conclusion
In this study, we proposed in silico "Docking" strategy to predict the PPIs network about between 16 BdMAPKs and 86 BdPP2Cs. Docking studies showed 96 pairs of probable PPIs including all BdMAPK and 49 BdPP2C, which indicated that BdPP2C may be involved in other signaling pathway. In addition, 34 BdPP2C proteins harbor D-site in identified interaction with BdMAPK, which means that D-site plays important role in BdPP2C protein bind to BdMAPK. Moreover, we evaluate the reliability of prediction results and docking site by compare the 3D protein structure of BdMAPK6 and BdPP2C25 which can provide the structural support for our study. Furthermore, STRING database also indicated that our prediction results have low false positive rate and dependability. These findings provide insight into docking interactions in MAPK signaling networks and should be related to efforts to predict new MAPK substrates and regulators by in silico-prediction methods.  50 . The minimized energy structures were finally saved as *.pdf files which were checked and validated by PROCHECK, ERRAT and VERIFY_3D. Subsequently, the refined structures of BdMAPKs were taken as receptor and the structures of BdPP2Cs were taken as ligand for the docking studies on the online Patch Dock server which is based on shape complementary principle and keeping parameter value Clustering RMSD:4.0, Complex Type: Default 51,52 . The results obtained were refined using Fire Dock online server which rearranges the interface side chains and adjusts the relative orientation of the molecules 53 . The AVLE is in solutions number 1000. At last, we can obtain the AVLE of all phosphatase-MAPK interaction pairs (Table S2). The prediction framework is shown in Fig. 5.

Selection of threshold as a criterion.
To distinguish the interacting and non-interacting partners between BdMAPKs and BdPP2Cs, we need chose a reasonable value as a criterion. It has previously reported that phosphatase-MAPK interaction is only found in subgroup B PP2Cs for putative D-site with several kinds of MPKs, including MPK3, MPK4 or MPK6 35,38 . It is more important that these interactions with MPK3, MPK4 and MPK6 depends on D-site in PP2C, which have proved through experiments such as yeast two hybrid screen (Y2H), BiFC assays and so on. According to the principle that homologous genes have similar functions, we therefore analyzed the AVLE of the interaction between all the subgroup B PP2Cs (BdPP2C6, BdPP2C25,   (Table 3). For maximum assurance of the accuracy of the prediction results, e.g. the false positive rate, etc. we have chosen the max AVLE (that is 65.74) from them was used as a criterion which was distinguished the interacting and non-interacting partners between BdMAPKs and BdPP2Cs. Since there is a smaller AVLE of phosphatase-MPK interaction pairs that are likely to interact, then which has a greater AVLE that is of course more likely to interact.
Prediction of docking site in BdPP2Cs using hidden Markov model. The docking interactions are contributed to the efficiency of all the enzymatic reactions and the specificity of molecular recognition. In order to search for the docking site for BdMAPKs in BdPP2Cs, we aligned the full-sequence of BdPP2Cs with others that contains conserved D-site using hidden Markov model (HMM). A profile HMM architecture, composed of linked main, insert, and delete states was performed in the programming language Java to implement the computational analysis and prediction. An HMM model of length 19 was designed to be a minimal prescreen that simply checks for a basic residue followed after a spacer of 1-5 residues by a hydrophobic-X-hydrophobic (as defined above) 54 . D-finder is used to select suitable strings and then assigns a standard HMM Viterbi probability score. The code of D-finder is downloadable from http://dfinder.sourceforge.net as a.zip file that contains the Java files along with the original training set file, a sample testing file, and a README file.

Confirmation of docking motif and interolog based protein-protein interaction prediction. The docking-based and interolog method uses domain interaction information and sequence similarity
to infer the potential PPIs, respectively. If two proteins contain an interacting domain pair, it is expected that these two proteins may interact with each other. While if two proteins have interacting homolog in another organism such as Arabidopsis, it is also thought that these two proteins are conserved interaction 55 . Thus, interologs are homologous pairs of protein interactions across different organisms. To assess the PPIs of BdPP2Cs and BdMAPKs, protein 3D structures are used to check their probability of interaction from spatial conformation, especially docking site.
Validation of results against STRING database with our methods. To validate the feasibility and reliability of our methods for phosphatase-MAPK interaction, the appraisement about AVLE of PPIs that have confirmed with experiments in STRING database 56 (https://string-db.org/cgi/input.pl) were performed. Subsequently, we compared the AVLE of other species phosphatase-MAPK interaction, including Arabidopsis and yeast.