An Improved Free Energy Perturbation FEP+ Sampling Protocol for Flexible Ligand-Binding Domains

Recent improvements to the free energy perturbation (FEP) calculations, especially FEP+ , established their utility for pharmaceutical lead optimization. Herein, we propose a modified version of the FEP/REST (i.e., replica exchange with solute tempering) sampling protocol, based on detail studies on several targets by probing a large number of perturbations with different sampling schemes. Improved FEP+ binding affinity predictions for regular flexible-loop motions and considerable structural changes can be obtained by extending the prior to REST (pre-REST) sampling time from 0.24 ns/λ to 5 ns/λ and 2 × 10 ns/λ, respectively. With this new protocol, much more precise ∆∆G values of the individual perturbations, including the sign of the transformations and decreased error were obtained. We extended the REST simulations from 5 ns to 8 ns to achieve reasonable free energy convergence. Implementing REST to the entire ligand as opposed to solely the perturbed region, and also some important flexible protein residues (pREST region) in the ligand binding domain (LBD) has considerably improved the FEP+ results in most of the studied cases. Preliminary molecular dynamics (MD) runs were useful for establishing the correct binding mode of the compounds and thus precise alignment for FEP+ . Our improved protocol may further increase the FEP+ accuracy.


Motivation and Protocol Development
An initial motivation for this study came from our recent work in which we faced significant problems achieving accurate FEP+ results using MD-derived structures. We thus decided to find the best FEP+ sampling protocol that can provide us reasonable free energy predictions [1]. This protocol was developed based on a series of perfluoroalkyl acid (PFAA) ligands that bind to the PPARγ receptor, and in particular used the Perfluoroundecanoic acid (PFuDA)-PPARγ MD-derived structures. Initially, we used an averaged PFuDA-PPAR MD structure obtained by 2×150 ns MD simulation runs in Amber 16 software (see Methods) as a baseline for our FEP+ calculations using six ligands and nine perturbations. Using this MD-derived structure and the default FEP+ protocol provided exceptionally poor results even though the structure reasonably resembled the Decanoic acid (DA)-PPARγ resolved X-ray complex. We obtained an RM SE of more than 3 kcal/mol, errors of 8 kcal/mol, and almost all of the perturbations showed a deviance of approximately 2 kcal/mol compared to experimental data (see Figure S8 in ref [1]).
To find the optimal protocol for our set of ligands for PPAR we ran 2, 5, 10, and 2×10 ns/λ pre-REST simulations, in the context of FEP+, instead of the conventional 0.24 ns/λ. We carefully monitored the energy convergence and used 2, 5, 10, 20, and 50 ns/λ long REST simulations during our tests. System convergence in a solute was also monitored. We paid special attention to transformations that had a high (2-3 kcal/mol) free energy difference (∆∆G) and lower structural similarity. The results from standard FEP+ were used for reference. Execution of these 20 combinations, which were 180 perturbations in total, lead us to conclude that for an averaged MD structure a longer pre-REST simulation time of 2× 10 ns is optimal for obtaining reasonable results (RM SE = 1.86, M U E = 1.51, R 2 = 0.97). All perturbations featured an improvement via our 2×10 ns sampling protocol (Figures S1 and S8 in ref [1], Table S1 (attached excel file)). The free energy converged after 5-8 ns and no further significant improvement after durations of up to 50 ns was observed (data not shown). We also investigated the output trajectories. After execution of two independent 10 ns simulations the PFuDA ligand adopted a conformation which was closer to that of DA, and the ligand-protein contacts more resembled those in the DA X-ray structure compared to the starting structure Figure S2. As expected, the PFuDA ligand adopted a planar conformation. For the remaining sampling schemes, there were either no reasonable predictions of less than 2.0 kcal/mol or more than one of the simulations was poorly converged. It should be noted that in order to save computational time we did not complete all of these simulations, and for some of them we monitored only the perturbation of DA to PFuDA, Perfluorooctanoic acid (PFOA) and Perfluorohexanoic acid (PFHxA) because the standard FEP+ protocol provided the most significant errors mainly for these transformations. Thus, if we observed data similar to the default FEP+ sampling protocol error, we ended the calculations (data not shown). However, completion of only these perturbations provided us significant information for our protocol development because the main source of the error (RM SE and M U E) was generated by these transformations.
Another critical decision in FEP+ calculations based on the MD-derived structures is whether an averaged structure or those from clustering to be used as a starting system. Structures obtained both via averaged MD and clustering are common in the literature. In our case the preliminary conventional (cMD) and accelerated (aMD) [2] results clearly demonstrated the presence of multiple binding modes [1].Therefore, the most probable (most frequently present) binding mode can be more accurately obtained by cluster analysis. Moreover, the averaged and minimized structure may eventually "trap" the system into a deeper local energy minimum which presumably explains the requirement for longer pre-REST simulations. We did not perform separate FEP+ analyses for each binding mode as researchers previously suggested [3] because this is beyond the scope of the current study and consider increased pre-REST sampling as an alternative approach for analyzing considerable ligand-protein interactions. Thus, we repeated the aforementioned procedure with the same combination of simulation times using the PFuDA-PPARγ complex in a low-energy minimum (where the PFuDA conformation more closely resembles those of DA) obtained by clustering. A protocol using 5 ns pre-REST sampling and 8 ns REST simulation lead to greater improvement (RM SE = 1.23, R 2 = 0.9; ( Figure 7A and 7B in ref [1]).
Based on these results we suspect that the long (2×10 ns) and short (5 ns) pre-REST samplings perform different functions and have different applications. The longer simulation protocol is more suitable to systems when major conformational changes are expected (as in PFuDA average structure), whereas the shorter simulation protocol is more suitable for either lesser conformational changes or regular systems (ligands that more closely resemble the Xray structure). In addition to the exception of the aforementioned complexity (large and flexible LBD), PPARγ is also a difficult target for some ligand sets for several reasons. For instance, dimerization in the nuclear receptors is an important factor for LBD conformation, in particular helix 12 [1] [4][5] [6]. In the native state PPAR dimerizes with the retinoid X receptor (R×R) but forms a homodimer in crystallographic experiments. The dimerization interface is situated at helices 6, 7, and 11, which surrounds and modulates the PFuDA binding site, rendering MD simulations in the monomer state less realistic for long-chain ligands that bind to the same cavity. In addition, during the our preliminary MD simulations the transformation of PFuDA from an initial orientation, as obtained via a docking procedure in the Rosi-PPARγ X-ray structure, to those similar to DA introduced considerable changes in the conformation of residues in the LBD [1]. This also affected the mobility of small-chain ligands, their realistic ligand-receptor interactions, and their predicted ∆G values as per FEP+.
Nonetheless, even in this case we clearly showed that more-intensive equilibration per lambda during the pre-REST step considerably improved the ∆∆G values for all of the perturbations. The correct ranking of all ligands (6/6), the good correlation coefficient (R 2 = 0.9), and an RM SE error of 1.2 kcal/mol are reasonable results and may be helpful during the lead optimization process. Thus we decided to test our sampling protocol on several other systems and compare results to the default FEP+ protocol.