Improved fragment-based protein structure prediction by redesign of search heuristics

Difficulty in sampling large and complex conformational spaces remains a key limitation in fragment-based de novo prediction of protein structure. Our previous work has shown that even for small-to-medium-sized proteins, some current methods inadequately sample alternative structures. We have developed two new conformational sampling techniques, one employing a bilevel optimisation framework and the other employing iterated local search. We combine strategies of forced structural perturbation (where some fragment insertions are accepted regardless of their impact on scores) and greedy local optimisation, allowing greater exploration of the available conformational space. Comparisons against the Rosetta Abinitio method indicate that our protocols more frequently generate native-like predictions for many targets, even following the low-resolution phase, using a given set of fragment libraries. By contrasting results across two different fragment sets, we show that our methods are able to better take advantage of high-quality fragments. These improvements can also translate into more reliable identification of near-native structures in a simple clustering-based model selection procedure. We show that when fragment libraries are sufficiently well-constructed, improved breadth of exploration within runs improves prediction accuracy. Our results also suggest that in benchmarking scenarios, a total exclusion of fragments drawn from homologous templates can make performance differences between methods appear less pronounced.

In Rosetta, the product kT is specified as a single parameter whose values are scaled according to typical score values seen. We vary the value of this term using a simple scheme of simulated annealing (Kirkpatrick et al., 1983), whereby the Metropolis temperature parameter is gradually decreased from an initial high value. We used this simulated annealing procedure for the Accep-tanceCriterion step with a view to encourage more disruptive moves in the early parts of each low-resolution stage employing our sampling framework. We used initial and final values of 10 and 2 kT units, respectively, for the temperature parameter. The initial value of 10 was chosen based on the analysis by Bowman and Pande (2009), who used a simulated tempering method for conformational sampling together with the low-resolution scoring functions in Rosetta. Their analysis indicated that Rosetta temperature settings of 10-20 kT units were sufficient to realise a high degree of exploration for small proteins. To prevent the search process from devolving into a purely random search, we used an initial value of 10 kT units for the Metropolis temperature. The final value of 2 kT units corresponds to the standard setting in Rosetta, which leads to relatively conservative exploration. The temperature associated with any given AcceptanceCriterion step is a function of the initial and final temperature settings, as well as the predetermined total number of score function evaluations in the current low-resolution stage. The value of the temperature parameter is reset at the beginning of each low-resolution stage employing bilevel optimisation. The temperature value T at a given score function evaluation number t is given by an exponential cooling schedule: n is the number of score function evaluations in the current stage (n ≥ t), T i is the initial temperature value, and T f is the final temperature value. Thus, the annealing proceeds by exponential decay until the end of a given stage, when t = n, and here the value of the temperature parameter T t equals T f .
One potential problem with methods based on temperature control (such as simulated annealing) is that different proteins may require tailored balances between exploration and exploitation, and some parameter tuning may be necessary to achieve optimal results in individual cases. For example, QUARK (Xu and Zhang, 2012) sets run length and temperature settings for its replica exchange procedure as a function of target length. Similar strategies have been employed in FRAG-FOLD (Jones et al., 2005;Koscio lek and Jones, 2014). Parameter adaptation is an area that can be investigated in future work.
We will now discuss the details of how the Perturbation and LocalSearch steps are implemented for each of our protocols.

S1.1.1 Implementation for the bilevel protocol
In the bilevel protocol, the Perturbation operator comprises a single fragment insertion in a segment of the protein structure containing loop residues, while the LocalSearch operator repeatedly alters non-loop residues. The LocalSearch operator forms the "lower-level" optimiser, while the "upperlevel" optimiser can be seen as all components in lines 2-10 of Algorithm 1 in the main text, leaving out the LocalSearch step. The lower-level optimiser operates within the constraints set up by the upper-level optimiser. The constraints here correspond to the values of structural parameters (torsion angles) set by the Perturbation step in the loop residues. The AcceptanceCriterion step corresponds to the acceptance criterion of the upper-level optimiser. Thus, following the usual approach taken in bilevel optimisation, a solution can be accepted by the upper-level optimiser only when it is an optimum in the lower-level optimisation problem.
The bilevel protocol depends on prior knowledge of which residues are involved in loop regions. Given a native structure, secondary structure (SS) is typically assigned using a classification algorithm such as DSSP (Kabsch and Sander, 1983). However, for the bilevel protocol to be useful in blind prediction, we must rely on a SS prediction program to provide these assignments. In this work we make use of PSIPRED (Jones, 1999) to provide SS assignments. For proof of principle, using a single SS prediction allows us to rigidly define the boundaries of SS elements and loops, and allows us to identify cases where inaccurate SS prediction may be an important factor affecting predictive ability. We used PSIPRED to provide the bilevel protocol with SS assignments, since we used PSIPRED as the sole SS prediction that is used to inform the choice of fragments available in the fragment set in each insertion window (Kandathil et al., 2016). The fragment picking process can alternatively be configured to use a quota system employing three prediction methods (Gront et al., 2011). The use of a consensus prediction from independent SS prediction programs, and/or frameworks that take the probabilities output by these methods into account, are all possibilities for future development. Table 1 summarises the rules we defined for whether and how to apply a proposed fragment insertion in the Perturbation step of the bilevel protocol, depending on the arrangement of loops and SS elements in the current insertion window. These rules were defined so as to direct the search towards altering an appreciable number of loop residues in any single Perturbation step, and alternative sets of rules consistent with the bilevel framework can be devised. Terminal loops are excluded from the set of loops that can be altered by the Perturbation steps. This is done in order to prevent a large number of moves being spent altering these regions; we identified this behaviour as one potential cause of poor sampling performance in our previous work (Kandathil et al., 2016). Currently, we implement the SS-dependent move framework as a "filtering" step, after an insertion window has been chosen at random along the chain. In other words, once a fragment insertion has been proposed in the Perturbation step for a given insertion window, we evaluate the PSIPRED prediction for this window to determine if the fragment should be applied, according to the rules in Table 1. The LocalSearch steps are only allowed to change non-loop residues in any insertion window. The use of these rules leads to resultant fragment insertions with lengths that are less than or equal to the length of the fragments being used. It has been suggested that the use of such moves can provide some advantages in fragment-based structure prediction (Handl et al., 2012).

S1.1.2 Implementation for ILS protocol
The iterated local search (ILS) protocol can be seen as a modified version of the bilevel protocol, in which both the Perturbation and LocalSearch steps are allowed to alter any part of the chain. In the ILS protocol, the changes made to the structure by the Perturbation steps are not enforced as a constraint. In other words, the LocalSearch operator is allowed to affect residues that were altered by the Perturbation step, thus moving away from the bilevel framework in which this is not allowed. All move sizes in the ILS protocol are equal to the fragment length (9 residues), and each run of the ILS protocol uses the same total number of scoring function evaluations as a run of the bilevel protocol. All other aspects, such as the simulated annealing framework and its associated parameters, are kept identical. Table S1: Rules for applying a proposed fragment insertion in the Perturbation steps, depending on the predicted local three-state secondary structure (SS) of the target protein. In the first column, the extents of the proposed fragment insertion are shown above a depiction of local secondary structure. Coloured boxes represent secondary structure elements such as α helices and β strands. These are joined by line segments, representing loops. The remaining two columns describe each condition, and the rule for accepting or rejecting the proposed move.

Illustration
Description Accept or reject?
No loop residues in insert Reject move

S1.2 Solution archiving
For both protocols, we keep and update a collection or archive of the best LMin structures encountered as the search proceeds (UpdateArchive in Algorithm 1 in the main text). Currently, we define the set of the "best" LMins as those with the lowest Rosetta scores in the current stage. Once stored in this archive, solutions are not used to guide the search in any way, although this is another area that can be explored in future work. The purpose of storing a set of structures that meet some criteria is to try to retain multiple local minima that are potentially diverse in structure. This could be particularly useful in more difficult prediction scenarios, which are characterised by an energy landscape showing many distinct basins with nearly equal energy values. Although storing low-energy solutions does not guarantee that a structurally diverse set of solutions will be captured, we implemented a framework that allows us to employ different criteria to define the set of solutions that should be retained. The desired size of the archive of structures is set in advance, and every LMin encountered during the search is considered for addition to the archive. We currently save the 10 best LMin structures encountered. The first set of consecutive LMins encountered in stage 2 are added to the archive to create the initial solution set. Subsequently encountered LMin structures are added to the archive if their score is lower than that of the LMin with the highest score currently present in the archive. In this way, the archive is constantly updated to contain a set number of the lowestscoring structures in any stage. Whenever the scoring function is changed during the search (e.g. when transitioning between low-resolution stages), all solutions in the archive are re-evaluated with the new scoring function, ensuring that any subsequent comparison between the archive and a new LMin is based on the same scoring function. S3 Results following all-atom refinement and clustering Figure S1 shows distributions of Cα RMSD from the native for a total of 50,000 all-atom decoys generated per target and sampling protocol. The sampling protocols in this data made use of the newer fragment sets. These distributions are highly similar to those seen pre-refinement (red distributions in Figure S2), suggesting that there is no widespread deterioration in decoy quality following refinement.  Figure S1: Kernel density plots of the distribution of Cα RMSD from the native (Ångströms, x-axis), for Rosetta (gold), the bilevel (magenta) and ILS protocol (blue), following all-atom refinement. PDB codes, SCOP secondary structure class and length in amino acid residues (aa) are given in the title of each plot. Targets are ordered by PDB identifier. Each distribution uses RMSD data from 50,000 decoys. Vertical lines represent the five-number summary (minimum, first quartile, median, third quartile and maximum) of each distribution (figure continued on next page).  Figure S1 (continued from previous page). S4 Comparison of predictive accuracy achieved by three protocols using different fragment sets Figure S2 shows the influence of fragment library quality on RMSD distributions, comparing Rosetta and our protocols, each using either the older or newer fragment sets for a given target. For the bilevel protocol, each run made use of the SS prediction generated with the respective fragment set. The red distributions of RMSD in this figure are generated using the same data in Figure 3 in the main text.

S5
The ILS protocol is more robust to inaccurate secondary structure predictions than the bilevel protocol The key difference between the bilevel and ILS protocol is that the bilevel protocol makes use of predicted secondary structure (SS) information to inform how fragment insertions are carried out, whereas the ILS protocol does not use such information. Typically, both of our protocols show similar performance in terms of predictive accuracy distributions. However, from Figure S2, it can be seen that when the older fragment set for the target 1c8cA is used, the ILS protocol realises a much more favourable distribution of predictive accuracy than the bilevel protocol. The bilevel protocol does not achieve such an improvement, in contrast to the usually comparable results realised by both of our protocols.
The target 1c8cA has poorly-predicted SS assignments, in both the older and newer fragment sets ( Figure S3, top row). Since the bilevel protocol relies on these SS assignments, we hypothesised that the poor SS prediction could be a reason for the reduced performance realised by the bilevel protocol for this target, with the older fragment set. We therefore ran another round of predictions using the "true" SS assignment for 1c8cA instead of the PSIPRED prediction. The true assignment was derived from the 8-state DSSP assignment for the native structure, which was converted to a 3-state assignment using the same procedure used to generate training data for PSIPRED, thus simulating a 100% accurate PSIPRED assignment. The results of this set of prediction runs are shown in the middle row of Figure S3. When the correct SS assignment is used, the bilevel protocol more closely matches the performance of the ILS protocol, in terms of both RMSD values, and score values. Thus, poor SS prediction can limit the predictive performance of the bilevel protocol. Because the ILS protocol does not make use of predicted SS information during the search, it is more robust to inaccuracies in SS prediction. However, since SS predictions are utilised during fragment picking, the ILS protocol may not perform well in a scenario where inaccurate SS prediction strongly biases a fragment set away from native-like structures. Inaccurate SS prediction does not automatically result in fragment sets incompatible with near-native structures, as is evident from the case of the target 1c8cA discussed here. In Rosetta, SS predictions are just one factor taken into account when choosing fragments. Local sequence and structure profiles tend to have a stronger influence on the choice of fragments in any window, and so fragment sets can be compatible with near-native structures even if the predicted SS is incorrect. It may be possible to improve robustness to poor SS prediction in the bilevel protocol by evaluating consensus between predictions from multiple SS prediction methods, and possibly utilising confidence scores reported by these methods.

ILS protocol
Cα RMSD from native (Å) Figure S2: Kernel density plots of Cα RMSD from the native structure for all 59 targets, comparing decoy sets generated using the older and newer fragment libraries (blue and red distributions, respectively), for Rosetta, the bilevel protocol and the ILS protocol (left, centre and right panels, respectively). Each distribution comprises data from 1000 low-resolution decoys sampled using identical scoring function setups. Vertical lines represent the fivenumber summary of each distribution (figure continued on next two pages).  Figure S3: Effects of fragment library and SS prediction on predictive performance on target 1c8cA. The left and right columns compare data generated using older and newer fragment sets, respectively. Top row: Cartoon representation of the native structure, coloured according to PSIPRED SS assignments derived while generating the respective fragment set. Colours represent regions predicted as α helices (blue), β strands (orange), and loops (green). Middle row: Plots of Rosetta score3 value in Rosetta Energy Units (REU) versus Cα RMSD from the native, comparing the performance of the bilevel protocol when using old and new fragments, and when given PSIPRED or DSSP SS assignments. Distributions of score and RMSD values are also shown as kernel density 'beanplots' with 5-number summaries marked. In the latter case, the bilevel protocol is able to much more reliably locate native-like decoys with the old fragment set, almost matching the ILS protocol on performance. The same improvement is not seen when the newer fragments are used, suggesting that the new fragment set is less enriched for native-like states. Bottom row: Score versus Cα RMSD plots for Rosetta and the ILS protocol; these protocols do not use SS information during prediction. Each dataset for each protocol comprises 1000 low-resolution decoys. NB Portions of the bottom two figures appear in a paper submitted to the 2017 Genetic and Evolutionary Computation Conference Companion (Handl et al., 2017).