Millisecond dynamics of BTK reveal kinome-wide conformational plasticity within the apo kinase domain

Bruton tyrosine kinase (BTK) is a key enzyme in B-cell development whose improper regulation causes severe immunodeficiency diseases. Design of selective BTK therapeutics would benefit from improved, in-silico structural modeling of the kinase’s solution ensemble. However, this remains challenging due to the immense computational cost of sampling events on biological timescales. In this work, we combine multi-millisecond molecular dynamics (MD) simulations with Markov state models (MSMs) to report on the thermodynamics, kinetics, and accessible states of BTK’s kinase domain. Our conformational landscape links the active state to several inactive states, connected via a structurally diverse intermediate. Our calculations predict a kinome-wide conformational plasticity, and indicate the presence of several new potentially druggable BTK states. We further find that the population of these states and the kinetics of their inter-conversion are modulated by protonation of an aspartate residue, establishing the power of MD & MSMs in predicting effects of chemical perturbations.


Complete Methods:
Initial Setup and homology modeling: We downloaded all 23 publically available BTK pdbs from the protein databank 1 . We then used Modeller 2 to mutate out all the sequences to the human sequence (Uniprot ID Q06187) while building in missing loops and residues. We also used Modeller to mutate ASP539 to its protonated form ASH. Hydrogen atoms were added to both the deprotonated and protonated structures using the t-leap module contained with the Amber tools suite [3][4][5] . All the structures were solvated in a water box with 10 Å padding on all sides. Chloride and sodium ions were added to neutralize the charge and set up the final ionic concentration to 150mM. The Amber99sb-ildn 6 force field was used to model protein dynamics in conjunction with the TIP3P 7 water model. This process led to the creation of 23 starting crystallographic structures for each of the two systems.
Minimization: We minimized the system in two steps using Amber. In the first step, the protein was held fixed with 500 kcal/mol restraints while the water and counter ions were minimized for 1,000 steps using steepest gradient descent followed by 3,000 steps of conjugate gradient descent. We then released the restraints on the proteins and minimized the entire system for 2,000 steps using steepest gradient descent followed by another 2,000 steps of conjugate gradient descent. We then loaded the minimized Amber topology and input coordinates into OpenMM 8 for simulation setup and equilibration.
Equilibration: All production runs for regular molecular dynamics simulations were run at a pressure of 1 atm and temperature of 300 K. The Langevin integrator with a friction coefficient of 1/ps and a 2 fs time step were used. A Monte Carlo barostat with an interval of 25 frames was used to maintain the pressure at 1 atm. Frames were saved every 200 ps. Long-range electrostatics were dealt with using the Particle mesh Ewald 9 algorithm with a 10 Å cutoff for the production runs on Folding@home 10 . Hydrogen bonds were constrained using the LINCS 11 algorithm for the Folding@home production runs. All simulations were equilibrated for 1 ns before beginning production runs.
Production MD Simulations: For each of the 46 starting configurations, we started 35 simulations with randomized velocities sampled from a Boltzmann distribution. We also performed a round of adaptive sampling where we reseeded simulations from configurations sampled along high free energy paths for both systems. Overall, we collected 887 μs (1,065 trajectories with 395 ns median length) for the BTK-ASP simulations and 818 μs (1,075 trajectories with 400 ns median length) for BTK-ASH simulations. We then built a joint Markov model for the entire 1.7 ms of data. The supporting information contains more statistics for the individual production simulations.
Markov state model: Building a MSM requires identification of metastable kinetically similar states. This splitting of the phase space is followed by counting the transitions between those states as observed in our trajectories at a Markovian (memory free) lag time. This means that the probability of moving to the next state only depends upon the current state and nothing before it. The conditional independence means that we only require local equilibrium 12 within the state to be able to infer global properties, allowing us to parallelize sampling across hundreds of machines. This transition model can be summarized using the following equation: where is the probability distribution at time "t" while + is the probability distribution after a Markovian lagtime . Spectral decomposition of the MSM transition matrix was used to estimate the equilibrium populations and dynamical processes connecting those Markov states. The relaxation timescales for these dynamical processes can be obtained by using the following transformation on the associated eigenvalue = − The MSM framework avoids the collective variable problem of many enhanced sampling algorithms by only using unbiased simulations in the full state space of the protein. We note that all of the trajectories used or presented in the paper were obtained via regular MD sampling. While, this means that MSMs generally require more unbiased sampling, it has the advantage of NOT specifying CVs or having to worry about hidden degrees of freedom. In fact, thanks to the tICA 13 algorithm, our simulations tell(SI Figure 6-7) us what the important degrees of freedom are. The trick to getting MSM to work is to start from as many available crystal structures and to re-seed simulations from lowly populated states, both of which were employed in this paper.
After sampling the MD trajectories using Folding@home, a total of 2,140 trajectories were vectorized using the protein dihedrals and closest contacts distances. While previous modeling work on kinases with MSMs 14 used RMSD-based metrics, recent advances in dimensionality reduction techniques 13,15 now allow us to learn a kinetically relevant structure-based distance metric directly from the MD trajectories. We used the sine and cosine values of the > dihedrals for the protein. The trigonometric transform was necessary to account to dihedral periodicity within our algorithms. For the contacts, we used the closest heavy atom distance between all Nchoose-2 residues from the following list of residues : GLY393, TRP395, GLU396, ILE397, PRO399, This list was designed to explicitly include the following conserved kinase core residues Trp395, Glu396, Ile397, Lys430, Glu445, Asp539, Phe540, Gly541, His519, Arg520, Asp521, Arg544, Tyr551, Phe559, Pro560, and Val561. This subsampling was necessary to reduce the number of contacts to a computationally manageable subset. This feature selection led to each frame being represented as a feature vector of length 5,532. We preprocessed the data by setting the mean of each feature to 0 and variance to 1. We used the same feature selection scheme for both the BTK ASP and the BTK ASH datasets. We then reduced the dimensionality of the datasets using time structure independent component analysis (tICA) 13,15 . tICA seeks to find a set of linear combinations of features that de-correlate the slowest (at a certain lag time) while minimizing their correlation. This is done by solving the following generalized eigenvalue problem: = (3) where are the associated eigevectors (tICs), the eigenvalues, and is the covariance matrix is the time lagged correlation matrix whose ij element is defined as In equations 4 and 5, Ε … is the average/expectation over the entire ensemble. The aim of equation 3 is to find the slowest/most highly auto-correlated set of coordinates ( ) with in our dataset at a certain lag time. Here, is the tICA lagtime and can be different from Markovian lagtime. The tICA-transformed dataset was clustered using the K-means algorithm. We then used the cluster labeled dataset to build a MSM.
It is worth noting that we built a single tICA and K-Means model for the deprotonated and protonated ensembles. Given the state-space equivalence, this allows us to explicitly compare the thermodynamics and kinetics of these systems. We also note for all projections, the deprotonated (BTK ASP ) ensemble's highest populated state was assigned an absolute free energy of 0 kcal/mol and all other free energies were reported relative to that state. Based upon previous work 14,16 and the convergence of the implied timescales plot (Supporting figure 1) for 50-500 state models, we chose a Markovian lag time of 80 ns. For all the other hyper-parameters within the model, we turned to cross validation.

Hyperparameter cross validation:
Recently McGibbon et al. 17,18 formulated a cross-validation scheme for the selection of hyperparameters for Markov state models. The methodology requires maximization of the generalized matrix Rayleigh quotient (GMRQ) across training and testing sets. The maximization of the test set GMRQ leads to the finding of a rank-m projection that best captures the slow dynamics of the system. For Markov models, this has two tunable parameters: the value of m and the Markovian lag time. Based upon previous work and initial modeling, these were set to 5 and 80 ns, respectively. In order to quickly fit models, the trajectories were subsampled to every 8 ns. The GMRQ was used to optimize over the tICA lag time (8 ns-4 µs), number of tICA components (between 1 and 10), choice of kinetic mapping 19 (yes/no) and number of cluster centers (50-500). We used 5-fold shuffle split with a test size of half the data to ensure that the method doesn't overfit to the training set. After randomly sampling several thousand models from this hyperparameter space, we picked the highest scoring model with GMRQ score of 5.85 (out of a theoretical maximum of 6 with an unknown true upper bound). The parameters for the best model are given below:

Hyper parameter
Value in best model tICA lagtime 208 ns tICA number of components 3 tICA kinetic mapping True Number of clusters 190 Table 1: GMRQ scored hyper parameters for the best model.
After we determined the optimal model given the current amount of sampling, we retrained the model on the entire set of trajectories. For the reported tICA model, we used a sparse variant of tICA 20 for increased interpretability (Supporting Figure 6-7).
The Markov transition matrix was fit via maximum likelihood estimation (MLE) with reversibility and ergodicity constraints. This procedure discarded no data for both ensembles, indicating converged sampling. To obtain error bars for the equilibrium populations, 200 rounds of bootstrapping were performed over the original set of trajectories. We performed standard bootstrapping (N=200) where for each bootstrap sample, we randomly sampled T (T=1065 for BTK-ASP, 1075 for BTK-ASH) trajectories, with re-replacement, from our ensemble. We explicitly kept the final micro-state definitions to allow for direct comparisons. We then fit a new MSM to this perturbed dataset. Due to this sampling-with-replacement, each bootstrap sample produced a slightly different MSM and thus different estimates for both the thermodynamics and kinetics, allowing us to measure the uncertainty within our model (Supporting Figure 2). All populations and free energies reported are from the MLE populations with the 95% confidence interval for the free energies coming from the 200 rounds of bootstrap sampling. The bootstrapping was used to find empirical estimates for the standard deviations of equilibrium population of a state which were then used to obtain the upper and lower bound for the populations using where is the MLE estimate for state's population while is the boostrap estimate for the standard deviation of the population. Repeating this analysis for all 190 states gives us the error estimate in the population, and thus the error in the estimated free energy, for each of those states. The mean and errors can then be propagated into projections (Supporting figure 12) for further analysis.
The trajectories were featurized and analyzed using the MDTraj 21 package while tICA dimensionality reduction and Markov modeling were performed using MSMBuilder 22 . Most of the analysis was performed within the IPython/Jupyter scientific environment 23 with extensive use of the matplotlib 24 , and scikit-learn libraries 25 . All protein images were generated using visual molecular dynamics (VMD) 26 , all protein surfaces were rendered using SURF 27 ,and secondary structure was assigned using STRIDE 28 as implemented in VMD.

Model interpretation:
The models were primarily analyzed using techniques laid out in previous papers 15,29 . To further query the model, we sampled a 800 µs long kinetic Monte Carlo trajectory (10,000 frames at a lagtime of 80 ns) from the Markovian transition matrix.  Table 2: Summary of motif positioning for active, Src-like and DFG out kinase states. All, except the active, states are catalytically inactive. See Figure 8 below for more details.        kl to be <4 Å 1 st tIC value <0, and 2 nd tIC value > 2 . For the Src-like states, we picked all states whose centroids had a tIC value <1 in the first dimension, less than 0 in the 2 nd dimension, and had an A-loop RMSD to the folded structure be <3 Å. For the DFG out state, we picked all states whose 1 st tIC value was higher than 2.     Supplementary Note 1: While the Möbitz 30 classification scheme serves as an excellent comparison to our current simulation set, we chose not to explicitly cluster only along those coordinates. Instead we chose to project our trajectories unto several thousand randomly selected contacts and all side chain dihedrals and letting the tICA 13 algorithm identify the slowest coordinates for us. This is done to explicitly reduce the chance of accidently biasing our final MSM. However, our tICA model was able to pick up both the DFG in to DFG out coordinate and C-helix out to C-helix in as the two slowest  (Figure 7 above) as being highly correlated with activation, a metric that the Möbitz 30 scheme fails to account for. Therefore, all results presented in the paper explicitly use the tICA model with the state definitions outlined in Supplementary Figure 8 (above). However, we provide a classification of the starting pdbs in Table 2 (below) and representative BTK-MD configurations for several of the Möbitz 30 states in Figure 16 (below).   Supporting Note 2: DFG-ASP539 undergoes a large-scale conformational change from a mostly hydrophilic environment to a mostly hydrophobic environment (Figure 19 below) when going from the DFG in to the DFG out state. Previous experimental and computational 31,32 work suggests that the DFG-Aspartate residue might be protonated in the DFG out state to reduce the free-energy cost of burying a charged residue inside the protein's core. To ascertain the likelihood of ASP539 protonation, we used the H++ server 33 to calculate the pKa of DFG-ASP539 in the only reported DFG out crystal structure 34 3PJ3. To that end, we removed the protein (heavy atom only) from 3PJ3 and built the missing residues using Modeller. Based upon the choice of the internal dielectric, the web service estimated the pKa from 5.9 (internal dielectric 10, external dielectric 80, pH 7.2) to 9.6 (internal dielectric 4, external dielectric 80, pH 7). This is several pH units away from the reported solution pKa of ~4, indicating ASP539's proclivity for being protonated in the DFG out state. Other titratable residues such as His519 or Asp521 had pKa<4 in several differing starting configurations indicating that the residues are unlikely to protonate at neutral pH of 7.2. We therefore left those residues in their de-protonated states. This is not too surprising because the local environment around these residues is mostly solvent exposed in both the DFG in and DFG out states. Figure 19: DFG-Asp539 goes from a solvent exposed environment to mostly hydrophobic environment upon DFG flip. Left, DFG-Asp539 is solvated by several water molecules in the Srclike inactive DFG in state. Right, protonated DFG-Asp539 is surrounded by several hydrophobic valine and methionine residues. In both cases, we used VMD's atom selection language to find all water and protein residues within 3 A of Asp539.  Figure 20: Projection of 1.8ms (20% of data obtained by subsampling to 1ns) of BTK-ASP and BTK-ASH datasets unto three dominant tICs, shows that we have significant number of partial and complete crossovers along each coordinate. Note that the MSM method doesn't require any single trajectory to completely cross over a coordinate but can use the information from partial crossovers to gain global equilibrium statistics. Each color corresponds to a different trajectory. The panels on the right trace the trajectories as a function of simulation time. Figure 21: Convergence of MSMs as a function of trajectory length shows that the MSM thermodynamics and kinetics are robust. For both ensembles, we restricted each individual trajectory's state assignments from 50 to 95% of the its final length. We next re-computed the MSM for the sub-sampled data and compared the population (via L-2 norm), longest timescales, and projections along dominant tICs to the complete data solution(gold). As it can be seen, our models converge to within 0.5kcal/mol of their final estimates even at 50% of the overall data.