Multisecond ligand dissociation dynamics from atomistic simulations

Coarse-graining of fully atomistic molecular dynamics simulations is a long-standing goal in order to allow the description of processes occurring on biologically relevant timescales. For example, the prediction of pathways, rates and rate-limiting steps in protein-ligand unbinding is crucial for modern drug discovery. To achieve the enhanced sampling, we perform dissipation-corrected targeted molecular dynamics simulations, which yield free energy and friction profiles of molecular processes under consideration. Subsequently, we use these fields to perform temperature-boosted Langevin simulations which account for the desired kinetics occurring on multisecond timescales and beyond. Adopting the dissociation of solvated sodium chloride, trypsin-benzamidine and Hsp90-inhibitor protein-ligand complexes as test problems, we reproduce rates from molecular dynamics simulation and experiments within a factor of 2–20, and dissociation constants within a factor of 1–4. Analysis of friction profiles reveals that binding and unbinding dynamics are mediated by changes of the surrounding hydration shells in all investigated systems.


I. INTRODUCTION
Classical molecular dynamics (MD) simulations in principle allow us to describe biomolecular processes in atomistic detail 1 .Prime examples include the study of protein complex formation 2 and protein-ligand binding and unbinding 3,4 , which constitute key steps in biomolecular function.Apart from structural analysis, the prediction of kinetic properties has recently become of interest, since optimized ligand binding and unbinding kinetics have been linked to better drug efficacies [5][6][7] .Since these processes typically occur on timescales from milliseconds to hours, however, they are out of reach for unbiased allatom MD simulations which currently reach microsecond timescales.To account for rare biomolecular processes, a number of enhanced sampling techniques [8][9][10][11][12][13][14][15][16] have been proposed.These approaches all entail the application of a bias to the system in order to enforce motion along a usually one-dimensional reaction coordinate x, such as the protein-ligand distance.
While the majority of the above methods focuses on the calculation of the free energy profile ∆G(x), several approaches have recently been suggested that combine enhanced sampling with a reconstruction of the dynamics of the process [17][18][19] .In this vein, we recently proposed dissipation-corrected targeted MD (dcTMD), which exerts a pulling force on the system along reaction coordinate x via a moving distance constraint 20 .By combining a Langevin equation analysis with a cumulant expansion of Jarzynski's equality 21 .dcTMD yields both ∆G(x) and the friction field Γ(x).The latter reflects interactions of a) Electronic mail: steffen.wolf@physik.uni-freiburg.deb) Electronic mail: stock@physik.uni-freiburg.de.
the system with degrees of freedom orthogonal to those which define the free energy.In this work, we go one step further and use ∆G(x) and Γ(x) to run Langevin simulations, which describe the coarse-grained dynamics along the reaction coordinate and reveal timescales and mechanisms of the considered process.Moreover, we introduce the concept of "temperature boosting" of the Langevin equation, which allows us -without further approximations-to speed up the calculations by several orders of magnitude.

II. THEORY A. Dissipation-corrected targeted molecular dynamics
To set the stage, we briefly review the working equations of dcTMD derived in Ref. 20. TMD as developed by Schlitter 22 uses a constraint force f c that results in a moving distance constraint x = x 0 + v c t with a constant velocity v c .The main assumption underlying dcTMD is that this nonequilibrium process can be described by a memory-free Langevin equation 1 , which contains the Newtonian force −dG/dx, the friction force −Γ(x) ẋ, as well as a stochastic force with white noise ξ(t), that is assumed to be of zero mean, ξ = 0, delta-correlated, ξ(t)ξ(t ) = δ(t − t ), and Gaussian distributed.Since the constraint force f c imposes a constant velocity on the system ( ẋ = v c ), the total force mẍ vanishes.Performing an ensemble average . . . of Eq. ( 1) No. of bridging water molecules

A B
FIG. 1. Dissociation of NaCl in water.(A) Free energy profiles ∆G(x) along the interion distance x, obtained from a 1 µs long unbiased MD trajectory at 293 K (dashed green line) and 1000 × 1 ns TMD runs (full blue line).Also shown is the average work W (x) calculated from the TMD simulations (black).(B) Friction profile Γ(x) obtained from TMD (red) together with the average number of water molecules (black), that connect the Na + and Cl − ions in a common hydration shell. 28er many TMD runs, we thus obtain the relation 20 Here the first term W (x) = x x0 f c (x ) dx represents the averaged external work performed on the system, and the second term corresponds to the dissipated work W diss (x) of the process expressed in terms of the friction Γ(x).
While the friction in principle can be calculated in various ways 23,24 , it proves advantageous to calculate Γ(x) directly from TMD simulations.To this end, we invoke Jarzynski's identity 21 , e −∆G(x)/kBT = e −W (x)/kBT , the second-order cumulant expansion of which gives Eq. ( 2) with W diss (x) = δW 2 (x) /k B T .Expressing work fluctuations δW in terms of the fluctuating force δf c , we obtain for the friction 20 which is readily evaluated directly from the TMD simulations.
As discussed in Ref. 20, the derivation of Langevin equation (1) assumes that the pulling speed v c is slow compared to the timescale of the bath fluctuations, such that the effect of f c can be considered as a slow adiabatic change 25 , which leaves the system virtually at equilibrium at all times.This means that the free energy (2) and the friction (3) determined by the nonequilibrium TMD simulations correspond to their equilibrium results.As a consequence, we can use ∆G(x) and Γ(x) to describe the unbiased motion of the system via Langevin equation (1) for f c = 0. Numerical propagation of the unbiased Langevin equation then accounts for the coarse grained dynamics of the system, and yields the timescales of the considered process.
The theory developed above rests on two main assumptions.For one, we have assumed that Langevin equation (1) provides an appropriate description of nonequilibrium TMD simulations, and applies as well to the unbiased motion (f c = 0) of the system.This means that, due to a timescale separation of slow pulling speed and fast bath fluctuations, the constraint force f c enters this equation merely as an additive term.Secondly, we have invoked a cumulant expansion to derive friction coefficient (3), which is valid under the assumption that the distribution of W within the ensemble is Gaussian.While this assumption may break down if the system of interest follows multiple reaction paths, we have recently shown that we can systematically perform a separation of an ensemble of dcTMD trajectories into classes, each corresponding to a pathway 26 .To identify these pathways, a nonequilibrium principal component analysis of proteinligand distances may be performed 26 .Alternatively, path separation can be based on geometric distances between individual trajectories, making use of the NeighborNet algorithm 27 to cluster trajectories into classes (and thus pathways) according to these distances.

B. T -boosting
The speed-up of Langevin equation (1) compared to an unbiased all-atom MD simulation is due to the drastic coarse graining of the Langevin model (one instead of 3N degrees of freedom, N being the number of all atoms).Since the numerical integration of the Langevin equation typically requires a time step of a few femtoseconds (see Table S1), however, we still need to propagate Eq. (1) for 100 • 10 15 steps to sufficiently sample a process occurring on a timescale of seconds, which is prohibitive for standard computing resources.
As a further way to speed up calculations, we note that the temperature T enters Eq. ( 1) via the stochastic force, indicating that temperature is the driving force of the Langevin dynamics.That is, when we consider a process described by a transition rate k and increase the temperature from T 1 to T 2 , the corresponding rates k 1 and k 2 are related by the transition state expression where ∆G = denotes the transition state energy and β i = 1/k B T i is the inverse temperature.That is, by in-creasing the temperature we also increase the number n of observed transition events according to n 2 /n 1 = k 2 /k 1 .
To exploit this relationship for dcTMD, we proceed as follows.First we employ dcTMD to calculate the Langevin fields ∆G(x) and Γ(x) at a temperature of interest T 1 .Using these fields, we then run a Langevin simulation at some higher temperature T 2 , which results in an increased transition rate k 2 and number of events n 2 .In particular, we choose a temperature high enough to sample a sufficient number of events (N 100) for some given simulation length.In the final step, we use Eq. ( 4) to calculate the transition rate k 1 at the desired temperature T 1 .
It should be noted that the above described procedure, henceforth termed "T -boosting," involves no assumptions or approximations.It exploits the fact that we calculate fields ∆G(x) and Γ(x) at the same temperature for which we eventually want to calculate the rate.On the other hand, we wish to stress that a Langevin simulation run at T 2 using fields obtained at T 1 in general does not reflect the coarse-grained dynamics of an MD simulation run at T 2 , but can only be used to recover k 1 from k 2 , because the fields ∆G(x) and Γ(x) obtained from MD do depend on temperature.
In practice, we perform T -boosting calculations at several temperatures T 2 in increments of 25 K to 50 K and choose the smallest T 2 such that N 100 transitions occur.In the Supporting Information we derive an analytic expression of the extrapolation error as a function of boosting temperatures and achieved number of transitions, from which the necessary length of the individual Langevin simulations can be estimated, in order to achieve a desired extrapolation error.One-dimensional Langevin simulations require little computational effort (1 ms of simulation time at a 5 fs time step take ∼6 hours of wall-clock time on a single CPU) and are trivial to parallelize in the form of independent short runs.Hence the extrapolation error due to boosting can easily be pushed below 10% and is thus negligible in comparison to systematic errors coming from the dcTMD field estimates.

A. Ion dissociation of NaCl in water
To illustrate the above developed theoretical concepts and test the validity of the underlying approximations, we first consider sodium chloride in water as a simple yet nontrivial model system.For this system, detailed dcTMD as well as long unbiased MD simulations are available 20 , making it a suitable benchmark system for our approach.
To start with, Fig. 1A shows the free energy profiles ∆G(x) along the interionic distance x, whose first maximum at x ≈ 0.4 nm corresponds to the bindingunbinding transition of the two ions, i.e., the loss of di-rect van der Waals contact and the formation of a common hydration shell.The second smaller maximum at x ≈ 0.6 nm reflects the transition from a common to two separate hydration shells 28 .We find that results for ∆G(x) obtained from a 1 µs long unbiased MD trajectory and from dcTMD simulations (1000 × 1 ns runs with v c = 1 m/s) match perfectly.Since the average work W (x) of the nonequilibrium simulations is seen to significantly overestimate the free energy at large distances, the dissipation correction W diss in Eq. ( 2) is obviously of importance.Figure 1B shows the underlying friction profile Γ(x) obtained from dcTMD, which in part deviates from the lineshape of the free energy.While we also find a maximum at x ≈ 0.4 nm, the behavior of Γ(x) is remarkably different for larger distances 0.5 x 0.7 nm, where a region of elevated friction can be found before dropping to lower values at x 0.8 nm.Interestingly, these features of Γ(x) match well the changes of the average number of water molecules bridging both ions 28 .This indicates that the increased friction in Eq. ( 3) is mainly caused by force fluctuations associated with the build-up of a hydration shell 20 .
The dynamics of ion dissociation and association can be described by their mean waiting times (or the respective rates) shown in Figure 2A.For the chosen ion concentration and resulting effective simulation box size (see SI), the unbiased MD simulation at 293 K yields mean dissociation and association times of τ D = 1/k D = 120 ps and τ A = 1/k A = 850 ps, respectively.Using fields ∆G(x) and Γ(x) obtained from TMD, the numerical integration of Langevin equation ( 1) for 1 µs results in τ D = 420 ps and τ A = 3040 ps.That is, the Langevin predictions overestimate the true values by a factor of about four, which may be caused by various issues.For one, to be of practical use, the Langevin model was deliberately kept quite simple.It includes neither an explicit solvent coordinate to account for the complex dynamics of the solvent 28 , nor does it account for non-Markovian memory effects as in a generalized Langevin equation 1 .Moreover, we note that the calculation of the friction Γ(x) via Eq.(3) uses constraints, which move with a pulling velocity v c that is small compared to the transition timescale.Unlike unbiased simulations, where ions and water molecules move with comparable velocities, in TMD simulations the ions are therefore artificially held in place.This causes additional interactions with water molecules, which have the effect of increasing the effective friction 31 .This finding is supported by calculations using the data-driven Langevin approach 32 , which estimates friction coefficients based on unbiased MD simulations that are smaller than the ones obtained from TMD for x 0.7 nm (Fig. S1).Considering the simplicity of the Langevin model and the approximate calculation of the friction coefficient by TMD, overall we are content with a factor 4 deviation of the predicted kinetics.
Waiting times on a nanosecond timescale as found for solvated NaCl are readily sampled by microsecond long Langevin trajectories at room temperature.Nonetheless,  29,30 .Bars represent the standard error of the mean.Tables below comprise corresponding rates with M being the molarity, i.e., mol/l, and reference values.For trypsin-benzamidine and Hsp90, rate constants were fitted according to Eq. ( 4) at 290 K and 300 K, respectively.
to illustrate the validity of the T -boosting approach suggested above, we performed a series of Langevin simulations for eight temperatures ranging from 290 to 420 K and plotted the resulting dissociation and association times as a function of the inverse temperature (Fig. 2).
A fit to Eq. ( 4) yields transition state free energies ∆G = of 13 kJ/mol and 12 kJ/mol for ion dissociation and association, respectively, which agree well with barrier heights of the free energy profile in Fig. 1A extracted from both biased and unbiased simulations.Moreover, dissociation and association times obtained from the extrapolated T -boosted Langevin simulations (τ D = 370 ps, τ A = 3050 ps) agree excellently with the directly calculated values (see Table S1).This indicates that hightemperature Langevin simulations can indeed be extrapolated to obtain low-temperature transition rates.

B. Trypsin-benzamidine
Let us now consider the prediction of free energies, friction profiles and kinetics in protein-ligand systems.With binding and unbinding times on a range between milliseconds and hours, these systems cannot generally be studied by sufficent sampling using unbiased simulations, owing to the limited capabilities of current computational hardware.The first system we focus on is the inhibitor benzamidine bound to trypsin 29,33,34 , which represents a well-established model problem to test enhanced sampling techniques 19,[35][36][37][38][39][40] .The slowest dynamics in this system is found in the unbinding process, which occurs on a scale of milliseconds 29 .To capture the kinetics of the unbinding process, so far Markov state models 35,36 , metadynamics 37 .Brownian dynamics 38 and adaptive enhanced sampling methods 19,39,40 have been employed.
Here we combined dcTMD simulations and a subsequent nonequilibrium principal component analysis 26 to identify the dominant dissociation pathways of ligands during unbinding from their host proteins (see SI Methods).Figure 3 shows TMD snapshots of the structural evolution along this pathway of benzamidine, its free energy profile ∆G(x), and the associated friction Γ(x).Starting from the bound state (x 1 = 0 nm), ∆G(x) exhibits a single maximum at x 2 ≈ 0.46 nm, before it reaches the dissociated state for x x 4 = 0.75 nm.The calculated binding free energy of ∼ 27 kJ/mol compares well to the experimental value of 28.0 ± 0.2 kJ/mol 29 .In line with the findings of Tiwary et al. 37 .the maximum of ∆G(x) reflects the rupture of the Asp189-benzamidine salt bridge, which represents the most important contact of the bound ligand.Following right after, the friction profile Γ(x) reaches its maximum at x 3 ≈ 0.54 nm, where the charged side chain of benzamidine becomes hydrated with water molecules.Similarly to NaCl, the friction peak coincides with the increase in the average number of hydrogen bonds between benzamidine and bulk water.Being a ligand-binding system, however, the peak in friction is slightly shifted to higher x, because the ligand acts as a "plug" for the binding site, and first needs to be (at least partially) removed in order to allow water flowing in.As for the dissociation of NaCl in water, enhanced friction during unbinding appears to be directly linked to a rearrangement of the host hydration shell.
To calculate rates k on and k off describing the binding and unbinding of benzamidine from trypsin, we performed 5 ms long Langevin simulations along the dominant pathways at thirteen temperatures per system ranging from 380-900 K.As shown in Fig. 2B, the resulting rates are well fitted (R 2 ≥ 0.91) by T -boosting expression (4).For trypsin-benzamidine, the fit yields energy barriers ∆G = off = 39 kJ/mol and ∆G = on = 15 kJ/mol, which agree well with the corresponding values ∼ 44 and 17 kJ/mol obtained from Fig. 3B.Representing the resulting number of transitions as a function of the inverse temperature, we find that at 380 K only ∼ 13 events happen during a millisecond.That is, to obtain statistically converged rates at 290 K would require Langevin simulations at 290 K on a timescale of seconds.Using temperature boosting with Eq. ( 4), on the other hand, our high-temperature millisecond Langevin simulations readily yield converged transition rates at 290 K (see Figure 2), that is, k on = 1.9 • 10 7 s -1 M -1 and k off = 5.3 • 10 2 s -1 , which underestimate the experimental values 29 k on = 2.9 • 10 7 s -1 M -1 and k off = 6.0 • 10 2 s -1 by a factor of 1.2-1.5.We note that the experimental value of k off is given as 6.0 ± 3.0 • 10 2 M -1 s -1 in some publications 19,37 .While we could not find the source of this error estimate, it would imply a direct match of our calculations and the experiment.
As the extrapolation error due to T -boosting is negligible (see SI), the observed error is mainly caused by the approximate calculation of free energy and friction fields by dcTMD.In the case of NaCl, we have shown that reliable estimates of the fields (with errors 1 k B T ) require an ensemble of at least 500 simulations 20 , although the means of ∆G and Γ appear to converge already for ∼100 trajectories.In a similar vein, by performing a Jackknife "leave-one-out" analysis 41 , for trypsin-benzamidine, we obtain an error of ∼ 2 k B T for 150 trajectories (Fig. S2).Interestingly, the error of the main free energy barrier is typically comparatively small, because the friction and thus variance of W increase directly after the barrier.As a consequence, the sampling error of k off is small compared to that of k on and the binding free energy.We note that if the experimental binding affinity K D = k off /k on is known, it can be used as a further constraint on the error of the free energy and friction fields.

C. Hsp90-inhibitor
The second protein complex of interest is the Nterminal domain of heat shock protein 90 (Hsp90) bound to a resorcinol scaffold-based inhibitor (compound 1j in Ref. 42).This protein has recently been established as a test system for investigating the molecular effects influencing binding kinetics 30,[42][43][44] , and the selected inhibitor unbinds on a scale of half a minute.From the overall appearance of free energy and friction profiles (Fig. 4), this protein-ligand complex exhibits clear similarities to the case of trypsin-benzamidine.That is, the main transition barrier is also found at x 2 ≈ 0.5 nm, which stems from the ligand pushing between two helices at this point in order to escape the binding site.Moreover, the friction peaks at x 2 ≈ 0.6 nm with a side maximum at x 3 ≈ 0.8 nm, which again coincides with the establishment of a hydration sphere.We note that the ligand is again bound to the protein via a hydrogen bond to an aspartate (Asp93) and at a position that is open to the bulk water.The free energy in the unbound state of ∼ 45 kJ/mol compares well to the experimental value of 40.7 ± 0.2 kJ/mol 30 .
To calculate rates k on and k off describing the binding and unbinding of the resorcinol inhibitor from Hsp90, we again performed 5 ms long Langevin simulations along the dissociation pathway at fourteen different temperatures ranging from 700-1350 K. Fits result in energy barriers of ∆G = off = 71 kJ/mol and ∆G = on = 24 kJ/mol.The The inhibitor is bound to the protein in a cleft of the protein surface via a hydrogen bond to Asp93.dcTMD calculations of (B) free energy ∆G(x) and (C) friction Γ(x) together with the mean number of hydrogen bonds between inhibitor and water.
Color code as in Fig. 3.
agreement with the ∆G = off ≈ 71 kJ/mol and ∆G = on ≈ 26 kJ/mol from dcTMD simulation is good, as are the predictions at 300 K (see Fig. 2C), which yield k on = 2.3 • 10 5 s -1 M -1 and k off = 3.6 • 10 −3 s -1 .The predictions again underestimate the experimental 30 values k on = 4.8 ± 0.2 • 10 5 s -1 M -1 and k off = 3.4 ± 0.2 • 10 −2 s -1 by a factor of 2-10, which is a fair agreement considering that we attempt to predict unbinding times on a time scale of half a minute from sub-µs MD simulations.We attribute the larger deviation in comparison to trypsin to issues with the sampling of the correct unbinding pathways: especially unbinding rates in the range of minutes to hours fall into the same timescale as slow conformational dynamics of host proteins 30 , requiring a sufficient sampling of the conformational space of the protein as a prerequisite for dcTMD pulling simulations.

IV. CONCLUSIONS
Using free energy and friction profiles from dcTMD, we have shown that T -boosted Langevin simulations yield binding and unbinding rates which are comparable to results from atomistic equilibrium MD and experiments.Adopting solvated sodium chloride, trypsin-benzamidine and Hsp90-inhibitor as test systems, the method underestimates the correct rates by a factor of 1.2-10.Considering that only sub-µs MD runs and comparatively inexpensive Langevin simulations are required, this is comparable to the best accuracies currently achieved by enhanced sampling methods 3 .As the extrapolation error due to T -boosting is negligible, the error is mainly caused by the approximate calculation of free energy and friction fields by dcTMD.In particular, the identification of the correct unbinding pathways of protein-ligand complexes can be challenging and requires further investigation.We have shown that friction profiles may yield additional insight into molecular mechanisms of unbinding processes, which are not reflected in the free energies.Although the three investigated molecular systems differ significantly, in all cases friction was found to be governed by the dynamics of solvation shells.
V. METHODS Detailed information on system preparation, ligand parameterization, MD and Langevin simulations and pathway separation can be found in the SI.

A. MD simulations
All simulations employed Gromacs v2018 (Ref.45) in a CPU/GPU hybrid implementation, using the Am-ber99SB force field 46,47 and the TIP3P water model 48 .For each system, 10 2 -10 3 dcTMD calculations 20 at pulling velocity v c = 1 m/s were performed to calculate free energy ∆G(x) and friction Γ(x).For the NaCl-water system, dcTMD as well as unbiased MD simulations were taken from Ref. 20. Trypsin-benzamidin complex simulations are based on the 1.7 Å X-ray crystal structure with PDB ID 3PTB (Ref.33).Simulation systems of the Hsp90-inhibitor complex were taken from Ref. 42.

B. Langevin simulations
Langevin simulations employed the integration scheme by Bussi and Parrinello 49 , using a time step of 5 fs.As system mass m, the reduced mass of the NaCl dimer (13.88 g/mol), the trypsin-benzamidin (120.15g/mol) and Hsp90-inhibitor (288.73 g/mol) complexes were used.

Supporting Information (SI)
Computational details on simulations, simulation system preparations, pathway separation and Langevin dynamics; waiting time distribution of NaCl dissociation and association; uncertainty for prediction of rates with T -boosting; friction estimate by data-driven Langevin equation; two Supporting Figures; one Supporting Table.

Pathway separation
In the case of trypsin-benzamidine, pathway separation was performed by employing nonequilibrium principal component analysis 25 using a protein-ligand distance covariance matrix 26 .Trajectories were projected onto the first two principal components and sorted according to pathways by visual inspection.The dcTMD correction was then carried out separately for such bundles of trajectories.Performing 200 pulling simulations, we found 84 trajectories to constiture the major unbinding pathway, for which free energy and friction profiles were converged, and whose free energy difference between bound and unbound state matches well the value known from experiment (see Fig. S2).
For the Hsp90-inhibitor complex, we employed a path separation based on geometric distances between individual trajectories 27 .After aligning trajectories with the protein's C α atoms as fit reference, we calculated the matrix of means over time of the root mean square distance between individual trajectories.We then applied the NeighborNet algorithm 28 to the matrix to cluster trajectories according to distances.From the considered 400 trajectories, the cluster that gave a free energy difference that was in best agreement with experiment (see Fig. S2) was taken by 93 single trajectories.

Langevin simulations
Langevin simulations used the integration scheme by Bussi and Parrinello 8 .Simulations were carried out with integrator step sizes ∆t = 1, 2, 5 and 10 fs at a constant temperature of 300 K, or at 5 fs integrator step size and temperatures between 293.15 K to 420 K for NaCl, 380 K to 900 K for trypsin-benzamidine and 700 K to 1350 K for the Hsp90-inhibitor complex.See Table S1 for the convergence of binding and unbinding times of NaCl with respect to the time step.Simulations were run for 1 µs of simulated time for NaCl and 5 milliseconds for protein-ligand systems.System coordinates were written out each 1 ps for NaCl and each 1 ns for protein-ligand systems.
The gradient of the potential of mean force was approximated as Input free energy and fiction profiles obtained from dcTMD were smoothed with a Gauss filter (σ = 10 and 40 ps, respectively).As friction fields in some cases still exhibited negative friction values after smoothing, we used the absolute values |Γ(x)| as input for simulations.For x, we used a resolution of 1 pm.For compensation of data borders, we employed "fully reflective" boundary conditions: If the system jumped over a boundary x max at any time step by a distance a, it was put back to x = x max − a, and its velocity sign reversed.Mean waiting times were calculated by defining geometric cores 29 : For NaCl, the free energy surface was separated into the bound state x < 0.31 nm and unbound state x > 0.43.For trypsin-benzamidine, we used a bound state x < 0.3 nm and unbound state x > 0.6 nm, while for the Hsp90-inhibitor complex, we applied cores of x < 0.3 nm and x > 0.9 nm.As the native units of k on are s −1 M −1 , all according rates were scaled by the "concentration" of ion or protein-ligand pairs in an effective volume of 4 3 πx 3 end , amounting to a molarity of 0.2 M for NaCl Langevin simulations and 50 mM for protein-ligand Langevin simulations.

Data evaluation
Minimal distance evaluation was performed using the MDanalysis Python library 30 , nonequilibrium principal component analysis was carried out using the fastpca program 31 .Data evaluation was carried out using a Jupyter notebook ?employing the numpy 33 , scipy 34 and astropy 35 Python libraries.Graphs were plotted using the matplotlib 36 Python library, molecular structures were displayed via PyMOL 37 .

Uncertainty for prediction of rates with temperature boosting
To estimate the extrapolation error of T -boosting, we assume that the waiting time τ of an unbinding or binding process is exponentially distributed, where τ is a function of temperature T .Hence the expectation τ is given by its mean τ plus/minus the error of the mean where N (T ) denotes the number of simulated transitions.By changing to the dimensionless rate k = t 0 / τ (with t 0 being some timescale, e.g., ns) and employing Gaussian error propagation to lowest order 38 , we obtain accordingly and where T i denotes a discrete set of temperatures at which simulations are performed.Employing error propagation, we estimate the uncertainty of ln(k) at T ref = 300 K as This yields for the desired relative uncertainty of the average waitingtime To illustrate the typical magnitude of this uncertainty, we assume that we perform 10 Langevin simulations of length t LE at different temperatures T i (i = 0, ... , 9) The first three temperatures (T 0 , T 1 and T 2 ) are chosen such that we observe ≈ 10 2 transitions during the simulation time t LE .Similarly, we assume to observe 10 3 transitions at T 3 , T 4 and T 5 , 10 4 transitions at T 6 , T 7 and T 8 and 10 5 transitions at T 9 .Using the case of T 0 = T ref = 300 K, since we assumed 10 2 transitions at T 0 = T ref during the Langevin simulation time t LE , the observed rate at 300 K is k = 10 2 /t LE .Chosing t LE = 5 ms, the corresponding rate is k(300K) = 1/50µs, and we obtain for the error of the rate σ ln(k(T ref )) = 7.7%.Alternatively, when we assume that we need to choose T 0 ≈ 450 K in order to achieve 10 2 transitions, employing the boosting relation (4) in the main text and t LE = 5 ms, the observed rate at 300 K is k(300K) = 0.063 ms −1 with an error of 10.6%.Considering our Langevin simulations of trypsin described in the main text, where we used T -boosting at 13 temperatures from 380-900 K, the error at 300 K is estimated to be σ ln(k(T ref )) = 3.3%.For Hsp90, the system with the highest considered barrier, we obtain σ ln(k(T ref )) = 11.0%at 300 K using Langevin simulations at 14 temperatures from 700-1350 K.As the overestimation of friction factors due to usage of constraints (see main text) results in a underestimation of rates by a factor of ∼ 4, and as the error of free energy profiles enters Eq. [4] in the exponent, the extrapolation error due to T -boosting can easily be made negligible in all practical cases.Jackknife ("leave-one-out") analysis 40 of free energy profiles and friction for (A) trypsin-benzamidine and (B) Hsp90-inhibitor complex.Error bars denote the Jackknife standard error obtained for various numbers of TMD trajectories ("samples").Color code in (A): 52 samples in black, 84 in blue, 117 in red, 148 in cyan.Color code in (B): 30 in black, 50 in blue, 93 in red.The calculated binding free energies of ∼ 27 kJ/mol (Trypsin) and ∼ 45 kJ/mol (Hsp90) compare well to the experimental values of 28.0 ± 0.2 kJ/mol 20 and 40.7 ± 0.2 kJ/mol 41 , respectively.

S13
SUPPORTING TABLES TABLE S1.Convergence of the Bussi-Parrinello Langevin equation integration scheme 8 with respect to the integration time step ∆t.Shown are dissociation and association times and corresponding number of transitions obtained for NaCl at T = 293 K, as well as results from fully atomistic MD simulations (single trajectory of 1 µs, 1 fs step size).Errors denote the standard error of the mean.A time step of 5 fs appears as the longest time step that results in suitable dissociation and association times.∆t (fs) dissociation time (ps) no. of transitions association time (ps) no. of transitions 0.

4 FIG. 3 .
FIG.3.Unbinding of benzamidine from trypsin.(A) TMD snapshots of the structural evolution in trypsin along the dominant dissociation pathway, showing protein surface in gray, benzamidine as van der Waals spheres, Asp189 and water molecules as sticks.Benzamidine is bound to the protein in a cleft of the protein surface via a bidental salt bridge to Asp189.dcTMD calculations of (B) free energy ∆G(x) and (C) friction Γ(x) together with the mean number of hydrogen bonds between benzamidine and water.

FIG. 4 .
FIG. 4. Unbinding of an inhibitor from the N-terminal domain of Hsp90.(A) Structural evolution along the dissociation pathway in Hsp90, showing protein surface in gray, inhibitor as van der Waals spheres, Asp93 and water molecules as sticks.The inhibitor is bound to the protein in a cleft of the protein surface via a hydrogen bond to Asp93.dcTMD calculations of (B) free energy ∆G(x) and (C) friction Γ(x) together with the mean number of hydrogen bonds between inhibitor and water.Color code as in Fig.3.

)
Due to the rate expression k ∝ e −∆G/kBT with transition state energy ∆G, ln(k) depends linearly on 1/T , i.e., ln(k(T )) = a T + b. (S09) Linear regression theory 38 yields estimates for a and b as well as uncertainties FIG.S5.Friction calculated from data-driven LE (dLE)39  in comparison to friction profiles from dcTMD.