Dynamical origins of heat capacity changes in enzyme-catalysed reactions

Heat capacity changes are emerging as essential for explaining the temperature dependence of enzyme-catalysed reaction rates. This has important implications for enzyme kinetics, thermoadaptation and evolution, but the physical basis of these heat capacity changes is unknown. Here we show by a combination of experiment and simulation, for two quite distinct enzymes (dimeric ketosteroid isomerase and monomeric alpha-glucosidase), that the activation heat capacity change for the catalysed reaction can be predicted through atomistic molecular dynamics simulations. The simulations reveal subtle and surprising underlying dynamical changes: tightening of loops around the active site is observed, along with changes in energetic fluctuations across the whole enzyme including important contributions from oligomeric neighbours and domains distal to the active site. This has general implications for understanding enzyme catalysis and demonstrating a direct connection between functionally important microscopic dynamics and macroscopically measurable quantities.

. Reaction mechanism of KSI. Conversion of 5-androstene-3,7-dione into 4-androstene-3,7-dione occurs through two consecutive proton transfers. The first proton transfer is from C4 to Asp38 1 , and the second proton transfer is from Asp38 to C6 2 . In the intermediate state (used in the simulations as a proxy for the proton transfer transition states), Asp38 is protonated and hydrogen bonds donated by Tyr14 and Asp99 (indicated) stabilize the dienolate oxygen.

Supplementary Figure 2. Reaction mechanism of MalL 3 and TS analogue used in simulations.
Cleavage of isomaltose (glc(α1-6)glc) into two D-glucose units catalysed by MalL proceeds via a two-step mechanism. In the first step, the +1 glucose unit is released, and a covalent glycosyl-enzyme intermediate is formed between Asp199 and the retained glucose unit. In the second step, the second glucose is released via nucleophilic attack by water. The retained glucose passes through a contorted half chair conformation as the anomeric carbon adopts a more planar conformation at the TS 4,5 (not shown). Top pane: transition state (TS) for the first MalL reaction step showing an elongated bond during the cleavage step and the oxocarbenium ion character. The charge and elongated bond are mimicked in the TS analogue via a charged nitrogen and methyl bridge, respectively.

Supplementary Note 2: Simulation protocols
For both KSI and MalL, the same simulation protocol was used. Histidine tautomers and Asn/His side-chain rotations of 180° were assigned according to the optimal hydrogen bond network 10 . For KSI, the following were rotated: Asn2, Asn104, His100 and His122 in chain A, and Asn19, Asn57 and Asn104 in chain B. No rotations were required for MalL. After addition of solvent (in tleap from AmberTools 11 with a closeness parameter of 0.9) and randomisation of the Na + ions (minimum distance between protein and Na + of 10 Å, minimum distance between Na + ions 5 Å), a minimisation procedure was started consisting of: 300 steps minimisation of water, ions and hydrogens only; 50 ps MD simulation in the NVT ensemble at 300 K of water and ions only (positional restraint on protein atoms: 25 kcal·mol 1 ·Å 2 ); minimisation of the whole system with positional restraints on C atoms (5 kcal·mol 1 ·Å 2 ) for 300 steps. Heating (random velocity assignment at 25 K followed by heating to 300 K for KSI and 320 K for MalL) was then performed in 20 ps simulation using Langevin dynamics for temperature control with a 1 ps −1 collision frequency (maintaining 5 kcal·mol 1 ·Å 2 positional restraints on C atoms). In four consecutive 10 ps simulations under the same conditions, the positional restraints on C atoms were reduced to 4, 3, 2, and 1 kcal·mol 1 ·Å 2 . Subsequently, 1 ns of equilibration was performed in the NPT ensemble at 1 atm, using the Berendsen barostat (1 ps pressure relaxation time) and Langevin dynamics for temperature control (1 ps −1 collision frequency). Production simulation (500 ns) was then performed in the NVT ensemble with the Berendsen thermostat and loose temperature coupling (10 ps time constant), to limit the influence of the thermostat whilst avoiding temperature drift. In all dynamics simulations, the default direct-space cutoff was used for non-bonded interactions, with particle-mesh Ewald summation for long-range electrostatic interactions. Dynamics simulations were run with pmemd.cuda on GPUs, using the SPFP precision model 12 .

Supplementary Note 3: Restraints in simulations
For KSI, restraints were required to keep Asp38 in a reactive (Michaelis-complex) conformation, in line with proton abstraction from C4 by Asp38 (Supplementary Figure 1). Without any restraints, the deprotonated Asp38 swings out into solvent within the initial 1ns NPT equilibration (dOD2Asp38-C4substrate increased to 6.6-10.1; Supplementary Figure 3); once solvent exposed, the reactivity is severely limited 13 . Only after this initial change, instability of the loop that contains Asp38 occurs: the main-chain hydrogen bonds from Gly37 and Val40 (in the loop) to Ala114 are lost in the first 10 ns of simulation (donor-acceptor distances increase from <4 Å after NPT equilibration to >4.7 Å after 10 ns in all cases). The tendency of Asp38 to swing out into solvent may be a force-field limitation, or a consequence of the conformation of a deprotonated Asp38 directed to the C4-C5 double bond being a short-lived event prior to reaction (or a combination of both). To address whether the observed behaviour can be mitigated by using a more recent, improved protein force field, simulations of the reactant state (2 independent runs of 50 ns) without restraints were performed with Amber ff14SB. Essentially, the behaviour is the same as observed before (with ff99SB-ILDN): initial movement of Asp38 away from the substrate into solvent happens within the initial 1 ns NPT equilibration, and subsequently, the loop becomes less stable. Although the loop hydrogen bonds and structure may change less rapidly than with ff99SB-ILDN, hydrogen bonds still break within the first ~50 ns of simulation (Supplementary Figure 3d). This suggests that the movement of Asp38 into solvent followed by instability of the loop is not just an artefact of the ff99SB-ILDN force field used. To keep Asp38 in a position relevant for the reactant state, a set of six restraints was used (Supplementary Figure 3). To allow reasonable sampling of the conformational fluctuations, 1) all restraints are one-sided (quasi-)harmonic, i.e. no restraint is applied if a distance stays within a certain range, 2) the restraint force-constant is mild, 25 kcal·mol 1 ·Å 2 , and 3) all restraints are outside the range that is typically sampled in simulations of the IS state without restraints present. As a consequence, the simulations of IS with restraints in place that were used for analysis (for consistency), typically have no (or a very small) energy contribution from the restraints. Despite the restraints imposed, Asp38 was still observed to move away from its Michaelis complex position in some simulations, i.e. the shortest possible proton transfer distance (dOH) moved beyond 4 Å and was consistently >4 Å for the remaining simulation time. Trajectories where this occurred were omitted from further analysis. For MalL, initially no restraints were applied, but after many 10s of ns of simulation in some trajectories, the isomaltose moved out of the active site area. To avoid this from happening, a restraint was placed on the centre-of-mass distance between the heavy atoms of the 1 glucose unit on the one hand, and the C atoms of Asp199 and Asp332 on the other. This one-sided (quasi-)harmonic restraint is active when the distance is 10 Å or larger, with a force-constant of 15 kcal·mol 1 ·Å 2 . It is only activated occasionally (<2% of the time), and stops the isomaltose substrate from drifting away. Figure 3. Restraints required to maintain a Michaelis complex conformation in KSI simulations. a, Example snapshot from simulation of KSI with the reactant bound without restraints, indicating that Asp38 and the surrounding loop swing away from the bound substrate, into solvent. Distance between Asp38 C and C4 on the substrate is indicated, alongside the hydrogen bond donor-acceptor distances between Tyr14, Asp99 and O3 (see Supplementary Figure 1). Hydrogens are omitted for clarity. b, Starting structure for simulation (with inhibitor bound, from PDB 1OHP) with distances on which restraints are placed to keep Asp38 in a reactive conformation indicated with dashed lines. Distances measured in the structure are shown in black, and distance where the one-sided harmonic restraint comes into effect in red. c, Example snapshots (ff99SB-ILDN, without restraints) indicating the sequential movement of Asp38 moving out into solvent (orange backbone, with the loop still anchored by backbone hydrogen bonds between Glu37-Ala114 and Val40-Ala114taken after 1 ns NPT equilibration), prior to further loop movement (dark red, taken after 10 ns of production simulation). Starting structure in green. d, Example distance plots in simulations with ff99SB-ILDN, ff14SB and ff99SB-ILDN with restraints. Change in loop structure (Cα RMSD of residues 37-42 after alignment on residues 1-116) and the presence/loss of backbone hydrogen bonds between Glu37-Ala114 and Val40-Ala114 (donor-acceptor distance) are indicated.   Table 2). Analysis of the trajectory indicates a significantly higher Cα RMSD than the other replicate runs, especially in the latter half of the simulation. Increases in RMSD from 120 ns onwards correspond to anomalous orientation of loops 288-300 and 380-420 (Supplementary Figure  8). Usually, the helix-loop-helix comprising of residues 380-420 sits into the active site; Glu389 in this region has been identified as significant for substrate recognition 14 . For a majority of the run, this interaction is broken as loop 288-300 flips into the active site, excluding helix-loop-helix 380-420. Based on the clustering results, high RMSD and the trapping of the structure in an anomalous loop configuration, this simulation was replaced with a new trajectory for the purposes of ΔC ǂ P analysis.

Supplementary Note 5: Conformational clustering
For KSI, all conformations sampled were divided into two separate conformational clusters for both simulated states. This was done on the basis that for both states, only two highly similar significant clusters are present (Figure 2 and Supplementary Table 4). To ensure all sampled conformations are included in the analysis, the K-means algorithm was used (see Methods). Because in this case, substantially different conformational clusters exist (with substantially different variances; Figure 2), it is relevant to analyse these clusters separately: reactions take place very rapidly if the energy barrier to reaction is overcome (e.g. proton transfer on the order of ~fs) and a transition state is (by definition) very short-lived. The enzyme conformation will thus not change from one cluster to another on this timescale. From experiment, ∆ ‡ is inferred from steady-state kinetics and its dependence on temperature, involving a very large number of individual enzyme molecules. If the simulations are sufficiently accurate, clustering will identify different relevant conformations (and their fractions) in the overall population of enzymes present in experiment.
Combined clustering of the 20 MalL simulations (10 for each state) using the hierarchical agglomerative algorithm with a minimum cluster distance (epsilon) of 2.1 Å leads to 9 clusters with an overall contribution >1%. Only the top two clusters are sampled significantly in both states (of the remaining clusters, one has a 0.58% contribution of the minor contributing state; all others are 0.08% or less).

Supplementary Note 6: Calculation of variances and ∆ ‡ with solvent
The contribution of solvent to the variance of the force-field energies for the two states (from which ∆ ‡ is calculated) may be significant, e.g. due to electrostatic screening that solvent would provide. In an attempt to include possible solvent effects, we originally intended to calculate ∆ ‡ values based on variances of energies obtained with a layer of explicit solvent (approximately the 1 st and 2 nd solvation shell) around the protein structures. However, there is no clear way to select which water m olecules should be included in such a calculation, and errors in the energies of water modelled by force-field approaches (see e.g. 15 ) may be a further source of error for the resulting heat capacities. We therefore used a Poisson-Boltzmann implicit solvent model (as implemented in Amber), see Supplementary  Table 4. The resulting ∆ ‡ value for MalL converges less well with moving variance window size, but for both enzymes the values are qualitatively similar to those obtained without solvent. We can thus conclude that no large error is introduced by excluding contributions of solvent in calculation of the difference in variance between two states, from which ∆ ‡ is calculated. * indicates that the ligand is included. a Standard deviation calculated as the sum of the standard deviations of RS and IS variances computed using a leave-one-out procedure (one simulation of each state is omitted at a time). b Standard deviation calculated from the sum of the standard deviations of RS and TSA moving variances for each of the 10 simulations. c Variances obtained from force-field energies calculated after deleting all amino-acid side-chains. d For consistency, the moving variance window was again 70 ns (in line with the calculations without implicit solvent), but ∆ ‡ does not converge well with window size (with 60 or 80 ns windows, ∆ ‡ is −3.54 and −4.06 kJ·mol 1 ·K 1 , respectively). e PBSA energy calculations were performed with sander from AmberTools 16. Total electrostatic energy and forces were calculated with the particle-particle-particle mesh (P3M) procedure, and the cut-off distance for van der Waals interactions was 8 Å. The 'mbondi' set of atomic radii was used.