Intriguing role of water in protein-ligand binding studied by neutron crystallography on trypsin complexes

Hydrogen bonds are key interactions determining protein-ligand binding affinity and therefore fundamental to any biological process. Unfortunately, explicit structural information about hydrogen positions and thus H-bonds in protein-ligand complexes is extremely rare and similarly the important role of water during binding remains poorly understood. Here, we report on neutron structures of trypsin determined at very high resolutions ≤1.5 Å in uncomplexed and inhibited state complemented by X-ray and thermodynamic data and computer simulations. Our structures show the precise geometry of H-bonds between protein and the inhibitors N-amidinopiperidine and benzamidine along with the dynamics of the residual solvation pattern. Prior to binding, the ligand-free binding pocket is occupied by water molecules characterized by a paucity of H-bonds and high mobility resulting in an imperfect hydration of the critical residue Asp189. This phenomenon likely constitutes a key factor fueling ligand binding via water displacement and helps improving our current view on water influencing protein–ligand recognition.

Occupancy maps for water oxygen and hydrogen atoms are shown at the 55% level as red and yellow meshes, respectively. While the ligand and selected protein residues are shown (Panel A) in stick representation as present in the first frame of the MD trajectory following the equilibration phase, water molecule stick models were derived from a superimposition of the trypsin:N-amidinopiperidine XN structure onto this frame. Panel B provides a more detailed view of water molecule W1 located on top of Tyr228.
restraining potential was switched on to keep the ligand within this volume. Asp189 is shown as cyan ball-and-stick model. (B) Description of those regions N-amidinopiperidine visits most frequently. Trypsin is represented by the transparent gray surface model with the view onto the horizontally-oriented binding cleft of this protease. An occupancy map for the guanidino carbon of N-amidinopiperidine is depicted at the 1.8% level in gray. While Asp189 defines the location of the S1 pocket, Tyr39 and Trp215 are located close to the S2' and S3/4 pockets, respectively.
(C) Development of the CV1 and CV2 parameters along the simulation. While values every 10 ps are plotted in gray, the blue line represents the sliding average over 1 ns intervals. States in which the ligand can be considered to be bound as evaluated by CV1 ≤ 5 Å and 120° < CV2 < 180° and corresponding to minimum a in Figure 5A, are marked in red on the x-axis. Similarly, states in which the ligand is almost bound (5 Å < CV1 ≤ 7 Å and 120° < CV2 < 180°, minimum b in Figure 5A) are highlighted in orange. (D) Stability of rmsd values during the simulation.
The backbone rmsd extracted every 10 ps and calculated following a least-squares fit of all backbone atoms onto the first frame are plotted in gray, while the sliding average over 1 ns intervals is overlaid in blue. (E) Projection of the free energy surface onto CV1 and CV2, respectively. The Gibbs free energy of binding as derived from our direct ITC titration experiments performed in three different buffers is shown in red as an experimental reference (buffer-corrected value from Supplementary Table 2). Please note that the steep increase in energy at large CV1 values originates from the applied restraining potential (see panel A).
trypsin:benzamidine complex. (A) Description of those regions benzamidine visits most frequently. Trypsin is represented by the transparent gray surface model with the view onto the horizontally-oriented binding cleft of this protease. An occupancy map for the amidino-group carbon of benzamidine is depicted at the 1.8% level in gray. While Asp189 defines the location of the S1 pocket, Tyr39 and Trp215 are located close to the S2' and S3/4 pockets, respectively.
(B) Stability of rmsd values during the simulation. The backbone rmsd extracted every 10 ps and calculated following a least-squares fit of all backbone atoms onto the first frame are plotted in gray, while the sliding average over 1 ns intervals is overlaid in blue. (C) Development of the CV1 and CV2 parameters along the simulation. While values every 10 ps are plotted in gray, the blue line represents the sliding average over 1 ns intervals. States in which the ligand can be considered to be bound as evaluated by CV1 ≤ 5 Å and 125° < CV2 < 180° and corresponding to minimum a in Figure 5B, are marked in red on the x-axis. Similarly, states in which the ligand is almost bound (5 Å < CV1 ≤ 7 Å and 100° < CV2 < 125°, minimum b in that the His57-Nδ and Ser214-Oγ atoms are deuterated to a large extent. An H/D occupancy refinement was performed at these sites because they are not fully solvent-exposed and involved in H-bonding. The resulting occupancies of H and D-atoms are indicated in the figure. In contrast, much lower nuclear density peaks were observed for the His57-Nε and Ser195-Oγ atoms indicating either hindered H/D exchange or incomplete protonation. Since these atoms are directly solvent-exposed and not involved in a highly stable H-bond that might explain a slow H/D exchange (all analyzed crystals were soaked in deuterated mother liquor for at least two weeks), we expected that both atoms are fully H/D exchanged. Accordingly, we modeled only a deuteron at these sites and refined its occupancy. The results, which are highlighted in the figure, indicate that His57-Nε is partially protonated leading to a positive charge on this amino acid while Ser195 might partially exist in a deprotonated and thus nucleophilic, substrate-reactive state. A sulfate ion, which originates from the crystallization solution, was found to be present close to Ser195 and His57 albeit with high mobility and/or partial occupancy. The sulfate might mimic the negative charge of the oxyanion generated during substrate turnover. The above-described findings are supported by the 2mFo-DFc electron and nuclear density maps shown at the 2σ level as transparent gray surface and gray mesh, respectively. While deuterons appear as positive peaks in the 2mFo-DFc nuclear density map, hydrogens result in negative peaks due to the negative neutron scattering length of the H atom.
In the figure, the 2mFo-DFc nuclear density map is therefore additionally shown at the -2σ level in the form of red dots. Supplementary Table 1. Inhibition of trypsin and thrombin by derivatives of N-amidinopiperidine and benzamidine (arithmetic mean ± standard deviation from measurements performed at least in triplicate).  Table 4 see PDB code 5MO2 in Supplementary Table 4 see PDB code 5MO0 in Supplementary Table 4 X-ray part see PDB code 5MNF in Supplementary Table 6 see PDB code 5MNO in Supplementary Table 6 see PDB code 5MNH in Supplementary Supplementary Table 9. Damping factors for individual molecular motions, defined as the ratio between the solvent residence time found in the protein (p) and the bulk solvent (b) environment.

Translational and Rotational Stability of W1 water molecules
The W1 water molecule is located on top of the phenyl moiety of the Tyr228 residue in trypsin (same numbering in related proteins such as thrombin). Since it is buried deep within the S1 pocket and many of the known binders displace or interact with this water molecule, W1 may  Supplementary Table 8).
Generally, the damping factor, in the following defined as the ratio of the solvent residence time found in the protein environment and the solvent residence time found in bulk solvent, is comparable for each residence time observed for the W1 water molecules (see Supplementary   Table 9). In comparison to the crystal structures, the spatial positions of the W1 water molecules were, on average, not very well reproduced. They were missed by approximately 2.82 Å when only considering the nearest water molecules of the MD simulation (see Supplementary Table   10). As another example, the crystallographic water molecule W7 was reproduced much better with a deviation of only 1.27 Å relative to its crystallographic reference position. Presumably, this water W7 is fixed within the interior of the protein and therefore restrained with respect to translational and orientational degrees of freedom. The orientations of water molecules at W1, however, seem not to favor any of the two crystallographic conformations A or B as suggested by the RMSDH-H values found for the two conformations (see Supplementary Table 10). This observation is in accordance with the fact that both orientations of the W1 refined to nearly identical occupancy values. On the contrary, water molecules that interact with Asp189, namely W2 and W3, are highly stable in space and have a translational lifetime as long as 87.6 ps. By this, its orientational lifetime is much longer even if compared to the translational lifetime of the W1 water molecules. Furthermore, the damping factor for water molecules next to Asp189 are ranging from 14.0, for the translational damping, to 24.1 for the RY vector. The side chain of Asp189 is surrounded on average by 1.7 water molecules during the MD simulation. The position of the crystallographic water molecule W2 was not very well reproduced by MD in both configurations when considering only the nearest water molecule. In contrast, W3 was spatially reproduced very well, (on average 1.12 Å as nearest distance), which is even closer than in the case of W7. However, the orientations are not equivalently reproduced by the simulation. Similarly here, no clear preference is indicated over the two orientations found in the crystal structure. As a reference, His57, an important residue in the catalytic triad of the protein, was investigated. It is located at the surface of the protein and highly accessible by water molecules from the bulk phase. Therefore, we assume, that the dynamic properties of the water molecules interacting with this His57 residue do not differ tremendously with respect to the same residue in a completely solvent-exposed situation. As indicated by the damping factors of His57, all of them being close to unity, this residue does indeed obey bulk-like solvation dynamics.
Given the high occupancy of almost 75% and fast translational dynamic in the first hydration layer of Tyr228 during the MD simulation, it is obvious that water molecules must rapidly enter and leave the hydration layer. Furthermore, all orientational lifetimes for W1 water molecules are comparable to each other within one standard deviation. Moreover, the fast decay of orientational states matches with the observation of multiple orientational states of W1 in the crystallographic part of this study. On the contrary, water molecules W2 and W3, both attached to Asp189, follow slow exchange dynamics. Given the high occupancy of 96% (1.71 water molecules on average) in the first hydration layer of Asp189, it can be assumed that W2 and W3 exchange indeed slow. With respect to their high damping factors, it can be assumed that the shape and electrostatic distribution of the S1 pocket takes a significant impact on these water molecules. It is quite remarkable that, although water molecules at W1 position are in fact in hydrogen bonding distance to W2 and W3, they obey completely different dynamics.