N+-C-H···O Hydrogen bonds in protein-ligand complexes

In the context of drug design, C-H···O hydrogen bonds have received little attention so far, mostly because they are considered weak relative to other noncovalent interactions such as O-H···O hydrogen bonds, π/π interactions, and van der Waals interactions. Herein, we demonstrate the significance of hydrogen bonds between C-H groups adjacent to an ammonium cation and an oxygen atom (N+-C-H···O hydrogen bonds) in protein-ligand complexes. Quantum chemical calculations revealed details on the strength and geometrical requirements of these N+-C-H···O hydrogen bonds, and a subsequent survey of the Protein Data Bank (PDB) based on these criteria suggested that numerous protein-ligand complexes contain such N+-C-H···O hydrogen bonds. An ensuing experimental investigation into the G9a-like protein (GLP)-inhibitor complex demonstrated that N+-C-H···O hydrogen bonds affect the activity of the inhibitors against the target enzyme. These results should provide the basis for the use of N+-C-H···O hydrogen bonds in drug discovery.


Results and Discussion
Establishment of criteria for the presence of N + -C-H···O hydrogen bonds in protein-ligand complexes based on quantum chemical calculations. Before N + -C-H···O hydrogen bonds can be examined in protein-ligand complexes, selection criteria have to be established for such interactions, as general criteria to determine the existence of N + -C-H···O hydrogen bonds in protein-ligand complexes still remain elusive. Although a few computational studies on N + -C-H···O hydrogen bonds have previously been reported 12,17 , the quantity of theoretical information on N + -C-H···O hydrogen bonds remains low, especially regarding the character of the hydrogen acceptor oxygen atoms and the relationship between the energy and the geometry of these interactions. In this study, we simulated complexes of two molecules bound via N + -C-H···O hydrogen bonds, and we determined the strength and geometry of the N + -C-H···O hydrogen bonds between proteins and ligands. Specifically, we used N-methylacetamide (1), propanoate (2), ethanol (3), and phenol (4) as hydrogen acceptor models for the protein peptide bond/Asn/Gln, Asp/Glu, Ser/Thr, and Tyr, respectively, and monomethylammonium (5), trimethylammonium (6) or N-methylpiperidium (7) as hydrogen donor models in protonated aliphatic amine-containing ligands. Geometry optimizations and single-point calculations were carried out at the M06-2X/6-311++G** level of theory 25,26 . The results of this computational study allowed us to establish selection criteria for N + -C-H···O hydrogen bonds (Table 1). Initially, we analyzed the electrical charges on the hydrogen donors of the N + -C-H···O hydrogen bond models, considering that the electrostatic force controls the geometry of the heteroatom-hydrogen bonds 27 . A natural bond orbital (NBO) analysis 28 of 5 indicated that the positive charge is distributed over the H atoms (formal charge on H1: 0.237; H2: 0.447), including those of the CH 3 group, rather than being localized on the N atom (−0.688) ( Supplementary Fig. S2). The NBO analysis also showed that the formal charge on the H atoms of the CH 3 group in 5 (0.237) is more positive than that of monomethylamine (8; 0.185/0.159) or ethane (9; 0.195). These results suggest high potential for the N + -C-H group to act as a hydrogen donor in hydrogen bonds.
Subsequently, we simulated the optimized structures of complexes A1, B1 and C1, i.e., complexes of 1, 2, and 3 with 6, as well as that of complex D1, which consists of 4 and 7 in the gas phase (Fig. 1). For the optimized complex between 4 with 6, only CH/π interactions were calculated, which is comparable to the results of a previously reported similar model of phenol/SMeEt 2 + (Supplementary Fig. S3) 16 . The H···O distances between one H atom of the CH 3 groups and one O atom of each hydrogen acceptor are less than the sum of the van der Waals radii of hydrogen and oxygen (2.72 Å), which suggests the formation of intermolecular hydrogen bonds between these atoms ( Fig. 1 and Table S1). In addition, counterpoise-corrected interaction energies 29 of −22.01, −106.20, −13.49 and −15.46 kcal/mol were calculated for A1, B1, C1, and D1, respectively. Subsequently, we also simulated the optimized structures of complexes A1-D1 in water (Supplementary Fig. S4 and Table S1). Although the interaction energies/H···O distances in water were slightly higher/longer than those in the gas phase, we did not observe any significant differences between the two phases. These results suggest that the results in the gas phase should at least qualitatively correlate with those in water or a ligand-binding pocket, as the latter does generally not contain many water molecules. Therefore, we thereafter performed all calculations in the gas phase. Next, we compared energies of the N + -C-H···O hydrogen bonds to those of interactions which are often considered in drug design. As a result, the energies of the N + -C-H···O hydrogen bonds are low relative to those of heteroatom-hydrogen bonds, π/π interactions, cation/π interactions, or CH/π interaction models ( Supplementary Fig. S5).
Subsequently, we compared A1-D1 with A2-D2 and A3-D3, in which 6 in A1-C1 is replaced with trimethylamine (10) or 2,2-dimethylpropane (11), and 7 in D1 is replaced with N-methylpiperidine (12) or methyl cyclohexane (13) (Supplementary Fig. S6 and Table S1). The conformations of A2 and A3 are significantly different from that of A1 ( Supplementary Fig. S6A). The H···O distances in A2-D2 and A3-D3 are longer than those in A1-D1 (Fig. 1, Supplementary Fig. S6, and Table S1). Additionally, the interaction energies of A1-D1 are lower than those of A2-D2 and A3-D3 (Fig. 1, Supplementary Fig. S6, and Table S1). These results indicate that the nitrogen cation is important for the formation of the C-H···O hydrogen bond, as well as for its strength. Finally, we also estimated the influence of an N-H···O hydrogen bond formed involving the ammonium moiety on the C-H···O hydrogen bonds of A1-D1 (Supplementary Fig. S7 and Table S1). For these simulations, we used a water molecule as a hydrogen-bond acceptor for the N-H···O hydrogen bond. The obtained results suggest that even if ideal heteroatom-hydrogen bonds are formed between the ammonium cation and the water molecule, the geometries and energies of the C-H···O hydrogen bonds of A1-D1 are hardly influenced by the heteroatom-hydrogen bonds.
In order to establish selection criteria for N + -C-H···O hydrogen bonds, we used A4-D4, which consist of 1-4 and 5 as a simple model for N + -C-H···O hydrogen bonds, and examined the dependence of their interaction energy on the geometry. Specifically, we measured the H···O distances (d HO ), the C-H···O angle (ψ), the H···O=C/H···O-C angle (ξ), and the H-elevation angle (θ) 30 (Figs 2A, 3A, 4A and 5A). Initially, we examined the dependence of the interaction energy on the H···O distance in A4, whose C-H···O, H···O=C, and H-elevation angles were kept constant (Fig. 2B). For H···O distances in the range of 2.0-2.7 Å in A4, low interaction energies (−16.19 to −7.3 kcal/ mol) were calculated (Fig. 2B), which is similar to the case of O-H···O hydrogen bonds ( Supplementary Fig. S8F). The estimation of the distance dependence in B4-D4 was carried out in a similar fashion. As B4 is basically formed by ionic interactions between an anion and a cation, its interaction energy was significantly lower in the H···O distance region of 1.4-2.7 Å (Fig. 3B). The distance dependence of C4 and D4 was similar to that of A4 (low   . This is also reflected in our calculations, which indicate that interaction energies are low for H···O distances <2.7 Å, while strong N + -C-H···O hydrogen bonds were estimated for H···O distances <2.4 Å ( Supplementary Fig. S9). Subsequently, we tested the dependence of the interaction energy of A4 on the C-H···O angle (ψ). For 90° < ψ < 180°, the interaction energy is relatively low (Fig. 2C), which is similar to the case of the other hydrogen acceptors (B4: 90° < ψ < 180°, C4 and D4: 105° < ψ < 180°) (Figs 3C, 4C and 5C). However, the trends for the changes of the interaction energy in N + -C-H···O hydrogen bonds are different to that in O-H···O hydrogen bonds. While for O-H···O angles ~180°, the interaction energy of O-H···O hydrogen bonds decreases (Fig. S8G), a similar trend could not be established for N + -C-H···O hydrogen bonds, although some cases involving 3 and 4 revealed a preference for ψ ≈ 180° (Figs 4C and 5C). The results of these calculations can be rationalized by examining the electrostatic potential maps of the hydrogen donors. The electrostatic potential map of 5 suggests that the positive charges for the H atoms of the N + -C-H groups are widely distributed ( Supplementary Fig. S2A), while the positive charge of the hydroxyl hydrogen atom of e.g. ethanol is limited to its O-H line ( Supplementary Fig. S2F). Therefore, the ψ angles of N + -C-H···O hydrogen bonds may vary in a wide range, whereas the O-H···O angles of O-H···O hydrogen bonds are required to be ~180°. Additionally, the O-H···O hydrogen bonds can be weakened by exchange repulsion between the O-H group and H···O hydrogen bond when the O-H···O angle is ~90°, while the exchange repulsion in N + -C-H···O hydrogen bonds should be weaker in comparison.
Then, we tested the dependence of the interaction energies on the H···O=C/H···O-C angles (ξ). Calculating the interaction energies, we discovered that these energies depend on the type of hydrogen acceptors. In the case of amide acceptors, i.e., for 105° < ξ < 180°, the interaction energies are low and the preferred angles are ξ > 135° ( Fig. 2D and Supplementary Fig. S9A). Conversely, heteroatom-hydrogen bonds between ethanol and N-methylacetamide showed preferences for 105° < ξ < 120° (Supplementary Fig. S8H). In the case of carboxylate acceptors, a wide angle range (90° < ξ < 270°) was permissible, and B4 exhibited a preference for ξ > 240° (Fig. 3D). This should be attributed to the fact that the H atom can interact with two O atoms of the carboxylate moiety. In the case of alcohol acceptors, the ξ angles strongly affect the interaction energies (permissible angle range: 105° < ξ < 150°, preferential angle range: ξ = 105°-120°) (Figs 4D and 5D and Supplementary Fig. S9C,D). For carboxylate and alcohol acceptors, the properties of the ξ angles of the N + -C-H···O hydrogen bonds are similar to those of the O-H···O hydrogen bonds ( Supplementary Fig. S8H).
The ξ and θ angles of the N + -C-H···O hydrogen bonds are also affected by the negative charge on the O atom of the hydrogen acceptor. The negative charges in N-methylacetamide (1) and propanoate (2) are widely  (Supplementary Fig. S2F,G). Because the H atom in the N + -C-H···O hydrogen bonds with alcohol acceptors shows positional preferences in proximity to the planes, ξ = 105°-120° and θ = ~0° were estimated (Figs 4D,E and S5D,E and Supplementary Fig. S9). Based on the results of these quantum chemical calculations, we were able to establish, for the first time, selection criteria for N + -C-H···O hydrogen bonds (Table 1).

PDB survey.
To verify the existence of N + -C-H···O hydrogen bonds in protein-ligand complexes, we examined protein-ligand interactions in X-ray structures registered on the PDB. In this survey, we analyzed structures with a resolution of ≤2.80 Å. The positions of hydrogen atoms were determined using the drug discovery studio 3.5 software, as information on hydrogen positions in X-ray structures is generally not included. Therefore, the survey was conducted not only in accordance with the selection criteria established in the previous section (Table 1), but also under consideration of the following two points: (i) as the positions of the H atoms in protein structures cannot be determined accurately, we also used a carbon criteria set (C···O distance, N-C···O, and C···O=C/C···O-C angles, as well as C-elevation angles; Supplementary Table S2), which is based on the criteria shown in Table 1; (ii) H-and C-elevation angles of the interaction with the O atoms of Ser/Thr/Tyr were not surveyed, due to the difficulties associated with the determination of the position of the H atoms of the hydroxyl groups. In this study, we analyzed 159 X-ray crystal structures of complexes with ligands that contain aliphatic amines. This survey of 373 carbon atoms that are covalently bound to an aliphatic amino group in 159 ligands allowed us to identify 135 N + -C-H···O hydrogen bond interactions in 86 structures (Fig. 6, Supplementary Fig. S10, and Supplementary Tables S3-S6) and representative examples are shown in Supplementary Figs S11-S17.
The distribution of the C···O (d CO ) and H···O (d HO ) bond distances (Fig. 6A,B) shows that ~40% of the 135 interactions include distances of <3.2 Å and <2.4 Å, respectively. According to our calculations (Figs 2B, 3B, 4B, 5B and Table 1), and considering that strong hydrogen bonds (X-H···Y) exhibit bond lengths that are shorter than the sum of van der Waals radii of X and Y (sum of van der Waals radii for C and O: 3.22 Å) 31 , these N + -C-H···O hydrogen bonds can hence be classified as strong hydrogen bonds.
Finally, we focused on the elevation angles of amide and carboxylate acceptors. The H-elevation (θ) and C-elevation (φ) angles are subject to a certain degree of correlation ( Supplementary Fig. S18A). Among 65 cases of amide acceptors, we found 36 contacts with small elevation angles (θ < 30°, φ < 30°), whereas contacts of carboxylate acceptors exhibited a wide distribution of these angles (Fig. 6E and Supplementary Fig. S18A). The results of this survey are consistent with those of our calculations (Figs 2E and 3E). It should be especially noteworthy that we found 12 contacts among the Asp/Glu acceptors with θ < 30° and ξ > 240° (Supplementary Fig. S18B). According to our calculations, these contacts should be classified as especially strong N + -C-H···O hydrogen bonds, as the H atom can interact with two O atoms of the carboxylate group ( Fig. 3D and Supplementary  Fig. S17). In summary, the PDB survey allowed us to identify numerous examples of N + -C-H···O hydrogen bonds formed between proteins and their ligands. In combination with our energy calculations, these results suggest that N + -C-H···O hydrogen bonds should contribute significantly to the formation of protein-ligand complexes and to the activity of the ligand.

Experimental investigation.
To experimentally validate the importance of the N + -C-H···O hydrogen bonds in protein-ligand complexes, we investigated if the N + -C-H···O hydrogen bonds between proteins and ligands affect the activity of the ligands. Among the X-ray crystal structures in the aforementioned PDB survey, we focused on the G9a-like protein (GLP)-inhibitor complex 14a (Fig. 7A), one of histone methylransferases 32 . Its X-ray crystal structure suggests that the C-H groups adjacent to the nitrogen atom in the dimethylamino group of 14a engages in three C-H···O hydrogen bonds with two Asp residues of GLP ( Fig. 7A-C). In order to examine the potential importance of N + -C-H···O hydrogen bonds for the formation of the protein-ligand complex and its effect on the GLP-inhibitory activity of 14a, we designed and synthesized monomethylamine 14b and amine 14c, both of which lack C-H groups adjacent to the nitrogen atom, as well as alkyl compound 14d, in which the nitrogen atom of 14a is replaced by a carbon atom (Fig. 7D). Compounds 14b-d were expected to exhibit two or one, no, and no C-H···O hydrogen bonds, respectively (Fig. 7E). If the C-H···O hydrogen bonds of 14a are responsible for the GLP-inhibitory activity, the activity of compounds 14b-d should be weaker than 14a. Accordingly, 14b-d were also examined with respect to their GLP-inhibitory activity ( Fig. 7E and Supplementary Fig. S19). The parent compound 14a exhibits a dose-dependent inhibitory activity of GLP (IC 50 = 0.156 μM). Relative to 14a, 14b significantly reduced the GLP-inhibitory activity (IC 50 = 0.664 μM) in the 0.1-1 μM concentration range. In the same concentration range, 14c showed a decreased GLP-inhibitory activity (IC 50 = 1.45 μM) compared to 14a and 14b, while 14d did not show any GLP inhibition up to 3 μM. The latter result should most likely be rationalized in terms of a lack in both electrostatic interactions and C-H···O hydrogen bonds between the ligand and GLP. Moreover, we determined the dissociation constant and thermodynamic parameters of 14a-c by means of isothermal titration calorimetry (ITC). As shown in Supplementary Fig. S20, the dissociation constant of 14a-c was distinctly dependent on the number of methyl group (K D : 0.171 μM for dimethyl 14a, 0.617 μM for monomethyl 14b, 0.944 μM for non-methyl 14c), which is consistent with the IC 50 values of 14a-c. The thermodynamic parameters were also dependent on the number of methyl group: the ΔH values for dimethyl 14a, monomethyl 14b, and non-methyl 14c were −10.7 kcal/mol, −8.58 kcal/mol, and −7.04 kcal/mol, respectively; the −TΔS values for 14a, 14b, and 14c were 1.42 kcal/mol, 0.10 kcal/mol, and −1.19 kcal/mol, respectively ( Supplementary Fig. S20). The ITC data revealed that the removal of the methyl group resulted in both an unfavorable enthalpy change and a favorable entropy change. However, the change in enthalpy by the removal of the methyl group was larger than that in entropy, which led to the decreased binding affinity (increased K D and ΔG). It is well known that enthalpic forces such as hydrogen bond formation can decrease entropic forces by restricting the degrees of freedom of water molecules and protein conformation 33 . Based on this, there is a possibility that the methyl group of 14a and 14b reduced the ΔH value by the conformational fixation of GLP through the formation of N + -C-H···O hydrogen bonds, which resulted in reduction of the TΔS value by decreasing the degrees of freedom of water molecules and protein conformation, and the effect of enthalpy was larger than that of entropy. Taken together, the observed GLP-inhibitory activity, disassociation constant and thermodynamic parameters for 14a-d strongly suggests that the N + -C-H···O hydrogen bonds in the GLP-ligand complex are responsible for their GLP-inhibitory activity, although the effect might not be attributed exclusively to the N + -C-H···O hydrogen bonds.
In addition, we investigated a ligand, whose N + -CH 3 groups do not engage in N + -C-H···O hydrogen bonds with any protein-based oxygen atom. The X-ray crystal structure of tyrosine kinase with Ig and EGF homology domains-2 (tie-2) complexed to its inhibitor 15a indicates that the methylene group that is in conjugation with the aliphatic amino group forms an N + -C-H···O hydrogen bond, but the methyl groups do not (Fig. 8A-C) 34 . We furthermore prepared monomethylamine 15b, amine 15c, and alkyl compound 15d (Fig. 8D), and evaluated their inhibitory activity. As expected, we did not observe any significant difference among the IC 50 values of 15a-c (Fig. 8E), although the activity of 15d was lower than that of 15a-c. In their entirety, the experimental results (Figs 7 and 8) highlight the importance of the N + -C-H···O hydrogen bonds in protein-ligand complexes and for the activity of the ligands.

Conclusions
Herein, we have demonstrated for the first time the significance of N + -C-H···O hydrogen bonds in protein-ligand complexes by establishing selection criteria based on quantum chemical calculations, a PDB survey, and experimental investigations. Our calculations revealed that hydrogen bonds can be formed between the H atom of N + -C-H groups and the oxygen atoms of amides, carboxylates, or alcohols. The low interaction energies of these bonds are comparable to heteroatom-hydrogen bonds, π/π interactions, cation/π interactions, or CH/π interactions, all of which are routinely considered in drug design. Based on the geometric analysis of the quantum chemical calculations, we theoretically established selection criteria for the N + -C-H···O hydrogen bonds. A PDB survey of X-ray structures based on the thus obtained criteria revealed that numerous proteins-ligand complexes contain such N + -C-H···O hydrogen bonds. Thus, we used a simple method to qualitatively and comprehensively estimate N + -C-H···O hydrogen bonds based on a minimum of information required, although more detailed data such as environmental interactions should be helpful for further understanding of N + -C-H···O hydrogen bonds. Finally, we experimentally corroborated our hypothesis that the presence and magnitude of the N + -C-H···O hydrogen bonds strongly affects the activity of the proteins-ligand complexes. The results of this study should thus help to further the understanding of ligand recognition by proteins, which should be beneficial for the drug design.

Methods
Quantum chemical calculations. All calculations were performed using the Gaussian 09 package 35 .
Structural optimizations and single-point calculations were carried out using the M06-2X variant of density functional theory (DFT) 36 , which is often used for long distance interactions, with the 6-311++G** basis set in the gas phase or in the water phase 37 . Structures were considered as minima in case of all harmonic frequencies being positive. All interaction energies were corrected for basis set superposition errors (BSSE) by the counterpoise procedure 38 . To gauge the accuracy of the M06-2X/6-311++G** level of theory, the thus obtained binding energies were compared to those obtained from MP2/aug-cc-pVTZ (Supplementary Table S7). Natural bond orbital (NBO) 39,40 analyses were performed via the procedures contained within Gaussian 09. Electrostatic potential surfaces were created using the GaussView 5.0 software package. The electrostatic potential for each structure was mapped onto a total electron density surface. The borders for the criteria were determined based on the energy/ angle slope and the lowest energy; the borders were set at the point where the slope is >1.5% of the lowest energy of each curve, or where the energy is <60% of the lowest energy of each curve.  H···O=C-O(N) and H···O=C-C. Similarly, the C-elevation angle (φ) was calculated by sinφ = sinβ · sinδ, whereby δ refers to the average of the two corresponding dihedral angles C···O=C-O(N) and C···O=C-C.
GLP activity assay. The GLP activity assay was carried out using a GLP Chemiluminescent Assay Kit (Catalog #53007, BPS Bioscience, Inc.). Microwells were rehydrated by adding 200 μL of tris buffered saline with Tween 20 (TBS-T: 1x TBS, pH = 8.0, containing 0.05% Tween-20) to every well, followed by incubation at room temperature for 45 minutes. After removing TBS-T, the inhibitors were incubated in the presence of 2 μM SAM and 40 ng of GLP in the supplied buffer on the microwells of (120 min, room temperature, total volume: 50 μL).
After the enzymatic reaction, every well was washed three times with TBS-T (100 μL) and blocked for 10 min with blocking buffer. Then, 100 μL of primary anti-body solution (1:400 dilution) were added to the microwells, followed by incubation (2 h). The wells were probed with the primary antibody, washed three times with TBS-T (100 μL), incubated (2 h, room temperature) with sheep secondary anti-body (1:1000 diluted), and again washed three times with TBS-T (100 μL). The chemiluminescence of the wells, to which detection reagents were added, was measured on a chemiluminescence reader (ARVO X3 Multilabel Plate Reader), and the values of % inhibition were calculated from the chemiluminescence readings of inhibited wells relative to those of control wells.  (1N). The absorbance at 450 nm of the wells, to which detection reagents were added, was measured in a chemiluminescence reader (ARVO X3 Multilabel Plate Reader), and the values of % inhibition were calculated from the absorbance readings of inhibited wells relative to those of control wells.