Dissecting C−H∙∙∙π and N−H∙∙∙π Interactions in Two Proteins Using a Combined Experimental and Computational Approach

C−H∙∙∙π and N−H∙∙∙π interactions can have an important contribution for protein stability. However, direct measurements of these interactions in proteins are rarely reported. In this work, we combined the mutant cycle experiments and molecular dynamics (MD) simulations to characterize C−H∙∙∙π and N−H∙∙∙π interactions and their cooperativity in two model proteins. It is shown that the average C−H∙∙∙π interaction per residue pair is ~ −0.5 kcal/mol while the N−H∙∙∙π interaction is slightly stronger. The triple mutant box measurement indicates that N−H∙∙∙π∙∙∙C−H∙∙∙π and C−H∙∙∙π∙∙∙C−H∙∙∙π can have a positive or negative cooperativity. MD simulations suggest that the cooperativity, depending on the local environment of the interactions, mainly arises from the geometric rearrangement when the nearby interaction is perturbed.

suggesting a positive cooperativity. This is opposite to the findings of an earlier computational study where the negative cooperativity is concluded for the same complexes 25 . The C−H•••π and N−H•••π cooperativity in proteins remains largely unexplored.
In this work, we attempt to measure the C−H•••π and N−H•••π interactions in protein GB3 and staphylococcal nuclease (SNase). GB3 is the third immunoglobulin binding domain of protein G, a model protein that has been extensively studied 26 . SNase is an enzyme that hydrolyzes nucleotides in DNA or RNA. A stable mutant of SNase, Δ + PHS, is selected as the test system 27 (Table 1). Therefore, a total of 25 C−H•••π and 6 N−H•••π interactions were measured.
The folding free energies ΔG of all proteins were derived from the denaturation curves. The values of [D] 50% , m values for the wild type and mutant proteins are given in Supplementary Table S1. The magnitude of noncovalent interactions in the two proteins GB3 and Δ + PHS was obtained using the double mutant cycle (DMC) analysis 28 . The values of C−H•••π interactions are shown in Table 1, ranging from +0.31 (unfavorable) to −0.85 (favorable) kcal/mol, with 22 out of 25 showing favorable interactions. The three small positive interaction energies may come from the secondary interactions, i.e., the interaction changes from the surrounding residues caused by mutations (a caveat of the DMC experiment). The residual secondary interactions can contribute to the measured XH•••π energy which may change the sign of the energy (to repulsive) if it is small. The interaction energy of N−H•••π ranges from −0.15 to −0.86 kcal/mol. According to DMC, it is preferable to mutate the two side chains x and y in the X−H•••π pair to alanine residues to completely remove the interactions between the two. However, eliminating an aromatic residue in a protein core can be detrimental to protein stability. Instead, we only mutated the aromatic side chain (y) to a leucine (y′) which is still hydrophobic but disrupts the X−H•••π interaction (see more details in Materials and Methods). For the X−H component (x), conservative mutations are introduced (x′) to remove the X−H•••π interaction and maintain the protein folding at the same time. These mutations may create residual pairwise side chain interactions in x′y′, xy′, and x′y. Furthermore, for a residue like leucine (for example, in L5−F30) which has two CH3 and one CH, it can form multiple C−H•••π interactions which complicate the interpretation of the experimental results. These problems can be solved with the assist of MD simulations.
Benchmark of MD simulations. MD simulations were performed for all the experimentally measured mutants with three commonly used force fields, Amber99sb 29 , Charmm27 30 , and Gromos53a6 31 . The experimental C−H•••π and N−H•••π interaction energies were used as a benchmark to evaluate the accuracy of different force fields. The root mean square deviation (RMSD) between the experimental and predicted X−H•••π interactions was calculated: where N is 31, the total number of measured residue pairs that form X−H•••π interactions, ΔΔG exp is the experimental X−H•••π interaction energy, and ΔΔE MD is the calculated interaction energy. Charmm27 appears to perform better than the other two force fields. Its RMSD value is 0.27 kcal/mol (after removing two apparent outliers), while the RMSDs of Amber99sb and Gromos53a6 are 0.41 and 0.47 kcal/mol, respectively (Fig. 2). Thus, the trajectories produced using Charmm27 were selected for further analyses.

Geometric parameters of c−H•••π and n−H•••π interactions.
The reasonable correlation between the interaction energies from MD simulations and experiments encourages us to investigate the geometric parameters that are important for C−H•••π and N−H•••π interactions. The pairwise interaction energy ΔΔE CH3•••π between a CH3 group and a aromatic ring was calculated for all the C−H•••π interactions identified above. Two geometric parameters 15 d CX and ω are defined for the C−H•••π interaction, where d CX is the distance of the methyl carbon to the center of mass of the aromatic ring (X), and ω is the ∠C−H−X angle (Fig. 3A). Since there are three methyl hydrogens, the one with the largest ∠C−H−X angle is defined as ω. The same geometric parameters can also be defined for N−H•••π interactions (Fig. 3B). The 3D plot of (d CX , ω) versus ΔΔE CH3•••π shows that the geometries with shorter d CX and larger ω have more negative interaction energies (Fig. 3C). The distance appears to be   www.nature.com/scientificreports www.nature.com/scientificreports/ the most important parameter, with the energy dropping quickly as the distance decreases. Meanwhile, the angle ω can also be important. The average ΔE CH3•••π for all the C−H•••π interactions is −0.36 kcal/mol. The number of N−H•••π interactions is less than that of C−H•••π, and they appear to be stronger than C−H•••π interactions with the same geometric parameters. cooperativity mechanism from MD simulations. The ∆ΔΔG coop are in a good correlation with the computational ∆ΔΔE (cooperativity energy, see more details in Materials and Methods), although the absolute value of ∆ΔΔE is generally larger than that of ∆ΔΔG coop (Fig. 4). One likely cause is that the entropic contribution, which is not calculated in MD simulations, may offset the large change of ∆ΔΔE. The entropy calculation is far more difficult (less reliable) and thus not pursued. As discussed above, the residual interactions caused by the experimental non-alanine mutations complicate the interpretation of ∆ΔΔG coop . To solve this problem, we rebuilt TMBs by mutating the three side chains, for example L25, F34, and V74 in L25−F34−V74, to alanines systematically in MD simulations. The cooperativity energy ∆ΔΔE′ was calculated for the residue groups listed above with the same procedure (Fig. 5A). The cooperativity from ∆ΔΔE′ generally agrees with that from ∆ΔΔE, except that L5−Y33−T16 and I7−Y33−T16 show a weak negative instead of positive cooperativity.
The cooperativity energy ∆ΔΔE′ varies from −0.39 to 0.16 kcal/mol (Fig. 5A). Although they appear to be small, the percentagewise ∆ΔΔE′ (∆ΔΔE′ divided by the average of the two C−H•••π interactions in  can vary from −40% (cooperative) to +60% (anticooperative) (Fig. 5B). So it is obvious that cooperativity can be very important for C−H•••π and N−H•••π interactions in an interaction network. To further understand the origin of cooperativity, the geometric changes in the TMB are investigated. It is known that d CX or d NX (Fig. 3) is an important parameter for C−H•••π or N−H•••π. Using L5−Y33−T16 as an example, Δd, the change of d CX , was calculated by

Discussion
DMC experiments are commonly used to measure residue−residue interactions, such as salt bridges and hydrogen bonds 32,33 . However, measuring C−H•••π interactions in the protein interior using DMC can be challenging because removing an aromatic side chain can destabilize and even unfold the protein. In this work, we only mutate the aromatic residue to leucine which maintains the protein folding and removes the C−H•••π interaction. Two very stable proteins GB3 and Δ + PHS were selected for the purpose. One caveat of the F or Y to L mutation is that residual interactions with leucine complicate the data interpretation. Molecular dynamics simulations were used to decompose the various contributions and help us focus on the C−H•••π interactions. The good agreement between experimental and computational interaction energies validates the procedure which provides important insights about the C−H•••π and N−H•••π interactions.
The energy of C−H•••π interactions obtained from the DMC experiments of two proteins in this work is smaller than ~ −0.9 kcal/mol, with an average of ~ −0.5 kcal/mol. This C−H•••π interaction strength is generally weaker than those reported for small molecules [17][18][19][20][21] . It is likely that different interactions compete with each other in proteins so that the C−H•••π interaction of a specific residue pair is not in an optimum geometry. This is evident from the interaction energy landscape of methyl−aromatic ring pair (Fig. 3). The lower corner, with d CX of ~0.4 nm and ω of ~165°, has the lowest interaction energy in the plot. But many C−H•••π pairs are clustered around d CX of ~0.4−0.6 nm and ω of ~120°−150°. The optimal d CX of 0.4 nm is close to the distance obtained from the quantum mechanical calculations 9 . For C−H•••π pairs with larger d CX , the C−H group moves away from the top of the aromatic ring to form a side-by-side configuration which has an optimal d CX of ~0.5 nm, as suggested from the QM calculations 9 . The non-optimum geometry also implies that different C−H•••π interactions with the same aromatic ring are interdependent. A small perturbation of one C−H•••π pair may affect the geometry of another C−H•••π nearby which creates the cooperativity effect.
The cooperativity analysis from TMB clearly suggests that the C−H•••π…C−H•••π and C−H•••π…N−H•••π can be either cooperative or anticooperative (Fig. 4). Although in the experimental TMB analysis, the cooperativity information is contaminated by the residual interactions in the mutants, the computational TMB analysis where the residual interactions are removed suggests that the side chain C−H•••π and N−H•••π interactions have a major contribution to the experimentally determined ∆ΔΔG (Figs. 4 and 5). Moreover, the d CX or d NX distance change Δd is an important indicator for the cooperativity. But when comparing the computational cooperativity energy ∆ΔΔE′ and Δd, the linear correlation between the two is only moderate, suggesting that the distance change is not the only contributor to the cooperativity change. The change of angles such as ω may also play a role.
Two simpler cooperativity models were built using two methane and one benzene molecules, with methanes on the same side (MMB) or opposite side (MBM) of the benzene. The cooperativity energies of MMB and MBM models were calculated at the MP2/aug-cc-pvtz level 34 . According to the quantum mechanical (QM) calculations, the cooperativity energy of MMB is 0.74 kcal/mol, indicating that C−H•••π…C−H•••π is anticooperative in this www.nature.com/scientificreports www.nature.com/scientificreports/ model, while the cooperativity energy of MBM is 0.03 kcal/mol, suggesting that there is no cooperativity in this model. Similar to the result in the MD simulations, the geometric reorganization occurs in the MMB model where the two methanes compete for the binding site. No such competition exists in the MBM model where the cooperativity energy is close to zero. The QM calculations highlight the importance of geometric reorganization to cooperativity.

conclusion
In this study, we measured the strength of C−H•••π and N−H•••π interactions in GB3 and SNase. The C−H•••π interaction is about 0.3 to −0.9 kcal/mol whereas the N−H•••π interaction is about −0.2 to −0.9 kcal/mol. The energy decomposition from MD simulations helps determine the C−H•••π and N−H•••π interactions for individual methyl−aromatic and amino−aromatic pairs and identify important geometric parameters d C(N)X and ω. The experimental TMB analysis suggests that the cooperativity of X−H•••π interactions can be either positive or negative, depending on the local environment. The cooperativity trend is successfully captured by MD simulations where the cooperativity energy can reach ~ −40% to 60% of C−H•••π or N−H•••π interactions, highlighting its importance in proteins. The geometric rearrangement is the main cause for the cooperative interactions. It is worth noting that the C−H•••π and N−H•••π interactions and the cooperativity were only measured for two proteins GB3 and Δ + PHS. More measurements will be needed to see whether the conclusions also hold for other proteins. But we expect that the mechanism behind the interactions is universal for all protein molecules.

Materials and Methods
Protein expression and purification. The wild type and mutants of GB3 and Δ + PHS were prepared with the PCR-based site-directed mutagenesis on vector pET-11b. These plasmids were transformed into the E. coli strain BL21 (DE3) cells for protein expression. The purification procedure for GB3 and its variants has been described previously 35 . Δ + PHS and its variants were purified using the same procedure as described by Shortle and Meeker 36 . thermodynamic stability measurements. All the denaturation measurements were performed using a HITACHI f-4600 fluorescence Spectrophotometer. Mixtures consisted of up to 6.0 M GdnHCl and 50 µM proteins (final concentration) were incubated for 30 min at 30 °C. The signal intensity at 340 nm for GB3 and 348 nm for SNase was extracted and fitted using the following equation, where S is the measured Fluo 340nm or Fluo 348nm , α N and α U are the intercepts and β N and β U are the slopes of the www.nature.com/scientificreports www.nature.com/scientificreports/  42 (or SPC water when the Gromos53a6 force field was used) in a rectangular box, and counter ions were used to neutralize the system. 500,000 steps of energy minimization followed by 1 ns MD simulation at constant pressure (1 atm) and temperature (303 K) were performed to equilibrate the system before the production running. Three 10 ns MD production runs with different random starting velocities were performed with snapshots saved every 50 ps which were then used in the data analysis and error estimation. All backbone heavy atoms are restrained in the equilibrium and production runs. Temperature was regulated by a modified Berendsen thermostat 43 and pressure was controlled by the extended ensemble Parrinello-Rahman approach 44,45 . The long-range electrostatic interactions were evaluated by the Particle mesh Ewald method 46,47 . The nonbonded pair list cutoff was 10 Å and the list was updated every 10 fs. The LINCS algorithm 48 was used to constrain all bonds linked to hydrogen in the protein, whereas the SETTLE algorithm 49 was used to constrain bonds and angles of water molecules, allowing a time step of 2 fs. In the energy decomposition analysis, only the interaction energy between the paired residues of C−H•••π or N−H•••π was calculated. The computational interaction energy ΔΔE was calculated by,