The nucleophilic amino group of lysine is central for histone lysine methyltransferase catalysis

Histone lysine methyltransferases (KMTs) are biomedically important epigenetic enzymes that catalyze the transfer of methyl group from S-adenosylmethionine to lysine’s nucleophilic ε-amino group in histone tails and core histones. Understanding the chemical basis of KMT catalysis is important for discerning its complex biology in disease, structure-function relationship, and for designing specific inhibitors with therapeutic potential. Here we examine histone peptides, which possess simplest lysine analogs with different nucleophilic character, as substrates for human KMTs. Combined MALDI-TOF MS experiments, NMR analyses and molecular dynamics and free-energy simulations based on quantum mechanics/molecular mechanics (QM/MM) potential provide experimental and theoretical evidence that KMTs do have an ability to catalyze methylation of primary amine-containing N-nucleophiles, but do not methylate related amide/guanidine-containing N-nucleophiles as well as simple O- and C-nucleophiles. The results demonstrate a broader, but still limited, substrate scope for KMT catalysis, and contribute to rational design of selective epigenetic inhibitors. Lysine methyltransferases (KMTs) play important roles in epigenetics. Here human KMTs are shown through a combination of experimental and computational methods to methylate non-natural histone peptides containing various non-lysine nucleophiles.

H istone proteins undergo numerous posttranslational modifications, which lead to a dynamic and complex epigenetic landscape that regulates the activity of genes in humans and other eukaryotes [1][2][3] . N ε -methylation of lysine residues has been found on unstructured N-terminal histone tails as well as on core histones, and could lead to transcriptional activation or repression, depending on the locations of methylated lysine residues and methylation states [4][5][6][7] . As established for most histone posttranslational modifications, lysine methylation is a dynamic process that is regulated by three classes of epigenetic proteins, i.e., histone lysine methyltransferases (KMTs), histone lysine demethylases (KDMs), and N ε -methyl lysine binding epigenetic proteins 1 . Histone N ε -lysine methylation is catalyzed by S-adenosylmethionine (SAM)-dependent histone lysine methyltransferases, which also catalyze the methylation of many non-histone proteins of biomedical importance (Fig. 1a) 8 . KMTcatalyzed methylation of lysine residues can produce monomethyllysine (Kme), dimethyllysine (Kme2), and trimethyllysine (Kme3) residues with an unaltered positive charge and an increased hydrophobicity (Fig. 1a) 5,9,10 . With the exception of DOT1L, all characterized KMTs contain a highly conserved SET (Su(var)3-9, enhancer-of-zeste, trithorax) domain, which is essential for the enzymatic activity 8,11 .
Mechanistic and structural studies revealed basic molecular requirements for KMT-catalyzed N ε -lysine methylation of histones and other proteins 8,[11][12][13][14][15][16] . Binding of SAM cosubstrate to a methyltransferase proceeds before histone substrate association, which induces a significant conformational change of the post-SET (located at the C terminus of SET) domain. Upon the formation of the tertiary complex, the lysine side chain occupies a narrow, hydrophobic channel, most often comprised of side chains of Phe and Tyr residues (Fig. 1b, c). The nucleophilic εamino group of lysine is positioned toward the electrophilic methyl group of SAM (Fig. 1b, c). Although lysine exists in a protonated form at physiological conditions, an active-site Tyr or a water channel have been proposed to act as a general base for deprotonation of the ammonium ion of lysine, thus leading to a neutral form of lysine with a strong nucleophilic character 17,18 . Examinations of the active-site variants demonstrated that Phe and Tyr residues control the final methylation state of lysine 15 . Kinetic isotope measurements provided evidence that the transfer of the methyl group from SAM to N ε -lysine takes place via the S N 2 reaction 19,20 .
To understand the mechanism of KMT-catalyzed SAMdependent methylation of lysine residues at an unprecedent level of detail, recent work explored the KMT active-site residues that possibly play an important role in enzymatic catalysis 15 . Moreover, examinations of sterically demanding SAM analogs as cosubstrates for recombinantly expressed KMTs, and in cellular assays showed that KMTs not only catalyze the methylation reaction, but also related alkylation reactions [21][22][23] . At present, however, it is still unknown whether KMTs do have an ability to catalyze methylation of lysine analogs that possess nucleophilic groups other than the N ε amino group. Here we report integrated experimental and computational studies on human KMTcatalyzed methylation of the simplest lysine analogs that possess  Fig. 1 Histone lysine methyltransferase-catalyzed methylation of lysine. a Methylation of lysine residue by histone lysine methyltransferases using SAM as a methyl donor. b View on the crystal structure of SETD8 with H4K20 peptide (magenta) and SAH (yellow). c View on the crystal structure of GLP with H3K9me peptide (magenta) and SAH (yellow). d Unnatural lysine analogs possessing functionalities with a different nucleophilic character N-, O-, and C-nucleophiles (Fig. 1d). The results demonstrate that, whereas O-and C-nucleophiles are not methylated, some of the N-nucleophiles can be substrates for KMTs.

Results
Selection of lysine analogs. Lysine's structural elements comprise L-stereochemistry, four methylene groups, and the nucleophilic and basic N ε amino group at the end of the side chain. Our recent studies demonstrated that the L-stereochemistry, the chain length, and the main chain all determine a degree of KMT-catalyzed methylation, highlighting that lysine possesses the optimal L-configuration and the length of side chain for efficient KMTcatalyzed methylation [24][25][26][27] . The central question that needs to be addressed concerns the importance of the nucleophilic character of lysine's N ε amino group for KMT catalysis. Therefore, determination of the effects on the methyl transfer due to simplest chemical perturbations on the lysine's side chain, while maintaining the L-configuration and the same chain length, would advance our fundamental understanding of KMT catalysis and form the basis for designing chemical probes for KMTs. Our panel of lysine analogs includes: (i) Primary N-nucleophiles K aza and K oxy ; (b) Resonance-stabilized N-nucleophiles hGln and nArg; (c) O-nucleophiles K OH and K COOH ; and (d) Cnucleophiles K alkyne and K alkene (Fig. 1d). The synthesis of the required building block Fmoc-K aza (Boc) 2 -OH 1 was based on an established protocol (Fig. 2) 28 . Perbenzylation of L-glutamic acid 9 to intermediate 10, followed by selective DIBAL-H-mediated reduction of the side chain ester to alcohol afforded 11 in 87% yield. Subsequently, Swern oxidation furnished the corresponding amino aldehyde 12, which was then treated with tert-butylcarbazate to give the intermediate hydrazone 13 in excellent 93% yield. Next, sodium cyanoborohydride was used to reduce hydrazone to hydrazide 14 in 87% yield. The synthesis was continued with the Boc-protection of the ε-nitrogen yielding compound 15, deprotection of benzyl groups, and a final Fmoc-protection to generate the appropriate unnatural building block 1 (77% yield in two steps). The synthetic route for the preparation of Fmoc-K oxy (Boc)-OH 2 followed the same first two steps as for 1, according to previously described protocol (Fig. 2) 29 . The preparation of the hydroxylamine-containing amino acid proceeded through the activation of alcohol with N-hydroxyphthalimide employing Mitsunobu reaction to give compound 16. Deprotection of the phthalimide to produce a benzylated oxylysine, followed by installation of Boc group afforded 17 in 93% yield. Removal of the benzyl groups, followed by direct Fmoc-protection at the α-amino group, yielded the final Fmoc-K oxy (Boc)-OH 2 in 66% yield. Benzylation of carboxylic group of Fmoc-allylglycine-OH 18, followed by olefin cross metathesis with allyl-tert-butyl ether, produced alkene 19 in 41% yield. A subsequent hydrogenation step furnished the novel unnatural building block 3 in 48% yield. Similarly, the synthesis of the compound 4 was accomplished in three steps from 18 (Fig. 2). Benzylation of carboxylic acid 18, followed by the cross metathesis of the allyl side chain with tertbutyl acrylate yielded product 20. Subsequently, reduction of the resulting alkene via catalytic hydrogenation with the concomitant generation of the free acid liberated the building block 4 in excellent 87% yield. The building blocks 5, 6, 7, and 8 were commercially available.
To examine whether the unnatural amino acids are methylated by human KMTs, histone peptides that contain the above eight lysine analogs (Fig. 1d) were synthesized by solid-phase peptide synthesis (SPPS) (H4 residues 13-27, GGAKRHRX 20 VLRDNIQ; H3 residues 1-15, ARTKQTARX 9 STGGKA). We also synthesized 15-mer H4K20me for NMR studies and 14-mer H3K9 for competition experiments in MALDI-TOF MS studies. All histone peptides were obtained in high purity by preparative HPLC, and all purified histone peptides were further examined by analytical HPLC and ESI-MS (Supplementary Figs. 1-12 and Supplementary Tables 1 and 2).
Examining KMT-catalyzed methylation by MALDI-TOF MS. Histone peptides that bear lysine and its unnatural analogs were examined as substrates for KMTs employing MALDI-TOF MS assays. We previously established the standard conditions at which the natural histone sequences are efficiently methylated by various human KMTs 24 . At standard conditions, a mixture of KMT enzyme (2 µM), H3/H4 histone peptide (100 µM), SAM (200 μM for monomethylation and 500 μM for trimethylation), buffered in 50 mM Tris at pH 8.0 was incubated for 1 h at 37°C, and then examined by MALDI-TOF MS. SETD8 (SET domain containing lysine methyltransferase 8) catalyzed almost quantitative monomethylation of H4K20 (Fig. 3a, top panel, and Supplementary Fig. 13), whereas G9a and GLP (G9a-Like Protein) catalyzed the predominant formation of H3K9me3 (Fig. 3b, c, top panels, and Supplementary Figs. 14 and 15). Initially, we examined whether SETD8 could catalyze the methylation of the H4K aza 20 and H4K oxy 20 peptides. In the presence of SETD8 and SAM, these two peptides underwent quantitative monomethylatation (H4K aza 20me and H4K oxy 20me), similar to the H4K20 peptide ( Fig. 3a and Supplementary Fig. 13). To verify that the formation of monomethylated species is due to the enzyme activity in the presence of SAM, control experiments were carried out in the absence of SETD8 and SAM with both unnatural substrates. As expected, the results manifested that no detectable methylation was observed (Supplementary Figs. 16 and 17). Enzyme kinetics analyses revealed that H4K20, H4K aza 20 and H4K oxy 20 exhibit similar kinetic parameters for SETD8-catalyzed methylation reactions ( Supplementary Fig. 18). Measured substrate efficiencies (k cat /K m ) were: 38.5 mM −1 min −1 for H4K20, 38.9 mM −1 min −1 for H4K aza 20, and 30.3 mM −1 min −1 for H4K oxy 20 (Table 1). Both k cat values (4.14-4.89 min −1 ) and K m values (114-137 μM) were comparable for H4K20, H4K aza 20, and H4K oxy 20, indicating similar substrate binding affinity and efficiency of the methyl transfer reaction (Table 1). We subsequently screened the remaining six peptides as possible substrates for SETD8 ( Supplementary Fig. 19). H4nArg20 (positively charged) and H4hGln20 (neutral) did not undergo SETD8-catalyzed methylation within limits of detection (<5%), demonstrating that not all N-nucleophiles act as substrates for KMTs. Next, enzymatic reactions were undertaken to determine whether the replacement of the positively charged N ε -amino group of lysine by simplest O-nucleophiles (i.e., -OH and -COOH) leads to SETD8-catalyzed methylation. Nevertheless, no O-methylation was detected within detection limits by MALDI-MS. Similarly, the electron-rich triple and double bonds did not react with SAM in the presence of SETD8. Moreover, a prolonged incubation (1 and 5 h) in the presence of high concentration of SETD8 (10 µM) and SAM (1 mM) also did not lead to observable formation of methylated products (Supplementary Figs. 20 and 21). Attention was then focused on the examination of H3K aza 9 and H3K oxy 9 as possible substrates for G9a and GLP. Strikingly, H3K aza 9 underwent predominant formation of trimethylated product H3K aza 9me3 in the presence of G9a; the same methylation level was observed with the natural H3K9 forming H3K9me3 under standard conditions (Fig. 3b,   Competition studies were then carried out to determine whether the eight lysine analogs that display different nucleophilic character do inhibit G9a-catalyzed trimethylation of the 14mer H3K9 peptide. The examination was performed in the presence of equimolar amounts of each of 15-mer histone peptides that possess unnatural lysine analogs and the 14-mer H3K9 natural sequence. The degree of methylation of 14-mer H3K9 during the competition experiments was compared to a control sample in the absence of competing peptide. Experiments revealed that all these analogs were able to bind the active site of G9a, thus leading to a partial inhibition G9a, as manifested by a reduced intensity of 14-mer H3K9me3 and increased intensities of H3K9me2 and H3K9me, and in some cases H3K9 signals . Following these competition experiments, we carried out additional inhibition studies aimed at providing IC 50 values. All unmethylated histone peptides were screened for inhibition at 100 µM concentration in the presence of G9a or GLP (100 nM), SAM (20 µM) and 14-mer H3K9 peptide (5 µM) in glycine assay buffer pH 8.8 at 37°C for 35 min. The initial rates of 14-mer peptide trimethylation in the samples were compared to a control sample in the absence of unmethylated histone peptides. MALDI-TOF MS data showed no significant (IC 50 > 100 µM) inhibition of 14-mer H3K9 peptide trimethylation by the unnatural histone peptides within the Examining KMT-catalyzed methylation by NMR spectroscopy. After investigating the methylation of different nucleophilic analogs of lysine in the presence of SETD8, G9a, and GLP, we then carried out detailed NMR studies to provide additional information about the level and site of methylation. Before NMR analysis of SETD8-and G9a-catalyzed methylation of lysine analogs, all synthetic peptides were fully characterized by 1D and 2D NMR analyses . 1 H NMR spectrum of reaction mixture that contains the H4K20 peptide (400 μM), SAM (2 mM), and SETD8 (8 μM) in Tris-D 11 buffer (50 mM, pD 8.0) showed a downfield shift for the CH 2 ε (2.94 ppm) of K20, a characteristic new singlet (2.62 ppm) for NMe, and a triplet for SAH (2.61 ppm) ( Fig. 4a and Supplementary Fig. 50). This result is consistent with previously reported data on SETD7-catalyzed monomethylation of H3K4 (ref. 30 ). The Heteronuclear Single Quantum Coherence (HSQC) NMR experiment showed that chemical shifts change upon the installation of the methyl group on N ε . 1 H NMR resonance at 2.94 ppm ( 13 C: 48.3 ppm) is lysine CH 2 ε, and the 1 H resonance at 2.62 ppm ( 13 C: 32.0 ppm) is lysine NMe (Fig. 4d); these results were further supplemented with Total Correlated Spectroscopy (TOCSY) and Heteronuclear Multiple Bond Correlation HMBC analyses ( Supplementary Fig. 51). To provide unambiguous evidence for the methylation of H4K20 by 8   We then performed the analysis of methylation of H4K aza 20 and H4K oxy 20 by SETD8. The appearance of new singlet resonances at 2.48 ppm for H4K aza 20 and 2.55 ppm for H4K oxy 20 indicated that SETD8 catalyzed methylation of both lysine analogs (Fig. 4b, c, respectively). 1 H-13 C HSQC spectrum revealed that the 1 H resonance at 2.76 ppm ( 13 C: 47.8 ppm) is H4K aza 20me CH 2 δ, and the 1 H resonance at 2.48 ppm ( 13 C: 35.1 ppm) is NMe of H4K aza 20me (Fig. 4e). The site of methylation was confirmed by HMBC analysis; NMe resonance at 2.48 ppm did not show correlation with CH 2 δ at 47.8 ppm ( Supplementary  Fig. 55). A correct coupling network for H4K aza 20me was supported by TOCSY analysis (Supplementary Fig. 55). 1H-13 C HSQC spectrum showed that the 1 H resonance at 3.65 ppm ( 13 C: 71.9 ppm) is H4K oxy 20me CH 2 δ, and the 1 H resonance at 2.55 ppm ( 13 C: 37.6 ppm) is NMe of H4K oxy 20me (Fig. 4f). The NMe resonance at 2.55 ppm did not show correlation with CH 2 δ at 71.9 ppm in HMBC spectrum, while the TOCSY spectrum further supported the coupling network for H4K oxy 20me ( Supplementary Fig. 56). The integral ratio between the arginine CH 2 δ at 3.11 ppm and the NMe resonance at 2.48 ppm (H4K aza 20me)/2.55 ppm (H4K oxy 20me) was 6:3, whereas the ratio between arginine CH 2 δ and SAH-CH 2 γ at 2.61 ppm was 6:2. These results suggest that the methylation of H4K aza 20/H4K oxy 20 by SETD8 is quantitative and also that the conversion of SAM to SAH is tightly coupled with the enzymatic process (Supplementary Figs. 57 and 58).
Next, G9a-catalyzed methylation of H3K9 peptides was investigated by NMR. We first performed the reaction of H3K9 with G9a and SAM. The NMR results of the reaction mixture that contains G9a (8 μM), the H3K9 peptide (400 μM), and SAM (2 mM) in Tris-D 11 buffer (50 mM, pD 8.0) after 1 h at 37°C, largely mirrored the results from experiments using functionally related GLP ( Supplementary Fig. 6a, d, and Supplementary  Fig. 59) 25 . This observation was further supported by 2D TOCSY and HMBC analyses ( Supplementary Fig. 60). Under the same conditions, we then investigated the site and level of G9acatalyzed methylation of H3K aza 9 and H3K oxy 9. 1 H NMR spectrum of the enzymatic reaction of H3K aza 9 showed a new singlet resonance at 3.25 ppm (9H) and a triplet resonance at 2.61 ppm (2H), which corresponds to SAH-CH 2 γ. In agreement with MALDI-TOF MS data, this result implies that H3K aza 9 underwent trimethylation by G9a (Fig. 5b). The 1 H-13 C HSQC spectrum showed chemical shift change observed at 2.98 ppm ( 13 C: 42.1 ppm, CH 2 δ), and the 1 H resonance at 3.25 ppm ( 13 C: 54.1 ppm, NMe3) derived from H3K aza 9me3 (Fig. 5e). Additional HMBC and TOCSY analyses confirmed that G9a-catalyzed trimethylation of K aza occurs on the terminal amine (Supplementary Fig. 61). The NMR spectrum of G9a-catalyzed methylation of H3K oxy 9 showed a singlet resonance at 2.52 ppm (6H) and a triplet resonance at 2.61 ppm (2H) (SAH-CH 2 γ) ( Fig. 5c and Supplementary Fig. 62 (Fig. 5f). HMBC analysis did not show correlation between the NMe2 resonance at 2.52 ppm and CH 2 δ at 70.4 ppm, and TOCSY showed the proper coupling network (Supplementary Fig. 63). After confirming that H4K20, H4K aza 20, and H4K oxy 20 peptides, as well as H3K9, H3K aza 9, and H3K oxy 9 peptides, underwent methylation reaction in the presence of SETD8 and G9a, respectively, we carried out additional 1 H NMR analyses with histone peptides that bear other six unnatural lysine analogs. The absence of new characteristic resonances (singlets) in the spectra of enzymatic reactions with these analogs and a lack of an indicative triplet for SAH-CH 2 γ at 2.61 ppm implies that lysine analogs that bear N-amide, N-guanidine, O-nucleophiles, and    Figs. 64-69). Collectively, our NMR observations indicate that H4K20, H4K aza 20, and H4K oxy 20 act as substrates for SETD8, and that H3K9, H3K aza 9, and H3K oxy 9 are substrates for G9a. Other N-, O-, and C-nucleophiles in our panel of simplest lysine analogs, however, were not methylated by SETD8 and G9a. It is worth stressing that these NMR findings are in complete agreement with results from our MALDI-TOF-based assays.
Quantum mechanics/molecular mechanics studies. Computer simulations can provide additional information concerning the energetic and structural origins of the KMT's activities on different lysine analogs. Previous computational studies have shown that the free-energy barriers for the methyl transfers are the key determinants for the product specificity and on whether the enzymes are active or not for catalyzing certain methylation processes 17 . The QM/MM MD and free-energy simulations were performed for SETD8 (GLP) complexed with H4K20 (H3K9) and the analogs containing the N-nucleophiles. The free-energy profiles for the first, second, and third methylation reactions in SETD8 involving lysine, methyllysine, and dimethyllysine in histone substrates, respectively, are plotted in Fig. 6a. The general trend of the free-energy barriers for the SETD8-catalyzed methyl transfers obtained here is quite similar to that obtained in our previous study 31 ; i.e., the free-energy barrier increases by about 8 kcal mol −1 in Fig. 6a and by 6.5 kcal mol −1 in the earlier study, respectively, in going from monomethylation to dimethylation. Both results suggest that the enzyme is a monomethyltransferase. Some structural information obtained from the simulations is also provided here. The average active-site structure of the reactant complex for the first methylation is given in Fig. 6b (Supplementary Fig. 70), which shows that the active-site structure has the lone pair of electrons on N ε of the target lysine (based on the sp 3 hybridization) well aligned with the transferable methyl group of SAM with a relatively short r(C M ···N ε ) distance (~3.4 Å). Supplementary Fig. 71c shows that, for the reactant complex of the third methylation, the average distance between N ε and the methyl group (∼4.4 Å) became significantly larger compared to that for the first methylation, and the S-CH 3 group of SAM cannot be well aligned with the lone pair of electrons on N ε for the third methyl transfer. Thus, the efficiency of the corresponding methyl transfer is likely to be significantly compromised. This conclusion is consistent with the results in Fig. 6a, which shows that the freeenergy profile before reaching the transition state (TS) is shifted to the left of that of the first methylation and that the free-energy barrier for the third methylation is increased by~5 kcal mol −1 .
The similar discussions can be made for the second methylation reaction involving the monomethyllysine substrate (Supplementary Fig. 71a). Figure 6c shows that there is a strengthening for the interactions with the substrate near the transition state, which may contribute to lowering the free-energy barrier for the methyl transfer.
The free-energy profile for the methylation reaction involving H4K aza 20 in SETD8 is given in Fig. 6d, which shows that the freeenergy profile for the methylation reaction has a free-energy barrier of 18.0 kcal mol −1 . The free-energy profile for the methylation reaction involving H4K oxy 20 in SETD8 is given in Fig. 6g. Comparison of Fig. 6a, g shows that there is a small increase in the free-energy barrier (by about 2 kcal mol −1 ) in going from K20 to K oxy 20 for the first methylation reaction. Although K oxy is an α-nucleophile, this result seems to be consistent with the fact that the oxygen atom in K oxy 20 is more electronegative than the carbon atom and therefore has a higher tendency to draw electrons from the neighboring N atom. This would lead to a decrease of nucleophilicity of NH 2 and make it more difficult to accept the methyl group from SAM. Nevertheless, the increase of the barrier is rather small, and this is consistent with the enzyme kinetics data given above (Table 1), which showed that monomethylation reaction can occur for K20, K aza 20, and K oxy 20. Figure 6h shows that the active-site structure for the reactant complex with K oxy 20 is quite similar to that with K20 (Fig. 6b). For instance, the distances between the methyl donor (C M ) and acceptor (N) are 3.36 Å and 3.37 Å for the cases involving K oxy 20 and K20, respectively, and in both cases the lone pair of electrons on N ε is well aligned with the methyl group of SAM. Tyr245 forms a relatively stronger hydrogen bond with the ε-amino group of K oxy 20 (i.e., a hydrogen bond distance of 3.06 Å compared to 3.20 Å in Fig. 6b). This observation is consistent with the suggestion that there is a decrease of nucleophilicity for NH 2 (i.e., H may carry a more positive partial charge and be able to form a stronger hydrogen bond). The structure near the transition state with K oxy 20 (Fig. 6i) is also quite similar to that with K20 (Fig. 6c). It should be pointed out that the changes of reactivity measured experimentally or free-energy barriers determined computationally can be the results of changing different factors, including, but not limited to, the alternations of electronic structures among different substrates (leading to different intrinsic reactivity) and the structural fits of substrates to the active sites. The free-energy profiles for the first methylation reactions involving H4hGln20 (Fig. 6j) and H4nArg20 in SETD8 (Fig. 6m, Supplementary Fig. 72), respectively, show that the free-energy barriers are very high (e.g., 43 kcal mol −1 for H4hGln20), suggesting that the methylation reactions cannot occur for these two substrate analogs, consistent with the experimental observations. Fig. 6 Computational analyses on SETD8-catalyzed methylation. a Free-energy (potential of mean force) profiles for the first, second, and third methylation reactions in SETD8 involving K, Kme, and Kme2, respectively, as a function of the reaction coordinate [R = r(C M ···S δ )r(C M ···N ε )]; the designation of C M , S δ , and N ε is shown in Fig. 6b. First methylation: blue line with a free-energy barrier of 19.4 kcal mol −1 ; second methylation: gray line with a free-energy barrier of 27.5 kcal mol −1 ; third methylation: orange line with a free-energy barrier of 24.1 kcal mol −1 . b Representative active-site structure of the reactant complex of SETD8 for the first methylation containing SAM and lysine. Non-relevant hydrogen atoms are not shown for clarity. SETD8 is shown in sticks, and SAM and lysine are in balls and sticks. Some average distances are given in Å. c Representative active-site structure near the transition state of the SETD8 complex for the first methylation of H4K20. d Free-energy profile for the first methylation reaction involving H4K aza 20. e Representative active-site structure of the reactant complex of SETD8 for the first methylation containing SAM and H4K aza 20. f Representative active-site structure near the transition state of the SETD8 complex for the first methylation of H4K aza 20. g Free-energy profile for the first methylation reaction involving H4K oxy 20. h Representative activesite structure of the reactant complex of SETD8 for the first methylation containing SAM and H4K oxy 20. i Representative active-site structure near the transition state of the SETD8 complex for the first methylation of H4K oxy 20. j Free-energy profiles for the first methylation reactions involving H4hGln20. k Representative active-site structure of the reactant complex of SETD8 for the first methylation containing SAM and H4hGln20. l Representative active-site structure near the transition state of the SETD8 complex for the first methylation of H4hGln20. m Free-energy profiles for the first methylation reactions to N η1 involving H4nArg20. n Representative active-site structure of the reactant complex of SETD8 for the first methylation containing SAM and H4nArg20. o Representative active-site structure near the transition state of the SETD8 complex for the first methylation of H4nArg20 The free-energy profiles for the first, second, and third methylation reactions in GLP involving K, Kme, and Kme2 as the substrates, respectively, are given in Fig. 7a. The free-energy barriers for the three methylation reactions are quite similar and all rather low. For instance, the free-energy barriers of the first, second and third methylation are 17.0, 17.8 and 17.0 kcal mol −1 , respectively. The similar and low free-energy barriers indicate that all the three methylation reactions occur. The results are consistent with the experimental observations from this work, but different from some previous computational investigations. In our earlier study 32 , we found that GLP could only produce monoand dimethyllysine products. Nevertheless, the earlier simulations were based on older SCC-DFTB parameters with an empirical scaling and two different X-ray structures for different methylation reactions (which might lead to some inconsistency). Interestingly, the active-site structure for the reactant complex   of the third methylation (Fig. 7b, Supplementary Fig. 73) seems to be rather similar to that obtained in the earlier simulations, and the both structures obtained here and earlier showed that the lone pair of electrons could not be well aligned with the transferable methyl group, with an average r(C M ···N ε ) distance of about 4.1-4.5 Å. The existence of the similar free-energy barriers in Fig. 7a for all the three methylation reactions in GLP suggests that some additional transition-state stabilization may exist for the third methylation reaction (to offset the poor reactant structure for the third methyl transfer) ( Supplementary Fig. 74). Figure 7c shows the active-site structure near the transition state is stabilized through strengthening the CH···O interactions as well as by the presence of cation-π interactions involving F1209 and Y1124. A similar explanation has been used to understand the substrate/product specificities of Suv4-20h2 (ref. 33 ).
The free-energy profiles for the first, second, and third methylation reactions in GLP involving H3K aza 9, H3K aza 9me, and H3K aza 9me2, respectively, are shown in Fig. 7d; the activesite structures for the reactant complexes of the second and third methylation reactions are given in Fig. 7e, f, respectively ( Supplementary Fig. 75). The free-energy barriers for all the three methylation reactions are similar and rather low (15.9-17.2 kcal mol −1 ), indicating that GLP is a trimethyltransferase for H3K aza 9, in agreement with the experimental data. The free-energy profiles for the first, second, and third methylation reactions of H3K oxy 9 in the presence of GLP are given in Fig. 7g. While the free-energy barriers for the first and second methylation reactions are close to each other with a difference of only~1 kcal mol −1 , the barrier for the third methylation is significantly higher (i.e., about 4 kcal mol −1 higher than that of the first methylation) (Fig. 7g, Supplementary Fig. 76). The results support the experimental observations that GLP can only catalyze mono-and dimethylation of H3K oxy 9. The free-energy profile for the first methylation reaction involving H3hGln9 in GLP is given in Fig. 7j; the active-site structures for the reactant complex and neartransition state are shown in Fig. 7k, l, respectively. As is evident from Fig. 7j, the free-energy barrier is as high as 42 kcal mol −1 , implying that the methylation cannot occur on H3hGln9 in the presence of GLP, in line with our experimental observations.

Discussion
Understanding the molecular origin of enzyme catalysis that plays essential roles in human health and disease is important from a basic molecular perspective as well as from a biomedical perspective. Despite ongoing examinations of basic biomolecular requirements that define the activity of numerous enzymes, an in-depth understanding of the underlying chemical mechanisms that control the enzymatic methylation of lysine and other residues remains incomplete. Members of SAM-dependent methyltransferases represent a widespread and important class of enzymes that catalyze N-, O-, and C-methylation reactions in all kingdoms of life 34,35 . Our work highlights that cooperative experimental and computational investigations enable the exploration of the chemical foundation for human KMT-catalyzed methylation of histones that possess lysine and its simplest analogs at an unprecedented level of molecular detail. The nucleophilic character and the basicity of lysine and analogous N-, O-, and Cnucleophiles as well as the conformations of the substrates at the active sites appear to define whether the enzymatic methylation takes place or not. In comparison with lysine, protonated forms of K aza and K oxy are slightly stronger acids that undergo easier deprotonation by KMTs 36,37 , but their unprotonated forms are somewhat poorer nucleophiles than lysine 38,39 . The nucleophilic characters and the binding conformations of K aza and K oxy (as demonstrated from computer simulations) may therefore contribute to the observations that K aza and K oxy can in general undergo the KMT-catalyzed methylation to a similar degree compared to lysine. The lack of methylation of nArg by KMTs indicates that the Tyr-rich active sites of KMTs may not have an ability to deprotonate the weakly acidic guanidinium cation of nArg and that the electron lone pair of nArg may not be able to align well with the methyl group of SAM for the methyl transfer. Thus, the arginine methylation is catalyzed by functionally related arginine methyltransferases (RMTs) that have different active sites with well aligned methyl donor and acceptor and containing negatively charged Glu residues for deprotonating the weakly acidic guanidinum group of Arg during the methyl transfer processes ( Supplementary Fig. 77) 40,41 . Our experimental observations that K OH does not undergo KMT-catalyzed methylation by SETD8, G9a, and GLP suggest that deprotonation of the very poorly acidic hydroxyl group cannot take place in the KMT active site, thus leading to an inactive substrate (OH is a much poorer nucleophile than O − ) 42 .
The elucidation of the chemical foundation of epigenetics remains one of the great challenges of modern biomolecular sciences. It is envisaged that current chemical biology approaches will contribute to an advanced understanding of biomolecular recognition and enzyme-catalyzed posttranslational modifications on histones and other proteins 5,9,[43][44][45][46] . Toward this aim, our integrated synthetic, enzymatic, and computational studies demonstrate that the biocatalytic scope of biomedically important KMTs is limited to N-methylation, and that the nucleophilic character and related basicity of the functional group importantly contribute to the efficiency of the enzymatic methylation reaction. Fig. 7 Computational analyses on GLP-catalyzed methylation. a Free-energy profiles for the first, second, and third methylation reactions in GLP involving K, Kme, and Kme2, respectively. The color scheme for the profiles is the same as in a. Free-energy barrier of the first methylation: 17.0 kcal mol −1 ; second methylation: 17.8 kcal mol −1 ; third methylation: 17.0 kcal mol −1 . b Representative active-site structure of the reactant complex of GLP for the third methylation containing SAM and H3K9me2. Non-relevant hydrogen atoms are not shown here for clarity. c Representative active-site structure near the transition state of the GLP complex for the third methylation of H3K9me2. d Free-energy profiles for the first, second, and third methylation reactions in GLP involving H3K aza 9, H3K aza 9me, and H3K aza 9me2, respectively. Free-energy barrier of the first methylation: 15.9 kcal mol −1 ; second methylation: 17.2 kcal mol −1 ; third methylation: 16.5 kcal mol −1 . e Representative active-site structure of the reactant complex of GLP for the second methylation containing SAM and H3K aza 9me. f Representative active-site structure of the reactant complex of GLP for the third methylation containing SAM and H3K aza 9me2. g Free-energy profiles for the first, second, and third methylation reactions in GLP involving H3K oxy 9, H3K oxy 9me, and H3K oxy 9me2, respectively. Freeenergy barrier of the first methylation: 17.5 kcal mol −1 ; second methylation: 18.8 kcal mol −1 ; third methylation: 21.3 kcal mol −1 . h Representative active-site structure of the reactant complex of GLP for the second methylation containing SAM and H3K oxy 9me. i Representative active-site structure of the reactant complex of GLP for the third methylation containing SAM and H3K oxy 9me2. j Free-energy profile for the first methylation reaction involving H3hGln9. k Representative active-site structure of the reactant complex of GLP for the first methylation containing SAM and H3hGln9. l Representative structure near-transition state of the GLP complex for the first methylation of H3hGln9 Methods Solid-phase synthesis of histone peptides. Histone peptides bearing lysine and its analogs were synthesized on Wang resin using Fmoc solid-phase peptide synthesis (SPPS). Coupling of the amino acids was carried out for 1 h at room temperature with 3.0 equiv. of the desired amino acid, 3.6 equiv. of 1 M Nhydroxybenzotriazole (HOBt) in DMF and 3.3 equiv. of N,N′-diisopropylcarbodiimide (DIPCDI). Fmoc-protected nucleophilic lysine analogs were coupled overnight. Deprotection of the Fmoc-groups was carried out with piperidine in DMF (20%, v/v) for 30 min. After each coupling and deprotection step, a Kaiser test was done to ensure completion of the reaction. After the final Fmoc removal, the peptides were cleaved from the resin with mild cleaving reagents, to ensure that the acid-labile protecting groups remained intact. Cleavage was performed by a mixture of 95% of trifluoroacetic acid (TFA), 2.5% Triisopropylsilane (TIS), and 2.5% water for 4 h at room temperature. Crude peptides were purified by reverse phase HPLC. Fractions containing the pure peptide were collected, frozen, and lyophilized to afford the product as a white-off solid. The purity of histone peptides was examined by analytical HPLC and predicted masses were confirmed by MALDI-TOF MS, LC-MS, and ESI-MS. Results of characterization of histone peptides are presented in Supplementary Figs. 3-12.
HPLC and ESI-MS analyses of histone peptides. Lyophilized crude H3 and H4 peptides were purified by prep-HPLC on a Phenomenex® Gemini-NX 3u C-18 110A reversed-phase column (150 × 21.2 mm) using gradient elution at constant flow rate of 10 mL min −1 and the temperature is 30°C. A typical run for all histone peptides was performed as follows: C-18 reverse phase column; after 3 mins at 3% B, a gradient of 3-15% over 12 mins was introduced, followed by a gradient of 15-30% over 17 mins and from 30 to 100% B over 19 mins, proceeding with 100 to 100% over 21 mins finalized by 3 mins at 100% CH 3 CN (total runtime 30 mins). Solvent A is 0.1% TFA in H 2 O, Solvent B is 0.1% TFA in acetonitrile. The amount of sample applied to the preparative column was 10-15 mg in 1 mL of MilliQ water (100 µL injection per each run). The crude peptides samples were filtered through syringe filters (0.22 µm, Screening Devices B.V, The Netherlands) prior to injection onto the column. H3 peptides were eluted at 8-11 min, whereas H4 peptides were eluted at 15-20 min. Pure fractions containing product were combined, frozen, and freeze-dried overnight to produce pure histone peptides as a white-off solid. Lyophilization was achieved using an ilShin Freeze Dryer (ilShin, Ede, The Netherlands). The purified peptides were characterized by analytical HPLC, MALDI-MS, and LC-MS. Analytical HPLC was performed on a Shimadzu LC-2010A HPLC system (Shimadzu, Kyoto, Japan) using RP C-18 column from Phenomenex, Prodigy ODS3, particle size 5 µm, pore size 110 Å, length 150 mm, and internal diameter 4.60 mm. Linear gradients of acetonitrile (+0.1% TFA) into H 2 O (+0.1% TFA) were run at 1 mL min −1 flow rate over 50 min. A peptide concentration of 1.0 mg mL −1 in milliQ water offered optimal resolution and separation with the following gradients: After 1 min at 5%, a gradient of 5 to 100% over 30 min was introduced, followed by 5 min at 100 to 100% and followed by a gradient of 100 to 5% in 5 min. Histone peptides were detected at 214 nm wavelength. The retention time of each peptide was shown on the top of the corresponding peak in HPLC chromatogram. The used MilliQ water was purified using a WaterPro PS Polisher (Labconco), set to 18.2 MΩ cm −1 . Mass spectrometric analyses of the H3 and H4 peptides were carried out by ESI-MS (Thermo Finnigan LCQ Advantage Max) operating in a positive ionization mode, which was performed on a Thermo Finnigan LCQ-Fleet ESI-ion trap (Thermofischer, Breda, The Netherlands) equipped with a Phenomenex Gemini-NX C-18 column, 50 × 2.0 mm, particle size 3 µM (Phenomenex, Utrecht, The Netherlands). Linear gradients of acetonitrile (+0.1% formic acid) into H 2 O (+0.1% Formic acid) were run at 0.2 mL min −1 flow rate over 50 min. Ions were scanned in a range of m/z 50-2000 in MS mode. Multiply charged molecular-related ions of each peptide were detected. The observed masses matched the predicted peptide masses which are summarized in Supplementary Table 2.
Expression and purification of the KMTs. The expression and purification of SETD8 (residues 186-352), G9a (residues 913-1193), and GLP (residues 951-1235) were carried out as previously described 25 . Briefly, the WT enzymes were recombinantly expressed in E. coli Rosetta BL21 (DE3)pLysS cells, using the LB broth supplemented with kanamycin and chloramphenicol. The cultures were induced with isopropyl-D-thiogalactopyranoside (IPTG). Cells were harvested by centrifugation and lysed, and the expressed proteins were purified employing Ni-NTA affinity column and size exclusion chromatography using an AKTA system. Protein purity was monitored by SDS-PAGE and the concentrations were determined using the Nanodrop DeNovix DS-11 spectrophotometer.
Methyltransferase activity assays. The standard conditions of methyltransferase activity assays were performed by MALDI-TOF MS in 50 µL final volume for 1 h at 37°C. Assay conditions for selected KMTs enzymes are described here. For SETD8, the reaction contained enzyme (2 µM), H4 peptide (GGAKRHRK 20 VLRDNIQ) or any of its unnatural analogs (100 µM), SAM (200 µM) in 50 mM Tris-HCl (pH = 8.0). At high concentration and long incubations, SETD8 was (10 µM) and SAM (1 mM). For G9a and GLP, the reaction contained enzyme (2 µM), H3 peptide (ARTKQTARK 9 STGGKA) or any of its unnatural analogs (100 µM), excess of SAM (500 µM) in 50 mM Tris-HCl (pH = 8). At longer incubation time and high concentration, G9a and GLP were (10 µM) and SAM (1 mM). Samples were incubated in an Eppendorf vial 1.5 mL in thermomixer. An aliquot of the reaction 5 µL was quenched with 5 µL of MeOH to stop the enzymatic reaction before analysis by MALDI-MS spectra. The spots were placed on a stainless steel MALDI plate (MS 96 target ground steel BC of Bruker, Germany). The mass spectra were measured in the positive reflector mode using α-cyano-4-hydroxycinnamic acid matrix. The mass corresponding to one monomethylation was observed as +14 Da, demethylation was observed as +28 Da, and trimethylation was observed as +42 Da. The MALDI-MS data were annotated employing FlexAnalysis software (Bruker Daltonics, Germany). Enzymatic assays for methylated substrates were carried out in five repeats (distinct samples), whereas for the unmethylated histone peptides in triplicate (distinct samples). The evaluations applied in this work directly measure by mass shifts the substrates activity of SETD8, G9a and GLP. It is noteworthy to mention that in the conditions of MALDI-TOF MS analysis, nonenzyme and non-SAM controls were carried out to ensure that the conditions of MS assay did not affect the observable methylation states. Laser power was adjusted to slightly above the threshold to obtain high resolution and signal/noise ratios. Each measurement was obtained by accumulating three spectra collected at different positions on the plate, 100 shots per position.
The kinetic assays for SETD8-catalyzed methylation of histone peptides was carried out employing a MALDI-TOF MS assay to determine the initial velocity rates for the first methylation reaction 47 . A solution of histone peptide (0-300 μM), was added to a solution of SAM (3 μM) in assay buffer (50 mM Tris, pH 8.0) at room temperature (the final volume of 100 μL). The reaction was then initiated by the addition of SETD8 (2 µM) and shaken for 10 min. The enzyme activity was quickly neutralized by the addition of methanol:water (1:1). The different reaction mixtures were aliquoted and mixed with α-Cyano-4-hydroxycinnamic acid matrix prior to measurement. All experiments were carried out in replicates (distinct samples). The enzymatic activity was determined by taking the peak areas of each methylation state, including all isotopes and adducts, and is expressed relative to a control reaction in which no monomethylation is present, utilizing the FlexAnalysisTM software. Kinetics values were extrapolated by plotting initial reaction velocities against peptide concentrations, utilizing GraphPad Prism 5.
Methyltransferase inhibition assays. The inhibition assays were performed in 20 µL in Eppendorf vials in triplicates (distinct samples) as previously described 48 . Unnatural histone peptide (0-100 µM final concentration) was preincubated with G9a or GLP (100 nM final concentration) for 5 min at 37°C in 18 µL of 50 mM glycine pH 8.8 containing 2.5% glycerol as assay buffer. The reaction was initiated by the addition of 2 µL of a pre-mixture of SAM (20 µM final concentration of 200 μM stock) and 14-mer histone peptide (5 µM final concentration of 100 μM stock) to afford a final reaction volume of 20 µL. The enzymatic reaction was incubated for an additional 30 min. Then the reaction was quenched with the addition of 20 µL of MeOH. 2 µL of the quenched reaction was mixed with 2 µL of matrix solution (5.0 mg mL −1 of α-CHCA in 50% acetonitrile/H 2 O, 0.1% TFA) and spotted on the MALDI plate for crystallization. The enzymatic activity was determined by taking the peak areas of each methylation state, including all isotopes and adducts, and is expressed relative to a control reaction in which no unnatural histone peptide is present, utilizing the FlexAnalysisTM software. The half maximal inhibitory concentration (IC 50 ) and the drawing curves, and inhibition studies were calculated using nonlinear regression in GraphPad Prism 5.
NMR experiments. For the NMR experiments of SETD8 with H4K20 peptides, samples (300 µL final volume) were prepared containing SETD8 (8 µM), peptide (400 µM, diluted from a 2 mM stock in 50 mM Tris-D 11 .HCl at pD 8.0, supplemented with D 2 O), SAM (2 mM, diluted from a 10 mM stock in 50 mM Tris-D 11 . HCl at pD 8.0, supplemented with D 2 O). After incubation for 1 h at 37°C in an Eppendorf vial using a thermomixer, the reaction mixture was transferred into the NMR tube and then diluted to 550 μL with Tris-D 11 .HCl buffer and recorded by 1 H NMR at 298 K. For the NMR experiments of G9a with H3K9 peptides, similar conditions were applied. Per each NMR experiment, identical incubation was run in parallel but without enzyme as a control. NMR spectra were recorded using a Bruker Avance III-500 MHz magnet equipped with the Prodigy BB cryoprobe. Water suppression was performed by presaturation and the 1D spectra were acquired with 128 or 256 transients and a relaxation delay of 4 s. 2D TOCSY spectra were acquired with presaturation of the water resonance using 1k points per transient, 8.3 kHz spin-lock for 100 ms, 56 transients per increment with a relaxation delay of 2 s and 512 increments with a sweep width of 10 ppm in each dimension. 2D 1 H-13 C multiplicity-edited HSQC spectra were acquired using 1k points per transient, 64 transients per increment, a relaxation delay of 2 s, and 512 increments. The 13 C sweep width spanned from −10 to 130 ppm. 1 H NMR characterization of substrates prior to enzymatic catalysis was performed using a 30°excitation pulse, 16-128 transients per compound, and a relaxation delay of 8 s. 1 H-13 C spectra of the substrates were recorded using a 30°excitation pulse, 512-4096 transients per compound and a relaxation delay of 2 s. 1 H and 13 C chemical shifts were externally referenced to TMS based on the lock frequency of solvent. NMR enzymatic experiments were conducted at 310 K. MestreNova was used to process the 1D and 2D NMR data.