Structural Analysis of Glycine Sarcosine N-methyltransferase from Methanohalophilus portucalensis Reveals Mechanistic Insights into the Regulation of Methyltransferase Activity

Methyltransferases play crucial roles in many cellular processes, and various regulatory mechanisms have evolved to control their activities. For methyltransferases involved in biosynthetic pathways, regulation via feedback inhibition is a commonly employed strategy to prevent excessive accumulation of the pathways’ end products. To date, no biosynthetic methyltransferases have been characterized by X-ray crystallography in complex with their corresponding end product. Here, we report the crystal structures of the glycine sarcosine N-methyltransferase from the halophilic archaeon Methanohalophilus portucalensis (MpGSMT), which represents the first structural elucidation of the GSMT methyltransferase family. As the first enzyme in the biosynthetic pathway of the osmoprotectant betaine, MpGSMT catalyzes N-methylation of glycine and sarcosine, and its activity is feedback-inhibited by the end product betaine. A structural analysis revealed that, despite the simultaneous presence of both substrate (sarcosine) and cofactor (S-adenosyl-L-homocysteine; SAH), the enzyme was likely crystallized in an inactive conformation, as additional structural changes are required to complete the active site assembly. Consistent with this interpretation, the bound SAH can be replaced by the methyl donor S-adenosyl-L-methionine without triggering the methylation reaction. Furthermore, the observed conformational state was found to harbor a betaine-binding site, suggesting that betaine may inhibit MpGSMT activity by trapping the enzyme in an inactive form. This work implicates a structural basis by which feedback inhibition of biosynthetic methyltransferases may be achieved.


Results
The structure of MpGSMT-sarcosine-SAH ternary complex in a catalytically inactivated state. To understand how MpGSMT is capable of catalyzing the N-methylation reaction with both glycine and sarcosine (N-monomethyl glycine), and how this enzyme is subjected to feedback inhibition by betaine, we first determined the crystal structure of MpGSMT in complex with the sarcosine substrate and the SAH cofactor at 2.47 Å resolution (Table 1; Fig. 1a). Consistent with previous sequence-based prediction 26,27 , our structural analysis validated that MpGSMT indeed belongs to the Class I methyltransferases, with a characteristic seven-stranded Rossmann-like α /β catalytic core responsible for substrate and cofactor binding (Fig. 1a). As expected, a structural comparison carried out using the DALI program showed that the overall structure of MpGSMT most closely resembles vertebrate GNMTs (Z-scores over 20) 24,29,[31][32][33][34][35][36] . The MpGSMT and a rat GNMT structure 29 can be superimposed with an RMSD of 2.1 Å over 230 structurally equivalent Cα atom pairs. Similar to GNMT, the catalytic core of MpGSMT is further elaborated by a helical N-terminal region that precedes the β 1 strand and a lid domain composed of a four-stranded antiparallel β sheet that inserts between the β 5 strand and α E helix (Fig. 1).
Clear electron density is visible for bound sarcosine and SAH ( Supplementary Fig. S2), which allows both molecules to be precisely positioned in the active site of MpGSMT. The cofactor-binding pocket mainly consists of residues from three universally present and conserved elements in Class I methyltransferases that interact with the three moieties of SAH (L-homocysteine, ribose, and adenine). These conserved elements are the glycine-rich motif (E/DXXXGXG; residues 65-73) and the short loop region between β 3 and α C (residues 113-116), which interact with the L-homocysteine and adenine, respectively, and an acidic loop (residues 88-91) that contacts both the ribose and adenine (Fig. 2a). Residues are arranged to accommodate the shape of SAH and these three regions participate in ionic interactions, direct and water-bridged hydrogen bonds, van der Waals contacts and aromatic stacking to stabilize the bound cofactor. In addition, Arg43 from the N-terminal region and Leu132 from the β 4-H3 loop also contribute to cofactor binding by anchoring the carboxyl and amino group of SAH, respectively. Notably, whereas structural analyses of other SAM-dependent methyltransferases have revealed the presence of a functionally significant interaction between an active site-located tyrosine and the positively charged sulfonium of the cofactor 7,29,37-41 , such an interaction was not observed in the MpGSMT-sarcosine-SAH ternary complex. The functional relevance of this finding will be discussed below.
The substrate-binding pocket, where sarcosine or glycine can be recognized and positioned for N-methylation, involves residues from the lid and the core domains (Fig. 2b). The observed interactions between MpGSMT and sarcosine indicate that both sarcosine and glycine can be specifically oriented by forming multiple contacts with the surrounding residues, including a salt bridge and a hydrogen bond formed, respectively, between the carboxyl group and the guanidino group of Arg167 and the phenolic hydroxyl of Tyr206; polar interactions between the amino group and the main-chain carbonyl and the side-chain amide of Asn134; and van der Waals contacts with Asn134, His138, and Met218. Among the residues involve in substrate binding, Arg167 is located at the Scientific RepoRts | 6:38071 | DOI: 10.1038/srep38071 C-terminal end of β 5 and is spatially equivalent to the Arg169 of A. halophytica GSMT 42 , implicating a pivotal role for this residue in recognizing glycine and its N-methylated derivatives.
By adopting the ligand binding mode observed in the MpGSMT-sarcosine-SAH structure, the amino group of sarcosine points directly at the sulfonium of the cofactor and approaches a position that appears suitable for engaging in the subsequent in-line attack on the S-methyl group of SAM (Fig. 2c). Intriguingly, in comparison with other SAM-dependent methyltransferases whose structures were determined in the presence of both the substrate and SAH (or the product and SAM) 29,43 , an unexpected distinctiveness was revealed in the MpGSMT ternary complex structure. First, it is well documented that methyltransferases undergo extensive conformational changes upon substrate and cofactor binding. The apo forms of SAM-dependent methyltransferases frequently exhibit an open conformation in which the N-terminal region and the lid domain are distant from the core domain 36,[44][45][46] . Hence, active site residues are predominantly solvent-exposed and ready to interact with the incoming substrate and cofactor. In contrast, the enzyme-substrate-cofactor ternary complexes tend to adopt a closed conformation in which the N-terminal region and the lid domain close upon the bound substrate and , where the average intensity < I> is taken over all symmetry equivalent measurements, and I hkl is the measured intensity for any given reflection . R free = R cryst for a randomly selected subset (5%) of the data that were not used for minimization of the crystallographic residual. c Categories were defined by PHENIX 56 .
cofactor. This substrate/cofactor-induced conformational change not only places the methyl-accepting group and the donor methylsulfonium in proximity but also creates a narrow tunnel between the two reacting groups to facilitate methyl transfer. However, despite the concomitant presence of both sarcosine and SAH, the MpGSMT active site remains partially open; the N-terminal H1 helix points away from the core domain without covering the substrate-binding pocket ( Supplementary Fig. S3). Moreover, the distance between the methyl-accepting nitrogen of sarcosine and the cofactor's sulfonium (the "N-S distance") was 6.3 Å in the MpGSMT structure ( Fig. 2c), which is considerably longer than the N-S distances of 3.6-4.5 Å observed in other ternary complex structures 43,[47][48][49] . Based on the observed partially open active site, the absence of a conserved polar/ionic interaction with the cofactor's sulfonium, and the wide separation of the methyl-accepting nitrogen and sulfonium in our structure, we speculated that the structural state of MpGSMT reported here may correspond to a functionally inactivated conformation of the enzyme.
To test the validity of our interpretation, we investigated the feasibility of preparing crystals of the MpGSMT-sarcosine-SAM ternary complex. To this end, we first prepared crystals of the MpGSMT-SAM binary complex (Table 1, Supplementary Fig. S2b) and transferred them to a substitute mother liquor that contained sarcosine and SAM. If the observed MpGSMT structure does represent a catalytically inactivated state, we expected to observe the presence of both sarcosine and SAM in the active site without conversion to the products dimethylglycine and SAH. Indeed, X-ray diffraction analysis of the resulting crystals showed clear electron density maps corresponding to sarcosine and SAM in the active site (Table 1, Supplementary Fig. S2c). The B-factors of the atoms that constitute sarcosine and SAM are comparable to those of the surrounding protein residues, suggesting full occupancy for both ligands. Superimposition analysis revealed that the structures of MpGSMT-sarcosine-SAH and MpGSMT-sarcosine-SAM are highly isomorphous; no significant structural differences were observed, except for the minor reposition of sarcosine in response to the presence/absence of the sulfonium-linked methyl group ( Supplementary Fig. S4). Our finding that crystals of the MpGSMT-sarcosine-SAM ternary complex can be successfully prepared strongly indicates that the structure reported here represents a catalytically inactivated state of MpGSMT. The cofactor-binding pocket is shaped by residues from the glycine-rich motif, the acidic loop and the loop region between β 3 and α C. The guanidino group of Arg43 and the main-chain carbonyl group of Leu132 also contribute to cofactor binding by interacting with the carboxyl and amino groups of the cofactor's methionine/homocysteine moiety, respectively. (b) Structure of the substrate-binding pocket. The bound substrate (sarcosine) is stabilized by a salt bridge and an H-bond between the carboxyl group and Arg167 and Tyr206, an H-bond between the amino group and Asn134, and van der Waals interactions with His138 and Met218. (c) Structure of the MpGSMT active site. In this MpGSMT-sarcosine-SAH structure, the methylaccepting nitrogen atom and the cofactor's sulfonium ion are 6.3 Å apart (blue dashed line), suggesting that the nitrogen is not yet in a "near attack" position to effectively initiate methyl transfer. The predicted location of the sulfur-attached methyl group is shown as a red asterisk. Salt bridges and H-bonds are shown as red dashed lines.
Scientific RepoRts | 6:38071 | DOI: 10.1038/srep38071 To our knowledge, this is the first observation of a methyltransferase in its catalytically inactivated form despite the presence of both substrate and cofactor. Given that MpGSMT is capable of switching between an inactivated and activated state depending on the ionic strength of the environment and the intracellular concentration of betaine 25 , we speculated that the observed MpGSMT structure may, in fact, represent a functionally relevant, inactivated state of the enzyme that contains crucial structural information regarding the regulation of methyltransferase activity. The validity of this interpretation is addressed further in the following sections.

Structural basis of betaine-mediated feedback inhibition of MpGSMT.
To control the throughput of the betaine biosynthetic pathway, the activity of MpGSMT is known to be feedback inhibited when the end product betaine reaches a sufficiently high concentration 20,25 . However, the mechanism by which betaine regulates MpGSMT activity has remained elusive. We speculated that betaine may exert its inhibitory function by preferentially interacting with, and thus stabilizing, the inactivated state of MpGSMT. Given the assumption that the MpGSMT structure determined in this study may represent a functionally inactivated form of the enzyme, we tested whether this structure may possess a betaine binding site by soaking the crystals in a substituted mother liquor containing 1.7 M betaine. The use of such a high concentration of betaine for soaking is physiologically justifiable because upon hyperosmotic shock, betaine-mediated feedback inhibition only engages when its intracellular level approaches approximately 1 M; betaine concentrations as high as 2 M are needed to completely abolish MpGSMT activity 20,25 . An unbiased Fo-Fc omit map calculated after initial refinement cycles, using a data set collected from a betaine-soaked crystal with just the protein as the source of phases, clearly shows the extra density, which can be readily interpreted as a bound betaine molecule (Fig. 3a). This result demonstrates that the observed MpGSMT structure indeed harbors a pocket for interacting directly with betaine.
The betaine-binding pocket is composed of residues from the H1 helix (Asp35), H1-H2 loop (Ile38, Asp39), H2 helix (Trp40), and α B helix (Asn100, His104) (Fig. 3b). Remarkably, whereas the MpGSMT-sarcosine-SAH structure shows that the carboxyl group of the active site-bound sarcosine is extensively involved in the mediation of the enzyme-substrate interaction (Fig. 2b), no significant interaction was observed between the carboxyl group of betaine and surrounding residues. The bound betaine is mainly stabilized via van der Waals interactions with surrounding residues and the formation of a unique cation-π interaction between the positively charged trimethylated amino group and the indole sidechain of Trp40. A similar charge-aromatic ring interaction was also reported in the structures of betaine transporters [50][51][52] . The lack of engagement with the carboxyl group suggests that betaine binds only weakly to this pocket and explains why inhibition of MpGSMT activity requires a high concentration of betaine. Although the affinity between betaine and MpGSMT appears weak, the interaction is nevertheless highly specific, as no secondary sites were observed despite the high betaine concentration. The functional significance of the observed betaine-binding pocket is also implicated by the finding that the key interacting residues (Asp35, Trp40, Asn100) are highly conserved among the GSMTs but not among the closely related GNMTs. This suggests that GSMT has evolved to respond to betaine.
Asn100 and His104, two of the residues that constitute the betaine-binding pocket, were mutated to Gln and Lys, respectively, to produce MpGSMT N100Q and MpGSMT H104K for further examining the functional relevance of the observed betaine-binding pocket ( Supplementary Fig. S5). Compared to the wild-type enzyme, MpGSMT H104K exhibited increased sensitivity to betaine. It is possible that the longer, positively charged side-chain of Lys enhances betaine-mediated inhibition by interacting with the carboxyl group of betaine. However, MpGSMT N100Q displayed a decreased response to betaine, likely due to the presence of steric clashes between betaine and the longer Gln side-chain. However, the effects of the Asn100 mutation must be interpreted with caution, as this residue appears to serve an important role in cofactor binding by forming hydrogen bounds with the Thr70 of the glycine rich motif. The activity of MpGSMT N100Q is significantly lower that the wild-type enzyme. Nevertheless, it appears that both residues are involved in the betaine-mediated regulation of MpGSMT. Because most betaine-interacting elements are located within the N-terminal region of MpGSMT, it is likely that the movement of this region is restricted upon betaine-binding and traps the enzyme in the inactivated state. The successful identification of a specific betaine-binding site not only supports our hypothesis that the observed MpGSMT structure represents a physiologically relevant, inactivated state of the enzyme but also implicates the structural basis by which betaine inhibits MpGSMT function.

Structural modeling of MpGSMT in its catalytically active state. To provide structural information
for MpGSMT in its catalytically active form, we performed an extensive screen of crystallization conditions in conjunction with different combinations of ligands (substrates, products, and cofactors in various concentrations) to identify the distinct conditions under which MpGSMT crystals can be produced (Supplementary Table S1). However, diffraction analysis revealed that all these MpGSMT crystals were isomorphous with the initial crystals of the MpGSMT-sarcosine-SAH complex (Table 1), with the same space group and unit cell parameters. That is, all the newly determined structures corresponded to the inactivated state. This result suggests that it will be challenging to crystallize MpGSMT in the active conformation for structural analysis, possibly because the inactivated form is relatively stable and abundantly populated in solution. Therefore, the crystal structure of GNMT complexed with acetate and SAM (PDBid: 1NBH 29 ), which has been used to elucidate the catalytic mechanism of GNMT and shares significant sequence identity (28.6%) and structural similarity (DALI Z-score 26.2) with MpGSMT, was used as a template to build a homology model for an alternative conformational state of MpGSMT that could correspond to the active form.
Guided by the locations of bound acetate and SAM in the GNMT template structure 29 , which occupy the substrate and cofactor binding pocket, respectively, the substrate (either glycine or sarcosine) and SAM were readily docked into the homology model to generate a hypothetical model for the MpGSMT-substrate-cofactor ternary complex (Fig. 4). Compared to the inactive MpGSMT, the H1 helix of N-terminal region undergoes a helix-to-loop transition and moves towards the bound substrate and cofactor. This conformational change brings additional residues into contact with the substrate and cofactor to shield the active site from bulk solvent. Moreover, a channel is assembled between the amino group of the substrate and the methyl group of SAM; the channel walls are lined by triangularly arranged oxygen atoms (the phenolic oxygen atoms of Tyr26 and Tyr185 and the main-chain carbonyl oxygen of Asn134) (Fig. 5). These three electronegative oxygen atoms may facilitate catalysis by defining the pathway of methyl transfer and stabilizing the developing partial positive charges on the three methyl hydrogens during the reaction. Consistent with the proposed catalytic roles of Tyr26 and Tyr185, the Tyr26→ Phe and Tyr185→ Phe mutant enzymes exhibited 74-and 5-fold reduction in specific activity, respectively (Fig. 6a). The more pronounced functional impact of Tyr26→ Phe may be explained by the modeling-based prediction that the phenolic oxygen of Tyr26 is also properly positioned to stabilize the sulfonium group of SAM (Fig. 5c), an interaction known to facilitate methyl transfer reactions [38][39][40][41] . It should also be noted that the distance between the methyl-accepting nitrogen and the methyl carbon is shortened from approximately 3.7 Å in the inactivated state to approximately 2.7 Å in the homology model (Fig. 5). This value is comparable to those seen in the structures of other SAM-dependent N-methyltransferases in their active forms 29,43 . Together, these results indicate that the modeled conformational state of MpGSMT indeed harbors the structural features expected for the catalytically active MpGSMT and can be used to successfully predict the role of certain residues in catalysis.

Disruption of the H1 helix increases MpGSMT activity.
To further test whether the functional switching of MpGSMT from the inactivated into active state requires a helix-to-loop transition of the H1 helix of the N-terminal region (Fig. 4), we examined whether reducing the stability of the H1 helix could enhance MpGSMT activity by lowering the threshold of the conformational change. Four residues in the H1 helix (His21, Glu23,

Figure 4. Comparison of MpGSMT-sarcosine-SAH crystal structure and a homology model of MpGSMT constructed using the GNMT-acetate-SAM structure as a template. In the experimentally determined
MpGSMT-sarcosine-SAH structure (upper left panel), which adopts a catalytically inactive conformation, the H1 helix is composed of four complete helical turns. In contrast, the corresponding region (red arrowhead) exists as a loop in the modeled, active form of MpGSMT (upper right panel). A key difference between the two structures is that Tyr26 is solvent-exposed in the crystal structure but points towards the cofactor and substrate (green sticks) and becomes an integral part of the active site. The polypeptide segment from His21 to Leu28 (highlighted in scarlet) was chosen for introducing helix-destabilizing mutations. Specifically, residues His21, Glu23, Glu24, and Leu28 were mutated to glycine, threonine, asparagine, and serine, respectively. These amino acids are present in other members of the GSMT family or conserved in the GNMT family (lower panel).
Scientific RepoRts | 6:38071 | DOI: 10.1038/srep38071 Glu24, Lys28) show a high propensity for helix formation and are not directly located in the active site (Figs 2c  and 5c). These residues were mutated to amino acids with considerably lower helical propensity (Gly, Thr, Asn and Ser, respectively) to give a tetramutated MpGSMT (MpGSMT 4mut ). We chose to use the tetramutant rather than single mutants because simultaneous incorporation of four mutations would be a more effective design to disrupt an α -helix. This approach is considered valid because only a gain-of-function phenotype would allow us to draw a conclusion. It should be noted that none of the four mutation sites are conserved and all four were replaced by residues found in other members of the GSMT family or conserved in the GNMT family. Therefore, the structural integrity of the enzyme should not be significantly affected by our mutant design. Indeed, MpGSMT 4mut was abundantly expressed and purified to homogeneity, and the crystal structure of MpGSMT 4mut was essentially identical to the wild-type enzyme, except that the N-terminus of H1 helix prior to residue Glu25 underwent a helix-to-loop transition (Supplementary Fig. S6). Consistent with our assumption, the introduction of helix-destabilizing mutations increased the activity by 2.36-fold compared to wild-type MpGSMT (Fig. 6b),  ). Because H1 helix, which carries Tyr26 and Trp34, points away from the core domain, only one electronegative group (the main-chain carbonyl group of Asn134) is present in the active site (right panel) and the channel that links the substrate and the methyl group is not present, suggesting that this structure likely reflects a catalytically inactive state for MpGSMT. (c) In the MpGSMT structure modeled using GNMT (PDBid: 1NBH; panel (a)) as the template, the N-C distance shortens to 2.7 Å (left panel), and a channel lined by three oxygen atoms from Tyr26, N134 and Tyr185 is formed between the methyl group and substrate (right panel). A helix-to-loop transition of the H1 helix allows Tyr26 and Trp34 to enter and complete the assembly of the active site, indicating that the modeled structure may represent MpGSMT in its active conformation.
suggesting that the presence of the H1 helix may be indicative of the catalytically inactivated enzyme and that enzyme activation can be triggered by a conformational change involving the disruption of this helix.

Discussion
In this work, we report the crystal structures of MpGSMT, which represents the first structural elucidation of the GSMT family of methyltransferases. Interestingly, despite the simultaneous presence of both substrate (sarcosine) and cofactor (SAH) in the structure, the enzyme appears to adopt a catalytically inactivated conformation. Additional structural changes were required to complete the active site assembly (Fig. 2). The validity of this hypothesis is backed by the finding that crystals of MpGSMT-sarcosine-SAM ternary complex can be prepared without triggering the methyl transfer (Fig. 5b). Given that MpGSMT is subjected to feedback inhibition by betaine 25 , we suspected that the observed structural state of MpGSMT may harbor a betaine-binding site if this structure indeed corresponds to a functionally relevant, inactive form of the enzyme. The successful identification of a betaine-binding pocket composed of residues conserved in the GSMT family provides additional support for our interpretation (Fig. 3). Next, to understand how MpGSMT may switch from an inactive state to an active form, a homology model of activated MpGSMT was constructed (Fig. 4) and validated by mutagenesis (Fig. 6). By comparing the crystal structure of the inactivated state and the experimentally tested homology model of the activated state, we obtained new insights into the structural basis by which regulation of MpGSMT activity is achieved.
First, structural comparison revealed that the activation of MpGSMT likely involves a conformational change in the N-terminal region, because residues 21-37 form an H1 helix in the inactive state. These residues are predicted to undergo a helix-to-loop transition upon activation (Fig. 4). This hypothesis is consistent with the finding that MpGSMT activity can be enhanced when helix-disrupting mutations are introduced into the H1 helix (Fig. 6), which favors conversion to the activated state. Our structural analysis also shows that the H1 helix is an integral part of the betaine-binding pocket (Fig. 3). Thus, the presence of betaine is expected to inhibit catalysis by stabilizing the H1 helix against structural transition. As the betaine-binding pocket is present only in the inactive state, due to structural changes in the H1 helix and rearrangement of the betaine-interacting residues (Fig. 3), we speculate that betaine exerts its inhibitory function by binding to and arresting MpGSMT in the inactive state. An extensive survey of the structures of class I methyltransferases currently available in the Protein Data Bank revealed that the N-terminal regions may exist as a loop, β -strand or α -helix. Although no apparent correlation was observed between the intrinsic methyltransferase activity and the structure adopted by the respective N-terminal region, the observed structure-activity connection of MpGSMT nevertheless suggests that the activity of methyltransferases may be regulated by controlling the threshold of active-site assembly.
Although GNMT and GSMT share a high degree of sequence and structural similarity, a major difference between the two is that GNMT only acts on glycine, whereas GSMT can methylate both glycine and sarcosine. Comparing the GNMT structure with the MpGSMT model revealed that the substrate-binding site of MpGSMT is considerably more spacious (Supplementary Fig. S7). Either glycine or sarcosine can be snugly placed into the MpGSMT active site without inducing steric conflict. The extra methyl group of sarcosine can be accommodated in a side pocket and forms favorable van der Waals interactions with His138 and Met218. In contrast, steric clashes between the methyl group and surrounding residues were observed when sarcosine was modeled in the GNMT active site. Our modeling analysis suggests that differences in shape and volume between the active sites of GNMT and GSMT may explain their distinct substrate specificity. The following similar conclusions have been drawn from structural studies of the histone lysine methyltransferases 53,54 : those capable of catalyzing consecutive rounds of methylation, such as human lysine methyltransferase SET7/9 and a viral histone H3 lysine 27 methyltransferase, are known to possess a larger pocket that allows the binding of either mono-or dimethyllysine.
In conclusion, we report here the first crystal structure of a GSMT methyltransferase family member. In addition, we provide structural and functional evidence that the observed conformational state represents MpGSMT in a catalytically inactivated form. Most importantly, we determined the crystal structure of the MpGSMT-betaine binary complex. This structure offers the first visualization of a biosynthetic methyltransferase in complex with a bona fide feedback inhibitor, that is, the end product of the corresponding biosynthesis pathway. Taken together, this work reveals structural insights into GSMT function and suggests a structural basis by which feedback inhibition of biosynthetic methyltransferases may be achieved.

Methods
Protein preparation. The construction of an expression plasmid for producing full-length MpGSMT (residues 1-263) with a N-terminal His-tag was described previously 25 . QuikChange TM mutagenesis method (Stratagene) was applied to this vector to allow the production of two MpGSMT mutants: MpGSMT Y26F and MpGSMT Y185F . The wild-type and mutant proteins were produced in E. coli strain BL21-CodonPlus(DE3)-RIL(Cam r ). Briefly, single colonies were inoculated into LB medium containing 50 μ g/ml kanamycin and 34 μ g/ml chloramphenicol, cells were grown at 37 °C until OD 600 reached ~0.6. Protein expression was induced by adding isopropyl-β -D-thiogalactopyranoside to a final concentration of 1.0 mM, and culture was shifted to 30 °C and grown for another 20 hours. The cells were harvested by centrifugation and stored at − 80 °C until further use. For protein purification, the cell pellet from 1 liter of culture was resuspended in 10 ml of buffer A (50 mM Tris-HCl pH 7.5, 300 mM KCl, 0.5 mM PMSF, 1 mM β -mercaptoethanol). Following cell disruption by sonication, the crude cell lysate was clarified by centrifugation at 27,216 × g for 1 hour (repeat 3 times) at 4 °C and loaded onto a pre-equilibrated Ni-NTA column. The column was washed with ten column volumes of buffer A and a stepwise increasing of imidazole concentration from 15 mM to 75 mM were used to remove contaminating proteins. The bound MpGSMT was eluted using buffers containing 150 mM and 300 mM imidazole. Fractions containing MpGSMT were pooled and concentrated using an Amicon Ultra device (Millipore; 30 KDa cutoff) and then loaded on to a 16/60 Superdex-200 size-exclusion column (GE Healthcare) pre-equilibrated in gel-filtration buffer (100 mM TES pH 7.3, 2.0 M KCl, 1.0 mM EDTA, 1.0 mM β -Mercaptoethanol). The peak fractions containing MpGSMT were pooled and concentrated by ultrafiltration to 6.6 mg/ml for crystallization.

Protein crystallization, Data collection and structural determination. Initial crystallization trials
were performed with commercially available kits (Hampton Research) using the hanging-drop vapor-diffusion method. To obtain the crystals of MpGSMT-sarcosine-SAH ternary complex, purified MpGSMT (6.6 mg/ml) was first incubated with 0.1 mM SAH and 0.2 M sarcosine on ice overnight, followed by mixing 1.0 μ l of the protein sample with 1.0 μ l of reservoir solution containing 0.1 M Tris-HCl pH 8.0, 1.5 M sodium chloride and equilibrated at 4 °C against 200 μ l of reservoir solution. The crystals can be obtained in ~3-4 weeks. Before freezing, the crystals were transferred into 20 μ l of substitute mother liquor containing 0.1 M TES pH 8.0, 1.8 M NaCl, 0.51 M sarcosine, 0.1 mM SAH and 25% ethylene glycol for 48 hours, and flash-frozen by plunging into liquid nitrogen. Crystals of the MpGSMT-SAM binary complex were prepared by transferring the MpGSMT-sarcosine-SAH ternary complex crystals into a substitute mother liquor 0.2 M potassium sodium tartrate tetrahydrate, 23% (w/v) polyethylene glycol 3350, 2 mM SAM and 25% ethylene glycol for 48 hours. Crystals of the MpGSMT-sarcosine-SAM ternary complex were prepared by transferring the MpGSMT-SAM binary complex crystals into a substitute mother liquor 0.2 M potassium sodium tartrate tetrahydrate, 23% (w/v) polyethylene glycol 3350, 0.685 M sarcosine, 2 mM SAM and 25% ethylene glycol for 48 hours. Crystals of the both MpGSMT-SAM and MpGSMT-sarcosine SAM were looped directly from the respective substitute mother liquor and harvested by plunging into liquid nitrogen. To obtain the crystals of betaine-bound MpGSMT, 1.0 μ l of purified MpGSMT (6.6 mg/ml) was mixed with 1.0 μ l of reservoir solution containing 0.2 M potassium sodium tartrate tetrahydrate, 20% (w/v) polyethylene glycol 3,350 and equilibrated at 4 °C against 200 μ l of reservoir solution. Single crystals can be obtained in ~3-4 weeks. Before freezing, 1.0 μ l of 5.0 M betaine was added into the crystallization drop and incubated for ~3 hours at 4 °C, and the betaine-soaked crystals were frozen in liquid nitrogen using Parabar 10312 (Hampton research) as the cryoprotectant. Specifically, theses crystals were transferred to a drop of Parabar 10312 and the residual solvent which surrounded the crystal was removed with the CryoLoop, flash-frozen was achieved by plunging into liquid nitrogen. To obtain the crystals of MpGSMT 4Mut (H21G,E23T,E24N,L28S), purified MpGSMT 4Mut (6.6 mg/ml) was first incubated with 0.1 mM SAH and 50 mM glycine on ice overnight, followed by mixing 1.0 μ l of the protein sample with 1.0 μ l of reservoir solution containing 0.2 M ammonia phosphate monobasic. Before freezing, the crystals were transferred into 20 μ l of substitute mother liquor containing 0.23 M ammonia phosphate monobasic, 0.1 mM SAH and 3.35 M betaine as the cryoprotectant for 24 hours, and flash-frozen by plunging into liquid nitrogen.
X-ray diffraction data sets were collected at NSRRC beamlines BL15A1 and BL13C1 (Hsinchu, Taiwan) and SPring8 beamline BL12B2 (Japan) and processed using the HKL2000 program suit 55 . The MpGSMT-sarcosine-SAH was solved by molecular replacement using the rat GNMT structure (PDBid: 1XVA) as the search model. The MR-phased initial structure model was improved by the AutoBuild module of Phenix 56 followed by rounds of manual adjustment and refinement using Coot 57 and Phenix. The MpGSMT-sarcosine-SAH structure was then used as the search model to solve the MpGSMT-SAM, MpGSMT-sarcosine-SAM, MpGSMT-betaine and MpGSMT 4Mut (H21G, E23T, E24N, L28S) structures. In all cases, the initial mF o -DF c difference electron density maps allow the bound ligands (sarcosine, SAH, SAM, betaine) to be located and precisely positioned using Coot. The structures then underwent rounds of manual model rebuilding and refinement using Coot and Phenix. Data collection and refinement statistics were summarized in Table 1. All figures were generated using PyMOL 58 .
Scientific RepoRts | 6:38071 | DOI: 10.1038/srep38071 Homology modeling of the active form of MpGSMT. To identify a suitable template for modeling, the crystal structure of MpGSMT reported in this study was used for a DALI structural similarity search. The structure of rat liver GNMT in complex with the cofactor (SAM) and inhibitor (acetate) (PDBid: 1NBH, chain D) was chosen for the extensive similarity between the corresponding catalytic cores and lid domains, and because the active site of GNMT is in a catalytically active conformation 29 . Sequence alignment and homology modeling were performed by Accelrys Discovery Studio (Accelrys Software Inc.). SAM and acetate were placed into the homology model of MpGSMT by transferring the atomic coordinates of the ligands after superimposing GNMT-acetate-SAM complex onto the homology model.
Introducing mutations that destabilize helix H1 into MpGSMT. The H1-destabilizing mutations (H21G/E23T/E24N/L28S) were introduced into the MpGSMT gene by SOE (gene Splicing by Overlap Extension) method 59 . The primers used for mutagenesis were synthesized by the Viogene BioTek Corp, Taiwan  (Supplementary Table S2). PCR reactions were performed using the wild-type MpGSMT gene as the template and either the primer pair a/d or b/c to obtain the 5′ and 3′ fragments of the gene, respectively. The two PCR-amplified products were retrieved by gel extraction and mixed, the mixture then underwent one cycle of primer-free extension to generate small amount of full-length MpGSMT gene that carries the L28S mutation, which served as the template for subsequent PCR amplification cycles using primer pair a/b to produce large amount of the mutated MpGSMT gene. In the next round of SOE, PCR reactions were performed using the mutated MpGSMT gene as the template and either the primer pair a/f or b/e to obtain the 5′ and 3′ fragments of the gene, respectively. Similarly, the new PCR-amplified fragments were gel extracted, mixed, and underwent primer-free extension, followed by additional PCR cycles in the presence of the primer pair a/b to produce full-length H21G/E23T/ E24N/L28S mutated MpGSMT gene. The mutated MpGSMT genes were inserted into pET28a for sequencing and protein production. Activity assay. The "SAM510: SAM Methyltransferase Assay Kit" (G-Biosciences) was used for measuring the activities of wild-type and mutant MpGSMT. The assays were set up and performed in a 96-well plate according to the manufacturer's suggestion. Each reaction mixture (143.5 μ L) contains 7 μ g of enzyme, 150 mM glycine, 1 mM SAM, 0.1 M KCl, and 100.5 μ L of the master mix. The reaction was followed using a microplate spectrophotometer (PowerWave, BioTek Instruments, Inc., USA) to monitor the absorbance change at 510 nm under 37 °C. Absorbance of the reaction mixture was recorded every minute for a total of 1 hour. A radioactivity-based assay described previously was used to measure the activities of MpGSMT Y26F and MpGSMT Y185 25 .

Statistical analysis. Data obtained in this study were examined with Student's t-test embedded in
SigmaPlot. Confidence levels of Student's t-tests were depicted in the graph, three asterisks indicating p < 0.001.