Structural basis of human 5,10-methylenetetrahydrofolate reductase (MTHFR) regulation by phosphorylation and S-adenosylmethionine inhibition

The folate and methionine cycles are crucial to the biosynthesis of lipids, nucleotides and proteins, and production of the global methyl donor S-adenosylmethionine (SAM). 5,10-methylenetetrahydrofolate reductase (MTHFR) represents a key regulatory connection between these cycles, generating 5-methyltetrahydrofolate for initiation of the methionine cycle, and undergoing allosteric inhibition by its end product SAM. Our 2.5 Å resolution crystal structure of human MTHFR reveals a unique architecture, appending the well-conserved catalytic TIM-barrel to a eukaryote-only SAM-binding domain. The latter domain of novel fold provides the predominant interface for MTHFR homo-dimerization, positioning the N-terminal serine-rich phosphorylation region into proximity with the C-terminal SAM-binding domain. This explains how MTHFR phosphorylation, identified on 11 N-terminal residues (16-total), increases sensitivity to SAM binding and inhibition. Finally, we demonstrate the 25-amino-acid inter-domain linker enables conformational plasticity and propose it to be a key mediator of SAM regulation.


Introduction
In humans, the folate and methionine cycles both generate products essential to cellular survival. Folate, the major cellular carrier of single carbon units, is required for the synthesis of purines and thymidine monophosphate. Within the methionine cycle, the methylation of homocysteine to methionine by methionine synthase (EC 2.1.1.13) produces an essential amino acid which may be used for protein synthesis or, crucially, be further converted to S-adenosylmethionine (SAM), a vitally important donor for the methylation of DNA, RNA and proteins as well as the creation of numerous methylated compounds. These two cycles intersect at the enzyme 5,10-methylenetetrahydrofolate reductase (MTHFR; E.C. 1.5.1.20). MTHFR catalyzes the physiologically irreversible reduction of 5,10methylene-tetrahydrofolate (CH 2 -THF) to 5-methyl-tetrahydrofolate (CH 3 -THF), a reaction requiring FAD as cofactor and NADPH as electron donor. Since the product CH 3 -THF is exclusively used by methionine synthase, and only the demethylated form (THF) may be recycled back to the folate cycle, MTHFR commits THF-bound one-carbon units to the methionine cycle.
In accordance with this essential role, major and minor deficiencies of human MTHFR are the direct or indirect causes of human disease. Severe MTHFR deficiency (MIM #607093) is inherited in an autosomal recessive manner and is the most common inborn error of folate deficiency 1 with ~200 patients known 2 . To date, over 100 different clinically relevant mutations in MTHFR have been described, the majority of which are of the missense type (n=70, >60%) and "private" 2 . Milder enzyme deficiencies, due to single nucleotide polymorphisms (SNPs) of the MTHFR gene, have been associated with various common disorders. The most studied of these is p.Ala222Val (c.665C>T in NM_001330358, commonly annotated as c.677C>T), identified as a risk factor for an overwhelming number of multifactorial disorders, including: vascular diseases, neurological diseases, various cancers, diabetes and pregnancy loss (see e.g. review by 3 ).
Human MTHFR is a 656 amino acid multi-domain protein (Fig. 1). The catalytic domain is conserved across evolution, and crystal structures of MTHFR from Escherichia (E.) coli [4][5][6][7] and Thermus thermophilus 8 , in which the catalytic domain constitutes the entire sequence ( Fig. 1), have been solved. These structures reveal the catalytic domain to form a β 8 α 8 (TIM) barrel and have uncovered residues critical for binding the cofactor FAD 4 , the electron donor NADPH (NADH in bacteria 7 ) and the product CH 3 -THF 5,6 . The bacterial structures, together with activity assay of trypsin cleaved porcine MTHFR 9 , indicate that the catalytic domain is sufficient for the entire catalytic cycle. Eukaryotic MTHFR orthologs additionally possess a C-terminal regulatory domain that is connected to the catalytic domain by a linker sequence (Fig. 1). This C-terminal domain is able to bind SAM, resulting in allosteric inhibition of enzymatic activity 10 , an effect which is very slow 11 and can be reversed by binding to S-adenosylhomocysteine (SAH) 12,13 , the demethylated form of SAM.
Human MTHFR further contains a 35 amino acid serine-rich region at the very N-terminus which is not found in MTHFR orthologs of bacteria, yeast or even lower animals (Fig. 1). This region has been identified to be multiply phosphorylated following heterologous expression in insect cells 14 and yeast 15 , or following immunoprecipitation from human cancer cell lines 16 . Phosphorylation has been associated with moderately decreased catalytic activity [14][15][16] and increased total inhibition mediated by SAM 14 .
Although phosphorylation mapping of this region has been thus far unsuccessful, scanning mutagenesis has revealed substitution of alanine for threonine at position 34 (p.Thr34Ala) to almost completely block phosphorylation 14,15 , suggesting Thr34 is the priming position. The cellular relevance of this modification remains unclear, although one group has suggested that phosphorylation at Thr34 can be accomplished by CDK1/cyclin B1 16 and at Thr549 by polo-like kinase 1 17 whereby they posit a role in histone methylation and replication.
The repertoire of bacterial MTHFR structures to date does not provide any mechanistic insight into the enzymatic regulation by phosphorylation and SAM binding, because both features are absent in prokaryotes. To this end, we have combined structural, biophysical and biochemical data of human MTHFR to provide a molecular view of MTHFR function and regulation in higher eukaryotes. We have identified specific phosphorylation sites and demonstrate a distinct relationship between phosphorylation, conformational change and SAM inhibition. Further, using our 2.5 Å resolution crystal structure of the almost full-length human protein, we reveal that the regulatory domain utilizes a novel topology to bind SAH/SAM and transmit a catalytic inhibition signal by long range conformational change, most likely through the linker region. These novel insights highlight conformational plasticity as an important mediator of MTHFR regulation.

Identification of phosphorylated residues by mass spectrometry
To examine the phosphorylation status of human (Hs)MTHFR, we generated full-length recombinant human MTHFR (HsMTHFR 1-656 ) by baculovirus expression in Sf9 cells. Mass spectrometry-based phosphorylation mapping (with 92% coverage) identified 16 separate phosphorylation sites in HsMTHFR 1-656 following purification from Sf9 cells (called here "as purified") ( Fig. 2A). All phosphorylation sites were considered to have partial occupancy, since no residues were phosphorylated in every tryptic peptide analysed (data not shown). Of these, 11 phosphorylated amino acids (Ser9, Ser10, Ser18, Ser20, Ser21, Ser23, Ser25, Ser26, Ser29, Ser30, Thr34) were within the N-terminal serine-rich region, including the putative phosphorylation determining residue Thr34 ( Fig.  2A). Additionally, we found phosphorylation of three further amino acids in the catalytic domain (Tyr90, Thr94, Ser103) and two in the regulatory domain (Ser394, Thr451). Up to ten phosphorylation sites were identified to be occupied simultaneously, whereby treatment with calf intestine alkaline phosphatase (CIP) resulted in removal of 9 ( Fig. 2B) or 10 ( Fig. 2C) phosphate groups, as identified by denaturing and native mass spectrometry, respectively. To examine the importance of the N-terminal serine-rich region to global protein phosphorylation, we produced recombinant HsMTHFR 38-644 , which removes the N-terminal 37 amino acids, including the entire serine-rich region ( Fig. 1) as well as the poorly conserved C-terminal 12 amino acids predicted to be of high disorder (Suppl. Fig. 1). As purified HsMTHFR  was not found to be phosphorylated by phosphorylation mapping (Suppl. Fig.  2A), or native mass spectrometry (Suppl. Fig. 2B), and treatment with CIP did not alter the protein molecular mass (Fig. 2D). Therefore, the primary determinant of HsMTHFR phosphorylation resides within the N-terminus.

Phosphorylation does not alter MTHFR kinetic parameters
Phosphorylation has been described to alter MTHFR kinetics, resulting in moderately decreased catalytic activity as measured by the NADPH-menadione oxidoreductase assay 14,16 . To investigate this more thoroughly, we used a very sensitive HPLC-based activity assay which monitors the full enzymatic reaction in the physiological direction and allows determination of kinetic values 18  Interestingly, we found no increase in the specific activity of MTHFR proteins following addition of exogenous FAD to the assay buffer (Table 1). Since FAD is required as cofactor for the MTHFR reaction, this suggests the cofactor was already bound to the as purified protein, presumably acquired during cellular expression. This is consistent with native mass spectrometry, which identified monomeric and dimeric forms of as purified HsMTHFR  which, in addition to phosphorylation, contained equivalent units of FAD (Fig. 2C). HsMTHFR 38-644 also presented as monomeric and dimeric forms bound to equivalent units of FAD (Suppl. Fig. 2B), suggesting phosphorylation has no effect in this regard. Supplementation with FAD, however, helped rescue activity either during, or to a lesser extent following, incubation of these proteins at 46°C for 5 minutes (Table 1). Therefore, this cofactor, which is important for protein stability, may be lost under heat treatment. In our experiments, HsMTHFR 1-656 was markedly more sensitive to heat inactivation than HsMTHFR  , but this heat sensitivity was not affected by the phosphorylation state of the protein, and therefore likely rather reflects overall protein stability.

MTHFR phosphorylation increases protection of and sensitivity to SAM
In addition to phosphorylation and FAD, native mass spectrometry identified the dimeric form of as purified HsMTHFR 1-656 to contain 0, 1 or 2 units of SAM (Fig. 2E). Like FAD, SAM was likely acquired during cellular expression. However, following CIP treatment, the SAM bound to MTHFR was found to degrade to SAH, a chemical transition which did not occur during mock treatment of the protein (Fig.  2E). Correspondingly, as purified dimeric HsMTHFR 38-644 , which is not phosphorylated, was found to be bound to 0, 1 or 2 units of SAH, but not SAM (Suppl. Fig. 2C). Thus, phosphorylated MTHFR appeared to protect thermally unstable SAM from degradation to SAH, while the non-phosphorylated protein was unable to perform this function.
Phosphorylation has been identified to affect the maximum degree of inhibition of MTHFR by SAM, whereby phosphorylated protein was found to be maximally ~80% inhibited, while phosphatase treated protein was maximally ~60% inhibited 14 . At high concentrations of SAM (> 200 μ M), we were able to inhibit all recombinant HsMTHFR proteins by over 95%, regardless of the phosphorylation state (Fig. 3). However, at low SAM concentrations we found phosphorylated HsMTHFR 1-656 to be more sensitive to SAM inhibition than HsMTHFR 38-644 and dephosphorylated HsMTHFR 1-656 (Fig. 3). Further analysis revealed as purified and mock treated HsMTHFR 1-656 to have inhibition constants (K i ) of ~ 3 µM, while CIP treated HsMTHFR 1-656 was approximately 2-fold less sensitive to SAM inhibition, and HsMTHFR 38-644 7-fold less sensitive ( Fig. 3 -inset; Table 1). Thus, although phosphorylation does not directly affect MTHFR enzymatic activity, it increases the protein's sensitivity to SAM inhibition.

Human MTHFR has evolved an extensive linker to connect and interact with its two domains
We determined the 2.5 Å resolution structure of HsMTHFR 38-644 in complex with FAD and SAH by multiple-wavelength anomalous dispersion using the selenomethionine (SeMet) derivatized protein ( Table 2). The identity of both ligands is guided by well-defined electron density (Suppl. The HsMTHFR 38-644 structure allows the mapping of the 70 inherited missense mutations known to cause severe MTHFR deficiency, which lie on 64 different residues of the polypeptide (Suppl. Fig. 5). Twice as many of the mutation sites are found in the catalytic domain (n=38) as the regulatory domain (20), with the remainder (6) found in the linker region. By proportion, however, the linker region has a higher density (24% of the sequence) of mutation sites than the catalytic (11%) and regulatory (7%) domains. Additionally, a number of sites in the catalytic and regulatory domains are in direct contact with the linker region. Further, the most severe mutations, those found either homozygously or in conjunction with a truncating mutation to result in enzymatic activity below 1.5% of control activity in patient fibroblasts 20 , cluster in the catalytic domain and the first two aa of the linker region, most of which are located where the linker meets the catalytic domain (Suppl. Fig. 5). Together, this analysis underscores the importance of the linker region to proper protein function.

The HsMTHFR 38-644 structure displays an asymmetric dimer with inter-domain flexibility
The HsMTHFR 38-644 structure reveals a homodimer (Fig. 4B), consistent with native mass spectrometry (Suppl. Fig 2B) and previous investigation of mammalian MTHFR by size exclusion chromatography and scanning transmission electron microscopy 9 . It was previously thought that MTHFR homodimerizes in a head-to-tail manner, where the regulatory domain of one subunit interacts with the catalytic domain of the other subunit 13 . Unexpectedly, in our structure dimerization is mediated almost entirely by the regulatory domain (Fig. 4B), although the first ordered residue in chain A (Glu40) is located around 5-6 Å from the regulatory domain of chain B (e.g. Glu553, Arg567). The N-terminal sequence that is either not present (Ser-rich phosphorylation region, aa 1-37), or present but disordered (aa [38][39] in the HsMTHFR 38-644 structure will likely project towards the interface of the two regulatory domain (Fig. 4C), and may contribute further to the dimer contacts.
The essential interfacial residues from the regulatory domain are contributed predominantly from the two central β -sheets, including a β -turn (β11-β12), strand β 16, and the loop encompassing Asn386-Asn391 (Suppl. Fig. 6), which buries in total ~1330 Å 2 of accessible surface. Half of the sites of missense mutations in the regulatory domain causing MTHFR deficiency (n=10, Suppl. Fig. 5) either participate in, or are within in two residues of, the dimerization site.
Within the homodimer, each of the two catalytic domains is presented away from the dimeric interface and their active sites are at opposite ends of the overall shape and face away from each other (Fig.  4B). In this arrangement, the catalytic domain is not involved in oligomerization, unlike bacterial and archaeal MTHFR proteins (Suppl. Fig. 7). This said, the N-terminus of the HsMTHFR 38-644 construct is projecting towards the dimer interface. A direct consequence of the dimeric architecture is that the HsMTHFR catalytic domain displays a large degree of flexibility in relative orientation with the regulatory domain. In fact, this is reflected in our structure whereby the catalytic domain of one dimer subunit (chain A) is ordered, while that of the other dimer subunit (chain B) is highly disordered, to the extent that only main chain atoms of the amino acid 40-58, 129-134 and 155-342 in chain B could be modeled.

Dynamics of MTHFR observed by solution scattering
Our HsMTHFR 38-644 crystal structure has captured the snapshot of an asymmetric dimer whereby the two catalytic domains have different orientations with respect to their own regulatory domains (Suppl. Fig. 8). We applied small angle x-ray scattering (SAXS) to understand better the different conformational variations assumed by the protein in solution. Superimposition of the theoretical scattering curve back-calculated from the crystal structure dimer against experimental data obtained from HsMTHFR 38-644 in solution revealed a poor fit (Chi 2 14.8; Fig. 5A and B), suggesting this is not the predominant conformation in solution. However, by employing CORAL 21 to simulate relaxation of the relative orientations of the catalytic and regulatory domains (by allowing flexibility in residues 338-345 of the linker), and thus also permitting rigid body movement of these subunits in relative orientation to each other, we obtained a significantly improved fit (Chi 2 5.5; Fig. 5A). Here, the best model was also represented by an asymmetric dimer, but in this case with catalytic domains extended and rotated in comparison to the regulatory domains (Fig. 5B). Thus, consistent with our finding from the crystal structure, HsMTHFR retains a significant degree of intra-and inter-domain conformational flexibility in solution.
To further investigate the influence of phosphorylation on protein conformation, we next collected SAXS data for full-length HsMTHFR 1-656 as purified (i.e. phosphorylated and bound with SAM) and treated with CIP (i.e. dephosphorylated and bound with SAH). Superimposition of the experimental scattering curves between these two conditions indicates that as purified HsMTHFR 1-656 has slightly larger dimensions than CIP-treated HsMTHFR 1-656 as indicated by its larger LogI(0) (Fig. 5C), although both protein forms are consistent with a dimeric configuration. To clarify the difference in overall shape, we fitted the theoretical scattering curves of the HsMTHFR 38-644 rigid body CORAL models to the SAXS experimental data of phosphorylated (Chi 2 of 22.2 +/-0.4) and dephosphorylated (Chi 2 of 30.7 +/-1.8) MTHFR. Since the hits were clearly different, the protein conformations represented by the experimental data are also different. Likewise, the theoretical scattering curve of HsMTHFR  observed in the crystal presents a good fit to the experimental data of dephosphorylated MTHFR (Chi 2 of 5.9), but not to that of phosphorylated MTHFR (Chi 2 11.6). This data was further corroborated by charge radius analysis of native phosphorylated and dephosphorylated HsMTHFR 1-656 by electrospray ionization mass spectrometry, which found a significantly different charge-distribution of protein ions between the two protein forms, indicating altered flexibility (Suppl. Fig 9). Altogether, we interpret that the phosphorylated SAM-bound form of the protein presents a different conformation to the dephosphorylated SAH-bound form.

Subtle features of the eukaryotic catalytic domain provide for NADPH specificity
The MTHFR catalytic domain adopts a TIM-barrel structure evolutionarily conserved across all kingdoms. In addition to HsMTHFR 38-644 , we further determined the catalytic domain structure of the yeast homolog MET12 (ScMET12 1-301 ) to 1.56 Å resolution ( Table 2). This enables for the first time a structural comparison across mammalian (HsMTHFR), low-eukaryotic (ScMET12) and bacterial (E. coli, H. influenzae, T. thermophilus) orthologues. Consistent with their sequence conservation (Suppl. Fig. 10), the catalytic domains have highly superimposable folds (main chain RMSD: 1.85 Å), although distinct local differences are found in low homology loop regions (Fig. 6A, a-b) and helices (Fig. 6A, cd).
In HsMTHFR 38-644 , clear electron density for FAD was observed in the TIM barrel of chain A (Suppl. The bi bi kinetic mechanism of MTHFR necessitates the electron donor NAD(P)H and substrate CH 2 -THF to interact in turn with FAD for transfer of the reducing equivalent, and hence to share the same binding site. In our structures, the FAD ligand adopts a conformation poised to expose the si face of the isoalloxazine ring for the incoming NAPDH and CH 2 -THF. However, instead of trapping the electron donor or substrate (despite multiple attempts at co-crystallization), the binding site in ScMET12 1-301 and subunit A of HsMTHFR 38-644 is blocked by a crystal packing interaction from a nearby symmetry mate, making π -π stacking interactions with the FAD ligand (Suppl. Fig. 11). By contrast, no crystal packing interaction is found in the chain B binding site of HsMTHFR 38-644 , explaining the overall mobility and disorder of its catalytic domain.
Superimposing the HsMTHFR 38-644 structure with that of EcMTHFR bound with NADH (Fig. 6C) and CH 3 -THF (Fig. 6D) demonstrates that the human enzyme has largely preserved the same shared binding site found in prokaryotes, with Gln228, Gln267, Lys270, Leu271 and Leu323 likely to be important for interacting with both NAD(P)H and CH 3 -THF. EcMTHFR preferentially utilizes NADH 23 , and its NADH-bound structure reveals a highly uncommon bent conformation 24 for the electron donor, where the nicotinamide ring stacked over the adenine base to mediate π -π interactions 7 . Our activity assay of HsMTHFR 38-644 and HsMTHFR 1-656 clearly demonstrates an ~100-fold preference for NADPH compared to NADH as electron donor (Table 1), in agreement with previous enzyme studies from pig 11,25 and rat 11 MTHFRs.
Within the HsMTHFR active site, we did not identify any obvious differentiating features surrounding the modeled NADH, which could indicate how the extra 2'-monophosphate group on the NADPH ribose is accommodated (Suppl. Fig. 12). It is also unclear if HsMTHFR actually binds NADPH in a similar manner as NADH for EcMTHFR, considering there is only one report in literature documenting a compact stacked conformation for NADPH 26 . Modelling an NADPH ligand with such stacked conformation onto the HsMTHFR 38-644 structure reveals severe steric clashes with helix α 8 (not shown), which creates the floor of the NAD(P)H binding site (e.g. via Gln267, Lys270 and Leu271). Helix α 8 is poorly aligned with bacterial and low eukaryotic orthologues in both amino acid sequence (Suppl. Fig 10) and structural topology (Fig. 6A). The equivalent helix in EcMTHFR harbours the residue Phe223, which is crucial to NADH binding 7 and moves to accommodate substrate release 5 . Notably, this residue is not conserved in HsMTHFR and ScMET12, replaced by Leu268 and Ala230, respectively. (Suppl. Fig 10). Therefore, given its position and mobility, we propose that residue(s) on helix α 8 in HsMTHFR may play a role in the specificity for NADPH and likely also substrate binding/release.

A novel fold for the SAM-binding regulatory domain
The HsMTHFR 38-644 structure provides the first view of the 3D arrangement of the regulatory domain unique to eukaryotic MTHFR. The core of this fold comprises two mixed β -sheets of 5 strands each (β9↑-β17↑-β16↓-β12↑-β11↓ and β 10↓-β13↑-β18↓-β14↑-β15↓) (Suppl. Fig. 4). Strand  Fig. 13). Further, a DALI search of this domain (Holm and Laakso 2016) did not yield any structural homolog, and we found no existing annotation in PFAM/CATH/SCOP databases and no sequence for this domain beyond eukaryotic MTHFR homologs. Therefore, this appears to be a novel fold utilized only by MTHFR for SAM binding/inhibition. In our structure, SAH is bound in an extended conformation within the part of the regulatory domain (Fig. 7A) that faces the catalytic domain. Indeed, part of the binding site is constituted by the linker region itself. The ligand is sandwiched between the loop segment preceding hydrogen-bond to the SAH adenine moiety, while Glu463 (99%) and Thr464 (62%) fixate the ribose hydroxyl groups. The strongest sequence conservation in the SAH binding site is found around the homocysteine moiety, including Pro348 (invariant) and Trp349 (99%) from the linker region, as well as Thr560 and Thr573 (both invariant) at the start and end of the β 15-β16 turn. The SAH homocysteine sulfur atom is loosely contacted by Glu463 (3.8 Å) and Ala368 (3.7 Å). SAM is expected to bind to the same site in the regulatory domain, in a similar extended configuration as SAH and requiring the same set of binding residues. However, the additional methyl group in the sulphonium centre of SAM would create steric clash to the Ala368 position of the structure (inter-residue distance ~2.0 Å between heteroatoms, and <1.5 Å between hydrogen atoms) (Suppl. Fig. 14). Although not strictly conserved (45% of 150 orthologues), conservation of Ala368 follows a similar evolutionary pattern as the MTHFR domain organisation (Fig. 1): in higher animals alanine is invariant; lower animals may accommodate a serine; while lower eukaryotes often incorporate a bulky residue (e.g. lysine) (see Suppl. Fig. 15). Therefore, in higher organisms such humans, SAM binding likely results in conformational rearrangement of the loop region containing Ala368 to accommodate its methyl moiety.

SAM-dependent conformational change is mediated by the linker region
Since there is no direct interface between the active-site of the catalytic domain and any part of the regulatory domain (Fig. 4), SAM binding must elicit enzymatic inhibition via a conformational change propagated from the regulatory to catalytic domain. The most likely effector of this conformational change is the extended linker region (defined as aa 338-362), since it makes multiple contacts to both the regulatory and catalytic domains (Fig. 4) and forms part of the SAM/SAH binding site (Fig. 7A). To investigate the potential of this region to elicit conformational change following SAM binding, we generated recombinant HsMTHFR proteins consisting of the regulatory domain alone attached to progressively shorter linker regions, where the N-terminus of these constructs would become Pro348 (HsMTHFR 348-656 ), Arg357 (HsMTHFR 357-656 ) and Arg377 (HsMTHFR 377-656 ) (Suppl. Fig. 1; Fig. 7B). All three constructs are sufficient to bind SAM and SAH, as demonstrated by dose-dependent increases in thermostability by differential scanning fluorimetry when exposed to increasing concentrations of each ligand (Suppl. Fig. 16A). This again reinforces the catalytic and regulatory domains as separate binding modules for their cognate ligands (FAD/NADPH/CH 3 -THF vs SAM/SAH respectively).
We employed analytical size exclusion chromatography (aSEC) as a means to study solution behaviour of the MTHFR regulatory domain in response to SAM/SAH binding. Exposure of MTHFR 348-656 to either SAH or SAM resulted in shifts of elution volume (V e ) compared to as purified (apo-) protein, which we interpreted as changes in the overall protein conformation, rather than changes in oligomeric states. Our assumption is based on the native mass spectrometry data ( Fig. 2C; Suppl. Fig  2B) showing that SAM and SAH do not alter the oligomeric states observed for the HsMTHFR 1-656 and HsMTHFR 38-644 proteins. Importantly, for MTHFR 348-656 , SAM resulted in a leftward V e shift (suggestive of a larger hydrodynamic volume), and SAH a rightward shift (suggestive of a smaller hydrodynamic volume) (Fig. 7B). By contrast, MTHFR 357-656 showed conformational change only when exposed to SAM, and MTHFR 377-656 did not change conformation when exposed to either ligand (Fig. 7B). A similar pattern of results were observed when using purified recombinant mouse MTHFR of the same protein boundaries (Suppl. Fig. 16A and B). Therefore, we conclude that residues within 357-377 must contribute to conformational change upon SAM binding.
Next we carried out site-directed mutagenesis to define residues involved in SAM binding, and/or SAM-mediated conformational change as observed in the aSEC experiment. We reasoned that mutation of Glu463 (which hydrogen-bonds a ribose oxygen) could lead to loss of SAH/SAM binding, and thus conformational change. Indeed, conservative mutation of Glu463 to either aspartate (p.E463D) or glutamine (p.E463Q) on MTHFR 348-656 resulted in protein that could no longer bind SAM (Suppl. Fig. 16C), nor change conformation in its presence (Fig. 7C). We further hypothesized that mutation of Ala368 (in close proximity to the SAM/SAH sulphonium centre) to a smaller residue (glycine: p.A368G) may not have an effect on binding or conformational change, while mutation to a larger residue (leucine: p.A368L), might reduce the ability of the linker region to sense SAM binding. Correspondingly, p.A368L resulted in protein which retained the ability to bind SAM, but was less sensitive to change in its presence, while p.A368G did not change either of these properties. (Fig. 7C,  Suppl. Fig.16C). These experiments conclusively pinpoint Glu463 as crucial to SAM binding, and Ala368 to SAM sensing, representing a mechanism that could transmit a ligand-bound signal from regulatory to catalytic domain of the protein.

Discussion
Catalytic regulation by phosphorylation and SAM binding distinguishes human MTHFR from its bacterial (which do not have phosphorylation or SAM binding regions) and lower eukaryotic (which do not have a phosphorylation region) counterparts. Until now, the molecular basis of how these two allosteric events modulate the catalytic machinery was entirely unknown, due to the absence of a structural context. Now, our structure-guided study has provided 2 major discoveries in this area: (1) identification of an extensive linker region involved in both SAM-binding and purveying the binding signal to inhibit catalysis by conformational change; and (2) demonstration of the concerted effects of phosphorylation and SAM binding, individually mediated by regions more than 300 amino acids apart.

Long-distance cross-talk between phosphorylation and SAM-binding
This work presents the first mapping of the entire phosphorylation landscape of HsMTHFR, revealing phosphorylated Ser/Thr not only at the far N-terminus (n=11) as predicted from the sequence, but also within the catalytic (3) and regulatory (2) domains. Many of the N-terminal phosphorylation sites identified are consistent with previous mutation analysis 14 , including Thr34 [14][15][16] . The phosphorylated residues detected in the catalytic and regulatory domains were not reported before. Interestingly, two phosphorylated Ser are located within the FAD binding site, although their physiological significance is currently unclear. Contrary to the recent observation of Li et al. 17 we did not identify phosphorylation of Thr549.
An important finding with regards to MTHFR phosphorylation is that it does not directly alter the catalytic parameters of the enzyme, as determined by a sensitive HPLC-based activity assay. Perhaps this is not too surprising, since the first ordered residue of the structure, Glu40 (i.e. immediately following the phosphorylation region aa 1-37), is far removed from the catalytic site. Instead, MTHFR phosphorylation exerts a long-range influence on the SAM binding status at the regulatory domain some 300 amino acids away, by causing an increased sensitivity of the enzyme to SAM inhibition, but with no overall changes on total SAM inhibition. Phosphorylation likely enhances SAM sensitivity in two interdependent ways. Firstly, it enables protection of bound SAM from spontaneous degradation to SAH, a phenomenon widely observed for SAM-bound enzymes in vitro 28 (unpublished observations) and in crystallo 29 , to avoid dis-inhibition by SAH. Secondly, phosphorylation could induce a conformational change to the protein that primes an inhibition ready state. The SAM Ki differences between phosphorylated and dephosphorylated protein, while relatively small (2-3 µM versus 6-7 µM), are likely to be physiologically relevant. Intracellular SAM concentrations are reported to be 1 -3 µM in human cells 30,31 , and the mTORC1 linked starvation sensor SAMTOR, which recognizes SAM for nutrient sensing, has a SAM dissociation constant of 7 µM 32 .
It is not immediately clear if global phosphorylation or phosphorylation of only specific residues contributed to the results we found. Evolutionary conservation of the identified phosphorylation sites varies from absolute invariance through to yeast (e.g.Ser394), to poor conservation even among animals (e.g. Ser9 and Ser10) (Suppl. Fig 15). Truncated recombinant HsMTHFR  was not identified to be phosphorylated by mass spectrometry or crystallography, suggesting that phosphorylation at the far N-terminus primes the other phosphorylation events within the catalytic and regulatory domains. This is consistent with previous observations 14-16 that removal of Thr34 results in non-phosphorylated protein in vitro. It remains to be seen whether phosphor-Thr34 alone, or other sites at the Ser-rich region, primes other phosphorylation events in vivo.

Inter-domain conformational change is integral to MTHFR regulatory properties
We identified the MTHFR regulatory domain to constitute a novel SAM binding fold whose appendage to the well conserved catalytic TIM-barrel is a relatively recent and contained evolutionary event. A similar phenomenon of domain organisation is found in several eukaryotic enzymes, for example those involved in amino acid metabolism (e.g. cystathionine β -synthase, CBS 33 ; phenylalanine hydroxylase, PAH 34 , whereby the additional metabolite-binding modules, not found in their bacterial counterparts, serve to fine-tune catalysis in response to the more intricate higher eukaryotic metabolic and signalling cues. We propose that MTHFR belongs to this class of allosteric enzymes that share a common mechanism -to regulate catalysis through steric sequestration of the catalytic site, in a liganddependent manner (SAM for MTHFR and CBS; phenylalanine for PAH).
In all of these enzymes, inter-domain conformational change is central to the allosteric mechanism, bringing about a rearrangement of the relative orientation between the regulatory and catalytic domains. Often this requires the flexibility of an inter-domain linker that adopts different conformations to mediate the domain-domain rearrangement. Our data are consistent with the MTHFR linker region being an indispensable component of this mechanism, as supported by the concentration of deleterious disease mutations found in this region. The MTHFR linker has a function beyond merely joining the two domains physically, but actively partakes in the allosteric mechanism by (i) acting as a SAM sensor that contributes to binding the effector ligand, and (ii) purveying the SAM-bound inhibition signal to the catalytic domain. The MTHFR linker is aptly suited for these dual roles, as it makes extensive contacts with the regulatory domain (e.g. SAM binding site) and catalytic domain (e.g. helices α 3 and α 4).
Our domain truncation and mutagenesis experiments coupled with size exclusion chromatography have dissected the regions responsible for SAM/SAH binding and binding-mediated conformational change. Notably, aa 348-376 contribute to, but are not essential for the SAM/SAH binding site; while aa 357-376 are essential for SAM-mediated conformational change. Within aa 357-376, Ala368 is in direct vicinity of SAM/SAH sulphonium centre, and its mutation to a bulkier residue blocked conformational change without affecting SAM/SAH binding. We therefore posit that SAM binding causes a change in the linker conformation (e.g. via Ala368), which in turn translates to a change in the catalytic domain, resulting in decreased enzyme activity.
Although our crystal structure represents a static snapshot of the enzyme state (likely a dis-inhibited state due to SAH binding), evidence for inter-domain conformational changes is provided by the following data. Firstly SAXS analysis between SAM-bound phosphorylated protein and SAH-bound dephosphorylated protein reveals inter-domain flexibility, consistent with subtle, but distinguishable changes to the protein dimensions. Secondly, the two chains in the crystal asymmetric unit show varying intrinsic order of the catalytic domain with respect to the regulatory domain. Additional genetic data from our lab are also in accord 20 , as patient fibroblasts homozygous for p.His354Tyr, a linker residue which contacts helix α 3 in catalytic domain, exhibited a 5-fold decrease in Ki for SAM.
So in what aspects could the SAM-bound signal influence the catalytic domain, seeing that its kinetic parameters remain largely unaltered? One possibility is an effect on the stability or integrity of FAD, the essential cofactor. We observed that supplementation with FAD enabled rescue of activity to our recombinant MTHFR. This is indicative of cofactor loss, in agreement with previous findings 13 , and suggestive of FAD being only loosely bound, as exemplified in chain B of our structure. Furthermore a number of MTHFR mutations 20,22 and polymorphisms 13 are shown to affect FAD responsiveness. It is therefore possible that the inter-domain flexibility we observed, communicated by the SAM-bound signal, would alter the orientation of catalytic domains with respect to the rest of the protein, in a similar manner as the multi-domain enzymes CBS and PAH 35 . Such structural conformations are supported by overlays between chains A and B in our structure, and between apo-and holo-subunits in TtMTHFR 8 . In the case of MTHFR, the active site could be more sequestered (leading to FAD bound) or more exposed and mobile (leading to FAD loss) as a consequence.

SAM-binding and phosphorylation act in concert as on/off and dimmer switches, respectively
The SAM/SAH ratio is regarded as an indicator of a cell's methylation potential and is a crucial indicator of the cells' capacity to perform DNA methylation or create compounds which require methyl groups for assimilation. In the face of a low SAM/SAH ratio, meaning methyl donor deficiency, MTHFR is dis-inhibited, increasing the production of CH 3 -THF to improve throughput of the methionine cycle and replenish SAM levels. Conversely, a high SAM/SAH ratio means abundant methylation capacity, in which case SAM mediated allosteric inhibition of MTHFR turns off CH 3 -THF production, thereby lowering methionine cycle activity and concomitantly generation of SAM. This "on/off" switch is especially powerful at high SAM levels, as illustrated by almost complete inhibition of recombinant HsMTHFR at > 200 µM SAM. Although these types of concentrations are unlikely to be seen inside the cell, HsMTHFR has been further outfitted with a "dimmer" switch, whereby protein phosphorylation increases sensitivity to SAM-mediated inhibition at normal (1-10 µM) cellular SAM levels. In this regard, phosphorylation allows linkage of the methionine cycle to other cellular pathways (e.g. cell cycle) through specific kinase activities (as suggested by 16,17 ).
The clear correlation we observed between phosphorylated MTHFR with SAM binding in solution (vs dephosphorylated MTHFR and SAH binding) leads us to interpret that the two regulatory properties act in concert. In fact, the architecture of the HsMTHFR homodimer is smartly tailor-made to facilitate this correlation. (1) The dimeric interface is entirely constituted by the regulatory domain to form a scaffold, while leaving each SAM binding site on a different face for its sensing and signal transmission functions; (2) The lack of contacts between inter-monomeric catalytic domains allows for the intrinsic intra-monomeric mobility with respect to the regulatory domains for signal propagation; (3) Importantly, this dimeric configuration brings the N-and C-termini of the polypeptide in proximity, projecting the phosphorylation region close to the regulatory domain dimer interface.

Concluding Remarks
In summary, we provide the first structural view of a eukaryotic MTHFR, pointing to the linker region playing a direct role in allosteric inhibition following SAM binding, and phosphorylation as a means to modulate SAM inhibition sensitivity. Modulating such finite control towards the level of a key metabolite may be of pharmacological interest, including in cancer metabolism 36,37 . Our work here constitutes a strong starting point for future, more precise investigation by structural, biochemical, and cellular studies, for example towards: identification of the kinase(s) responsible for MTHFR phosphorylation in vivo; combining different structural methods to delineate conformational changes of the entire protein, and revealing the molecular basis of its specificity over NADPH.

Crystallization, Structural Determination and Analysis
Purified native ScMET12 1-302 as well as SeMet-derivatized and native HsMTHFR 38-644 were concentrated to 15-20 mg/ml, and crystals were grown by siting drop vapour diffusion at 20°C. The mother liquor conditions are summarized in Table 1. Crystals were cryo-protected in mother liquor containing ethylene glycol (25% v/v) and flash-cooled in liquid nitrogen. X-ray diffraction data were collected at the Diamond Light Source and processed using XIA2 38 . The HsMTHFR 38-644 structure was solved by selenium multi-wavelength anomalous diffraction phasing using autoSHARP 39 , and subjected to automated building with BUCCANEER 40 . The SeMet model was used to solve the native structure of HsMTHFR 38-644 by molecular replacement using PHASER 41 . This structure was refined using PHENIX 42 , followed by manual rebuilding in COOT 43 . Phases for ScMET12 1-302 were calculated by molecular replacement using 3APY as model. Atomic coordinates and structure factors for both ScMET12 1-302 (accession code: To Be Deposited) and HsMTHFR 38-644 (accession code: 6CFX) have been deposited in the Protein Data Bank. Data collection and refinement statistics are summarized in Table 1.

MTHFR Assay
All enzymatic assays, including SAM inhibition and thermolability, were performed using the physiological forward assay described by Suormala et al. 18 with modifications as described by Rummel et al. 44 and Burda et al. 20,22 . Only minor adaptations were made for use with pure protein, including using the substrate CH 2 -THF at a concentration of 75 µM, reducing the assay time to 7 minutes and the addition of BSA to keep purified proteins stable. Prior to assay, purified proteins were diluted from 15-20 mg/ml to 1 mg/ml in 10 mM HEPES-buffer pH 7.4, 5% glycerol and 500 mM NaCl followed by successive dilutions of 1:100 and 1:32 in 10 mM potassium-phosphate, pH 6.6 plus 5 mg/ml BSA, to a final MTHFR concentration of 312.5 ng/ml. All Km values were derived using a non-linear fit of Michaelis-Menten kinetics by GraphPad Prism (v6.07). For SAM inhibition, purified SAM 45 was used. The Ki was estimated following a plot of log(inhibitor) vs. response and a four parameter curve fit as performed by GraphPad Prism (v6.07).

Solution Analysis
Analytical gel filtration to assess changes in conformation was performed in the presence or absence of 250 µM SAH or SAM (both Sigma-Aldrich) as described previously 46 . Differential scanning fluorimetry in the presence of 0 -250 µM SAM and SAH was performed as described previously 47,48 . C. The data were processed and analyzed using the ATSAS program package 21 . The radius of gyration Rg and forward scattering I(0) were calculated by Guinier approximation. The maximum particle dimension Dmax and P(r) function were evaluated using the program GNOM 49 . To demonstrate the absence of concentration dependent aggregation and interparticle interference in the both SAXS experiments, we inspected Rg over the elution peaks and performed our analysis only on a selection of frames in which Rg was most stable. Overall, such stability of Rg over the range of concentrations observed in the SEC elution indicates that there were no concentration-dependent effects or interparticle interference. The ab initio model was derived using DAMMIF 50 . 10 individual models were created, then overlaid and averaged using DAMAVER 51 . CORAL rigid body modeling was performed by defining residues 338-345 as a flexible linker and allowing the catalytic subunits to move while keeping the regulatory subunits fixed.

Mass spectrometry
Native (intact) mass spectrometry was performed as outlined 52 . Denaturing mass spectrometry and phosphorylation mapping were performed as described 53 . For further details see Supplementary Methods.  A. Phosphorylation mapping of HsMTHFR 1-656 . The protein sequence is given as amino acids in single letter code, including the C-terminal His/fla (underlined). Black font represents amino acids identified by the mass spectrometer (covered), blue font represents amino acids not identified covered), red font represents phosphorylated amino acids. Domains are coloured as in Figure 1. B. Dephosphorylation of HsMTHFR 1-656 following treatment with CIP. Treatment time at 37°C is given. Large number above peaks represents nu phosphate groups attached, small number represents atomic mass. Proteins were analyzed by denaturing mass spectrometry. amu: atomic m C. Native mass spectrometry analysis of HsMTHFR  Inhibition of MTHFR catalytic activity following pre-incubation with various concentrations of SAM. Remaining activity represents percentage of activity compared to MTHFR incubated without SAM. Inset: Replot of percent activity remaining against SAM concentration transformed by log10 to reveal differences between truncated (HsMTHFR 38-644 ) and dephosphorylated full-length (HsMTHFR 1-656 CIP) protein with phosphorylated full-length protein (HsMTHFR 1-656 ; HsMTHFR 1-656 mock). Inhibitory constants (K i 's) for SAM calculated from this graph were calculated as described in the Materials and Methods and are provided in Table 1.