Structural basis of the substrate preference towards CMP for a thymidylate synthase MilA involved in mildiomycin biosynthesis

Modified pyrimidine monophosphates such as methyl dCMP (mdCMP), hydroxymethyl dUMP (hmdUMP) and hmdCMP in some phages are synthesized by a large group of enzymes termed as thymidylate synthases (TS). Thymidylate is a nucleotide required for DNA synthesis and thus TS is an important drug target. In the biosynthetic pathway of the nucleoside fungicide mildiomycin isolated from Streptomyces rimofaciens ZJU5119, a cytidylate (CMP) hydroxymethylase, MilA, catalyzes the conversion of CMP into 5′-hydroxymethyl CMP (hmCMP) with an efficiency (kcat/KM) of 5-fold faster than for deoxycytidylate (dCMP). MilA is thus the first enzyme of the TS superfamily preferring CMP to dCMP. Here, we determined the crystal structures of MilA and its complexes with various substrates including CMP, dCMP and hmCMP. Comparing these structures to those of dCMP hydroxymethylase (CH) from T4 phage and TS from Escherichia coli revealed that two residues in the active site of CH and TS, a serine and an arginine, are respectively replaced by an alanine and a lysine, Ala176 and Lys133, in MilA. Mutation of A176S/K133R of MilA resulted in a reversal of substrate preference from CMP to dCMP. This is the first study reporting the evolution of the conserved TS in substrate selection from DNA metabolism to secondary nucleoside biosynthesis.

5-Hydroxymethyl cytosine (5hmC), also known as the 'sixth base' , was discovered in mammalian and T-even phage DNA 1,2 . 5hmC in mammalian DNA is produced post-replicatively by the Tet-catalyzed oxidation of 5-methyl cytosine (5mC) 3,4 . In T-even phage, the deoxycytidylate (dCMP) hydroxymethylase (CH) transfers the methylene group from methylene-tetrahydrofolate (CH 2 THF) to the C5 atom of dCMP, and then uses solvent water molecule to hydrate the methylene group to generate hydroxymethyl dCMP (hmdCMP) 5 , a precursor to be incorporated into DNA during replication 6 . Thereafter, its hydroxymethyl group serves as a substrate for glucosylation to form glucosylhydroxymethylated DNA to avoid cleavage by the host restriction systems 7,8 .
Some biologically active nucleoside antibiotics, such as bacimethrin 9 , 5-hydroxymethyl blasticidin S 10 and mildiomycin 11 , also contain 5hmC moieties that are all derived from hmCMP. We previously demonstrated that MilA, a CMP hydroxymethylase in the mildiomycin biosynthetic gene cluster in Streptomyces rimofaciens ZJU5119, can convert CMP to hmCMP 12 . HmCMP is then hydrolyzed by MilB to 5-hydroxymethylcytosine (5hmC) 13 , which is finally incorporated into mildiomycin.
MilA and CH are akin to the superfamily of thymidylate synthases (TS), which transfers a methyl group from CH 2 THF to dUMP to form dTMP in the de novo thymidylate synthesis pathway and, hence, DNA synthesis 14 . TS is one of the most conserved enzymes in nucleotide metabolism across phyla and therefore is an important drug target. TS from phage T4 (T4 TS) is involved in coordinating DNA synthesis in infected Escherichia coli cells 15 . Extensive biochemical and structural studies on TS have provided a wealth of information regarding its catalytic mechanism, specific interactions with dUMP and folate analogs, and stability 14,16,17 . The structures of TS and CH resemble each other very well, with a root-mean-square-deviation (RMSD) of 1.849 Å for 127 aligned Cα atoms, despite only 24% of sequence identity between them.
Since TS is responsible for the production of dTMP, one of the building blocks for DNA synthesis, it has been extensively studied as a target for cancer chemotherapy 18 . A number of structures of TS in complexes with various fragments of substrates, both in the presence or in the absence of cofactor analogues, are available 19 . These studies revealed that the cofactor triggered closure of the active site, that the pyrimidine ring of the substrate dUMP directed its binding orientation at the active site, that the ribose sugar moiety contributed to the enzyme's substrate specificity, and that the glycosidic linkage was critical for the precise localization of the substrate 19 . However, structural studies on how TS protein superfamily members differentiate between ribosyl and 2′ -deoxyribosyl substrates are relatively limited, probably in part, due to the lack of enzymes in this family biased towards ribosyl substrates. A report in this regard is that the binding affinity of TS for uridine monophosphate (UMP) is 40 times lower than that for dUMP 20 . All other usual members of TS, such as 2′ -deoxyuridylate hydroxymethylase (dUH) from phage SPO1 21 , dCMP hydroxymethyalse (CH) from phage T4 22 , dCMP methylase from phage Xp12 23 , are specific for 2′ -deoxynucleotides.
Several structural studies on ribose recognition specificity involved in pyrimidine nucleotide metabolism have been reported previously. The human mitochondrial deoxyribonucleotidase mdN prefers the 2′ -deoxyribose form of nucleoside monophosphate. In the structure of mdN, a hydrophobic pitch surrounding the 2′ position of the sugar moiety produces an energetically unfavorable environment for the 2′ -hydroxyl group of ribonucleoside 5′ -monophosphates 24 . Another case of deoxyribose preference is deoxyribonucleoside kinase (dNK) from Drosophila melanogaster. In the structure of dNK, the crowded surrounding in the 2′ -position of the substrate sugar leads to steric hindrance against the 2′ -hydroxyl group and hence makes ribose forms of nucleosides less favorable than deoxyribose forms 25 . A rare case of ribose preference is human uridine-cytidine kinase (UCK). It has high specificity for the 2′ -hydroxyl group of pyrimidine ribonucleosides and does not phosphorylate deoxyribose forms 26,27 . Comparison of ligand-free and -bound structures of UCK suggested that the ribose needs to be tightly bound to the enzyme in advance and then triggers a considerable conformational change to form the binding site. Poor binding of the deoxyribose sugar moiety cannot produce the induced fit required for the following base recognition and phosphorylation processes 28 . On the other hand, bacterial CMP kinase phosphorylates dCMP nearly as efficiently as CMP. Its structures in complexes with CMP or dCMP showed that Arg181 forms hydrogen bonds with the 3′ -hydroxyl of sugar moiety while Asp185 could be hydrogen bonded to both 3′ -and 2′ -hydroxyl group 29 . There is no hydrophobic pitch or steric hindrance around the 2′ -position of the substrate sugar; and unlike UCK, no induced fit is required for base binding. In addition, it was reported that a single Y639F mutation in the T7 RNA polymerase resulted in an ~20 fold loss of its specificity for NTP over dNTP [30][31][32] ; while a single residue Glu710 of E. coli DNA polymerase I (Klenow fragment) dictated its specificity for dNTP by sterically blocking the 2′ -hydroxyl of an incoming NTP 33 . Besides, the stringency of dNTP over NTP for the MoMLV reverse transcriptase was relaxed from 10,000-fold to merely 30-fold by its F155V mutation 32 , and the dNTP/ddNTP specificities of DNA polymerases of the pol I family could be switched simply by mutating a phenylalanine residue (corresponding to Phe762 for Klenow fragment) to a tyrosine residue 32 .
In this study, we demonstrated that MilA has a substrate preference for CMP (k cat /K M = 39.2 mM −1 min −1 ) over dCMP (k cat /K M = 7.84 mM −1 min −1 ), and thus offers an opportunity to investigate the mechanism by which conserved TS evolves the preference for ribosyl over 2′ -deoxyribosyl groups. The crystal structures of apo MilA, MilA in complexes with CMP, dCMP and hmCMP were determined. Sequence and structure analyses suggested that the selectivity of ribosyl substrates by MilA is attributed to Ala176 and Lys133′ from the other chain of the dimer in the ribose-binding pocket. Mutation of A176S/K133R of MilA resulted in a reversal of substrate preference from CMP to dCMP.

Results and Discussion
Substrate preference of MilA for CMP. We previously reported that MilA could only convert CMP into hmCMP, but could not take dCMP as its substrate 12 . Given only 26% sequence identity with CH, MilA was assayed for hydroxymethylation activity with dCMP as substrate. Unexpectedly, liquid chromatography-mass spectroscopy (LC-MS) detected the ion corresponding to the product hmdCMP ([M + H] + mass = 338, retention time Rt = 16.5 min), however its UV absorption peak was covered by that of the tetrahydrofolate (THFA) (Rt = 16.8 min) (Fig. S1A). To compare substrate preference, equal concentrations of CMP and dCMP were added in the same reaction system with MilA to compete with each other, and hmdCMP and THFA were completely separated using an optimized elution condition in high-performance liquid chromatography (HPLC) analysis. Our results clearly showed that MilA had a strong preference for CMP over dCMP ( Fig. 1 and Fig. S1B). The kinetic parameters for MilA were determined with either CMP or dCMP as its substrate (Table 1, Fig. S2). The K M for CMP was 0.0719 mM, 3.4-fold lower than that for dCMP (K M = 0.245 mM), demonstrating that CMP was a better substrate than dCMP for MilA. The k cat /K M for hmCMP was 39.2 mM −1 min −1 , 5-fold higher than that for hmdCMP (k cat /K M = 7.84 mM −1 min −1 , Table 1). Prompted by this observation, we performed the structural comparison of MilA with CH and other TS members to identify the amino acids of MilA critical for its substrate preference for ribosyl cytidylate.

Structure of MilA.
The structure of C-terminally His-tagged MilA was determined using selenomethionine (SeMet)-substituted MilA-L167M mutant at a 2.20 Å resolution (Table 2). Subsequently, the structures of MilA-CMP, MilA-dCMP and MilA-hmCMP complexes were refined to 1.65 Å, 2.10 Å and 1.80 Å resolution, respectively (Table 2). In the structures of apo MilA and its complexes with various substrates, MilA are all homodimers. The non-crystallographic symmetry (NCS) between the two monomers in the crystallographic asymmetric unit is a twofold rotation with no translation. The N-terminal three residues, C-terminal five residues, residue 232-238 of MilA, as well as the eight residues (LEHHHHHH) introduced by cloning, showed no clear electron density and presumably were disordered in the crystal. The electron density for residues 305-308 was poor in the structure of apo MilA but resolved clearly in the structures of all the MilA-substrate complexes. There is no obvious difference between the structures of CMP-bound MilA and apo MilA, with the root-mean-square deviation (RMSD) being 0.34 Å for 634 aligned Cα atoms. Interestingly, the average B-factor of a loop region around Arg31 (residues 29-33) is dramatically lowered from 43.2 to 24.8 Å 2 upon CMP-binding ( Fig. 2A & B).
The homodimer of MilA consists of two essentially identical subunits and has approximate dimensions of 108 Å × 108 Å × 112 Å. A MilA monomer consists of a six-stranded β -sheet, surrounded by thirteen α -helices and four 3 10 -helices (Fig. 2C). MilA possesses a common fold shared by TS and CH. Compared with TS and CH, MilA has an extra domain consisting of five α helices (from α 9 to α 13) in its C-terminal region ( Fig. 2C and D). Each active site of the dimer is contributed asymmetrically by residues from both subunits. The substrate CMP is located very close to the dimer interface (Fig. 2E). All six β -strands within each monomer as well as α -helices α 1, α 5 and α 6 are involved in dimerization ( Fig. 2C and E), in a manner similar to the dimerization patterns of CH and TS.
Structural similarity to T4 CH and bacterial TS. The major parts of MilA, T4 CH and E. coli TS subunits resemble each other very well, except for some significant structural difference located at the C-terminal region (Fig. 2D). After getting rid of the the C-terminal region, a superposition of the MilA with E. coli TS and T4 CH gives the RMSD of 1.293 Å and 1.209 Å, respectively. E. coli TS presents extra 27 C-terminal residues folded as two short β -strands, a 3 10 -helix and a long loop that is absent in CH ( Fig. 3A and B). Unlike the structure of CH and E. coli TS, the C-terminal region of MilA consists of five helices (α 9-α 13) linked by loops (Fig. 3C). The last six residues of E. coli TS move ~4 Å upon binding folate, and partly cover the active site 34,35 . Therefore, the presumable folate-binding site of T4 CH is more open than that of E. coli TS 22 . Some parts of this region are believed to provide an interaction surface for dihydrofolate reductase (DHFR) 36,37 . However, DHFR is not functionally required to interact with T4 CH and MilA, since tetrahydrofolate is produced in the T4 CH and MilA-catalyzed hydroxymethylation reaction. Different from TS, the extra C-terminal region of MilA is much bigger and positioned away from the active site. C-terminal truncations either from residue 235 or from residue 249 of MilA are both insoluble (data not shown). Presumably, this region could function as a domain to facilitate protein folding.
CMP binding and ribose specificity. CH, dUH, and TS all prefer the deoxyribose forms of substrates. In contrast, MilA accepts the ribose form more efficiently than the deoxyribose form, which makes it unique. The substrate CMP is bound in a deep active-site pocket of MilA, in a manner similar to the binding of dUMP by T4 CH and TS (Fig. 3). Most of the key amino acids involved in nucleotide recognition between the structures of MilA-CMP and CH-dCMP aligned very well, except for several substitutions of amino acids in the binding pocket.
In the CH structure, His216 and Tyr218 make hydrogen bonds with the 3′ -oxygen atom of 2′ -deoxyribose sugar 22 . In the crystal structure of CH, the imidazole ring of His216 could be in two different rotameric states. It is the same case for the analogous His216 in MilA. We propose that His216 of MilA and CH both probably adopt the more favorable rotameric state as shown in Fig. 4B, with the distance between the ε -nitrogen of His216 and the 3′ -oxygen of dCMP being 2.7 Å rather than 3.4 Å for the other rotameric state.  The catalytic efficiency (as quantified by the k cat /K M value) of MilA for CMP is 5-fold higher than that for dCMP. In contrast, the k cat /K M value of TS for dUMP is about 300-fold higher than that for UMP (Table 1). This observation immediately raises two questions. First, what is the molecular mechanism for that MilA prefers ribose nucleotide substrates whereas TS favors deoxyribose ones? Second, why does TS has a much higher stringency on substrate specificity (with an almost 300-fold difference between the two kinds of substrates) than MilA (with only a mere 5-fold difference)? Both these two interesting questions warrant further investigations for us.
Through a comparison of the active site structures of the TS-dUMP, CH-dCMP, MilA-dCMP and MilA-CMP complexes, it was found that the 3′ -hydroxyl groups of the sugar moiety of substrates adopt two different conformations when complexed with MilA or TS/CH (Figs 5B and S4); the 3′ -carbon together with its 3′ -OH of deoxyribose motif has a dramatic torsion (with 3′ -C set as the vertex, the angle from 6′ -O to 3′ -O is increased from 102.3°/105.3° to 136.4°) in MilA-dCMP relative to TS-dUMP/CH-dCMP (Fig. 5B, panel 1-3). In TS or CH, both of which prefer deoxyribosyl substrates, the 3′ -hydroxyl group of dUMP/dCMP makes hydrogen bonds with TS-His207/CH-His216 and TS-Tyr209/CH-Tyr218 respectively (Fig. 5B, panel 1&2). However, in MilA-dCMP, the 3′ -hydroxyl group of dCMP forms one hydrogen bond with the Lys-133′ and another intramolecular hydrogen bond with the phosphate group (Fig. 5B, panel 3). The main reason for this difference is that Ala176 in MilA is replaced by a serine, Ser167/Ser169, in TS/CH. The extra hydroxyl group of TS-Ser167/CH-Ser169 makes the space crowded for the 3′ -hydroxyl group of the sugar, and cannot tolerate the sugar moiety of the substrate to adopt the same conformation as that when in complex with MilA. It is not hard to imagine that when UMP or CMP attempts to enter the substrate-binding pocket of TS or CH, the 2′ -hydroxyl group of the sugar would occupy the space of the 3′ -hydroxyl group, and the 3′ -hydroxyl group would have to adopt the same conformation as CMP in MilA, in which case, the additonal hydroxyl group of TS-Ser167 or CH-Ser169 side-chain would give rise to steric hinderance with the 3′ -hydroxyl group of the sugar given the close distance (1.9 Å as indicated panel 4 of Fig. 5B). In contrast, Ala176, with its much smaller side-chain methyl group, is the corresponding residue for TS-Ser167/CH-Ser169 in MilA and makes the larger room. Therefore, both CMP and dCMP are able to fit into the substrate-binding pocket of MilA (Fig. 5A). Hence, MilA can not only can utilize CMP, but also can use dCMP as its substrate like CH and TS. This provides an explanation for the second question raised above. An alternative interpretation for this question might be that TS is actually a better enzyme than MilA in terms of catalytic efficiency. According to the kinetics summarized in Table 1, the catalytic efficiency of TS on dUMP is 4 orders of magnitude higher than that of MilA on CMP, probably magnifying the stringency of TS in selection of dUMP over UMP than that of MilA in selection of CMP over dCMP.
In accordance with structural analysis, mutation of alanine 176 into serine had dramatically decreased its activity towards CMP, but significantly enhanced its catalytic efficiency toward dCMP (Fig. 6). This further confirmed that Ala176 of MilA is critical for its substrate specificity. The fact that MilA-A176S could still catalyze the hydroxymethylation reaction of CMP implies that its substrate-binding pocket can still accommodate CMP. In the structures of TS-dUMP and CH-dCMP, the guanidino side chain of TS-Arg126′ or CH-Arg123′ could bond to three oxygen atoms of the phosphate group without formation of any bonds to the ribose moiety (Fig. S3A,B). By contrast, its counterpart residue in MilA is lysine 133′ , which formed hydrogen bonds with 3′ -hydroxyl group of dCMP or CMP in respective protein/substrate complex (Fig. 5A, panel 2&3). It seems that Lys133′ in MilA plays an auxiliary role in the ribose specificity. To address this possibility, Lys133′ was further mutated into arginine on the basis of MilA A176S, the catalytic efficiency toward CMP was completely eliminated in the double mutant MilA A176S/K133R, but its efficiency to dCMP is slightly affected (Fig. 6).
As for the first question, the reason that MilA prefers ribosyl substrates is because in addition to the hydrogen bonds with the 3′ -hydroxyl group of CMP, the 2′ -hydroxyl group of CMP makes strong hydrogen bonds with Tyr218 and His216 of MilA, with distances of 2.7 Å and 2.8 Å, respectively. These two additional hydrogen bonds make contributions to lower MilA′ s K M value for CMP compared to that for dCMP. In summary, our structural information strongly implies that the evolution from a serine and an arginine in the active site of TS/CH to an alanine and a lysine in the active site of MilA contributes a lot to the switch of substrate specificity from deoxyribosyl substrate (dUMP/dCMP) to ribosyl substrate (CMP) (Fig. 5). Comparison of sequences and identification of critical amino acids. CH, dUH and TS all prefer deoxyribose-containing substrates, while MilA and BcmA accept ribose-containing substrates more efficiently than deoxyribose-containing ones. There should be structural differences in the substrate-binding sites of MilA and BcmA from other enzymes. Therefore, we aligned the primary sequences of MilA and BcmA with those of T4 CH, dUH from phage SPO1, and E. coli TS using the Cobalt Constraint-based multiple protein alignment tool. The sequence alignment, which is graphically enhanced by Espript 3.0 38 , shows that most of the critical amino acids in the active site are conserved (Fig. 7). For instance, the reactive nucleophile residue Cys155, catalytically important residue Glu68, and ribose-binding residues His216 and Tyr218 of MilA are extremely conserved. These assure similar catalytic mechanisms for MilA, CH, dUH and TS.
On the other hand, three amino acids in MilA, Lys133′ , Ala176 and Asp186 are not conserved in all these five proteins. Interestingly, Lys133′ and Ala176 are conserved in MilA and BcmA, which prefer ribosyl-containing substrates; whereas the equivalent residues in enzymes preferring deoxyribosylated substrates are all arginines and serines. The third residue Asp186 is conserved in MilA, BcmA, and T4 CH which utilize cytosine-containing substrates; whereas the equivalent residue in enzymes favoring uracil-containing substrates like TS and dUH are both asparagines. Song et al. have proposed that in analogy with L. casei TS, Asp179 of T4 CH prefers dCMP to dUMP by achieving a proper orientation of the pyrimidine base through a hydrogen bond network for nucleophilic attack by Cys148 and a better stabilization of the reaction intermediates 22,39 , which is consistent with our structure of MilA-CMP.

Site-directed mutagenesis of MilA. Gene encoding wild type (WT) Streptomyces rimofaciens ZJU5119
MilA was cloned into the pET28a (Novagen) vector, with a C-terminal 6 × His tag. All mutant plasmids were produced by the whole-plasmid polymerase chain reaction 40 , and verified by sequencing. The plasmids and the primers used in this study are listed in Supplementary Information, Tables S1 and S2.  Protein expression and purification. Proteins were overexpressed in the Escherichia coli strain BL21(DE3) at 16 °C. 10 ml culture grown overnight from a single colony was inoculated into 1 liter of Luria Broth medium supplied with 50 μ g/ml kanamycin and 34 μ g/ml chloramphenicol. The culture was incubated at 37 °C to OD 600 = 0.6~0.8, and induced by 0.2 mM isopropyl β -D-1-thiogalactopyranoside (IPTG) for another 20 hours at 16 °C. The cells were harvested and resuspended in 20 ml binding buffer (20 mM sodium phosphate, pH 7.4, 20 mM imidazole and 500 mM sodium chloride), and lysed by sonication in an ice bath. After centrifugation at 16,000 × g for 30 min at 4 °C, the supernatant was applied to 2 ml Ni-NTA column (Qiagen) pre-equilibrated with the binding buffer. The column was washed by 60 ml binding buffer and 10 ml washing buffer (20 mM sodium phosphate, pH 7.4, 50 mM imidazole and 500 mM sodium chloride). The column was then eluted with 10 ml elution buffer (20 mM sodium phosphate, pH 7.4, 300 mM imidazole and 500 mM sodium chloride). All the eluant was collected and further purified by the Superdex 200 gel filtration chromatography (GE Healthcare) equilibrated with 10 mM Tris-HCl, pH 7.4, 100 mM sodium chloride and 2 mM dithiothreitol. The purified proteins were analyzed by sodium dodecylsulphate-polyacrylamide gel electrophoresis and visualized by Coomassie blue staining, and the protein concentration was determined by using the Bradford Protein Assay Kit (Bio-Rad). The combined peak fractions were concentrated to 10 mg/ml. Selenomethionine (SeMet)-substituted MilA-L167M was expressed using the methionine-autotrophic E. coli strain B834 cultured in M9 medium (carbon source: glucose) and purified similarly, except that 20 mM β -mercaptoethanol was added before sonication.
In vitro enzymatic assays of MilA WT and MilA mutants and analytical high-performance liquid chromatography (HPLC). In vitro assays of recombinant MilA were carried out at 37 °C for 1 h in a total volume of 100 μ l that contained Tris-HCl buffer (100 mM, pH 7.5), paraformaldehyde (15 mM), 2-mercaptoethanol (50 mM), tetrahydrofolate (2 mM, pH 7.5), CMP and dCMP (1 mM, pH 7.5) and the corresponding His-tagged MilA or its mutants (10 μ g). The reactions were quenched by the addition of trichloroacetic acid (4%) on ice, the products were resolved by Agilent TC-C18 column (4.6 mm × 250 mm, 5-Micron) on an Agilent 1200 HPLC system using a mobile phase of a gradient of methanol in water supplied with formic acid (0.1%). The constant flow rate for the LC eluent is 0.3 ml/min. Chromatograms were detected using the absorbance at 275 nm. The percentages of methanol (M) at time t varied according to the following scheme: (t, M), (0, 3), (30,3), (31,90), (35,90), (36,3), (45,3). The accurate mass of the reaction products that were previously determined by NMR 12 were analyzed by QTOF/MS (Agilent G6530A).
Enzymatic kinetic parameters measurement for MilA. Kinetic parameters were monitored on the basis of production of hmCMP/hmdCMP from CMP or dCMP catalysed by WT MilA. The co-substrate 5 N, 10 N-methylenetetrahydrofolate (CH 2 THF) was prepared as reference 41 . As the concentration of CH 2 THF is hard to determine, prior to performing the kinetic assay, CH 2 THF solutions prepared with tetrahydrofolate (THFA) of 2 mM and 5 mM were incubated with 1.6 μ M MilA and 2 mM CMP or dCMP, respectively. Compared to the reaction with 2 mM THFA, there is no increase of either product when 5 mM THFA applied, indicating that MilA is saturated with CH 2 THF generated by 2 mM starting THFA. On the other hand, 2 mM CMP or dCMP cannot be completely converted to product in each of reaction. MilA of 1.6 μ M was incubated with various concentrations of the substrate in 50-mM Tris-HCl, pH 7.5, for 30 min at 37 °C, and then the reactions (with a total volume of 100 μ l) were quenched by the addition of trichloroacetic acid (4%) on ice. After centrifugation at 16,000 × g for 5 min, the samples were analysed by HPLC as described above. The structures of the reaction products were determined by QTOF/ MS (Agilent G6530A). Kinetic parameters were calculated by fitting the enzymatic data to the Michaelis-Menten equation by the non-linear regression analysis (Prism5; GraphPad Software Inc.).
Crystallization. Crystallization trials for full-length MilA were performed at 14 °C using the hanging-drop vapor-diffusion method in 48-well plates. Typically, 1 μ l reservoir solution was mixed with 1 μ l protein solution and equilibrated against 1 ml reservoir solution. Initial crystallization screening trials were performed using Crystal Screen, Index, PEG/Ion and SaltRx screen kits from Hampton Research. After 2 weeks, small crystals of full-length MilA were obtained from the condition that consists of 30% (w/v) polyethylene glycol 4000, 0.2 M lithium sulfate monohydrate and 0.1 M Tris-HCl, pH 8.5. Longer and thicker crystals were obtained by using In the presence of both CMP and dCMP, MilA-WT preferred to hydroxymethylate CMP and just a slight amount of hmdCMP was produced. MilA A176S had dramatically decreased its activity towards CMP, but significantly enhanced its catalytic efficiency toward dCMP. The catalytic efficiency of MilA A176S/K133R toward CMP was completely eliminated, but its efficiency to dCMP is slightly affected.
12-20% (w/v) polyethylene glycol 3350. After further optimization, diffracting crystals were obtained from 15% (w/v) polyethylene glycol 3350, 0.08 M lithium sulfate monohydrate and 0.1 M Tris-HCl, pH 8.5, using the hanging-drop vapor-diffusion method in 48-well plates at 14 °C. Given that only two methionine residues are , BcmA from C. botulinum (CB_BcmA), CH from T4 phage (T4_CH), dUH from phage SPO-1 (SP_dUH) and TS from E. coli (EC_TS) were aligned. Conserved residues are highlighted in dark-red background. Residues involved in ribose specificity are indicated with red stars and green triangles. Catalytic residues are represented with red triangles in the bottom. Residues involved in phosphate-binding and base-binding specificities are marked, respectively, with pink triangles and blue star in the bottom. The secondary structure of MilA is shown above the sequences. present in MilA, we introduced a L167M mutation into MilA in order to enhance the anomalous diffraction signal. SeMet-MilA-L167M was crystallized at 14 °C in 15% (w/v) polyethylene glycol 3350, 0.08 M lithium sulfate monohydrate and 0.1 M Tris-HCl, pH 8.5. The crystals of MilA-CMP, MilA-dCMP and MilA-hmCMP complexes were obtained by crystallization in the presence of substrates from condition which consists of 0.1 M sodium cacodylate trihydrate, pH 6.5, and 1.4 M sodium acetate trihydrate. The substrate hmCMP was obtained by a one-step conversion of CMP by purified MilA, followed by the purification procedure described as reported 42 . Diffraction datasets of all the crystals were collected at the BL17U1 or BL19U1 beamlines at Shanghai Synchrotron Radiation Facility (SSRF) using an ADSC Quantum 315r CCD area detector and a Pilatus 3-6 M CMOS detector, and processed using HKL2000 and HKL3000 43,44 . Structure determination. SeMet-MilA L167M crystals belonged to the P3 2 21 space group and contained two molecules in the asymmetric unit. Its structure was determined by the single wavelength anomalous diffraction (SAD) method using PHENIX 45,46 . Crystals of apo MilA and MilA complexed with its substrates all belonged to the P3 2 21 space group, with two molecules in the asymmetric unit. Their structures were determined by the molecular replacement method with Phaser 47,48 , using the structure of SeMet-MilA-L167M as the searching model. Model building was performed by Coot 49 and refinement was performed by REFMAC5 50 and Phenix 51 . All the data of collection and refinement statistics are shown in Table 2.