Purification and characterization of a novel phloretin-2′-O-glycosyltransferase favoring phloridzin biosynthesis

Phloretin-2′-O-glycosyltransferase (P2′GT) catalyzes the last glycosylation step in the biosynthesis of phloridzin that contributes to the flavor, color and health benefits of apples and processed apple products. In this work, a novel P2′GT of Malus x domestica (MdP2′GT) with a specific activity of 46.82 μkat/Kg protein toward phloretin and uridine diphosphate glucose (UDPG) at an optimal temperature of 30 °C and pH 8.0 was purified from the engineered Pichia pastoris broth to homogeneity by anion exchange chromatography, His-Trap affinity chromatography and gel filtration. The purified MdP2′GT was low N-glycosylated and secreted as a stable dimer with a molecular mass of 70.7 kDa in its native form. Importantly, MdP2′GT also exhibited activity towards quercetin and adenosine diphosphate glucose (ADPG), kaempferol and UDPG, quercetin and UDP-galactose, isoliquiritigenin and UDPG, and luteolin and UDPG, producing only one isoquercitrin, astragalin, hyperoside, isoliquiritin, or cynaroside, respectively. This broad spectrum of activities make MdP2′GT a promising biocatalyst for the industrial preparation of the corresponding polyphenol glycosides, preferably for their subsequent isolation and purification. Besides, MdP2′GT displayed the lowest Km and the highest kcat/Km for phloretin and UDPG compared to all previously reported P2′GTs, making MdP2′GT favor phloridzin synthesis the most.

Phloretin and its glycosides belong to a subclass of polyphenols within a larger group of plant-based phenolic bioactive natural products. They exhibit a wide variety of beneficial biological activities, such as anti-oxidant 1 , anti-inflammatory 2 , anti-cancer 3 and anti-diabetic activities 4 . Phloretin, a precursor of phloridzin, exhibits additional pharmacologically biological functions like anti-tumor 5 and anti-estrogen activities 6 , and inhibition of cardiovascular disease 3 . However, the use of phloretin as a drug and food additive has been limited due to its weak aqueous solubility, chemical and/or biological instability, and low absorbability 1,3,7,8 . Attachment of sugar moieties to phloretin by glycosylation reactions can increase its aqueous solubility and in vivo half life, and typically exerts profound direct or indirect effects on its biological activity 1,3,7,9,10 .

Results
Purification of MdP2′GT. The three-step purified process used for MdP2′ GT purification is summarized in Fig. 1 and Table 1. The first purification step (DEAE anion-exchange chromatography) resulted in five enzymatically active fractions (F1-F5) that exhibited significant differences (P < 0.05) in specific activity (Fig. 1A), which led to an approximately 59-fold purification with 49.85% yield and a specific activity of 6.48 μ kat/Kg protein for MdP2′ GT (Table 1). His-trap affinity chromatography with 80 mM imidazole as the eluent (E6, Fig. 1B) further increased the purification to 301-fold and the specific activity to 33.13 μ kat/Kg protein, with 13.13% yield ( Table 1). The gel filtration was finally performed to remove the imidazole in E6, resulting in three fractions (G1-G3) ( Fig. 1C(1)). G3, which exhibited the highest specific activity, was loaded again onto a G75 Superdex gel filtration column and yielded a single peak in chromatography ( Fig. 1C(2)). Accordingly, approximately 110 mg of purified MdP2′ GT was obtained from 1 L of engineered P. pastoris GS115 culture, the specific activity of which increased from an average of 0.11 μ kat/Kg protein in the crude broth to 40.45 μ kat/Kg protein, representing a purification of 368-fold with 9.02% recovery. The single peak was further analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and Western blotting and both yielded a single homogeneous band (Fig. 1D), indicating the purity of MdP2′ GT was more than 95%. However, the size of purified MdP2′ GT appeared larger than the theoretical value (53.9 kDa) deduced from the amino acid sequence of mature MdP2′ GT fused to the His 6 tag (Fig. S1).

Molecular weight evaluation and glycosylation analysis.
According to the size exclusion chromatography (SEC) and the protein standard curve ( Fig. 2A,B), the molecular weight (MW) of purified MdP2′ GT was approximately 70.7 kDa that is much larger than the theoretical value of 53.9 kDa. To explain this discrepancy, the glycosylation pattern was investigated by deglycosylation with Peptide N-glycosidase F (PNGase F) and Endoglycosidase H (Endo H). Asymmetric tailing of the protein peak in SEC (Fig. 2A2) and the broad signal in SDS-PAGE (Fig. 2C1) suggested a heterologous glycosylation pattern in MdP2′ GT. The recombinant MdP2′ GT was released as a stable dimer with an approximate molecular mass of 65.3 kDa after deglycosylation with PNGase F or Endo H in its native states (Figs 2A3 and 5B,C2,C5). MALDI-TOF mass spectrometric analysis further confirmed that the MdP2′ GT signal was decreased from 70 kDa to 65 kDa after deglycosylation under native conditions (Fig. 2D1,2). In contrast, under denaturing conditions, MdP2′ GT was deglycosylated by PNGase F generating two subunits with MWs of 35.2 kDa and 24.9 kDa (Fig. 2). However, when MdP2′ GT was treated with Endo H under identical conditions, only a monomeric molecular mass of 24.5 kDa was observed (Fig. 2). Hence, approximately 23.8% of the molecular mass is attributable to N-linked carbohydrates, but approximately 30.4% of the glycosylation is resistant to PNGase F treatment under denaturing conditions. Additionally, MALDI-TOF mass spectrometric analysis revealed that asparagine glycosylation was the dominant site for the addition of mannose-containing oligosaccharide and the candidate sites included Asn77, 95, 132, and 195. Finally, the specific activities of native and natively deglycosylated MdP2′ GT toward phloretin and UDPG were determined. Interestingly, removal of the MdP2′ GT glycosylation sites resulted in no significant change in specific activity (Fig. 2C). Thus, the purified MdP2′ GT in this study could be used directly in practice without additional treatments.
Biochemical characterization of recombinant MdP2′GT. Purified MdP2′ GT displayed maximal activity at pH 8.0, and up to 80% of its maximal activity remained within the pH range of 7.0-9.0 (Fig. 3A) MdP2′ GT remained stable in the pH range of 7.0-9.0 incubated at 30 °C for 24 h (Fig. 3B). At pH 8.5, the maximal residual activity of MdP2′ GT was > 90% of its original activity, but at pH < 6.5, the residual activity was reduced to < 50% (Fig. 3B), indicating that purified MdP2′ GT preferred alkaline conditions to acidic conditions. Additionally, purified MdP2′ GT exhibited higher activity and stability in Tris/HCl buffer than in potassium phosphate buffer at pH 7.0 (Fig. 3A,B). Therefore, subsequent biochemical characterizations were performed in Tris/ HCl buffer system.
Purified MdP2′ GT exhibited activity in a broad temperature range from 15 °C to 65 °C, with a maximal activity of 46.18 μ kat/Kg protein at 30 °C (Fig. 3C). Its activity was not obviously influenced by incubation at temperatures  as high as 45 °C for 20 min but decreased at higher temperatures or longer incubation times (Fig. 3D). Incubated at 45 °C for 60 min, the enzyme retained 70% of its original activity, whereas incubated at 55 °C for 60 min decreased its relative enzyme activity to < 40%. Under identical conditions at 65 °C, enzyme activity was only 2% of its original activity (Fig. 3D). As shown in Table 2, enzyme activity was completely inhibited by 5 mM Co 2+ and Cu 2+ , partially inhibited by 5 mM Ca 2+ , Mn 2+ , Al 3+ , and Fe 3+ , and negligibly affected by 0.5 mM K + , Ni 2+ and Ba 2+ , while 2 mM Mg 2+ and 5 mM Fe 2+ could increase enzyme activity to 130% and 120% of the original activity, respectively. Sodium chloride, used in the purification protocol, had no obvious inhibitory effects even at the concentrations as high as 500 mM. Also, EDTA had no obvious inhibitory effects, suggesting MdP2′ GT is not a metalloenzyme. However, MdP2′ GT can be degraded by endogenous Pichia proteases and is susceptible to sulfhydryl oxidation, suggesting that adding glycerol, 4-(2-Aminoethyl) benzenesulfonyl fluoride (AEBSF, a protease inhibitor) and β -mercaptoethanol (β -ME) or dithiothreitol (DTT) to the buffers used during purification is essential. No obvious effects of 5% glycerol, 0.5 mM AEBSF, 5 mM β -ME and 5 mM DTT on MdP2′ GT activity were observed. Due to maximum glycoside product concentrations were restricted by the aqueous solubility of the comparably hydrophobic aglycone acceptors, use of co-solvent conducting the reaction in an aqueous-organic two-phase system can enhance the effective acceptor concentration 7,8 . The co-solvents 20% methanol and 20% dimethyl sulfoxide (DMSO) also had no effects on enzyme activity. Whereas the strong inhibition was observed with 1 mM p-hydroxymercuribenzoate (PHMB) and 1 mM N-Ethylmaleimide (NEM), indicating the presence of an SH group in the active site of MdP2′ GT. MdP2′ GT was also completely inhibited by UDP (by-product), strongly inhibited by the other tested product analogs except for uridine at a final concentration of 5 mM. In addition, Higher IC 50 values indicate lower inhibition, and thus Cu 2+ was the most effective metal ion inhibitor, UDP was the most potent product inhibitor.
Chromatographic data demonstrated that the purified MdP2′ GT exhibited additional activities towards the chalcone (isoliquiritigenin), flavonols (kaempferol and quercetin), and the flavone (luteolin) to produce the corresponding glycosides isoliquiritin, astragaline, hyperoside, isoquercitrin, and cynaroside. Besides, the formed glycosides in seven donor/acceptor combinations were all O-glycosidic compounds, indicating MdP2′ GT is an O-glycosyltransferase (OGT). Remarkably, only one hydroxy group on the glycosyl acceptor was glycosylated by MdP2′ GT, resulting in the production of a single glycoside in each donor/acceptor combination. These unique features suggest MdP2′ GT a special P2′ GT that is not only high region specificity regarding its sugar attachment site but also accept a wide range of substrates and do not produce by-products during reactions.
Kinetic parameters of MdP2′GT. The apparent kinetic parameters of MdP2′ GT for the seven accepted substrate combinations were determined under optimal conditions to evaluate its substrate preference. As shown in Table 4, MdP2′ GT exhibited the lowest K m for UDP-glucose (7.53 μ M) and phloretin (0.50 μ M). The k cat for UDPG and phloretin were 21.33 s −1 and 9.87 s −1 , respectively, and were the highest among the seven combinations. Accordingly, the calculated catalytic efficiencies (k cat /K m ) were the highest for UDPG (2.83 s −1 μ M −1 ) and phloretin (19.73 s −1 μ M −1 ). Besides, the highest specific activity (46.82 μ kat/Kg protein) was also observed in the UDPG/phloretin combination. Thus, UDPG and phloretin were the most preferred donor/acceptor combination for MdP2′ GT among the seven donor/acceptor combinations. Compared to reported P2′ GTs, MdP2′ GT exhibited a lower K m and a higher k cat /K m than that of UGT88F1 in the UDPG/phloretin combination, which is the lowest K m value and the highest k cat /K m reported (Table 5) (Table 4), while it displayed relatively lower k cat /K m in the UDPG/Isoliquiritigenin and UDPG/Luteolin combinations to produce small amount of 4-O-glycoside (isoliquiritin) and 7-O-glycoside (Cynaroside) ( Table 4). These highly site-specific glycosylation characterizations and divergent product profiles favor the potential use of MdP2′ GT as a commercial biocatalyst. Additionally, purified MdP2′ GT exhibited optimal activity at pH values of 7.5 to 8.5 and temperatures from 30 °C to 40 °C depending on the combination of substrates. Strikingly, the specific activity and kinetic parameters of MdP2′ GT were dramatically influenced by the sugar donor and acceptor combinations, e. g. MdP2′ GT displayed maximal specific activity and optimal kinetic parameters in the UDPG/ phloretin combination but only 10% of its maximal specific activity and poorest kinetic parameters in the ADPG/ phloretin combination, while nearly 70% of the maximal specific activity was restored in the ADPG/quercetin combination. This donor-acceptor interaction is beneficial for MdP2′ GT as a promising biocatalyst because the affinity and turnover rates toward the substrate can be adjusted by modulating the type of NDP-sugars or acceptor substrates.
Molecular modeling and docking. Until now, there is no available crystal-based 3D structures for P2′ GTs 3,14,19-21 . The homology-based structure for MdP2′ GT was model using coordinates from the crystal structure of one plant GT (Accession no. 2vg8A) which shares sequence identity with MdP2′ GT as high as 34.07% 27 . As expected, MdP2′ GT consisted of N-and C-terminal domains with a similar α /β /α fold (often referred to as a Rossmann fold).These two domains were connected by the interdomain linker (Fig. 5A). The "putative secondary plant glycosyltransferase (PSPG)" consensus sequence of 44 amino acids found in all Leloir GTs that is thought to be involved in binding of the activated NDP-sugar donor 9,19 was observed in the C-terminal domain (Fig. 5A). We also explored molecular docking studies to elucidate the interactions between MdP2′ GT and its preferred substrates (UDPG and phloretin) as well as its most potent product inhibitor (UDP). Five residues (W359, A360, S382, E385, and Y399) in PSPG and two residues (Y55 and S290) out of PSPG were predicted to interact with the uridine diphosphate moiety of UDPG in the form of hydrogen binds, the residues E401, Q402, W380, H15, and T140 were predicted to interact with the 2″-OH, 3″-OH, and 6″-OH on glucose residue of the UDPG in the form of hydrogen bonds (Fig. 5B). For sugar acceptor phloretin, 2′-OH group gave the best fitness score by stabilizing H-bond formation to the sugar donor, 4′ -and 4-OH groups were predicted to interact with the residues P11 and Q189 in the form of H-bonds, respectively, the 6′ -OH group pointed in the opposite direction from the UDPG (Fig. 5B). It is therefore not surprising that only one 2′ -O-glycoside (phloridzin) was synthesized by MdP2′ GT in UDPG/Phloretin and ADPG/Phloretin combinations. Besides, UDP, one by-product generated during the enzyme reaction, was observed to interact with the same active site residues as that with the uridine diphosphate moiety of UDPG (Fig. 5C). In this case, UDP will complete with UDPG to irreversibly bind to their active sites, leading to the reduction of MdP2′ GT's affinity to UDPG.

Discussion
The predominant phenolic compound phloridzin (phloretin 2′ -O-glucoside) in Malus species has various beneficial biological and pharmacological activities 3,11,12,[19][20][21] . Compelling evidence has suggested that P2′ GT is a key enzyme catalysing the rate-limiting step in the phlorizin biosynthetic pathway 12,21 . Recently, it has reported seven plant P2′ GTs (UGT88F1, UGT88F2, UGTA15 (Accession no. AAZ80472), UGT71K1 (Accession no. ACZ44835), UGT71K2 (Accession no. ACZ44837), UGTA16 (Accession no. ACZ44836), and DicGT4 (Accession no. BAD5200)) 14,19,20 and one bacterial P2′ GT (YijC, Accession no. AAU40842) 3 , but they suffer from narrow substrate range or low region selectivity. Besides, these reported P2′ GTs from Malus display low sequence identities with each other (34.1-46.3%) 20 . Therefore, the presence of possible further phloretin glycosylating enzymes with novel characterizations cannot be excluded. In this work, a novel P2′ GT (MdP2′ GT) with highly region specificity regarding the sugar attachment site and relatively broader substrate acceptance, was purified from the generally recognized as safe (GRAS) organism engineered P. pastoris GS115 and then characterized. DEAE-anion-exchange chromatography, the first use in P2′ GT purification to our knowledge, effectively removed more than 50% of impurities and increased enzyme specific activity to approximately 60% via the effective adsorption, demonstrating it was a powerful tool for MdP2′ GT purification. Besides, DEAE Sepharose is a weak anion-exchange resin showing much higher affinity and capacity for the negatively charged bio-molecules than that for the positive or neutral charged ones 28 , signifying MdP2′ GT was a negatively charged P2′ GT, in good agreement with the deduced pI of 5.8 19,20 . However, the molecular mass of the purified MdP2′ GT was a little larger than the expected MW due to the low glycosylation pattern. Similar results have been observed in other GTs 29,30 . They reported that foreign enzyme expressed in P. pastoris can be hyper-or low-glycosylated when directed to secretion. This hyper-or low-glycosylation pattern can increase the MW and may affect the recombinant enzyme activity if its native structure does not contain the sugars. Fortunately, the catalytic activity of purified MdP2′ GT was not affected by the low-glycosylated pattern, which might be explained by the glycosylation sites that positioned far away from the predicted active sites in space (Fig. 5B). Instead, the low N-glycosylated pattern is favorable because it frequently increases the half-lives of foreign proteins expressed in P. pastoris 30,31 . In addition, the simple deglycosylation treatments in the present study might facilitate the determination of favorable conditions for advanced structural research on MdP2′ GT, as the flexibility and heterogeneity of carbohydrate moieties potentially affect the crystallization 32,33 .
Our data showed that purified MdP2′ GT preferred alkaline conditions, suggesting the amino acids ionized at alkaline pH might be present in the catalytic site 34 . Similar behavior was observed in UGT88F1, UGT88F2, and YjiC, but different from UGT71A15, UGT71K1, UGT71A16, UGT71K2, and DicGT4, which have the optimal pH values at 7.7, 8.0, 8.0, 6.75, 6.75, 6.75, 6.25, and 6.0, respectively (Table 5) (Table 5) 19,20 , and exhibit thermo-stability at temperatures as high as 80 °C 19,20 . These differences might result from the glycosylation in the amino acid sequences, the binding ability of the substrates to the enzyme active sites, the backbone structure determining the overall shape of the acceptor pocket, or the isoenzyme form and the amino acid resides under ambient conditions 9,31,35 . Other factors, such as expression vector and host strain, organisms, apple cultivars, genotype, as well as the edaphic and environmental (temperature, salinity, water stress and light intensity) conditions 1,3,14,20 , may also affect these properties. However, the observed enzyme inhibition by Co 2+ and Cu 2+ ions may be due not only to their effects on MdP2′ GT itself but also the destruction of phloretin, since the Co 2+ , Cu 2+ and Hg 2+ have been reported to damage polyphenol anthocyanins that has properties similar to those of phloretin 36,37 . Notably, similar GT inhibition by UDP, just 5 mM was sufficient to completely inactivate MdP2′ GT, has been reported previously and addressed by using the glycosyltransferase-catalyzed cascade reactions that utilize UDP dependent conversion of sucrose to regenerate the UDP-glucose donor with sucrose synthase 7,8,37 . The sucrose synthase reaction not only served to overcome the UDP inhibitory effect but also to solve the problem of cost-effective supply of the NDP-sugar substrate. P2′ GTs are in complexes with substrates and their corresponding products, such as UGT88F1 and UGT88F2 with phloretin as substrate and strictly region-specific for the position 2′ to produce phloridzin, other reported P2′ GTs with phloretin as substrate producing a mixture of mono-glycosides, di-glycosides and tri-glucosides 3,14,[19][20][21] . In addition to phloretin, UGTA15, UGT71K1, and UGT71K2 also accept isoliquiritigenin, kaempferol, and quercetin as substrates 20,21 , and DicGT4 accept kaempferol and quercetin as substrates 9 . Another polyphenol luteolin can be utilized by UGT71A15 and UGT71A16 as well 20 . Besides to these polyphenols, the reported P2′ GTs (except UGT88F1) also accept other compounds as substrates (Table 5) 14,20,38 . In present study, MdP2′ GT could transfer the glucose moiety from the activated UDPG to phloretin specifically at the position 2′ -OH to form phloridzin, which was similar to UGT88F1 and UGT88F2 but different from UGT71A15, UGT71K1, UGT71K2, UGT71A16, YijC, and DicGT4. Similar to other reported P2′ GTs (except UGT88F1 and UGT88F2), MdP2′ GT exhibited additional high activity towards isoliquiritigenin, kaempferol, quercetin, and luteolin, the formed corresponding glycosides, however, have remarkable and significant differences. E. g. phloretin and isoliquiritigenin only differ by a single C7-C8 bond and a C7-C8 double bond respectively on the flexible open-chain three-carbon linker connecting the two aromatic rings (Fig. 6) 12,23 . Due to this high structural similarity, 2′ -OH on isoliquiritigenin should have been glycosylated in the UDPG/isoliquiritigenin combination, whereas only 4-O-glucoside (isoliquiritin) was detected (Fig. 4). It might be attributable to that the amount of 2′ -O-glucoside was too low to be detected, but in scaled up reactions, it was still not detected. We speculate the presence of a C7-C8 double bond on the flexible open-chain three-carbon linker might create a steric hindrance on the 2′ -OH and made the 4-OH amenable to be glycosylated based on the similar finding in previous studies 9, 10,39,40 . Besides, isoquercitrin and hyperoside, which are native or accumulated as their aglycones in apples 41,42 , could be synthesized by MdP′ 2GT but couldn't by the reported P2′ GTs. Reported P2′ GTs such as UGTA15, UGT71K1, UGT71K2, and DicGT4 can accept quercetin as substrate, but the sugar donor must be UDPG, and the formed glycosides were quercetin-3-O-glucoside (quercitrin) and 7-O-glucoside not isoquercitrin 14,20 . Furthermore, MdP2′ GT's affinities to UDPG and phloretin, the substrate for phloridzin synthesis 12,21 , are also relatively high compared to reported P2′ GTs. The K m of MdP2′ GT at optimal pH 8.0 and temperature of 30 °C are 0.61% to 80.65% of the K m of reported P2′ GTs from other apple cultivars or organisms (Table 5) 19,21 . The k cat /K m of MdP2′ GT are 1.25-fold to three orders of magnitude to reported P2′ GTs for the phloridzin synthesis (Table 5) 19,21 . Therefore, MdP2′ GT shows the best enzymatic efficiency and favors phloridzin synthesis the most. Also MdP2′ GT shows extraordinary catalysis properties for phloridzin synthesis. 370 mg/L phloridzin (850 μ M) was obtained from 1.0 mM phloretin with purified MdP2′ GT as catalyst, which was the highest among all reported phloridzin yield of 1.29 × 10 −2 mg/ L 19 , 310 mg/L 7 , 1.18 × 10 −4 mg/L 21 , 5.27 mg/L 3 , 13.2 mg/L 14 . These significant differences between MdP2′ GT and reported P2′ GTs couldn't be solely attributable to their primary sequence identity (13.2-92.6%, Table 6), which can be illustrated by the two reported plant P2′ GTs named UGT88F1 and UGT88F2. These two P2′ GTs both belong to the UGT 88family and share 99.0% amino acid sequence identity ( Table 6) 20 but the UGT88F2 displays higher optimal pH, temperature, and phloridzin yield as well as exhibits additional activity to hydrolyze phloridzin to phloretin (Table 5) 7,8,20 , whereas the UGT88F1 exclusively glycosylates 2′ -OH on phloretin 19 . The two other UGTs designated UGT73A5 and UGT71F2 from Dorotheanthus bellidiformis provide an example of the contrary. These two UGTs show ~20% amino acid sequence identity but glucosylate the same set of acceptor molecules forming the same products 9  93.1% amino acid sequence identity (Table 6) but they also use the same compounds as sugar donors and acceptors and produce the same glycosides 20 . It has reported that the highly conserved secondary and tertiary structures are the key factors that influence the substrate specificity and k cat /K m values for GTs 9,10,27,43 . In structural analysis, MdP2′ GT features an α /β /α /β structure (Fig. 5A) while the reported P2′ GTs adopt a β /α /β fold in PSPG motif 14,19,20 . This extra α -helix structure can create a more favorable hydrophobic environment to offer stabilizing interactions between substrates and MdP2′ GT than reported P2′ GTs. Moreover, the glycine residues in PSPG motif (Fig. S1) provide a small size of the side chains, which can confer MdP2′ GT enough space availability to the substrates for binding the active packet. These unique structural features might not only give MdP2′ GT a lower K m toward UDPG than the reported P2′ GTs but might also offer it additional activities to the other UDP-sugar donor UDP-Gal. The interdomain linker is another crucial factor determining the substrates binding due to the highly flexibility with respect to length and sequence for UGTs 9,10,27,40,44 . Both linker spans and sequences of MdP2′ GT were the same to UGT88F1 and UGT88F2 but different from other P2′ GTs 3,14,19,20 , and MdP2′ GT shared high sequence identity with UGT88F1 (92.6%) and UGT88F2 (92.1%) but low with other P2′ GTs (13.2-33.6%, Table 6), which might result in that ADPG could be utilized by MdP2′ GT, UGT88F1 and UGT88F2 as another sugar donor but couldn't by the other P2′ GTs. Changes in the PSPG motif and the interdomain linker region affecting the affinity and types of sugar donor to the enzyme binding sites are also proposed for other crystallized GTs 9,10,39,40 . The highly divergent residues in the acceptor pocket confer GTs large differences in their individual range of acceptors 9,10,27,40 . For MdP2′ GT, the functional 2′ -OH group on phloretin can be positioned near to the His residue (H15) acting as the general base facilitating deprotonation of the acceptor while reported P2′ GTs such as YjiC, UGT71K1 and UGT71K2 carry a Gly residue, UGTA15 and UGTA16 carry a lle residue, and DicT4 carries a large phenylalanine residue at this position. These residues pointing into the acceptor binding pocket may hamper deprotonation of the acceptor and therefore affecting the catalytic efficiency (k cat /K m ). Similar findings were observed in other previous reports 9,10 . Besides, inter-and intradomain interactions, confer stabilization in the firm of S-S bridges, salt bridges and H-bond formation to the secondary and tertiary structure 9,40 , are important for activity and specify as well. Furthermore, the differences in k cat /K m between MdP2′ GT and reported P2′ GTs may also be related to the enzyme conformation stabilized by the glycosylation procedure 33 . The usual binding of the enzyme to the substrate will not be imbedded or prevented by ambience factors after glycosylation, which is commonly reflected by an increase in k cat /K m . Similar results were observed in previous reports 31, 35 but the opposite results exist as well 29,33 . These opposite researches reported that the glycosylation of the enzymes is usually accompanied by a reduction in the substrate affinity due to the steric crowding caused by the presence of carbohydrate molecules near the substrate binding site. This discrepancy need further investigated.
Among the accepted poplyphenols, MdP2′ GT preferred phloretin the most, followed by the kaempferol and isoliquinitigenin, and the worst one is luteolin when used UDPG as sugar donor (Table 4). Compared to phloretin, kaempferol and luteolin contain an extra C ring (Fig. 6) that may affect their binding affinity. The large aromatic residue Tyr399 in the predicted acceptor binding pocket (Fig. 5B) could create a steric hindrance to the extra C ring and locally affect the binding affinity. In contrast, there is no steric hindrance to the flexible open-chain three-carbon linker, making the functional 2-OH 2′-OH group on phloretin more amenable to be glycosylated. However, MdP2′ GT exhibits a lower k cat /K m toward isoliquiritigenin than kaempferol, although isoliquiritigenin has similar structure to phloretin with a flexible open-chain three-carbon linker. This discrepancy might be related to the presence of a C7-C8 double bond, lacking OH group on 6′ position of the A ring, and the absence of isoliquiritigenin in apples 41,42 . Luteolin lacks one OH group at the C3 position of the C ring but has one extra OH group at the C3′ position of the B ring compared to kaempferol (Fig. 6), which leads to the sugar moiety of UDPG transferred to the 7-OH group on the A ring. Also, luteolin is not natural constituents of apple fruit but kaempferol is 41,42 . All these differences make MdP2′ GT exhibit a lowest k cat /K m toward luteolin. In addition, quercetin contains one extra OH group at the C3′ position of the B ring compared to kaempferol (Fig. 6), it speculate that the residue Tyr399 can produce a stronger hindrance to quercetin than kaemferol making MdP2′ GT exhibit a higher k cat /K m toward kaempferol than quercetin. However, the k cat /K m with UDPG/kaempferol combination was much lower than that with ADPG/ quercetin and UDP-Gal/ quercetin combinations for MdP2′ GT (Table 4). On the other hand, the results signify the activated NDP-sugars may also influence the regioselectivity and catalytic efficiency of the Leloir GTs. One previous study also reports the activated NDP-sugars can influence the conformation of enzyme showing improved bioactivity 45 . This is also the first to report the donor-acceptor interactions during the final glycosylation step of phloridzin using a phloretin glycosyltransferase originating from apples.
As a novel P2′ GT showing predominantly region-specific concerning the sugar attachment site and favoring phloridzin biosynthesis the most, MdP2′ GT illustrates a distinctive set of key residues for substrates recognition. Moreover, the outstanding region-specific and broader substrate acceptance makes MdP2′ GT an interesting enzyme for a promising catalyst in industrial preparation of phloridzin, isoquercitrin, hyperoside, and astragaline. Due to the no information on its crystal structure, the catalysis mechanism of MdP2′ GT should be further elucidated by site directed mutagenesis or resolving its crystal structure or MdP2′ GT-substrate complex in future study.

Methods
Chemicals. NEM  MdP2′GT activity. The MdP2′ GT activity assay was performed following a previous method with minor modifications 19 . Reaction mixtures consisted of 0.5 ml buffer A (50 mM Tris/HCl, pH 7.5, and 5 mM DTT), 270 μM UDPG, 20 μ M phloretin (dissolved in DMSO) and 20 μ g purified MdP2′ GT. After incubation at 30 °C for 30 min, the reactions were immediately terminated by freezing in liquid N 2 and lyophilized. Methanol (0.25 ml) was then added to dissolve the components, and the product phloridzin was quantified by the HPLC-DAD method on a WondaSil ® C 18 column (4.6 × 250 mm, ID = 5 μ m, Shimadzu, Kyoto, Japan) as described in our previous study 42 . One unit of MdP2′ GT activity was defined as the amount of enzyme needed to produce 1 μ mol product per second (s) at 30 °C and pH 7.5, and specific activity was expressed as μ kat/Kg protein.

Expression and purification of
MdP2′GT. The open reading frame of MdP2′ GT was sub-cloned into the SnaBI and NotI sites of pPIC9K (Novagen) and transformed into P. pastoris GS115 (Novagen) by electroporation 46,47 . Engineered P. pastoris GS115 carrying pPIC9K/MdP2′ GT was cultured in a 5 L bioreactor (Applikon ® Biotechnology, Foster City, USA) with a working volume of 1 L buffered minimal sorbitol medium (BMSM, 100 mM potassium phosphate, pH 7.0, 1.34% YNB, 4 × 10 −5 % biotin, and 5.0% sorbitol) in the presence of G418 (4.0 mg/ml) overnight at 28 °C, 280 rpm, and were induced, at the optical density (OD 600 ) of 10.0-15.0, with 0.75% methanol at 25 °C for 122h 46,47 . The resulting supernatant was harvested, dialyzed and lyophilized to a powder. The powder was dissolved in buffer B (50 mM Tris/HCl, pH 7.5, 2 mM MgCl 2 , 5 mM DTT, 0.5 mM AEBSF, and 5% glycerol), filtered through a 0.22-μ m membrane (Millipore) and then loaded onto a pre-equilibrated diethyl-aminoethanol (DEAE) Sepharose column (HiTrap-DEAE-FF, 1 ml, Amersham Biosciences) operated with an AKTA purifier system (Amersham Biosciences, Uppsala, Sweden) 28 . The column was first washed with buffer B for 5 column volumes (CVs) and then eluted with buffer B containing NaCl (500 mM) a linear gradient (0 to 0.5 M) at a flow rate of 1.0 ml/min. The fractions of interest (identified by activity towards UDPG and Phloretin) were collected and dialyzed against buffer C (50 mM Tris/HCl, pH 7.5, 200 mM NaCl, 2 mM MgCl 2 , 5 mM DTT, and 0.5 mM AEBSF) at 4 °C. The dialyzed solution was further purified under native conditions on a gravity flow column (10 ml, GE Healthcare) containing 1 ml cOmplete His-Tag Purification Resin (Roche CH    20 . c Calculated according to the sequences of MdP2′ GT shown in Fig. S1 following the method described as Gosch et al. 20 . 05-893-682-001) pre-equilibrated with buffer C 48 . The column was stepwise washed with 20 CVs of 40-500 mM imidazole in buffer C. The fractions of interest, judged by SDS-PAGE and enzyme activity assays, were applied to the gel filtration on a Superdex 75 10/300 GL column (24 ml, Amersham Biosciences) using buffer C as the eluent 49 . The final purified fractions were pooled, desalted, and concentrated using a 10 K membrane ultracentrifugation system (Ultrace-10, Millipore, Massachusetts, USA) 49 . The enzyme purity was estimated by specific activity as described in enzyme activity assay, Coomassie Blue R-250 (Bio-Rad, CA, USA) stained 8% SDS-PAGE, and Western blot with an anti-His antibody 19 , respectively. The total protein was determined by the Bradford method using a Bradford kit with bovine serum albumin as the standard 47,48 .
Molecular weight evaluation. The concentration of the purified MdP2′ GT solution was adjusted to 5 mg/ml and loaded onto a Sephacryl S-300 column (GE Healthcare, Piscataway, USA; 1.6 × 60 cm) pre-equilibrated with buffer D (50 mM Tris-HCl, pH 7.5, 200 mM NaCl, and 5 mM DTT) for SEC 31 . Five marker proteins (75 kDa conalbumin, 67 kDa bovine serum albumin, 45 kDa egg albumins, 29 kDa carbonic anhydrase, and 14.9 kDa lysozyme) were used to calibrate the column and generate a standard curve based on the partition coefficient of K av (abscissa) and lgMr (ordinate). K av is calculated on the the molecular mass and the elution volume of each protein; Mr is the molecular mass of the corresponding marker proteins.

Glycosylation analysis. Glycosylation analysis was performed by enzyme treatment with PNGase F and
Endo H according to the methods described by Luciana Facchinetti et al. 31 and Schlenzig et al. 35 . The sizes of the deglycosylated MdP2′ GT, PNGase F and Endo H were analyzed by SDS-PAGE and SEC as described in the molecular weight analysis. The native and deglycosylated MdP2′ GT bands in the SDS-PAGE gel were excised and analyzed by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF/MS) in the linear mode as described by Schlenzig et al. 35 .
Optimum pH and temperature and stability. The optimal pH was assessed in the pH range of 4.0-10.0 using 100 mM potassium phosphate buffer (pH 4.0-7.0) and Tris/HCl buffer (pH 7.0-10.0) at 30 °C. The optimal temperature was determined in the temperature range of 15-75 °C at pH 8.0. For the pH stability evaluation, MdP2′ GT was incubated at 30 °C for 24 h at pH values varying from 6.5 to 9.5. For the temperature stability assessment, MdP2′ GT was incubated at pH 8.0 at several temperatures (25,35,45,55, and 65 °C) for different periods of time (0, 10, 20, 30, 40, 50, and 60 min). The residual activity of MdP2′ GT was determined as described in the enzyme activity assay section and expressed as the relative activity compared with the initial activity, which was considered 100%.

Effects of metal ions and other compounds.
The effects of metal ions (monovalent, divalent, and trivalent metal ions), chelating agent (EDTA), protectants (glycerol and AEBSF), thiol-containing compounds (β -ME and DTT), thiol inhibitors (PHMB and NEM), substrate co-solvents (methanol and DMSO), substrate analogs (UDP, uridine triphosphate (UTP), uridine monophosphate (UMP), and uridine) and product analogs (trilobatin and 3-hydroxyphloridzin) on the activity of purified MdP2′ GT were examined as described in the enzyme activity assay section. The inhibition effects were calculated and expressed as relative activity (100%). The half-maximal inhibitory concentration (IC 50 ) values, defined as the concentration of inhibitor required to reduce the original activity of MdP2′ GT activity by 50%, were used to evaluate the effects.
Substrate specificity assays. Nine potential sugar donors, including UDPG, ADPG, guanosine diphosphate glucose, cytidine diphosphate glucose, thymidine diphosphate glucose, UDP-Gal, UDP-xylose, UDP-rhamnose, and UDP-pentose, and five types of potential sugar acceptors, including hydroxycinnamic acids (chlorogenic acid, caffeic acid, 4-coumaric acid, 3-coumaric acid, and 2-coumaric acid), chalcones (phloretin, isoliquiritigenin, naringenin chalcone, eriodictyol chalcone, and butein), flavanols (catechin, epicatechin, epigallocatechin, epicatechin gallate, and epigallocatechin gallate), flavonols (quercetin, dihydroquercetin, rutin, kaempferol, and myricetin), and flavones (baicalein, luteolin, eriodictyol, chrysin, and apigenin) were evaluated as potential MdP2′ GT substrates. For chromatographic identification of the enzyme products, substrate specificity assays were scaled up to a total volume of 20 ml containing approximately 1.0 mg of purified MdP2′ GT, 1.0 mM acceptor substrate, and 10.8 mM donor substrate in buffer A. The reactions were performed at 30 °C and pH 7.5 for 15 h and terminated by adding 50 μ l of glacial acetic acid. After lyophilization in liquid N 2 , the powder containing the enzyme products was re-dissolved in 10 ml of methanol and then filtered through a 0.22-μ m membrane (Millipore). The formed glycosides were first evaluated using the same HPLC-DAD method as described in our previous study 42 . Then, the target HPLC peaks were subjected to LC-MS/MS (Foster City, CA, USA) on a 4000 QTrap system from Applied Biosystems equipped with an electrospray ionization (ESI) interface and a HPLC system (Shimadzu, Tokyo, Japan) 50 . The chemical structures of target glycosides were further confirmed by NMR assays in DMSO-d 6 solvent (Sigma-Aldrich, St. Louis, USA) 24 . 1 H NMR and 13 C NMR spectroscopic analysis was performed on a Bruker Avance 500 MHz spectrometer (Bruker Corp., Billerica, USA) with. Chemical shifts were expressed in δ ppm with coupling constants (Ј) determined using tetramethylsilane (TMS) as an internal standard.
Kinetic parameters. The kinetic parameters of MdP2′ GT were determined by measuring its activity in the presence of various concentrations of sugar donor and acceptor substrates. For the donor substrates, the acceptor substrates concentration was fixed at 1.0 mM, with donors concentrations varying from 27 μ M to 25.4 mM. For the acceptor substrates, the donor substrates concentration was fixed at 10.8 mM, with acceptor concentration varying from 4.0 μ M to 2.0 mM. Kinetic constants (K m , V max , K cat ) were calculated from Lineweaver-Burk plots. All assays were performed in triplicate.