Structural insight into D-xylose utilization by xylose reductase from Scheffersomyces stipitis

Lignocellulosic biomass, of which D-xylose accounts for approximately 35% of the total sugar, has attracted attention as a future energy source for biofuel. To elucidate molecular mechanism of D-xylose utilization, we determined the crystal structure of D-xylose reductase from Schefferzomyces stipitis (SsXR) at a 1.95 Å resolution. We also determined the SsXR structure in complex with the NADPH cofactor and revealed that the protein undergoes an open/closed conformation change upon NADPH binding. The substrate binding pocket of SsXR is somewhat hydrophobic, which seems to result in low binding affinity to the substrate. Phylogenetic tree analysis showed that AKR enzymes annotated with bacterial/archaeal XRs belonged to uncharacterized AKR families and might have no XR function, and yeast/fungi derived enzymes, which belong to the same group with SsXR, can be candidates for XR to increase xylose consumption.

from Candida tenuis (CtXR) was reported 36 , however, the crystal structure of SsXR used in D -xylose consumption by introduction into S. cerevisiae and Y. lipolytica has not been determined yet. Here, we report the crystal structures of D -xylose reductase from S. stipitis (SsXR) in its apo form and in complex with NADPH cofactor. We revealed that the protein undergoes an open/closed conformation change upon the binding of NADPH and has somewhat hydrophobic D -xylose binding pocket. Phylogenetic tree analysis showed that bacteria/archaea derived XR-annotated AKRs might have no XR activity, and yeast/fungi derived XRs can be candidate to increase xylose consumption.

Results and Discussion
Overall structure of SsXR. To elucidate the molecular mechanism of SsXR, we determined its crystal structure at 1.95 Å resolution. The refined structure was in good agreement with the X-ray crystallographic statistics for bond angles, bond lengths, and other geometric parameters ( Table 1). The overall structure of SsXR is similar to that of the AKR superfamily enzymes. The monomeric structure of SsXR is composed of 15 α-helices (α1-α15) and 10 β-strands (β1-β10) (Fig. 1b). The monomeric structure of SsXR consists of a core domain and two auxiliary regions (ARs), AR-I and AR-II. The core domain consists of 13 α-helices (α1-α10, α12-α13, and α15) and eight β-strands (β3-β10) and forms a TIM-barrel motif (Figs 1b and 2a). As the conventional TIM-barrel motif, in SsXR eight parallel β-strands (β3-β10) are arranged in a cylindrical shape with eight surrounding α-helices (α1, α3, α5-α8, and α12-α13). Four α-helices (α2, α9-α10, and α15) are located at the back of the TIM-barrel motif and contribute to binding of the NADPH cofactor (Fig. 2a). AR-I is composed of two β-strands (β1-β2) and is located on the opposite side of the TIM-barrel. AR-II consists of two α-helices (α11 and α14) and is positioned next to the α12 helix (Fig. 2a). As reported in other enzymes belonging to AKR families, four catalytic residues, Asp43, Tyr48, Lys77, and His110, are also conserved in the SsXR, and catalytic mechanism can be proposed 37 (Fig. 2b). It has been reported that AKR enzymes function as a monomer, dimer, and tetramer 37 and SsXR is known to exist as a dimer in the presence of 50 mM NaCl 38 . In our current structure, there are two SsXR polypeptide chains in an asymmetric unit forming a dimer. PISA and EPPIC calculations also showed that the protein exists as a dimer, although the interaction was quite low. We then performed size-exclusion chromatography experiments under various NaCl concentrations. Unexpectedly, SsXR generally separated into monomers in 50 and 100 mM NaCl and completely separated into monomers in150 mM NaCl, although it existed as dimers in the absence of NaCl (Figs 2c, S1). Based on these results, we suggest that SsXR exists as a monomer under the physiological NaCl concentration and tends to form a dimer in the presence of low NaCl concentrations. Furthermore, in order to compare the enzyme activity according to the oligomec state, kinetic values at 0 mM and 150 mM were measured. Interestingly, all values at 0 mM NaCl condition were better than 150 mM NaCl, indicating that oligomer formation affects enzyme activity.
Open/closed conformation for cofactor binding of SsXR. AKR Fig. 3a). The NADPH cofactor bound at the back of the TIM barrel motif (Fig. 2a). The residues Phe216, Gln219, Glu223, Phe236, Ala253, Lys270, Asn272, Arg276,  Glu279, and Asn280 contribute to formation of the nucleotide binding pocket (Fig. 3b). The adenine ring is stabilized by Glu279 and Asn280 and the 2′-phosphate group of NADPH is stabilized by Gln219, Asn272, and Arg276 through hydrogen bonds. The Glu223 and Lys270 residues form hydrogen bonds with the ribose moiety (Fig. 3b). Two serine residues, Ser214 and Ser220, are involved in stabilizing pyrophosphate moiety and the nicotinamide ring of NADPH is stabilized by the residues Asp43, Tyr48, His110, Ser165, Asn166, Gln187, and Tyr213 through hydrogen bond networks (Fig. 3c). Interestingly, when we superimposed the SsXR structure in the apo-form with that in complex with NADPH, we observed a large structural difference near the NADPH binding site between these two structures (Fig. 3d). In the NADPH-bound form, the α9 helix and α9-α10 connecting loop (Gly217-Ser232) were in the closed conformation without crystal contact, and the Gln219 and Glu223 residues form hydrogen bonds with the phosphate atom of the phosphoribose moiety ( Fig. 3b,d,e). In contrast, the region moved away from the NADPH binding site by 11.1 Å and formed an open conformation (Fig. 3d,f). Based on these observations, we propose that SsXR undergoes an open/closed conformational change upon binding of the NADPH cofactor. D -Xylose binding mode of SsXR. To elucidate the binding mode of the D -xylose substrate, we performed molecular docking simulations of SsXR with the D -xylose substrate. Molecular docking simulations revealed that the D -xylose substrate fits well into the predicted substrate binding pocket (Fig. 4a). The D -xylose binding pocket is constituted by 10 residues: Trp20, Asp47, Trp79, His110, Phe111, Phe128, Phe221, Leu224, Asn306, and Trp311 (Fig. 4a,b). The Asp47 residue contributes to the stabilization of two hydroxyl groups (OH2 and OH3), and the aldehyde group of D -xylose is stabilized by Asn306 through hydrogen bonding (Fig. 4b). The residues involved in formation of the D -xylose binding pocket were confirmed by site-directed mutagenesis experiments (Fig. 4c). Most mutants, including W20A, D47A W79A, H110A, F111A, F128A, F221A, N306A, and W311A, exhibited almost complete loss of enzyme activity, indicating that these residues are crucial for D -xylose binding. However, the L224A mutant showed 35% XR activity compared to the wild type, indicating that the Leu224 residue does not influence substrate binding as much as other residues (Fig. 4b,c). Interestingly, the D -xylose binding pocket is somewhat hydrophobic and seven of 10 residues involved in formation of the D -xylose binding pocket are hydrophobic (Fig. 4a,b). Considering that the D -xylose substrate is highly hydrophilic with four hydroxyl groups and one aldehyde group, we can predict that the substrate binding pocket of SsXR requires more hydrophilic residues. Thus, the hydrophobicity of the substrate binding site of SsXR indicates that the binding affinity of the enzyme for D -xylose is low. We then performed kinetic analysis of the enzyme and the Km value for D -xylose was extremely high with a value of 39.4 mM ( Table 2). Because the Km value of D -xylose was significantly higher than that of the NADPH cofactor (0.0277 mM) ( Table 2), the bottleneck of the enzyme activity in SsXR appears to be the binding affinity for D -xylose.
Phylogenetic tree analysis and classification of XRs. Because utilization of D -xylose by bio-fuel producing organisms, such as S. cerevisiae and Y. lipolytica, requires heterologous introduction of XR enzymes, selection of highly efficient XR is a key issue in high bio-fuel production 4 . In order to explore the XR candidates that can be introduced to microorganisms for xylose consumption, enzymes annotated as XR in NCBI and UNIPROT server were analyzed (Table. S1). There are 161 AKR enzymes (67 yeast/fungal, 83 bacterial, and 11 archaeal enzymes) annotated with XR, and these enzymes can be divided into three groups by phylogenetic tree analysis (Fig. 5a). Interestingly, only yeast and fungi derived enzymes belonged to the same group with SsXR, and all other enzymes from bacteria and archaea belonged to the different group from SsXR. Furthermore, amino acid sequence alignment also showed that 161 AKR enzymes were classified in the same way as the phylogenetic analysis, and only yeast/fungi derived enzymes have amino acid sequence similar to SsXR (Fig. S2). In particular, when the residues involved in the substrate binding of SsXR were compared each other, there are many gaps in bacterial/archaeal AKR enzymes (Fig. 5b). These observations indicate that AKR enzymes annotated with bacterial/archaeal XRs belonged to uncharacterized AKR families (UNC AKR I and II) and might have no XR function. In order to confirm that the enzymes belonging to UNK AKR families cannot use xylose as a substrate, we measured an XR activity using purified AKR from E. coli, an enzyme belonging to UNK ASK II (Accession codes of NCBI and UNIPROT are WP_001199831 and C3TM25, respectively), and the enzyme showed no enzyme  activity against xylose (Fig. 4c). Consequently, we propose that the yeast/fungi derived enzymes, which belong to the same group with SsXR, can be candidates for XR. In addition, when we compared to the amino acid residues involved in the formation of xylose binding pockets, only four of ten residues are conserved in the yeast/fungi derived XRs, and various amino acids are positioned in the xylose binding pocket (Figs 5b, S2). Therefore, structural and biochemical studies on other yeast/fungi derived XRs are required to select more efficient XRs and to increase biofuel productivity using lignocellulosic biomass. We report the crystal structure of SsXR in the apo form and in complex with the NADPH cofactor and provide structural insight into the open/closed conformational change upon cofactor binding. Molecular docking simulations of SsXR for the D -xylose substrate and kinetic analysis of the protein revealed that why SsXR shows low binding affinity for the substrate. Moreover, based on phylogenetic tree analysis and detailed amino acid sequence alignment, we propose that various yeast/fungi derived XRs can be candidate for more efficient XRs. These results may be useful for developing methods with much higher D -xylose utilization for lignocellulosic biomass in biofuel production.

Material and Methods
Cloning, expression and purification. The SsXR coding gene was amplified through PCR from synthetic gene by Bioneer (Republic of Korea), and primers (forward: 5′-GCGCCATATGCCTTCTATTAAGTTGAACTC TGGTTAC-3′, reverse: 5′-GCGCCTCGAGTTAGACGAAGATAGGAATCTTGTCC-3′). The PCR product was digested by restriction enzymes (NdeI and XhoI), and sub-cloned into pProEX-HTa vector (Thermo Fischer Scientific), which contained a 6x -His tag with rTEV protease cleavage site at N-terminus. The pProEX-HTa:SsXR was transformed into a E. coli BL21(DE3)-T1 R strain, and transformed E. coli was grown to an OD 600 of 0.65 in LB medium containing 100 mg L −1 ampicillin at 310 K and SsXR protein expression was induced by 0.5 mM isopropyl β-D-1-thiogalatopyranoside (IPTG). After 18 h at 293 K, the cell was harvested at 277 K. The cell pellet was resuspended in 40 mM Tris-HCl, pH 8.0 (buffer A) and disrupted by ultrasonication. The cell debris was removed by centrifugation at 13,000 g for 30 min, and the lysate was applied onto a Ni-NTA agarose column (Qiagen). After washing with 40 mM Tris-HCl, pH 8.0 and 25 mM Imidazole (buffer B), the bound proteins were eluted with 40 mM Tris-HCl, pH 8.0 and 300 mM Imidazole (buffer C). Finally, the trace amount of contaminants was removed by gel filtration experiment using HiPrep 26/60 Sephacryl S-300 HR column (GE Healthcare Life Sciences) equilibrate with buffer A. The eluted protein had about 75 kDa molecular weight, indicating a dimeric structure. The protein was concentrated to 75 mg mL −1 , and kept at 193 K for further experiments.

Crystallization of SsXR.
Crystallization of the SsXR protein was initially performed with commercially available sparse-matrix screens, including PEG ion I and II, and Index, (Hampton Research), and Wizard Classic I and II (Rigaku Reagents), using the sitting-drop vapor diffusion method at 295 K. Each experiment consisted of mixing 1.0 μL protein solution (75 mg mL −1 , buffer A) with 1.0 μL reservoir solution and then equilibrating against 50 uL reservoir solution. The apo-form of SsXR crystals were observed from various crystallization conditions. After several steps for crystal improvement, crystals of the best quality appeared in 22% (w/v) polyethylene glycol 3350 and 0.1 M ammonium citrate tribasic, pH 7.0. For the crystal of the SsXR in complex with NADPH cofactor, the SsXR proteins was prepared in the same manner as that of apo form. The SsXR protein was mixed with 20 mM NADPH cofactor, and incubated for 1 hr at 277 K. SsXR crystals in complex with NADPH were crystallized in the condition of 20% (w/v) polyethylene glycol 3350 and 8% (v/v) Tacsimate, pH 7.0. Data collection and structure determination. The crystals of SsXR in its apo form and in complex with NADPH were transferred to cryo-protectant solution containing 30% (v/v) glycerol with the crystallization buffer condition. The crystals were mounted in a 100 K nitrogen stream. Data of apo and NADPH complex were collected to a resolution of 1.95 and 2.0 Å, respectively, at 7 A beamline of the Pohang Accelerator Laboratory (PAL, Pohang, Republic of Korea), using a Quantum 270 CCD detector (ADSC, USA) 39 . All data were indexed, integrated, and scaled together using the HKL2000 software package 40 . The apo-form crystals of SsXR belonged P2 1 2 1 2 1 with unit cell parameters a = 69.248 Å, b = 87.151 Å, c = 122.62 Å, α = β = γ = 90°. Assuming two SsXR molecules in asymmetric unit, the crystal volume per unit of protein mass was 2.58 Å 3 Da −1 , which means the solvent content was approximately 52.29% 41 . The crystals of SsXR in complex with NADPH belonged to the space group P4 2 2 1 2, with unit cell parameters a = b = 97.654 Å, c = 160.12 Å, α = β = γ = 90°. Assuming two SsXR molecules in asymmetric unit, the crystal volume per unit of protein mass was 2.73 Å 3 Da −1 , which means the solvent content was approximately 54.92%. The apo form structure of SsXR was determined by molecular replacement with the CCP4 version of MOLREP 42 using the structure of xylose reductase from Candida tenuis (CtXR, PDB code 1Z9A) as a search model 43 . Model building was performed manually using WinCoot software 44 , and refinement was performed with refmac5 45 . The structure of SsXR in complex with NADPH was determined by molecular replacement using the crystal structure of the apo-form of SsXR (PDB code 5Z6U). The data statistics are summarized in Table 1. The SsXR structures of apo-form and in complex with NADPH were deposited in the protein data bank with PDB codes of 5Z6U and 5Z6T, respectively.

Size-exclusion chromatographic analysis.
To investigate the oligomeric state of SsXR, analytical size-exclusion chromatography was performed using a Superdex increase 200 10/300 GL column (GE Healthcare Life Sciences) equilibrated with 40 mM Tris-HCl, pH 8.0 and various NaCl concentrations, such as 0, 50, 100, and 150 mM. Protein sample of 0.5 mL with concentration of 1 mg mL −1 was analyzed. In order to calculate the molecular weight of eluted SsXR sample, the calibration curve was drawn using standard sample of ferritin (440 kDa), conalbumin (75 kDa), carbonic anhydrate (29 kDa), and ribonuclease A (13.7 kDa) (GE Healthcare Life Sciences). Kinetic analysis. For the kinetic analysis, the SsXR protein was purified in the same manner as the protein for crystallization. The activities of SsXR were determined by measuring the decrease of absorbance at 340 nm (extinction coefficient of 6.22 × 10 3 M −1 cm −1 ). For the kinetic analysis of SsXR on NADPH and NADH cofactor, enzyme activity was measured with a reaction mixture of 0.5 ml total volume at 303 K. The reaction mixture contained 100 mM Tris-HCl, pH 8.0, 100 mM D -xylose, and various concentration of NADPH/NADH cofactor from 1 to 200 μM. The reactions were initiated by the addition of enzyme to a final concentration of 0.5 and 2 μM for the analysis of NADPH and NADH, respectively. For the kinetic analysis of SsXR on D -xylose substrate, enzyme reactions were performed with a reaction mixture of 0.5 ml total volume at 303 K. The reaction mixture contained 100 mM Tris-HCl, pH 8.0, 0.2 mM NADPH, and various concentrations of D -xylose substrate from 1 to 200 mM. The reaction was initiated by the addition of enzyme to a final concentration of 2 μM. The SsXR activity measurement was performed in duplicate reaction.
Molecular docking simulation. Molecular docking simulations of D -xylose to SsXR structure was performed using AutoDock Vina software 46 . SsXR structure of PDB code 5Z6T, in complex with NADPH, was used and the D -xylose ligand was prepared using the MarvinScketch software. The pdbqt files were generated by AutoDock Vina manual. Side-chain of Asp47, Tyr48, and Asn306 were treated as flexible residues, and the grid size for D -xylose was x = 18, y = 36, z = 40, and grid center was designated at x = 17.169, y = 21.352, z = 33.799. The final conformations produced in this simulation were confirmed using PyMOL software 47 . The calculated free energy of binding (ΔG bind ) of the final pose was −4.5 kcal/mol.
Site-directed mutagenesis and activity assay. Site-specific mutations were created with the QuikChange kit (Stratagene), and sequencing was performed to confirm correct incorporation of the mutations. The mutant proteins were purified in the same manner as the wild type. The activities of SsXR were determined by measuring the decrease of absorbance at 340 nm (extinction coefficient of 6.22 × 10 3 M −1 cm −1 ). Enzyme reactions for the relative activity of SsXR were performed with a reaction sample of 0.5 ml total volume at 303 K. The reaction sample contained 100 mM Tris-HCl, pH 8.0, 100 mM D -xylose, and 200 μM NADPH. The reactions were initiated by the addition of enzyme to a final concentration of 0.5 μM. The SsXR activity assay was performed in duplicate reaction.
Phylogenetic tree analysis of reported XRs. Annotated XR enzymes are searched by protein search in National Center for Biotechnology Information (NCBI) and UNIPROT server. Multiple sequence alignment was performed using Clustal Omega 48 . The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model 49 . The tree with the highest log likelihood (−19893.7669) is shown. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 161 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 133 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 50 .