Introduction

The utilization of ethanol produced from plant biomass (so-called “bioethanol”), which is derived from the fixation of atmospheric CO2, as an industrial carbon source and car fuel is one of the most important research issues for the realization of a sustainable global environment. Bioethanol is mainly produced from agricultural crop biomass, the biological fermentation of which is easy, but it commercially competes as food and animal feed resources. Alternatively, “lignocellulosic biomass”, such as woods and agricultural residues, represents an attractive feedstock that consists of cellulose (60% of the mass), hemicellulose (30%), and lignin (10%). Hemicellulose comprises pentoses, such as d-xylose (and l-arabinose), as well as hexoses, and d-xylose accounts for approximately 25% of the total sugar content in lignocellulosic biomass1,2.

Yeast, particularly Saccharomyces cerevisiae, has long been used to produce alcoholic beverages because of its ability to produce high concentrations of ethanol and its high inherent tolerance of ethanol. However, native strains cannot ferment d-xylose as a carbon source. Therefore, many studies have attempted to overcome the limitations associated with the utilization of d-xylose by introducing its metabolic pathway from other microorganisms3,4. Although the biological degradation of d-xylose in microorganisms is classified into phosphorylated and non-phosphorylated pathways5,6, only the former, which is further classified into two different pathways, is used for this purpose. In the “isomerase pathway”, d-xylose is directly converted into d-xylulose by d-xylose isomerase (XI; EC 5.3.1.5) without any cofactors (Fig. 1a). Although this pathway mostly operates in bacteria, a few fungi possess the bacterial type of XI7. Alternatively, in the “oxidoreductase pathway”, d-xylose reductase (XR; EC 1.1.1.21) catalyzes the reduction of the C1 carbonyl group of d-xylose, yielding xylitol as the product (Fig. 1a). Xylitol is then oxidized by xylitol dehydrogenase (XDH; EC 1.1.1.9) to give d-xylulose. Xylulokinase (XK; EC 2.7.1.17) commonly phosphorylates d-xylulose into d-xylulose 5-phosphate, which is metabolized further via the pentose-phosphate pathway. Although S. cerevisiae possesses the endogenous oxidoreductase pathway consisting of YHR104w (GRE3), YLR070c (XYL2), and YGR194c (XKS1; XK) as XR, XDH, and XK, respectively, the rate of D-xylose metabolism by strains overexpressing them has not yet reached industrially competitive levels8. Alternatively, XR and XDH genes from the native d-xylose-metabolizing yeast Pichia stipites (Scherrsomyces stipites; PsXR and PsXDH) are mostly used together with the endogenous XK gene.

Figure 1
figure 1

d-Xylose metabolism by yeast and fungi. (a) Metabolic network of d-xylose and l-arabinose. There are two routes of d-xylose metabolism, with XDH being involved in the oxidoreductase pathway. This pathway partially overlaps with l-arabinose metabolism, in which LADH belongs to the same protein family as XDH, described in (b). (b) A phylogenetic tree of the zinc-dependent MDR superfamily group. The number on each branch indicates the bootstrap value. Subfamilies of XDH, SDH, LADH, TDH, and ADH are colored in red, yellow, orange, yellow-green, and cyan, respectively. Open and closed circles at the end of each branch are enzymes in the absence and presence of structural zinc, respectively. XDHs with asterisks were enzymes from thermotolerant yeast. Proteins in the box were purified and characterized in the present study. Underlined proteins were used for discussions in the text.

XDHs from yeast and fungi, including PsXDH, belong to a polyol dehydrogenase (PDH) subfamily in the zinc-dependent group of the medium-chain dehydrogenase/reductase (MDR) superfamily9,10, together with sorbitol dehydrogenase (SDH; EC 1.1.1.14)10,11,12 and l-arabinitol 4-dehydrogenase (LADH; EC 1.1.1.12) from several organisms13,14,15 (Fig. 1b). Among them, LADH is involved in l-arabinose metabolism by fungi, which partially overlaps with the oxidoreductase pathway of d-xylose (Fig. 1a). All MDR enzymes utilize NAD+(H) or NADP+(H) as a cofactor, and one zinc atom with catalytic functions (so-called “catalytic zinc”) is present in the active center of PDH, alcohol dehydrogenase (ADH; EC 1.1.1.1)16,17, and l-threonine dehydrogenase (TDH; EC 1.1.1.103)18. Many PDHs and ADHs also have a second zinc atom (so-called “structural zinc”), which generally coordinates with four cysteine ligands. Since it is impossible to remove this zinc atom without the loss of stable folding or enzyme activity by site-directed mutagenesis19,20, “why” and “how” MDR enzymes appeared in the evolutionary stage in the absence of structural zinc remain unclear.

Although XR-XDH achieves higher metabolic fluxes than XI, the excretion of xylitol occurs during d-xylose fermentation by S. cerevisiae21. An intercellular redox imbalance due to the different coenzyme specificities of XR (with NADPH) and XDH (with NAD+) has been suggested as one of the main factors4,5. Furthermore, a relationship has been reported between stability and intercellular expression levels. Therefore, protein engineering to change (modify) coenzyme specificity and/or increase thermostability represents an attractive approach. A unique NADP+(H)-dependent SDH from insects12 is used as a reference enzyme, through which the complete reversal of coenzyme specificity towards NADP+ is achieved22. On the other hand, the introduction of four cysteine residues provided additional zinc-binding sites and significantly increased thermostability22.

Despite its importance in bioindustry, a crystallographic analysis of PsXDH has not yet been performed, and there is currently no structural evidence for protein engineering, particularly the artificial introduction of structural zinc. We herein report for the first time the crystal structure of PsXDH using a thermostabilized mutant. A second zinc atom coordinated with four artificially introduced cysteine ligands. The substitution of each of the four cysteine ligands with an aspartate in XDH from Schizosaccharomyces pombe contributed to the significantly better maintenance of activity and thermostability than their substitution with a serine, providing a novel hypothesis for how enzymes in the absence of structural zinc, such as PsXDH, appeared.

Results

Overall structure of the PsXDHC4 mutant

The so-called PsXDHC4 mutant was previously constructed by substituting Ser96, Ser99, and Tyr102 with cysteine residues in the wild-type (WT) enzyme (Fig. 2)22. The crystal structure of the apo-form of PsXDHC4 was refined at 2.80 Å resolution after molecular replacement using the coordinates of SDH from sheep liver (PDB ID; 3QE3) as a research model11. Data collection and refinement statistics are summarized in Table 1. Each monomer contained a bidomain architecture composed of a large “catalytic domain” (Thr2-Val163 and Arg300-Pro362) and a smaller “coenzyme-binding domain” (Gly164-Phe299), with a large cleft separating them (Fig. 3a). The former consisted of an α/β fold with a similar structure to that in other MDR enzymes, and the latter had the characteristic α/β Rossmann fold. The region corresponding to Ser121–Gly124 within a long loop between α3 and β7 in the catalytic domain was not built owing to invisible density map.

Figure 2
figure 2

Four cysteine ligands for structural zinc binding in MDR enzymes. Cysteine, serine, acidic, and basic residues are shadowed in gray, yellow-green, red, and blue, respectively. Proteins with asterisks possess structural zinc based on a crystallographic analysis.

Table 1 Data collection and refinement statistics.
Figure 3
figure 3

Crystal structure of the PsXDHC4 mutant. (a) Monomer structure. The catalytic domain housing both catalytic and structural zinc atom (orange and magenta spheres, respectively) is shown in blue, and the cofactor-binding domain is shown in red. The figure was prepared using PyMOL49. (b) Structure of the biological tetramer generated from crystallographic symmetry. The tetramer is a dimer of dimers (A/B and C/D). The dashed line ellipse indicates two contact regions 1 (c) and 2 between two dimers. In (c), hydrogen bonds are shown as black dashed lines. It is likely that the mutated Arg98 in subunit A can form a salt bridge with Asp141 in subunit D of another dimer, and the mutated Phe101 interacts with Leu109 within the same loop hydrophobically. Superimposed Cα traces of the structures of tetrameric PsXDHC4 (gray) on tetrameric SDH from silverleaf whitefly (1E3J) (d) and dimeric ADH from Arabidopsis thaliana (2CF5) (e).

Analytical size-exclusion chromatographic studies demonstrated that PsXDHC4 formed a homotetramer in solution. The four subunits (A–D) in the biological tetramer generated from crystallographic 222 symmetry were regarded as a dimer of identical A/B and C/D dimers (Fig. 3b). The contact region between each monomer of the dimer was strand β14, which formed a seven parallel β-sheet (β11–β10–β9–β12–β13–β15–β14), and packed antiparallel with strand β14 of the second monomer. During the formation of the tetramer, the subunits A/B dimer contacted the subunits C/D dimer at two regions. Region 1 corresponded to a loop that protruded from the catalytic domain, at which a zinc atom other than the catalytic zinc atom bound, as described below (Fig. 3c). The side chain of Arg97 in subunit A (D) formed a salt bridge with the side chain of Asp141 in the same subunit A (D), which also interacted with the side chain Lys103 in subunit D (A) of the other dimer; symmetrical interactions are shown in parentheses. Region 2 was located between two loops of the coenzyme-binding domain in subunit A (C), and one helix of the catalytic domain in subunit C (A). Therefore, these two types of contact were achieved by different subunits in the opposing dimers.

As described in “Introduction”, the zinc-dependent MDR superfamily group is phylogenetically classified into three subclasses; PDH, ADH, and TDH (Fig. 1b). As expected, higher (structural-based) sequence homologies were noted in the same PDHs (r.m.s.d. values of 1.3–2.2 Å and sequence identities of 36–42%) than in ADHs and TDHs (r.m.s.d. values of 2.1–3.5 Å and sequence identities of 21–31%) (Fig. 4 and Supplementary Table S1). PDHs and TDHs, and ADHs from bacteria and archaea are homotetramer with the same crystallographic symmetry as PsXDHC4 (Fig. 3d). On the other hand, ADHs from eukaryote and plant tend to be homodimer with two subunits (Fig. 1b), which corresponds to the A/B (or C/D) dimer of tetrameric MDR enzymes; therefore, neither regions 1 nor 2 contribute to formation of the tertiary structure (Fig. 3e).

Figure 4
figure 4

Stereo view of the superimposed Cα traces of the coenzyme binding domains of PsXDHC4 and SDH from human in complex with the competitive inhibitor (CP-16657213) and NAD+ (1PL6; gray) (a), or LADH from Neurospora crassa in complex with NAD+ (3M6I; gray) (b).

Catalytic zinc binding in the PsXDHC4 mutant

An anomalous difference Fourier map using data collected at the K-edge of zinc (wavelength of 1.280 Å) showed two clear peaks within the electron density map of PsXDHC4 (Fig. 5a,b). Among them, one peak, located at the bottom of the catalytic domain, corresponded to catalytic zinc, based on inference from other related structures (Fig. 4). Catalytic zinc was coordinated by interactions with Cys41 (distance of 2.3 Å), His66 (2.3 Å), Glu67 (2.2 Å), and a water molecule (Wat6; 2.6 Å) (Fig. 5a). The carboxyl group of Glu67 further formed a hydrogen bond with the side chain of Lys356, and nearby Glu159 was linked to the zinc atom through the wat6 molecule. All of these residues (and neighboring Ser43 and Asp44) were highly conserved in PDH members. XDH from the yeast Galactocandida mastotermitis (54% sequence homology with PsXDH) contained ~ 6 Mg2+ ions, which were selectively removed by dialysis without a loss of activity23. No magnesium ions were found in the crystal structure of the apo-form of PsXDHC4.

Figure 5
figure 5

Zinc binding in the PsXDHC4 mutant. (a) Amino acid residues involved in chelating the catalytic zinc atom in the active site, together with some neighboring residues. PEG is derived from crystallization solution. Anomalous difference Fourier maps, contoured at 3.0 σ and 10 σ, suggest peaks for the catalytic zinc (a) and structural zinc atoms (b), respectively, and are shown as a blue mesh. (b) Structural zinc binding sites, and comparisons with SDH from human (1PL6) (c), SDH from silverleaf whitefly (1E3J) (d), and ADHs from T. brockii (1YKF) (e), A. pernix (1H2B) (f), and S. solfataricus (1JVB) (g). In (f), there are two structures in the presence of zinc (right panel) and in its absence showing the disulfide bond (left panel).

Second zinc binding in the PsXDHC4 mutant

As described above, PsXDHC4 contained another peak in the anomalous difference Fourier map at the K-edge of zinc (Fig. 5b). Since binding sites were artificially constructed, another metal may have (concomitantly) been present; cobalt may have been alternatively introduced into the same binding sites of catalytic zinc as the native enzyme of ADH16. On the other hand, in previous biochemical study, the zinc content of the PsXDHC4 mutant was estimated to be ~ 1.9 mol of zinc/mol of subunit, which was close to 2.0, by atomic absorption spectrophotometry22. Furthermore, since zinc compounds were not used in the purification or crystallization protocols, we concluded that the metal must be zinc and intrinsically contained in the protein. To the best of our knowledge, this is the first structural evidence for artificially introducing structural zinc into MDR superfamily enzymes.

The second zinc atom was bound within a loop that protruded from the catalytic domain of PsXDHC4 (Fig. 3a), at which it was ligated by the enzyme residues Cys96, Cys99, Cys102, and Cys110 (distance of 2.3 Å) (Fig. 5b). Superimposition to the crystal structures of other MDR enzymes revealed that this zinc atom was equivalent to (inherently bound) structural zinc (Fig. 4b), and there was no significant difference in their binding loops regardless of zinc (Fig. 5c–g). SDH from humans and ADH from Thermoanaerobacter brockii possessed a salt bridge and/or hydrogen bond inside and/or outside of this loop (Fig. 5c,e).

When Phe98 and Glu101, neighboring residues of four cysteine ligands, were both substituted to arginine and phenylalanine residues in the PsXDHC4 mutant, respectively, the resultant double mutant (C4/F98R/E101F) further increased thermostability (Fig. 6a)24. Mutated Arg98 in subunit A may have formed a salt bridge with Asp141 in subunit D of the other dimer, and mutated Phe101 hydrophobically interacted with Leu109 within the same loop to significantly enhance dimer-dimer interaction(s) for the formation of tetramers (double-headed dashed arrow) (Fig. 3c).

Figure 6
figure 6

Courses of thermal inactivation. Purified enzymes were dialyzed against 50 mM Tris–HCl (pH 8.0) at 4 °C overnight. Dialyzed enzymes were incubated for 10 min at each temperature or at the indicated temperature for each time (inset). Enzyme activities were shown as relative average values (n = 3) expressed as a percent of the controls without a heat treatment. The background graph shows the effects of temperature on activity. (a) WT and mutants of PsXDH and ScXDH and WT of SpXDH. (b) Four serine mutants of SpXDH. (c) Four Cys97 mutants. (d) Four aspartate mutants and comparisons with each serine mutant.

Introduction of four cysteine ligands for structural zinc in ScXDH

As described in “Introduction”, the YLR070c protein from S. cerevisiae (45% sequence homology with PsXDH; Fig. 1b) functions as XDH in the endogenous d-xylose pathway8,25, and possessed none of the four cysteine ligands (Fig. 2); therefore, we designed it the D99C/S102C/M105C/D113C mutant (referred to as ScXDHC4). Each WT and C4 mutant enzyme of ScXDH (and also PsXDH) was successfully expressed in Escherichia coli cells and purified using the same procedure as that for PsXDH. The kcat/Km values of ScXDHC4 and PsXDHC4 for xylitol were 7.3- and 4.3-fold higher, respectively, than those of ScXDHWT and PsXDHWT, and these differences were attributed to 10- and 22-fold higher kcat values, respectively (Table 2). The inactivation of ScXDHC4 or PsXDHC4 was not detected after an incubation at 45 °C for 10 min, whereas the activities of each WT enzyme were decreased to 89 and 48%, respectively, by the same treatment (Fig. 6a). Collectively, the introduction of four cysteine ligands increased activity and thermostability in not only PsXDH, but also ScXDH, which appeared to be due to the introduction of a second zinc atom.

Table 2 Kinetic parameters of WT and mutant XDH enzymes for xylitol.

Serine mutants of four cysteine ligands for structural zinc in SpXDH

The hypothetical protein SPBC1773.05c from S. pombe (40% sequence homology with PsXDH; Fig. 1b) had four cysteine ligands at positions 97, 100, 103, and 111 (Fig. 2); therefore, we selected it as a target for the enzyme in the presence of structural zinc. The kcat/Km value of xylitol was similar to that of PsXDH (31.3 and 79.9 min−1 mM−1, respectively), suggesting its function as XDH (referred to as SpXDH) (Table 2). To elucidate the physiological role of structural zinc, we initially substituted each cysteine ligand with a serine residue. A gel filtration analysis using the sample purified by Ni–NTA revealed that, compared with the WT, the ratio of active molecular species with tetramer structure was significantly decreased. Therefore, we performed a kinetic analysis using the fraction with the highest activity (Supplementary Fig. S1).

Among the four serine mutants, the kcat/Km values of the C97S and C103S mutants (29.6 and 14.0 min−1 mM−1) were similar to that of the WT enzyme (68.5 min−1 mM−1), whereas that of the C100S mutant was markedly lower (0.743 min−1 mM−1) and the C111S mutant was completely inactive (Table 2). A heat treatment analysis indicated their significant decrease of thermostability; the half-live time for inactivation at 50 °C was estimated to be within 1 min (Fig. 6b,d), whereas the inactivation of WT was not detected by the same treatment. Similar results were observed at the optimum temperatures for activity. Collectively, these results indicated that all serine mutations markedly affected thermostability, and that the four cysteine ligands had varying impact on the levels of activity.

Other mutants of four cysteine ligands for structural zinc in SpXDH

In some (putative) ADH subfamily enzymes in the MDR superfamily, one of the four cysteine ligands was substituted with an aspartate, glutamate, or arginine residue; D-C-C-C, E-C-C-C, or R-C-C-C, respectively (Fig. 2). In the crystal structures of (hyper)thermophilic archaeal enzymes, aspartate and glutamate residues coordinated with structural zinc (Fig. 5f,g)26,27,28,29. Therefore, Cys97 in SpXDH was changed to design the C97D, C97E, and C97R mutants. Among them, the courses of the thermal inactivation of the C97D and C97E mutants were enhanced (Fig. 6c).

An aspartate residue was frequently detected in some substitution patterns of the four cysteine ligands in MDR enzymes; D-S-M-D, D-S-S-D, and R-D-C-S. (Fig. 2). Therefore, Cys100, Cys103, and Cys111 in SpXDH were further substituted with an aspartate. The kcat/Km values of the C97D, C100D, and C103D mutants increased and were 57%, 23%, and 65% that of WT, respectively (Table 2). Furthermore, the C111D mutant was significantly active, which differed from the C111S mutant. In the heat treatment analysis at 50 °C, losses in activity of 27%, 50%, and 23% were observed in the C97D, C100D and C103D mutants, respectively, which were less than those in each serine mutant (91%, 100%, and 82%), and their half-live times for inactivation at 50 °C were estimated to be longer than 1 h (Fig. 6d). Similar results were obtained for the optimum temperatures for activity. Regarding the C97D and C103D mutants, samples purified by Ni–NTA contained a large amount of the active tetramer, which may have been due to thermostabilization, as described below (Supplementary Fig. S1).

To investigate the effects of the aspartate ligand in more detail, (the mutated) Cys96 in PsXDHC4 was further changed to an aspartate residue. The resultant C4/S96D mutant (equivalent to the S96D/S99C/Y102C mutant) exhibited similar thermotolerance to WT (Fig. 6a), whereas the kcat/Km value increased by 2.7-fold, which was caused by a marked increase in the kcat value, similar to the C4 mutant (Table 2). On the other words, a change from C4 to C4/S96D in PsXDH had similar effects on WT and the C97D mutant of SpXDH, suggesting no difference of functions between artificial and inherent structural zinc.

Intracellular expression level of XDH

A Western blot analysis using an anti (His)6-tag antibody showed that the PsXDHC4, ScXDHC4, and SpXDHWT proteins were more highly expressed in E. coli cells at 37 °C than the PsXDHWT, ScXDHWT, and SpXDHC100S proteins, respectively (Fig. 7a–c). When any of the four cysteine ligands in SpXDH was substituted with a serine residue, translational levels in E. coli cells at 25 °C were higher than those at 37 °C, whereas no significant difference was observed between them (Fig. 7d). Therefore, the peak with a high molecular weight in gel filtration using the sample purified by Ni–NTA appeared to be due to the aggregation of XDH (but not contaminant proteins) (Supplementary Fig. S1). Collectively, these results suggest a relationship between stability in vitro and intercellular expression levels in vivo.

Figure 7
figure 7

Intercellular expression level by an immunoblot analysis. Fifty micrograms each of cell-free extracts of transformed E. coli (ad) or S. cerevisiae cells (e) was applied. A Western blot analysis was performed using the ECL Western blotting system (GE Healthcare) and Anti-Penta-His antibody (Qiagen) according to the manufacturer’s instructions.

Discussion

Molecular evolution of structural zinc

Only two studies previously investigated the artificial removal of structural zinc by site-directed mutagenesis, similar to the present study. Mutations in any of the four cysteine ligands to an alanine residue in ββ and χχ ADHs from humans19 or phenylacetaldehyde reductase (long-chain ADH) from Corynebacterium sp.20 resulted in no expression in E. coli cells or a marked decrease in activity to less than 4% that of the WT enzyme, suggesting the impossibility of removing the zinc atom without the loss of stable folding or enzyme activity. Alternatively, the C97S and C103S mutants of SpXDH maintained folding and activity; however, their thermostabilities decreased. In other words, if decreased stability is not a significant issue for enzyme function under physiological conditions, such a mutation may be neutral (but not negative). In spite of the (possible) removal of structural zinc, all SpXDH mutants maintained similar thermostabilities to PsXDHWT and ScXDHWT (Fig. 6a,b).

Any substitution(s) of the four cysteine ligands with an aspartate (and glutamate) residue in SpXDHWT prevented a decrease in thermostability (the C97D, C100D, and C103D mutants), and/or enhanced correct structural folding (the C111D mutant). These acidic residues may have been alternatively coordinated to structural zinc, similar to some MDR enzymes (Fig. 5f,g)26,27,28,29. In other words, the four cysteine ligands may have been primarily modified via an aspartate residue, but not by random mutations in any residue(s). Since the structural lobe surrounding zinc formed one of the major points of contact in the XDH tetramer (Fig. 3b,c), a serine residue with a similar sized side chain to cysteine must have contributed to the maintenance of the integrity of this lobe after the loss of zinc. Therefore, the aspartate and serine residues, which are often found in the substitution patterns of the four cysteine ligands in PDHs, may be traced for the hypothetical evolutionary process; S-S-Y-C, S-S-T-C, D-S-M-D, D-S-S-D, and R-D-C-S (Fig. 2).

The introduction of the four cysteine ligands increased thermostability not only in PsXDH (S-S-Y-C), but also in ScXDH (D-S-M-D) (Fig. 6a). Since C4 mutations reversely mimic the molecular evolution described above, and do not have to be generated by a random mutagenesis method, this strategy may be broadly applicable to other MDR enzymes. XDHs from yeast and fungi are further classified into two groups, which correspond to enzymes in the absence (group 1) or presence of structural zinc (group 2), respectively (Fig. 1b). Among them, group 1 contains some enzymes from “thermotolerant” yeasts with the ability to grow and ferment at higher temperatures (50 °C), including Kluyveromyces marxianus30 and P. angusta31. Their thermostabilities are similar to PsXDHC4, indicating that these properties were acquired later by a strategy other than C4 mutations, such as refining of the structural zinc binding loop24, as described above (Fig. 3c).

Application of bioethanol production by lignocellulose biomass

Although S. cerevisiae co-expressing the PsXR and PsXDH genes ferments (metabolizes) d-xylose, additional genetic introductions and/or deletions have been shown to result in an increased ethanol yield, concomitant with a decreased byproduct yield, including xylitol, glycerol, and acetate32,33,34. Another strategy is to modify the intercellular amount of XDH (and also XR) by plasmid copy numbers and the promoter control35,36. Alternatively, since increases in genetic expression levels may be (partially) compensated for by the intercellular lifetime of the translated protein (Fig. 7a–d), SpXDH may be significantly useful for d-xylose fermentation because of its markedly higher thermostability than not only PsXDHWT, but also PsXDHC4; the optimum temperature for activity was 55–65 °C and thermal inactivation was eventually observed at 70 °C (Fig. 6a). In a preliminary experiment, the SpXDHWT gene was successfully expressed in S. cerevisiae cells under a constitutive phosphoglycarate kinase (PGK) promoter37, whereas no expression of the thermolabile C100S mutant was noted (Fig. 7e).

l-Arabinose accounts for approximately 28% of the hemicellulose fraction of corn fiber (14%). The efficient fermentation of l-arabinose by S. cerevisiae has been achieved by using the bacterial pathway consisting of AraABD38. On the other hand, the co-expression of LADH and l-xylulose reductase genes, involved in the fungal pathway (Fig. 1a), along with PsXR, PsXDH, and ScXK enabled S. cerevisiae to ferment l-arabinose; however, ethanol production occurred at a very low rate39. PsXDH exhibited native activity for l-arabinitol (data not shown), and showed high sequence homology with XDH from Meyerozyma caribbica (70%) (Fig. 1b), which corresponds to “LADH” purified from yeast cells grown on l-arabinose as a sole carbon source40. In this regard, PsXDH is suitable for generating a bifunctional dehydrogenase for xylitol and l-arabinitol, based on the structural data in this study, which is useful for breeding d-xylose and l-arabinose co-fermenting S. cerevisiae.

Materials and methods

Expression and purification of recombinant proteins

The primer sequences used in the present study are shown in Supplementary Table 2. Each (putative) XDH gene of P. stipitis (encoded by PICST_86924 gene), S. cerevisiae (YLR070c), and S. pombe (SPBC1773.05c) was introduced into pQE-81L (Qiagen), a plasmid vector for conferring an N-terminal (His)6-tag on expressed proteins, to yield pQE-PsXDHWT, pQE-ScXDHWT, and pQE-SpXDHWT, respectively. E. coli strain DH5α harboring the pQE-based vector was grown at 37 °C to a turbidity of 0.8 at 600 nm in LB medium containing ampicillin (50 mg/l). After the addition of 1 mM isopropyl-β-d-thiogalactopyranoside, the culture was grown at 37 °C for 6 h or at 20 °C for 18 h to induce the expression of the respective (His)6-tagged protein. Cells were harvested and resuspended in Buffer A (50 mM sodium phosphate buffer (pH 8.0) containing 300 mM NaCl and 10 mM imidazole). Cells were then disrupted by sonication and the solution was centrifuged. The supernatant was loaded onto a Ni–NTA Superflow column (Qiagen), which was then washed with Buffer B (pH 8.0, Buffer A containing 25 mM imidazole instead of 10 mM imidazole). Enzymes were eluted with Buffer C (pH 8.0, Buffer A containing 250 mM imidazole instead of 10 mM imidazole), and the elutant was loaded onto a HiLoad 16/600 Superdex 200 pg column (GE Healthcare) equilibrated with Buffer D (20 mM Tris–HCl (pH 8.0) containing 150 mM NaCl). The main single-peak fractions were collected and concentrated by ultrafiltration with Amicon Ultra-15 (Millipore).

Site-directed mutagenesis

Several mutants of PsXDH, ScXDH, and SpXDH were constructed by a PCR-based method with the mutated sense and antisense primers (Supplementary Table S2), and pQE/PsXDHWT, pQE/ScXDHWT, or pQE/SpXDHWT as a template, respectively.

Enzyme assay

Dehydrogenase activity for xylitol was measured using a continuous spectrophotometric assay at 340 nm at 30 °C in 50 mM Tris–HCl buffer (pH 8.0) containing 100 mM xylitol and 1 mM NAD+.

Crystallization and X-ray crystallography

All crystallization trials were performed at 20 °C using the sitting-drop vapor diffusion method. Drops (0.5 μL) of ~ 20 mg/mL PsXDHC4 protein in Buffer D were mixed with equal amounts of reservoir solution, and equilibrated against 70 μL of the same reservoir solution by vapor diffusion. The initial trial was performed using Index HT and Crystal Screen (Hampton Research). The best crystal of PsXDHC4 was obtained within 1 week under the following conditions: 100 mM Hepes–NaOH (pH 7.0), 2 M ammonium sulfate, and 2.5% (w/v) polyethylene glycol 400. The crystals obtained were cryoprotected with reservoir solution supplemented with 15% (w/v) glycerol, and flash-cooled and kept in a stream of nitrogen gas at 100 K during data collection.

Diffraction data were collected with the PILATUS 6 M detector of BL45XU at SPring-8 (Hyogo, Japan), and the processed ZOO system and XDS41,42,43. The structure of the apo-form of PsXDHC4 was solved by the molecular replacement method using the molecular-replacement pipeline program BALBES44 with the structure of SDH from sheep liver (PDB ID 3QE3)11 as the search model. Further model building for all structures was performed manually with COOT45 and crystallographic refinement with PHENIX46. Detailed data collection and processing statistics are shown in Table 1.

Overexpressing XDH genes in S. cerevisiae

Each DNA fragment of (His)6-PsXDHWT, (His)6-PsXDHC4, (His)6-SpXDHWT, and (His)6-SpXDHC100S was amplified by PCR using the pQE-based vector as a template and was then introduced into EcoRI-HindIII sites between the PGK expression cassettes in the plasmid YEpPGK37. S. cerevisiae D452-2 strains (MATa leu2 his3 ura3 can1) harboring the YEpPGK-based vector were grown in minimal medium supplemented with 2% (w/v) glucose as a sole carbon source at 30 °C. Cells were harvested, resuspended in 50 mM Tris–HCl (pH 8.0), and vortexed together with an equal volume of glass beads (diameter of 0.5 mm). Cell debris and glass beads from the cell extract were separated by centrifugation and the remaining supernatant was used for enzyme assessments.

Sequence comparison

Protein sequences were analyzed using the Protein-BLAST and Clustal W programs distributed by the Kyoto Encyclopedia of Genes and Genomes (KEGG) of Japan (www.kegg.jp/kegg/kegg1.html)47,48.