Introduction

Taxol (generic name: paclitaxel), a rare natural product mainly generated by yew bark, is the well-known blockbuster anticancer drug. It promotes tubulin assembly into microtubules and prevents their disassembly1. However, the natural level of Taxol is extremely low2, while the content of 7-β-xylosyl-10-deacetyltaxol (XDT) can be up to 25 times of Taxol3. XDT is often regarded as the waste during Taxol extraction process, causing both resource loss and potential environmental pollution. Compared with Taxol, XDT lacks the C10 hydroxyl group but harbors an additional β-xylosyl group at the C7 position. If the xylosyl group is removed, the resultant 10-deacetyltaxol (DT) can be used as a precursor for Taxol preparation. Therefore, the β-xylosidase catalyzing the removal of xylosyl from XDT (Fig. 1a) would play a prominent role in reducing both of the resource waste and pollution to environment. However, a lot of commercially available β-xylosidases have been demonstrated to possess no activity on releasing the xylosyl residue from XDT4. Recently, we identified two bifunctional β-xylosidase/glucosidase (named as LXYL-P1-1 and LXYL-P1-2) from Lentinula edodes, which belong to GH3 family and can efficiently convert XDT into DT5. By combining LXYL-P1-2 and an engineered acetyltransferase, we have constructed an in vitro one-pot reaction system for converting XDT into Taxol6.

Fig. 1
figure 1

a Chemical transformation of XDT to DT with LXYL-P1-2 catalyzed. b Tetramer structure of LXYL-P1-2. The four monomers are colored in green, cyan, magenta, and yellow, respectively. The dimer structure of AaBGL1 (PDB code 4IIB). The two monomers are colored in green and cyan and made the same orientation as the dimer in LXYL-P1-2.

The catalytic specificity and higher efficiency of LXYL-P1-2 prompted further investigation of its structure-function relationship. Here we present the crystal structures of LXYL-P1-2 and its complex with XDT. The binding mode of xylose group shed lights on the catalytic mechanism for GH3 enzymes. DT binding pocket elucidates the structural basis of substrate specificity. Structural comparison of LXYL-P1-2 and tubulin suggests a possible common feature for designing Taxol binding protein.

Results

Structures of LXYL-P1-2 in substrate free form

Highly glycosylated proteins, such as LXYL-P1-2, are greatly difficult to be crystalized. To improve formation of good quality crystals, LXYL-P1-2 was endoglycosidase-treated before crystallization5. The crystallographic statistics for data collection and structure refinement are summarized in Table 1. The first 43 residues were missing in the electron density map, partially supporting the existence of the KEX27 cleavage site (Ile32Phe33Arg34Arg35). The first amino acid of the mature protein was then verified to be Asp36 by N-terminal sequencing.

Table 1 Statistics of data-collection and refinement.

LXYL-P1-2 exists as a 222 symmetric tetramer in the asymmetric unit (Fig. 1b) (PDB code 6JBS), consistent with the molecular weight measured by Gel filtration experiment. The structure of the four monomers are essentially same without any remarkable difference. Each monomer comprises three domains as some GH3 members. Domain 1 folds into a TIM barrel-like structure and contains the residues from Asp36 to Gly365. Domain 2 (residue 398–600) is an α/β sandwich, in which five parallel β-strands and one antiparallel β-strand are sandwiched by five α-helixes. Domain 1 and domain 2 is connected by linker 1 (residue 366–397). Domain 3 (residue 664–803) is connected to domain 2 by linker 2 (residue 601–663) and has a fibronectin type III (FnIII) fold. Seven Asn residues (81, 272, 342, 385, 457, 576, and 635) are found to link with different types of oligosaccharide even after treated by endoglycosidase H (Table 2).

Table 2 Glycan modification of LXYL-P1-2.

Structure of E529Q mutant co-crystallized with XDT

To obtain the substrate binding structure, mutant E529Q was co-crystallized with XDT (PDB code 6KJ0). There are two monomers in the asymmetric unit. The electron density clearly shows the existence of xylose and DT (Fig. 2a). The xylose adopts a pyranose configuration. It was surprised to found that the glycosidic bond between xylose and DT was broken in both of the monomers. Although the E529Q mutant did not show catalytic activity in the standard enzymatic essay, it may still have weak activity to digest the glycosidic bond of XDT during the crystallization time of one week. Thus, this structure should be regarded as the enzyme/product complex.

Fig. 2
figure 2

a The 2Fo-Fc electron density map (blue, 1.0σ) and Fo-Fc omit map (orange, 3.0σ) for xylose and DT. b Comparison of the overall structure of LXYL-P1-2 in substrate free form (gray) and XDT binding form (green). The moving loop Ile222-Gln229 is shown in red. DT and xylose are shown in a stick model. c A big view of comparison of the detailed structure of LXYL-P1-2 in substrate free form (gray) and XDT binding form (green). The moving loop Ile222-Gln229 is shown in red. DT and xylose are shown in a stick model. d Stereo view of the active site comparison of LXYL-P1-2 (green) and HjCel3A (PDB code 3ZYZ, cyan). The residues from LXYL-P1-2, atom C1 of xylose and atom O7 of DT are labeled. The C1 atom of glucose bound in HjCel3A is also labeled in brackets. e The specific activities of LXYL-P1-2 and its site-directed mutants on XDT. The data represent the means ± s.d., n = 3. *P < 0.05 versus LXYL-P1-2 (Control), **P < 0.01 versus LXYL-P1-2 (Control), ***P < 0.001 versus LXYL-P1-2 (Control), One-way analysis of variance was used for data analysis using SPSS 17.0.

Sequence alignment of LXYL-P1-2 indicates that Asp300 and Glu529 might act as the catalytic nucleophile and the acid/base residues, respectively. In LXYL-P1-2 structure, the distance of these two residues is 5.5 Å, consistent with the proposed retaining catalytic mechanism in GH3 family. In the substrate free enzyme, Glu529 forms hydrogen bond to the side-chain of Arg218 and the main-chain NH of Ser466. In the complex, the amide group of Gln529 rotates about 40 degree to form hydrogen bonds to the side-chain OH of Ser466 and the O7 atom of DT. This confirms that Glu529 plays as acid/base to attack the glycosidic bond. Other residues remain the same conformations as the free state. The side-chains of Asp109, Arg174, Lys207, His208, and Asp300 stabilize the xylose by forming hydrogen bonds (Fig. 2d).

Compared with the free enzyme, the only prominent conformational changes in the overall structure upon XDT binding is the movement of loop Ile222-Gln229 with the longest distance of 2.5 Å (Fig. 2b, c). This movement suggests that LXYL-P1-2 possesses an open conformation in free enzyme and a closed conformation in substrate binding state. DT molecule binds at a pocket formed by several loops: residues 220–232 and 324–328 from domain 1, residues 379–383 which links domain 1 and 2, and residues 446–450 and 529–530 from domain 2 (Fig. 3). Therefore, domain 1 and domain 2 together make the DT binding site. Besides the hydrogen bond between O7 of DT and Glu(Gln)529, no other hydrogen bonds or static electric interactions were observed between LXYL-P1–2 and DT. The hydrophobic environment, contributed by Leu220, Ile222, Ile224, Val227, Ile232, Phe324, Leu325, Ala327, Leu328, Ile379, Thr383, and Ala447 form the hydrophobic wall of DT binding pocket. The modified oligosaccharide chains seem not contact with the substrate, which indicates that the glycosylation does not contribute to the enzymatic activity.

Fig. 3
figure 3

Sequence alignment. Sequence alignment was done on the online sever of T-coffee23 and ESPript sever was used for result display24. Cysteine in the protein are shown by green figure, and the DT specific residues are shown by red points. XDT binding and catalytic domain are highlight in yellow.

Site-directed mutations of the XDT binding region

According to the enzyme-substrate complex structure, a number of residues at xylose and DT binding site were subjected to site-directed mutations to inspect their influence to the enzyme activities. Ala-scanning mutations were carried out and the activities of the mutants are summarized in Fig. 2e. Unsurprisingly, the mutations of conserved catalytic residues all showed an inactive or very low active. Deceased activities were observed on F324A, L328A, and L325A mutants, which locate at the recognition area for benzoate group of DT. Mutation S228A destroyed the potential hydrogen bond between OG-Ser228 and N-Tyr221 (3.0 Å in substrate free structure, 4.5 Å in E529Q), making it easier for loop Ile222-Gln229 to move. Mutation S449A might provide more hydrophobicity in DT binding pocket. It is surprised that mutation S91A, which hydrogen bonds to Asp109, increased the enzyme activity, too. This might indicate that Asp109 that interacts with xylose needs more flexibility during catalysis.

Discussion

Comparison with other GH3 enzymes shows that the crystal structures of LXYL-P1-2 provide the detailed information of xylosidase with the activity to hydrolyze 7-β-xylosyl-10-deacetyltaxol for the first time to our knowledge. Other structures similar to LXYL-P1-2 were found by using Dali server8. As expected, the overall fold of LXYL-P1-2 is resembled to some three-domain GH3 members. The RMSD of main-chain atoms is 1.3 Å when superimposing LXYL-P1-2 to the most similar structure TnBgl3B9. In the structure of LXYL-P1-2, eight cis-peptide bonds are found in Asp109-Gly110, Ala159-Pro160, Gly164-Pro165, Lys207-His208, Trp209-Ile210, Met319-Pro320, Ala381-Pro382, and Leu419-Pro420. Six of these cis- peptide bonds are conserved in many GH3 members, except for Ala381-Pro382 that makes a sharp turn involved in DT binding pocket. In addition, two intra-molecular disulfide bonds are formed by Cys266-Cys277 and Cys444-Cys451. These disulfide bonds are also conserved in most of GH3 members. Structure comparison also shows that the conformations of the residues around the sugar binding site, including the catalytic nucleophile Asp300 and the acid/base Glu529 are strictly conserved (Fig. 2d). This suggests LXYL-P1-2 share the common hydrolytic mechanism in GH3 family.

Interestingly, the xylose ring orientation rotated by about 60 degrees compared to the glucoses found in other GH3 enzyme structures as in HjCel3A10. As shown in Fig. 2d, the OH-1,-2,-3 groups of xylose is corresponded to the OH-2,-3,-4 of glucose. The OH-1 of glucose points outwards of the active site, where the +1 group of substrate could be linked. The glucose soaked in crystals mimics the substrate binding state. In LXYL-P1-2, the xylose is generated from the hydrolysis of XDT and represent the product binding state, in which the OH-1 group is away from DT, with the distance between C1 of xylose and O7 of DT being 4.3 Å. If the xylose is orientated as the in HjCel3, the distance between C1 and O7 would be only 3.0 Å, indicating a good position to form the glycosidic bond. Therefore, the glucose soaked in HjCel3 crystals might mimic the substrate binding state. In LXYL-P1-2, the xylose is generated from the hydrolysis of XDT and might represent the product binding state. Both in the substrate and the product states, although the C1 atom of xylose in LXYL-p1-2 and the C1 atom of glucose in HjCel3 are at different positions, both of them are close to the nucleophile residue (Asp300 in LXYL-P1-2 and Asp236 in HjCel3, with the C1-OD1 distances of 2.7 Å and 2.8 Å).

In the proposed retaining mechanism of glycosidase, a covalent intermediate complex is expected to be formed by the sugar group of substrate and the nucleophile residue. The structures of xylose and glucose described above might depict the two states before and after the covalent intermediate stage. This might be the first report elucidating the rotation of the sugar ring from substrate binding to product forming states.

Besides comparison above, the tetramer formation attracts our attention. The catalytic pocket of each monomer faces to the outside of tetramer. The tetramer formation seems not influence on the active pocket, but contributes to the highly thermal stability of LXYL-P1-2. Two interfaces are defined in the LXYL-P1-2 tetramer. On interface A, His118, Tyr390, Asn392, Arg394, Arg426, Tyr436, Glu479, Gln482, Asn489, and Glu491 form a number of hydrogen bonds between the two monomers, while Y473-I486*, L478-Y436*, L456-V438* (*indicates the residue from the neighboring monomer) form hydrophobic pairs. On interface B, Arg678-Tyr363*, Asp675-Lys66* form hydrogen bonds. In addition, Van der Waals interactions of Ser62-Trp679*, Thr368-Phe710*, and Asn369-Val757 also contribute to interface too. The interface A seems conserved as in AaBGL (Fig. 1b). However, the modified oligosaccharides are involved in dimer formation11. In LXYL-P1-2, there is no glycan modification close to this interface. The interface B was never reported in GH3 family. KmBglI12 and AoβG13 were reported to be a tetramer, but their dimer-dimer interfaces are different from that in LXYL-P1-2.

Compared to other GH3 enzymes, the sequence and structural variations mainly come from the loops, especially those forming the substrate binding pocket (Fig. 3). Loops of residues 220-232, 324–328, 379–383, 446–450, and 529–530 form the DT binding pocket. As shown in the sequence alignment, the first four loops are conserved in LXYL-P1-1 and LXYL-P1-2 but are quite different from other GH3 enzymes. In fact, when superimposing other GH3 enzymes structures with LXYL-P1-2, they may have residues occupy the position of DT binding, such as Val298 and Phe29 in TnBgl3B9, Trp37, Phe260, and Trp443 in HjCel3A10, Trp274, Leu295, and Tyr510 in KmBglI12. The DT specificity results from the loop sequences, with the residues contributing hydrophobic side-chains towards the binding DT molecule. This suggests the pocket size, shape and hydrophobic environment are critical to DT recognition. As shown in Fig. 2e, the mutants with large hydrophobic side-chain removed decrease the enzymatic activity. In contrast, mutations of the surrounding residues S91A and S449A, which increase the hydrophobicity, could improve the catalytic ability.

To confirm the importance of the DT-binding loops, we purified the recombinant TnBgl3B that shares the same sugar binding site but different DT-binding loops, and tested its activity on PNP-Glc, PNP-Xyl, and XDT, respectively. The results showed that although TnBgl3B exhibited considerable activities on PNP-Glc and PNP-Xyl, the activity on XDT was undetectable (Table 3). In order to find more candidate enzymes with XDT xylosidase activity, we searched the whole genome in EXPASY blast server [https://web.expasy.org/blast/]. No protein except LXYL-P1-1 and LXYL-P1-2, are found with the similar DT-preferred loops. Therefore, LXYL-P1-1 and LXYL-P1-2 seem to be the only available enzymes that could be applied in DT production to date. The activity-increased mutants show the potential of successful engineering of better enzymes in the future.

Table 3 Specific activity of TnBgl3B and LXYL-P1-2 against PNP-Xyl, PNP-Glc and XDT.

Another key point of the research is the binding site of Taxol in LXYL-P1-2. Due to the poor solubility in water, Taxol is usually formulated in a mixture of cremophor EL and dehydrated ethanol, which however may have severe side effects on patients (Ma and Mumper14). Therefore, the specific carrier proteins are needed for developing Taxol delivery system. To date, Nab-paclitaxel (Abraxane®, approved by FDA in 2005) has been the first FDA approved taxoid formulation based on albumin nano-delivery systems, and a number of novel Taxol nano-particle formulations are in clinical trials14. The structure of LXYL-P1-2 bound with XDT may provide useful information for designing new Taxol binding proteins.

In Protein Data Bank, Taxol is only found in tubulin/Taxol complex with the highest resolution of 3.5 Å. Our study shows the first high resolution structure of Taxol analogs bound in proteins. In order to search the possible common recognition features for DT or Taxol binding, the complexes of LXYL-P1-2/XDT and tubulin/Taxol15 are compared. Although the overall structures of LXYL-P1-2 and tubulin are totally different, the conformations of the DT or Taxol main body are superimposable (Fig. 4a, c) and two similar binding pockets could be identified, but with the binding groups are swapped.

Fig. 4
figure 4

a Structure comparison of the overall structure of LXYL-P1-2 (PDB code 6KJ0, green) and tubulin (PDB code 1JFF, cyan). DT and Taxol are shown as a yellow stick model. b Structure comparison of DT/Taxol two benzene rings recognition pocket of LXYL-P1-2 (PDB code 6KJ0, green) and tubulin (PDB code 1JFF, cyan). DT and Taxol are shown as a yellow stick model. The residues involved in DT/Taxol binding are shown as stick. c Structure superposition of DT/Taxol binding pocket of LXYL-P1-2 (green) and tubulin (cyan). DT is shown as a yellow stick model. Taxol is shown as model color stick. The residues involved in DT/Taxol binding are shown as stick. Label of tubulin are shown in parentheses.

As shown in Fig. 4b, loop of residues 220–232 in LXYL-P1-2 forms the first pocket that recognizes the benzoylamino of DT. The non-polar amino acid residues, Ile222, Ile224, Ile232, and Val227, form a semi-enclosed hydrophobic pocket, in which the benzoylamino is half surrounded, while the side-chain of Gln229 and the main-chain O of Val227 form the hydrophilic area at the bottom of the pocket. In the similar way, the pocket for benzoate biding in tubulin is also a semi-enclosed hydrophobic pocket formed by Leu217, Leu219, Leu230, and Leu275, while Asp226 and His229 form the hydrophilic bottom.

Loop324–328 of LXYL-P1-2 forms the second pocket, which is bound with the benzoate of DT. This pocket is formed by the side chain benzene of Phe324, the side chain methyl group of Ala327, the side chain of Leu325, the CG2 atom of Thr383. Tubulin has a similar pocket, which is formed by Phe272, Ala233, Val23, and Pro360. Instead of binding the benzoate in LXYL-P1-2, however, this binding pocket is for the phenylpropanoyl group binding in tubulin.

Further structural analysis indicates that Ile222, Ile224, Ile232, Val227, and Gln229 in LXYL-P1-2 have the similar spatial distribution as Leu217, Leu219, Leu230, Leu275, and Asp226 in tubulin (Fig. 4c). It is interesting that the three benzene rings of DT and Taxol are also in the same spatial position when superposed (Fig. 4c), which partially supports the conserved spatial distribution of binding pockets in Taxol binding. Besides the structural information, the results of our enzyme catalytic experiments demonstrate that Phe324 and Leu325 in the benzoate recognition region is critical for substrate recognition (Fig. 2d). Therefore, this spatial distribution of hydrophobic residues would make an interface with two pockets specific for Taxol binding. This is the first Taxol analog binding structure except tubulin and might be the only enzyme to catalyze XDT, which may provide useful information for Taxol analogs design.

Methods

Materials and strains

The plasmid pPIC3.5K-lxyl-p1-2 was cloned in our lab5, 16. The TnBgl3B gene (GenBank: ABI29899.1) was synthesized by SynBio Research Platform at Tianjin University (Tianjin, China). Phusion polymerase, restriction enzymes, and T4 ligase were purchased from New England Biolabs (Ipswich, MA). Escherichia coli Transeetta (DE3) competent cells and plasmids were purchased from TransGen Biotech (Beijing, China). The pET-28a plasmid was purchased from Novagen (Malaysia). XDT and DT (HPLC purity >99%) were purified in our laboratory. All other chemicals were analytical grade unless otherwise indicated.

Construction of active-site mutants of LXYL-P1-2 and TnBgl3B recombinant strain

The plasmid pPIC3.5K-lxyl-p1-2 was used as DNA template. The L-alanine scanning mutations and other active-site mutations were all obtained by means of the site-directed mutagenesis technique with Phusion High-Fidelity DNA Polymerase (NEB) by using whole-plasmid amplification PCR. All products were sequenced to ensure that no base change other than designed. The plasmids were extracted, linearized with Sac I and transformed into the Pichia pastoris GS115 competent cells for expression.

The full-length TnBgl3B gene was amplified by using forward primer (F: 5′-AAGGATCCATGGAAAAGGTTAACGAGATC-3′) with BamH I site (underlined) and reverse primer (R: 5′-TAGCGGCCGCTTAAGGCTTGAATCTTCTC-3′) with Not I site (underlined). The PCR products were digested by BamH I and Not I for directional ligation into vector pET-28a. After ligation, the construct was sequenced and transformed into E. coli Transeetta (DE3) competent cells for expression.

Protein expression, purification and activity assay

The heterologous expression and deglycosylation of LXYL-P1-2, as well as the mutants were same to the previous article5, 16. E. coli cells with pET-28a-TnBgl3B recombinant plasmid were grown overnight at 37 °C and 200 r.p.m. in 10 ml Luria-Bertani (LB) medium containing kanamycin (50 μg/ml) in a shaking flask. The overnight culture was suspended in 100 ml fresh LB medium at a final concentration of 1% (v/v), and grown at 37 °C and 200 r.p.m. for 2–3 h until OD600 reached 0.8. Then, isopropyl-β-D-thiogalactopyranoside (IPTG) was added at a final concentration of 1 mM and the cell culture was incubated for an additional 20 h at 24 °C and 200 r.p.m. The protein was purified by Ni Sepharose 6 Fast Flow resin and Agilent ZORBAX GF-450 gel-filtration column. The enzyme activities of LXYL-P1-2 were tested as reported previously5.

Crystallization, data collection, and structure determination

LXYL-P1-2 protein was concentrated to 10 mg/ml for crystallization. Both native crystal and complex crystal were grown at 16 °C with the hanging drop vapor diffusion method. For complex crystallization, E529Q protein was mixed with XDT before setting up. Native crystal and complex crystal grew in the solution contain 13% PEG3350, 0.1 M Tris-HCl pH8.5, 0.2% MgCl2 and 15% PEG3350, 0.1 M Hepes pH 7.5, 0.2% MgCl2, respectively. Diffraction data were collected in BL17U of the Shanghai Synchrotron Radiation Facility (SSRF)17. Data were processed using HKL200018. The native crystal diffracted to 2.4 Å (PDB code 6JBS), while the E529Q-XDT complex crystal diffracted to 2.27 Å (PDB code 6KJ0).

The native structure was solved by Phaser19 in CCP4 suit20 with TnBgl3B structure (PDB code 2 × 40) as the searching model. The complex structure was solved using native structure as the serching model. Refmac5 was used for strucure refienment21. Coot was used for model building22. All the strucure figures were prepared using PyMol (https://pymol.org).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.