Introduction

Endo-β(1 → 4)-mannanases (endomannanases, EC 3.2.1.78) are important enzymes, catalysing the degradation of abundant plant β-mannans (hereafter mannan) in nature. Endomannanases are currently used in various applications including plant biomass conversion1, food and feed2,3, detergent formulations4 and oil drilling5. An understanding of the intimate interactions between endomannanases and their substrates is key to optimising their utilisation and industrial performance. Mannan is an abundant type of hemicellulose in nature, primarily found in the secondary plant cell walls of softwood (coniferous trees). Mannans also serve as storage polysaccharides in certain seeds6. Mannans are composed of a linear backbone containing D-mannopyranosyl residues (linear mannans) or D-mannopyranosyl and D-glucopyranosyl residues organised in an alternating manner (glucomannans) linked by β-(1 → 4)-linkages. The backbone can be decorated with α-(1 → 6)-linked D-galactopyranosyl groups (galactomannans or galactoglucomannans) and acetyl groups6,7,8 (examples of galactomannans are shown in Fig. 1). In the secondary plant cell walls of softwood, acetylated galactoglucomannans comprise approximately 25% of the wood dry matter9,10,11. Guar gum, produced from the seeds of the guar plant (Cyamopsis tetragonolobus) and locust bean gum, produced from the seeds of the carob tree (Ceretonia siliqua) are significant sources of galactomannans. Guar gum contains more galactopyranosyl groups (Gal:Man, 1:2) than locust bean gum (Gal:Man, 1:4)6. In locust bean gum, the distribution of galactopyranosyl side-groups is irregular with a high proportion of unsubstituted blocks, whereas in guar gum, the galactopyranosyl groups are more ordered and found mainly in pairs and triplets with few non-substituted regions12 (Fig. 1).

Figure 1
figure 1

Schematic illustration of the two galactomannans (A) guar gum and (B) locust bean gum, with different degree and pattern of galactose substitutions on the β-mannan backbone12. Sugars shown using the Consortium for Functional Glycomics notation59. Both polymers continue towards the reducing end, having a degree of polymerization around 1500 for locust bean gum and 900 for guar gum12.

Endomannanases are the main enzymes which catalyse depolymerisation of mannan. Endomannanases catalyse cleavage of the β-(1 → 4)-linkages in mannans to produce mannooligosaccharides which may be further processed by e.g. the exo-acting β-mannosidases and α-galactosidases. Soluble substrates are often accessible to all these enzymes, but attack on mannan by endomannanases may also occur on water-insoluble substrate matrices1,2,13. Endomannanases are classified into four glycoside hydrolase (GH) families: 5, 26, 113 and 134 based on sequence similarity14. Endomannanases from families 5, 26, and 113 belong to clan GH-A and share the (β/α)8-TIM barrel fold and catalytic machinery, and catalyse the cleavage of the O-glycosidic bonds in the mannan backbone with net retention of the anomeric configuration15,16,17. In contrast, the newly identified GH134 endomannanases have a lysozyme-like fold and catalyse the hydrolysis of the mannan backbone via an inverting mechanism18. Fungal endomannanases known to date are predominantly categorised in family GH5 with a few in family GH26. Several GH26 endomannanases from different organisms have been characterised (e.g. CfimMan26A from Cellulomonas fimi (2BVY)19, CjapMan26A (1J9Y)20 and CjapMan26C (2VX6)21 from Cellvibrio japonicus, BovaMan26A (4ZXO) and BovaMan26B from Bacteroides ovatus22 and RspeMan26A from a symbiotic protist of the termite Recticulitermes speratus (3WDR)23). Fewer studies have focused on the fungal GH26 enzymes and only one crystal structure is available, namely that of PansMan26A from Podospora anserina, 3ZM824, which carries a family 35 carbohydrate-binding module (CBM35). PansMan26A and the GH26 endomannanase from Aspergillus nidulans, AnidMan26A, were shown to have a significant −4 subsite, and to accommodate galactopyranosyl units not only in the −1 subsite, but also in −2 and +1, in contrast to the GH5 counterparts from A. nidulans AnidMan5A and AnidMan5C24,25,26. Several fungal GH26 endomannanases were found to have higher initial rates on soluble galactomannans than the tested GH5 endomannanases, with the GH26 endomannanase from Yunnania penicillata, YpenMan26A, having the highest initial hydrolysis rate, closely followed by AnidMan26A and the GH26 endomannanase from Westerdykella sp, Wsp.Man26A1. However, the tested fungal GH26 endomannanases discriminated differently between the soluble mannans1, exemplified by the YpenMan26A and the Wsp.Man26A which both had high initial hydrolysis rates on locust bean gum, but different rates on more heavily substituted galactomannan. While YpenMan26A also showed high hydrolysis rate on guar gum, Wsp.Man26A appeared more restricted by the extra galactose substitutions.

Most fungal GH26 endomannanases have a CBM3524,26,27; a CBM family known to include members that bind β-mannans, uronic acids, β-1,3-galactan or α-1,6-galactopyranosyl residues on carbohydrate polymers28,29. The binding site of CBM35s has been reported to be located in between the loops connecting the β-strands and not on the concave surface presented by the β-strands28,29.

In the present study, the Michaelis-Menten kinetic parameters for YpenMan26A were determined, the crystal structure in complex with a galactomannooligosaccharide was solved, and the amino acids involved in substrate interactions identified. The structure of this unusual fungal wild-type enzyme with no CBM35 was compared to the known PansMan26A structure harbouring a CBM35 and by sequence alignment to seven other fungal GH26 endomannanases. The roles of selected substrate binding amino acids were evaluated from two YpenMan26A mutants, D37T and W110H. The mutations were inspired by the sequence of Wsp.Man26A, an endomannanase seemingly more restricted by galactose substitutions than YpenMan26A.

Results

Y. penicillata possesses at least one protein with endomannanase activity1 (GenBank sequence ID AYU65281). This enzyme, studied in the current paper, has a signal peptide and a GH26 catalytic domain, but no CBM, in contrast to most known fungal GH26 endomannanases which carries a CBM351,24,27. A gene encoding the catalytic domain, named YpenMan26A, was cloned and expressed in Aspergillus oryzae. Based on a sequence alignment with the sequence of PansMan26A, the two catalytic residues (previously identified for GH26 enzymes30,31), Glu165 and Glu257 in YpenMan26A were identified, with Glu257 being the nucleophile, performing the nucleophilic attack on an anomeric carbon in the mannan backbone, and Glu165 the acid/base, which serves as proton donor and later deprotonates the glycosyl acceptor in the first and second step of the retaining catalytic mechanism respectively15,32. This mechanism is characteristic for Clan GH-A glycosyl hydrolases, such as GH26 endomannanases15. The Michaelis-Menten kinetic parameters with locust bean gum and guar gum were determined for YpenMan26A. Interestingly, the kcat on guar gum (636 s−1) was found to be higher than that on locust bean gum (475 s−1). Previous studies reported a decrease in hydrolytic rate of endomannanases going from less to more substituted galactomannans, such as from locust bean gum to guar gum19,22,33. It is thought that the galactose substitutions cause steric hindrance, making the mannan backbone less accessible to the enzyme6,34. As expected, the KM was also higher on guar gum (2.2 mg/ml) than on locust bean gum (0.6 mg/ml) and the kcat/KM therefore lower on guar gum (289 ml/(mg·s)) than on locust bean gum (792 ml/(mg·s)). Motivated by the desire to see how this enzyme accommodates and interacts with the galactopyranosyl groups in galactomannan, we sought to determine the crystal structure of YpenMan26A in complex with a galactomannooligosaccharide. A YpenMan26A acid/base substituted variant, E165Q, was made using synthetic oligonucleotides and PCR, replacing the codon GAG at position 165 with CAG. The variant was synthesised and expressed in Aspergillus oryzae. N-Deglycosylation of the purified wild type and the E165Q YpenMan26A mutant using Endoglycosidase H, resulted in a small shift (~5 kDa) in the apparent molecular mass on SDS-PAGE (Fig. S1). These results confirm that YpenMan26A is N-glycosylated, in agreement with the GPMAW (Lighthouse data) prediction.

Structure of YpenMan26A

The structure of the deglycosylated YpenMan26A acid/base substituted variant E165Q, in complex with a α-62-61-di-galactosyl-mannotriose (MGG), was solved by molecular replacement using the known structure of PansMan26A24 as template, and refined at 1.36 Å resolution (Table 1). A YpenMan26A E165A variant was also cloned but this variant was not successfully expressed. Neither the active YpenMan26A nor the E165Q mutant crystallized as apoenzymes, suggesting that ligand binding resulted in increased stability and/or conformational changes leading to successful crystallogenesis. The YpenMan26A chain can be traced from Ala1 to Val312 without breaks, and forms a (β/α)8-barrel fold (Fig. 2A) as expected. The active site was identified in the groove with the conserved catalytic residue Glu165 (acid/base) mutated to Gln, and the conserved catalytic residue Glu257 (nucleophile) (Fig. 2A)30,31, equivalent to those observed in PansMan26A24. The low-activity YpenMan26A E165Q variant showed an initial rate of hydrolysis of locust bean gum of 40 U/µmole enzyme (equivalent to a “turnover rate” of 0.7 s−1), which was roughly 360 fold lower than the rate exhibited by the wild type enzyme (15050 U/µmole enzyme, equivalent to a “turnover rate” of 251 s−1). 1 U was defined as the amount of endomannanase (in moles) required to release 1 µmole of reducing ends per minute, under the assay conditions specified in Methods. The low activity of the E165Q variant may be a consequence of the acid/base, and not the nucleophile, being substituted. An alternative explanation may be a consequence of the small risk (about 1/100) of translational misreading error or mis-incorporation of the wrong amino acid (as reported for E. coli)35 since the E165Q variant was made with only a single base change from codon GAG (Glu) to CAG (Gln). There is a single N-glycosylation site at Asn103, located on the external side of the barrel, with a residual N-acetylglucosamine (GlcNac). As expected, YpenMan26A shows the highest structural similarity to other endomannanases (from both fungal, bacterial and protists origin) in family GH26 (Table 2). Judged from the Z-score (used by the DALI protein structure comparison server36 for ranking of structural matches) YpenMan26A has the greatest structural similarity to PansMan26A (3ZM8) followed by RspeMan26C (3WRD) (Table 2).

Table 1 Data collection and refinement statistics of YpenMan26A.
Figure 2
figure 2

(A) The structure of YpenMan26A (blue) superimposed with that of PansMan26A (3ZM824, gold). The α-62-61-di-galactosyl-mannotriose (MGG) ligand in YpenMan26A (subsites −4 to −2) is shown as green cylinders and the active residues are shown in shades of pink (B) Observed electron density for MGG in the −4 to −2 subsites. The positive electron density REFMAC Fo − Fc map, contoured at 3.5 σ (0.37 e Å−3) is shown in blue, with phases calculated prior to the incorporation of any ligand atoms in refinement. (C) The organisation of binding subsites and the MGG ligand in the −4 to −2 subsites of YpenMan26A (blue) compared with PansMan26A (gold). PansMan26A residues are only shown for those residues which differ from YpenMan26A. All panels were drawn using CCP4mg54.

Table 2 The five closest structural matches to YpenMan26A, calculated using the DALI protein structure comparison server36 (excluding duplicates).

Ligand binding to YpenMan26A

Crystals of YpenMan26A E165Q were obtained in the presence of α-64-63-di-galactosyl-mannopentaose (MGGMM) with the aim that the oligosaccharide would span the catalytic site. However, the electron density of the ligand was modelled as MGG situated in the −4 to −2 subsites (Fig. 2C). Since YpenMan26A E165Q was not completely inactive, it is likely that the residual activity has caused hydrolysis of the MGGMM between the backbone monomers in the −1 and +1 subsites, after which MGG migrated to span the subsite −4 to −2, indicating high ligand affinity in these subsites. This is supported by the observation that wild type YpenMan26A also produces MGG as a major hydrolysis product (discussed further below). The electron density of MGG is clear and unambiguous, except for the galactopyranosyl unit in the −3 subsite, which points out of the binding cleft (Fig. 2B). The B values for the galactopyranosyl residue in the −3 subsite are also higher (between 34–63 Å2 for the C atoms), than for the galactopyranosyl unit in the −2 subsite (between 17–30 for the C atoms) or for the mannopyranosyl moieties (between 14–28 for the C atoms). All the interactions between the enzyme and the ligand are clearly defined, except for the flexible galactopyranosyl unit. There is electron density present near the mutated Q165, which is remote from the ligand, and was described as acetate, which fits the density well. There was no acetate in the crystallisation buffer, but most probably it was a contaminant during purification or crystallisation, or was present in the cell growth media, similar to the unknown ligand described as propionate in 5G4Z37.

Like PansMan26A, YpenMan26A has eight large loops that form a deep cleft at the active centre and are involved in binding of the substrate: loop 1 (36–39), loop 2 (60–73), loop 3 (95–131), loop 4 (166–179), loop 5 (207–211), loop 6 (227–235), loop 7 (259–263), and loop 8(279–291). The −1 and +1 subsites of YpenMan26A are similar to other fungal and bacterial GH26 endomannanases (e.g. PansMan26A, CjapMan26A, CfimMan26A19,20,24) with the conserved residues His164, Trp170, Phe171, Tyr227, Trp279 (Fig. 2C). As described for the homologous enzymes19,20,24, YpenMan26A Tyr227 is involved in a hydrogen bond with the catalytic nucleophile Glu257 whilst the aromatic amino acids Trp170 and Trp279 stabilise the mannopyranose rings at the −1 and +1 subsites, respectively (Fig. 2C). Like PansMan26A, YpenMan26A displays a prominent −4 subsite, with stacking interactions between the mannopyranose ring and two aromatic residues W109 and W110 and hydrogen bonds between Asp61, Arg66 and the mannopyrannose ring (Fig. 2C). The −3 subsite appears weaker bound as judged from the ligand enzyme interactions. In the −2 subsite the two aromatic residues, Phe113 and Tyr114, equivalent to Phe248 and Tyr249 in PansMan26A, stabilise the interactions with the mannopyranose unit. Previously, enzyme interactions with a galactopyranosyl substituent attached to a mannopyranosyl unit within the −1 subsite of CjapMan26C have been described21. Interestingly, because of the captured ligand in the present study, it is possible to identify interactions between the galactopyranose unit and the YpenMan26A in the −2 subsite not previously described. Gln36, Asp37, and Asp58 are involved in hydrogen bonds with the galactose residue. Asp37 has a double conformation in the crystal structure, possibly because the amino acid conformation shifts upon ligand binding. PansMan26A has a Glu172 instead of the Asp37 in YpenMan26A, but otherwise the enzymes have essentially identical environments for interactions with the galactose residue. Out of the six closest structural matches (Table 2), only PansMan26A (3ZM8) accommodates galactopyranosyl residues in the −2 subsite like YpenMan26A. A surface view of YpenMan26A and CjapMan26C (2VX6) with their ligands superimposed (the MGG from YpenMan26A and a bound α-63-galactosyl-mannotetraose (MGMM) in the −2 to +2 subsite of CjapMan26C) shows that the ligands overlap nicely. The data thus indicate accommodation of galactopyranosyl residues in the −3, −2 and −1 subsites of both enzymes (Fig. S2). These superimpositions show that CjapMan26C does not accommodate the galactopyranosyl unit in the −2 subsite, where the moiety is pointing into the enzyme structure, whereas YpenMan26A accommodates galactopyranosyl moieties in −3, −2 and −1 (Fig. S2). The data also show that YpenMan26A has a more open active site than CjapMan26C (Fig. S2).

Design of two YpenMan26A variants – inspired by Wsp.Man26A

A sequence similarity search with the YpenMan26A sequence, using the NCBI protein-protein BLAST (Basic Alignment Search Tool at http://www.ncbi.nlm.nih.gov/BLAST/, against the non-redundant protein sequences database)38, identified the A. nidulans GH26 endomannanase (Swissprot ID Q5AWB726) with 67.5% amino acid identity as the closest characterised enzyme. A multiple sequence alignment of 9 fungal GH26 endomannanases showed that the amino acids that take part in ligand binding in YpenMan26A are highly conserved (Fig. 3, red stars) (see later paragraph for discussion of differences between sequences of the GH26 core domains with and without a CBM35). However, Wsp.Man26A has two striking differences compared to YpenMan26A and the other endomannanases. The first is in the −2 subsite (YpenMan26A Asp37), where the analysed endomannanases have either an Asp or a Glu, while Wsp.Man26A has Thr (Fig. 3).

Figure 3
figure 3

Sequence alignment of the catalytic GH26 core region from 9 fungal GH26 endomannanases. Secondary structure elements for YpenMan26A and PansMan26A are displayed above and below the alignment respectively. Mutated residues D37 and W110 (lilac) and residues involved in ligand binding (red stars) in the YpenMan26A structure including the two catalytic residues. The α-helix in PansMan26A (α9) which is nearest the CBM35 and which is a surface loop in YpenMan26A is coloured blue. Identical residues are shown in white on red background. Highly similar residues (when the similarity score assigned to one column is above 0.7) are coloured red and framed in a blue box. The GH26 core sequence of YpenMan26A (AYU65281), AnidMan26A (Q5AWB7), Ascobolus stictoideus AstiMan26A (BBW45412), Collariella virescens CvirMan26A (BBW45415), Mycothermus thermophiles MtheMan26A (MH208368), Neoascochyta desmazieri NdesMan26A (MH208367), Myceliophthora thermophila MtMan26A (99077), Wsp.Man26A (MH208369), PansMan26A (B2AEP0) were aligned by MUSCLE55 and the figure was generated using ESPript 3 Web server56.

The second is in the −4 subsite (YpenMan26A Trp110), where the tested endomannanases have Trp or Tyr, while Wsp.Man26A has His (Fig. 3). von Freiesleben et al.1 showed that YpenMan26A and Wsp.Man26A differ in their substrate preferences for locust bean gum and guar gum. YpenMan26A barely discriminated between the two substrates, whilst Wsp.Man26A had approximately four times higher initial hydrolysis rate on locust bean gum than on guar gum (Fig. 4B, data adapted from von Freiesleben et al.1), indicating that this enzyme was more hindered or had less affinity for the increased amount of galactose substitutions in guar gum. In the present study, the hydrolysis product profiles from full conversion of guar gum were analysed using the DNA sequencer-Assisted Saccharide analysis in High throughput (DASH) method26,39 (Fig. 4A).

Figure 4
figure 4

(A) Product profiles from guar gum hydrolysis by YpenMan26A and Wsp.Man26A. Aligned electropherograms of product profiles at 30% guar gum conversion (max conversion). Migration of oligosaccharides is given in dextran units (DE). A ladder was run containing: mannose (M1, 0.9 DE), mannobiose (M2, 1.87 DE), mannotriose (M3, 2.85 DE), and α-61-galactosyl-mannotriose (MMG, 3.81 DE). Migration of α-galactosyl-mannose (G, 2.10 DE), and α-62-61-di-galactosyl-mannotriose (MGG, 4.10 DE) was determined by von Freiesleben et al.26. (B) Initial reaction rates (U/µmole) by YpenMan26A and Wsp.Man26A on galactomannans. Data are from von Freiesleben et al.26. Hydrolyses were carried out at 37 °C, pH 5 on guar gum (light grey) and locust bean gum (dark grey). Values are given as mean values ± SD (n = 2). (C) The structure of YpenMan26A with MGG in the −4 to −2 subsites. The two differences in ligand binding amino acids between YpenMan26A and a superimposed homology model of Wsp.Man26A are highlighted in blue and orange, respectively.

YpenMan26A produced primarily α-galactosyl-mannose (G, 2.10 DE) and α-62-61-di-galactosyl-mannotriose (MGG, 4.10 DE), whereas Wsp.Man26A in addition produced M2 and M3. To investigate if the difference in ligand interacting amino acids between YpenMan26A and Wsp.Man26A (Fig. 4C) played a role in the observed differences in substrate preference and binding mode, two YpenMan26A mutants, YpenMan26A D37T and YpenMan26A W110H, were designed, expressed and purified to electrophoretic purity (Table 3 and Fig. S3).

Table 3 The wild-type YpenMan26A and the investigated variants.

Kinetics with galactomannans and MGGMM

The Michaelis-Menten kinetic parameters with locust bean gum and guar gum were determined for the two YpenMan26A mutants D37T and W110H and compared with those reported for the wild type enzymes YpenMan26A and Wsp.Man26A (Table 4). The wild-type YpenMan26A had the highest kcat/KM on both substrates, closely followed by the Wsp.Man26A on locust bean gum. The kinetic data for the two wild-type enzymes show that Wsp.Man26A is more compromised on the heavily substituted guar gum than YpenMan26A; these results corroborate our previous data1. The wild type YpenMan26A and the variant D37T had identical KM values on locust bean gum, but D37T had a higher KM than the wild type enzyme on guar gum. This result indicates that the D37T mutant has lower affinity for the galactose residues in the highly substituted guar gum than the wild type enzyme has. The reason that no difference in KM values was observed on locust bean gum as substrate might be due to the presence of unsubstituted blocks of mannan in the locust bean gum mannan12. It is likely that both the wild type and the D37T variant catalyse the degradation of the unsubstituted, more easily accessible, part of the substrate first, so the initial rate reflects the enzyme affinity for the unsubstituted regions of the substrate. Guar gum is known to have no (or few) blocks without substitutions12. Based on the KM value, the YpenMan26A W110H variant appeared to have very low affinity for locust bean gum, when compared to the other enzymes. On guar gum galactomannan it was not possible to determine the kinetic parameters separately, because saturation was not reached, but the low kcat/KM indicates low affinity or low hydrolysis rate.

Table 4 Kinetic parameters on locust bean gum and guar gum of the wild-type enzymes YpenMan26A and Wsp.Man26A and the variants YpenMan26A D37T and YpenMan26A W110H.

In addition, for the Wsp.Man26A substrate saturation was not fully reached, especially not on guar gum, resulting in relatively high standard deviation. R2 values for the fitted Michaelis-Menten curve for Wsp.Man26A were 0.90 and 0.91 on locust bean gum and guar gum, respectively.

To validate that the increase of KM for YpenMan26A D37T on the highly substituted guar gum galactomannan was caused by the change in the −2 subsite, kcat/KM on MGGMM for the YpenMan26A wild-type and the D37T mutant were determined by following substrate depletion at low substrate concentration (0.1 mM) by MS (Table 5). A novel MS based method with an internal standard was developed to allow these measurements (relevant spectra, extracted ion chromatograms and a standard curve are shown in Fig. S4). The reaction rate of MGGMM depletion could be described by the equation described by Matsui et al.40 (Fig. S5), which was used to determine kcat/KM. It is likely that MGGMM binds from the −4 to the +1 subsite in YpenMan26A, and therefore accommodates the galactopyranosyl residues in the −3 and −2 subsite, as in the X-ray structure (Fig. 2C). This can be assumed because of the dominant M5 productive binding mode for YpenMan26A from subsite −4 to +1 (see next section, Fig. 5) and the demonstrated capability of YpenMan26A to accommodate the galactopyranosyl moiety in the −3 and −2 subsites (Fig. 2). Furthermore, AnidMan26A, which is the closest homologue to YpenMan26A, was found to produce MGGM and M from MGGMM26.

Table 5 Kinetic efficiency on MGGMM for YpenMan26A wild type and the variant YpenMan26A D37T.
Figure 5
figure 5

(A) Relative frequency of the productive binding modes of M5 for the YpenMan26A wild-type and the W110H variant. Each circle represents a mannose unit. The dashed line between subsite −1 and +1 represents hydrolytic cleavage. The outmost numbers on respective side represent the total percentage of produced product, i.e. M4 and M1 or M3 and M2, determined by HPAEC-PAD quantification. These numbers were then combined with the individual ratios of labelled (18O) to unlabelled (16O) products (M4- and M3-species, respectively) (see panel B) to calculate the inner numbers which represent the relative frequency of each productive binding mode for the two enzymes. (B) Mass spectrometry peaks showing the major labelled (18O) hydrolysis product for YpenMan26A wild-type (left) and W110H (right) together with unlabelled (16O) species of the same DP (M4 and M3 for the wild-type and W110H, respectively). From these spectra, a M4/M4-18O ratio of 1:8.9 and a M3/M3-18O ratio of 1:9.2 was calculated. The theoretical mass for M3 with a sodium adduct is 527.159 and the theoretical mass for M4 with a sodium adduct is 689.212.

The D37T variant had four times lower kcat/KM on MGGMM than the wild type enzyme (84 vs 19 s−1·mM−1, Table 5), showing that the mutant has lower kcat and /or higher KM (probably a combination of both as for the individual kinetic parameters determined on guar gum). The observed kcat/KM for the wild-type YpenMan26A and the D37T variant is at the same level as kcat/KM’s reported for other fungal endomannanases on M5, which were found to range from 23–163 s−1·mM−1 for the GH5 endomannanases from A. nidulans and Trichoderma reesei41 and to be 22 s−1·mM−1 for PansMan26A24. The bacterial GH26 endomannanase from B. ovatus, BovaMan26A, had a kcat/KM of 247 s−1·mM−1 on M522. This result emphasises that substitution of Asp37 with Thr decreases the affinity for the galactopyranosyl moiety in the −2 subsite. The lower kcat/KM on MGGMM obtained for the D37T mutant compared to the wild type is consistent with the expected increase in distance between the galactopyranosyl unit and the amino acid residue when Asp is substituted with Thr (Fig. 4C).

Productive binding of M5

M5 hydrolysis product analysis using HPAEC combined with solvent isotope labelling and mass spectrometry (MS) analysis24,42 was used to estimate the relative frequency of productive binding modes for the YpenMan26A wild type and W110H mutant. The HPAEC product quantification showed a clear difference between the wild type and the W110H variant (Fig. S6), with the wild type preferring producing M4 and M1 (89% relative productive binding frequency) with little formation of M3 and M2 (11%). For the W110H mutant the major hydrolysis products were M3 and M2 (70%) as well as some M4 and M1 (30%). Because two productive binding modes can give rise to the same products (M5 can for example be hydrolysed into M4 and M1 through removal of the reducing end or the non-reducing end mannopyranosyl unit), the HPAEC data were combined with an in situ labelling, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis procedure24,42 where M5 hydrolysis is performed in 18O-water, to obtain product ratios of 18O-labelled versus ordinary 16O-products. The newly formed reducing end will be labelled with 18O (heavy product) while the “leaving group” saccharide (light) of each catalytic event will not. With the MS analysis, it is thus possible to distinguish between M4 produced by M5 binding from subsite −4 to +1 (generating heavy M4) and M5 binding from subsite −1 to +4 (generating light M4). The heavy versus light product ratios obtained for M3 and M4 were used to calculate the relative binding frequencies of binding modes that generate these products, respectively (Fig. 5).

The data show that for wild-type YpenMan26A, the dominant productive M5 binding mode is from subsite −4 to +1 (80% binding frequency) (Fig. 5), but that this mode is significantly reduced (28%) for the W110H mutant. Instead the dominant productive M5 binding mode is shifted to cover subsites −3 to +2 (63% binding frequency). This is probably a consequence of Trp110 in the −4 subsite being changed to His, resulting in a weaker subsite. It is also possible that the W110H substitution has caused a slight change in the global active site fold, resulting in slightly reduced thermostability (Table 3) and decreased activity (Table 4). However, on locust bean gum it is mainly KM and to a smaller extend kcat that changes when comparing the W110H variant with the YpenMan26A wild-type, indicating that the affinity for the substrate is dramatically changed while the hydrolysis rate is affected to a lesser extent. These results suggest that the W110H substitution has caused changed to the binding subsites and not the overall fold.

Differences in the catalytic GH26 domain in fungal endomannanases with and without CBM35

Most regions are highly conserved between YpenMan26A and PansMan26A (Fig. 2A,C), but YpenMan26A lacks a N-terminal CBM35 domain. From the superimposition of the two crystal structures (Fig. 2A), it is seen that the main difference in the secondary structure between the core modules of the two enzymes is in the area which approaches the CBM35 of PansMan26A, where PansMan26A has an α-helix and YpenMan26A a surface loop. Interactions occur through water between the Ala402 and Gln404 in the PansMan26A core domain and the Leu58 and the Ser130 in its CBM35 and linker respectively. Couturier et al.24 also report that a hydrophobic patch comprising Leu58 and Leu130 on the surface of the CBM35 stands in front of a cluster of hydrophobic residues, Ala402, Tyr403 and Leu399 of the core domain24. These interactions would not be established if the PansCBM35 were appended to the YpenMan26A, because of differences in the amino acid sequence and the flexible nature of the surface loop. The multiple sequence alignment (Fig. 3) of the GH26 core domains of nine fungal GH26 endomannanases (two wild-type core enzymes, five with a N-terminal CBM35 and two with a CBM35 and a C-terminal CBM1), confirms variation in the region in and around α9 in PansMan26A (Fig. 3, marked blue), the area of the core domain approaching the CBM35. The seven enzymes with a CBM35 have identical sequences to PansMan26A (LQAY, for AstiMan26A it is MQLY), which forms an α-helix in PansMan26A, while the two enzymes with no CBM35, have a different and seemingly more variable sequence (TGGV for YpenMan26A and MRED for AnidMan26A). From this analysis, it seems that co-evolution has occurred between the GH26 core domain and the CBM35. It is likely that the core domain evolved to accommodate and maybe help position the CBM35. When the CBM35 is absent, α9 is not needed.

Discussion

Data presented here add to the current understanding of fungal GH26 endomannanases, which appear to be conserved in their known functional characteristics. Characterised fungal GH26 endomannanases, including YpenMan26A, have a characteristic ligand binding site with a strong – 4 subsite, and a dominant M5 binding mode from the −4 to +1 subsite24,26,27, in contrast to at least some fungal GH5 endomannanases (including PansMan5A) which mainly bind M5 from the −3 to the +2 subsite24. To date, the fungal GH26 endomannanases which have been analysed with a focus on the accommodation of galactopyranosyl units, are able to degrade highly substituted galactomannans by allowing accommodation of galactose substitutions at least in the −3, −2, −1 and +1 subsites as judged by biochemical data and crystal structures. The biochemical data include the observations that PansMan26A and AnidMan26A produce α-galactosylmannose (G) as their dominant hydrolysis product from guar gum galactomannan and AnidMan26A catalyses the hydrolysis of MGGMM to MGGM and mannose26. The structural data include the crystal structure of PansMan26A24 and the homology model of AnidMan26A that both show an open active site cleft with space for galactose substitutions26. Furthermore, our current crystal structure of YpenMan26A with bound MGG from the −4 to the −2 subsites and the observation that the amino acids participating in MGG binding in YpenMan26A are highly conserved between studied GH26 endomannanases (Fig. 2), further support this hypothesis. Some fungal GH5 endomannanases, e.g. the TresMan5A from T. reesei, have been found to accommodate galactopyranosyl residues in the −1 subsite43, but not in the −2 and +1 subsites26. Among the bacterial GH26 endomannanases there is a variation in their ability to accommodate multiple galactopyranosyl residues in the active site cleft, exemplified by BovaMan26A and BovaMan26B from Bacteroides ovatus22.

We show that a single mutation in the substrate binding amino acids can result in altered binding modes or substrate affinity as seen for the YpenMan26 wild-type and mutants investigated in the present study. Of the 17 amino acids involved in ligand binding (including the two catalytic residues) only three residues were not conserved among the nine fungal GH26 endomannanases compared in this study (Fig. 3). In two of these changes Wsp.Man26A differed from the rest of the endomannanases. Mutation studies showed that W110H shifted the dominant productive M5 binding mode of YpenMan26A from covering the −4 to +1 subsites to the −3 to +2 subsites, emphasising the importance of Trp110 in the strong −4 subsite. The D37T mutation lowered the affinity for a galactopyranosyl unit in the −2 subsite of YpenMan26A. A third variation in ligand binding amino acids among the studied GH26 endomannanases was position Asn280 in YpenMan26A (Fig. 3). This residue is not conserved between the nine fungal GH26 endomannanases, which might indicate that this residue is not important for ligand binding or it could contribute to different affinity for galactose in the −2 subsite, similar to the D37T mutation investigated in the present study. Indeed fungal GH26 endomannanases were shown to have different ratios between their initial rate on locust bean gum and on guar gum1, indicating variations in galactose affinity and/or tolerance, which perhaps can be explained by variations at this position (Asn280 in YpenMan26A, Fig. 3). Detailed knowledge about binding mode and affinity for substitutions in different subsites is important when using these enzymes to produce specific oligosaccharides e.g. for prebiotics or alkyl mannooligosides.

As seen from the superimposition of YpenMan26A and PansMan26A (Fig. 2A) and the sequence alignment of nine fungal GH26 endomannanases (Fig. 3), the main difference in their catalytic domains appears to be in the area approaching the CBM35 (if present). The GH26 core module of the enzymes with a CBM35 seems to have evolved to harbour this big binding domain (15 kDa) in close proximity to the core, by aid of an α-helix (α9) whereas the wild-type enzymes with no CBM35, YpenMan26A and AnidMan26A, have a less structured surface loop in this area. The α9-helix in PansMan26A is situated with the end of the helix pointing directly into the site where the linker is attached to the CBM35. It is possible that this α-helix plays an important role in positioning of the CBM35. It is also possible that the position we see in the crystal structure of PansMan26A is not that of the CBM35 in solution, and it is likely that the core domain and the CBM35 can come in even closer contact, perhaps facilitated by ligand binding. A similar event has been reported for processive GH9 endoglucanases, for which a CBM3c module were shown to align with the catalytic cleft of the GH9 module, presumably forming one functional entity44. The linker in these GH9 cellulases is wrapped around the core domain, similar to the linker in PansMan26A24, and contributes significantly to the positioning of the CBM3c.

Conclusions

This study identified important amino acids for binding galactomannan in the −4 to −2 subsites of YpenMan26A, by solving and analysing its crystal structure in complex with MGG. Particularly the −2 subsite has multiple interactions with the galactopyranosyl side group. The study also highlights the high sequence similarity of known fungal GH26 endomannanases, with conserved ligand binding amino acids in the active site cleft. These results strongly indicate that the capability of accommodating multiple galactopyranosyl side-groups in the binding cleft is conserved among the fungal enzymes in the GH26 family. The two YpenMan26A variants, W110H and D37T, showed that these changes shifted the dominant M5 binding mode from covering the −4 to +1 subsite to cover the −3 to +2 subsite and lowered the affinity for galactopyranosyl residues in the −2 subsite. The crystal structure of YpenMan26A has a unique surface loop when compared to the crystal structure of PansMan26A, which appears to be a consequence of the enzyme lacking a CBM35. Known fungal GH26 endomannanases, including YpenMan26A, seem tailored for hydrolysing highly substituted galactomannans. Understanding the intimate enzyme-substrate interactions and the possibilities of changing product profiles and substrate affinities are important for fine-tuned optimization and utilization of these enzymes in industrial applications.

Methods

Materials

Locust bean gum (low viscosity; sodium borohydride reduced), guar gum (medium viscosity), mannobiose (M2), mannotriose (M3), mannotetraose (M4), mannopentaose (M5), α-61-galactosyl-mannotriose (MMG), α-64-63-di-galactosyl-mannopentaose (MGGMM), and α-62-63-64-tri-xylosyl-glucotetraose (XXXG) were purchased from Megazyme (Ireland). All other chemicals were purchased from Sigma (Germany), unless otherwise stated. Mobility markers, dextran ladder, and the DASHboard software for DASH analyses were kindly donated by Prof. Paul Dupree (University of Cambridge, UK).

Construction of variants

The gene sequence encoding YpenMan26A (GenBank sequence ID: AYU65281) was used to make the mutated constructs. E165Q was introduced into the gene sequence by PCR using synthetic oligonucleotides replacing the codon GAG position 165 of the mature peptide with CAG. PCR was conducted for the 5′ fragment and 3′ fragment separately using Phusion High-Fidelity DNA Polymerase (ThermoFisher Scientific) under the following conditions: 98 °C 2 min, 35 cycles at 98 °C for 10 sec, 72 °C for 150 sec, followed by 72 °C for 10 min. The PCR products were gel purified and used as template for a second round of PCR, using the gene flanking primers to amplify the full-length gene with the native signal peptide. The full-length PCR product was cloned into pDAu22245, an Aspergillus expression vector under the control of a NA2-tpi double promoter using the BamHI and XhoI restriction sites, and its sequence was determined. The resulting pDAu222-YpenMan26A-E165Q expression vector was transformed into A. oryzae MT3568. MT3568 is an amdS (acetamidase) disrupted derivative of A. oryzae Jal_35546 in which pyrG auxotrophy was restored in the process of inactivating the A. oryzae amdS gene. Secretion of YpenMan26A E165Q in the culture supernatant of the recombinant MT3568 clones was confirmed by SDS-PAGE.

Mutants containing the D37T and W110H substitutions respectively were made as synthetic full-length cDNA constructs with the native signal peptide (ThermoFisher Scientific) cloned into pDAu222 using the BamHI and XhoI restriction sites. For D37T the codon GAC of position 37 of the mature peptide was replaced with ACC. For W110H the codon TGG of position 110 of the mature peptide was replaced with CAC. The constructs were verified by sequencing and the resulting pDAu222 expression vectors were transformed into A. oryzae MT3568. Secretion of mutants in the culture supernatant of recombinant MT3568 clones was confirmed by SDS-PAGE.

Expression and purification

The fungal wild-type GH26 endomannanases Wsp.Man26A and YpenMan26A, as well as the YpenMan26A mutants D37T, W110H and E165Q were recombinantly expressed in A. oryzae MT3568 an amdS46. The enzymes, wild-types and variants, were purified to electrophoretic purity using hydrophobic interaction and ion exchange chromatography. The inactive YpenMan26A E165Q variant, used for crystallisation, was further purified using size-exclusion chromatography and deglycosylated with Endoglycosidase H (Roche). The identity of the purified endomannanases was validated with mass spectrometry analysing a tryptic digest of the protein band excised from a SDS-PAGE gel. Protein concentrations were determined by UV absorption at 280 nm using theoretical extinction coefficients (ε). ε at 280 nm of all proteins were estimated by GPMAW 9.20 (Lighthouse Data) and were based on mature proteins without modifications.

Crystallisation

The inactive YpenMan26A mutant E165Q was concentrated to 48 mg/ml, in 20 mM MES, 125 mM NaCl, pH 6 and aliquoted into 50 µl samples. Aliquots not used for immediate crystallisation trials were flash-frozen in liquid nitrogen and stored at −80 °C. Initial crystallisation screening was carried out using sitting-drop vapour-diffusion with drops set up using a Mosquito Crystal liquid handling robot (TTP LabTech, UK) with 150 nl protein solution plus 150 nl reservoir solution in 96-well format plates (MRC 2-well crystallisation microplate, Swissci, Switzerland) equilibrated against 54 µl reservoir solution. Experiments were carried out at room temperature with several commercial screens, for the protein on its own and in the presence of 5 mM MGGMM. The best hits were obtained in the AmSO4 suite (QIAGEN), for the ligand complex. The conditions were manually optimised in a 24-well Linbro dish, in hanging drop format. The final crystallisation conditions were 2.6–2.8 M ammonium sulphate, 0.1 M Hepes pH 7.0.

Data collection, structure solution and refinement

All computations were carried out using programs from the CCP4 suite v. 7.047. For the MGGMM-YpenMan26A complex, data were collected at the Diamond Light Source beamline I04 to 1.36 Å resolution and processed using xia248. The structure was solved using MOLREP49 with PansMan26A (PDB entry: 3zm8; Couturier et al.24; sequence identity: 47.7%) as template. The structure was refined using REFMAC550 iterated with manual model building/correction in Coot51. The final model was validated using Molprobity52 as part of the Phenix package53. Data-processing and refinement statistics are given in Table 1. Structure figures were prepared using CCP4mg54 or PyMOL v 1.7.20 (DeLano Scientific LLC, San Carlos, CA). The sequence alignments were created with MUSCLE55 and ESPript56.

Homology modelling

The homology model of Wsp.Man26A was generated using HHPred-Homology server (https://toolkit.tuebingen.mpg.de/#/tools/hhpred)57 with PansMan26A as template, (PDB ID: 3ZM824, 54% sequence identity). Model quality was evaluated using the Ramachandran analysis in MolProbity (http://molprobity.biochem.duke.edu/)52. The model of Wsp.Man26A had 96.4% (430/437) of all residues in allowed regions. The model was only used to visualise the mutated amino acids in YpenMan26A, which were inspired by Wsp.Man26A (Fig. 3).

Thermal stability

The thermal stability at pH 5.0 was investigated with Differential Scanning Calorimetry (DSC) following an established protocol26. The Thermal midpoint (Tm) was determined as the top of the protein denaturation peak, with an accuracy of +/−1 °C.

Initial rates and analysis of product profiles by DASH

The initial rates on locust bean gum and guar gum by the endomannanases were determined with 2.5 mg/ml substrate in 50 mM sodium acetate pH 5.0 at 37 °C. The hydrolytic activity was determined after 15 min in a 200 µl hydrolysis volume. Released reducing sugars were measured with the 4-hydroxybenzoic acid hydrazide (PAHBAH) method described by Lever58, with mannose as standard. All hydrolysis assays were carried out at 7 different endomannanase doses as described elsewhere26. Initial rates were calculated in the initial linear range of the hydrolysis. Guar gum hydrolysis product profiles at high conversion (26–36%) were analysed by DASH after inactivation by heating at 95 °C for 15 min. APTS (9-aminopyrene-1,4,6-trisulfonate) labelling and analysis of the labelled saccharides were carried out as described elsewhere26,39.

Kinetics with locust bean gum and guar gum

The kinetic constants for locust bean gum and guar gum hydrolysis were determined by assaying the initial endomannanase rates at different substrate concentrations (10 to 0.1 mg/ml) using the PAHBAH assay as described above. The enzyme concentrations used for the locust bean gum hydrolysis were 4 nM YpenMan26A wild-type, 4 nM Wsp.Man26A, 4 nM YpenMan26A D37T, and 18 nM YpenMan26A W110H and for the guar gum hydrolysis were 4 nM YpenMan26A, 10 nM Wsp.Man26A, 6 nM YpenMan26A D37T, and 44 nM YpenMan26A W110H. The initial hydrolysis rate, Vi, was plotted as a function of the substrate concentration, [S]. Non-linear regression using the Michaelis-Menten equation was used to determine the values for kcat, KM and kcat/KM.

Kinetics with MGGMM

kcat/KM was determined by following MGGMM depletion over time at low substrate concentration (0.1 mM), pH 5 and 37 °C, with an online, direct injection, mass spectrometry based assay. Duplicate samples were analysed using a HPLC-MS system with a Dionex Ultimate 3000RS HPLC connected to an ESI-iontrap (AmaZon SL, Bruker Daltonics). The HPLC provided a constant flow of 0.1 ml/min of 50/50 vol-% acetonitrile and 0.1% formic acid. The electrospray was operated in positive ultrascan mode with Multiple Reaction Monitoring (MRM) using a target mass of m/z 800. MRM mode was chosen to selectively follow substrate depletion and an internal standard (XXXG). 100% reaction amplitude was used to ensure fragmentation of the precursor ion. The capillary voltage was set at 4.5 kV, end plate offset was 0.5 kV, nebulizer pressure 3.0 bar, dry gas flow 12.0 l/min, and dry gas temperature was set to 280 °C. Buffer concentration, 1 mM sodium acetate pH 5, was set as low as possible to minimize ion suppression without compromising pH in the reaction. The total reaction volume was 500 µl and the sample was incubated directly in an HPLC-vial in the HPLC-autosampler. The reaction was started by adding enzyme in 2 nM and 6 nM for the wild-type YpenMan26A and the D37T variant respectively. Two min after enzyme addition, the first sample was taken. Thereafter, sampling was performed every 5.4 min (including sampling procedure), when the autosampler injected 4 µL sample directly into the flow leading to the MS. The enzyme reaction was immediately quenched when entering the flow path because the mobile phase was pH 2.7 and detection occurred approx. 0.5 min after injection. Total acquisition time was set to 4 min. The enzyme reactions were followed for a maximum time period of 50 min, but only data describing the initial phase of the reaction (less than 25% conversion of substrate) were used for estimating kcat/KM. Details on extracted ion chromatograms used for quantification of MGGMM and XXXG can be seen in Fig. S4. Data were analysed and quantified using Compass DataAnalysis 4.2 and Compass QuantAnalysis 2.2 provided by Bruker Daltonics. Ln (S0/St) was plotted as a function of time (t) (Fig. S5) and kcat/KM was calculated as described by Matsui et al.40; k = Ln (S0/St), where k = ((kcat/KM)·[enzyme])·t, S0 = substrate concentration at time zero and St = substrate concentration at time t.

Productive M5 binding modes

The hydrolytic cleavage pattern of M5 was determined for the YpenMan26A wild-type and the W110H variant, by the previously established 18O-water product labelling methodology24,42. First, M5 hydrolysis products were analysed and quantified by high performance anion exchange chromatography with pulsed amperometric detection (HPAEC-PAD) using a Dionex ICS-5000 with a Carbo-Pac PA-200 column and guard column. For this, double incubations of 1 mM M5 and 50 nM wild-type enzyme or 200 nM W110H mutant in 1.5 mM sodium acetate buffer, pH 5 were stopped by boiling at timed intervals (30 min to 3 h). Data after 30 min incubation for YpenMan26A or 2 h for the W110H mutant (approximately 30% hydrolysis) were used. The quantification allowed distinguishing between productive M5 binding modes that generated M4 and M1 versus those that generate M3 and M2. However, HPAEC alone cannot distinguish between the two possible binding modes generating M4 and M1 (i.e. binding either from subsite −4 to +1 or from −1 to +4), neither the two binding modes that generate M3 and M2 (i.e. binding from subsite −3 to +2 or −2 to +3). Therefore, incubations as above were also set up at 8 °C using 97% H2 18O as stock solvent reaching 92% 18O-water in the reactions. Duplicate reactions were stopped after 30 min (for wild-type) and 2 h (for W110H) by directly spotting 0.5 μl samples with 0.5 ml matrix (10 mg/ml 2,5-dihydroxybenzoic acid) on a stainless-steel plate, followed by immediate drying with warm air. Spectra were then obtained by MALDI-TOF MS and used to calculate the 18O over 16O product ratios using the monoisotopic peak areas as previously described24,42. Since M5 hydrolysis in 18O-water generates products where the newly formed reducing end becomes 18O-labelled (and other chain ends do not), the 18O over 16O product ratios can be used to calculate the relative frequency of the productive binding modes mentioned above (i.e. M5 binding from subsite −4 to +1 versus subsite −1 to +4 or binding from subsite −3 to +2 versus subsite −2 to +3)24,42. The procedure involves two calculated corrections for the product ratio determination, one for the (M + 2) natural isotope peak of the light (16O) species which overlaps with the heavy (18O) peak and a second for the presence of 8% ordinary H216O in the hydrolysis reaction.