Crystal structure and substrate interactions of an unusual fungal non-CBM carrying GH26 endo-β-mannanase from Yunnania penicillata

Endo-β(1 → 4)-mannanases (endomannanases) catalyse degradation of β-mannans, an abundant class of plant polysaccharides. This study investigates structural features and substrate binding of YpenMan26A, a non-CBM carrying endomannanase from Yunnania penicillata. Structural and sequence comparisons to other fungal family GH26 endomannanases showed high sequence similarities and conserved binding residues, indicating that fungal GH26 endomannanases accommodate galactopyranosyl units in the −3 and −2 subsites. Two striking amino acid differences in the active site were found when the YpenMan26A structure was compared to a homology model of Wsp.Man26A from Westerdykella sp. and the sequences of nine other fungal GH26 endomannanases. Two YpenMan26A mutants, W110H and D37T, inspired by differences observed in Wsp.Man26A, produced a shift in how mannopentaose bound across the active site cleft and a decreased affinity for galactose in the −2 subsite, respectively, compared to YpenMan26A. YpenMan26A was moreover found to have a flexible surface loop in the position where PansMan26A from Podospora anserina has an α-helix (α9) which interacts with its family 35 CBM. Sequence alignment inferred that the core structure of fungal GH26 endomannanases differ depending on the natural presence of this type of CBM. These new findings have implications for selecting and optimising these enzymes for galactomannandegradation.


Results
Y. penicillata possesses at least one protein with endomannanase activity 1 (GenBank sequence ID AYU65281). This enzyme, studied in the current paper, has a signal peptide and a GH26 catalytic domain, but no CBM, in contrast to most known fungal GH26 endomannanases which carries a CBM35 1,24,27 . A gene encoding the catalytic domain, named YpenMan26A, was cloned and expressed in Aspergillus oryzae. Based on a sequence alignment with the sequence of PansMan26A, the two catalytic residues (previously identified for GH26 enzymes 30,31 ), Glu165 and Glu257 in YpenMan26A were identified, with Glu257 being the nucleophile, performing the nucleophilic attack on an anomeric carbon in the mannan backbone, and Glu165 the acid/base, which serves as proton  12 . Sugars shown using the Consortium for Functional Glycomics notation 59 . Both polymers continue towards the reducing end, having a degree of polymerization around 1500 for locust bean gum and 900 for guar gum 12 .
Scientific RepoRts | (2019) 9:2266 | https://doi.org/10.1038/s41598-019-38602-x donor and later deprotonates the glycosyl acceptor in the first and second step of the retaining catalytic mechanism respectively 15,32 . This mechanism is characteristic for Clan GH-A glycosyl hydrolases, such as GH26 endomannanases 15 . The Michaelis-Menten kinetic parameters with locust bean gum and guar gum were determined for YpenMan26A. Interestingly, the k cat on guar gum (636 s −1 ) was found to be higher than that on locust bean gum (475 s −1 ). Previous studies reported a decrease in hydrolytic rate of endomannanases going from less to more substituted galactomannans, such as from locust bean gum to guar gum 19,22,33 . It is thought that the galactose substitutions cause steric hindrance, making the mannan backbone less accessible to the enzyme 6,34 . As expected, the K M was also higher on guar gum (2.2 mg/ml) than on locust bean gum (0.6 mg/ml) and the k cat /K M therefore lower on guar gum (289 ml/(mg·s)) than on locust bean gum (792 ml/(mg·s)). Motivated by the desire to see how this enzyme accommodates and interacts with the galactopyranosyl groups in galactomannan, we sought to determine the crystal structure of YpenMan26A in complex with a galactomannooligosaccharide. A YpenMan26A acid/base substituted variant, E165Q, was made using synthetic oligonucleotides and PCR, replacing the codon GAG at position 165 with CAG. The variant was synthesised and expressed in Aspergillus oryzae. N-Deglycosylation of the purified wild type and the E165Q YpenMan26A mutant using Endoglycosidase H, resulted in a small shift (~5 kDa) in the apparent molecular mass on SDS-PAGE (Fig. S1). These results confirm that YpenMan26A is N-glycosylated, in agreement with the GPMAW (Lighthouse data) prediction.
structure of YpenMan26A. The structure of the deglycosylated YpenMan26A acid/base substituted variant E165Q, in complex with a α-6 2 -6 1 -di-galactosyl-mannotriose (MGG), was solved by molecular replacement using the known structure of PansMan26A 24 as template, and refined at 1.36 Å resolution (Table 1). A YpenMan26A E165A variant was also cloned but this variant was not successfully expressed. Neither the active YpenMan26A nor the E165Q mutant crystallized as apoenzymes, suggesting that ligand binding resulted in increased stability and/or conformational changes leading to successful crystallogenesis. The YpenMan26A chain can be traced from Ala1 to Val312 without breaks, and forms a (β/α) 8 -barrel fold ( Fig. 2A) as expected. The active site was identified in the groove with the conserved catalytic residue Glu165 (acid/base) mutated to Gln, and the conserved catalytic residue Glu257 (nucleophile) ( Fig. 2A) 30,31 , equivalent to those observed in PansMan26A 24 . The low-activity YpenMan26A E165Q variant showed an initial rate of hydrolysis of locust bean gum of 40 U/µmole enzyme (equivalent to a "turnover rate" of 0.7 s −1 ), which was roughly 360 fold lower than the rate exhibited by the wild type enzyme (15050 U/µmole enzyme, equivalent to a "turnover rate" of 251 s −1 ). 1 U was defined as the amount of endomannanase (in moles) required to release 1 µmole of reducing ends per minute, under the assay conditions specified in Methods. The low activity of the E165Q variant may be a consequence of the acid/base, and not the nucleophile, being substituted. An alternative explanation may be a consequence of the small risk (about 1/100) of translational misreading error or mis-incorporation of the wrong amino acid (as reported for E. coli) 35 since the E165Q variant was made with only a single base change from codon GAG (Glu) to CAG (Gln). There is a single N-glycosylation site at Asn103, located on the external side of the barrel, with a residual N-acetylglucosamine (GlcNac). As expected, YpenMan26A shows the highest structural similarity to other endomannanases (from both fungal, bacterial and protists origin) in family GH26 (Table 2). Judged from the Z-score (used by the DALI protein structure comparison server 36 for ranking of structural matches) YpenMan26A has the greatest structural similarity to PansMan26A (3ZM8) followed by RspeMan26C (3WRD) ( Table 2).
Ligand binding to YpenMan26A. Crystals of YpenMan26A E165Q were obtained in the presence of α-6 4 -6 3 -di-galactosyl-mannopentaose (MGGMM) with the aim that the oligosaccharide would span the catalytic site.  However, the electron density of the ligand was modelled as MGG situated in the −4 to −2 subsites (Fig. 2C). Since YpenMan26A E165Q was not completely inactive, it is likely that the residual activity has caused hydrolysis of the MGGMM between the backbone monomers in the −1 and +1 subsites, after which MGG migrated to span the subsite −4 to −2, indicating high ligand affinity in these subsites. This is supported by the observation that wild type YpenMan26A also produces MGG as a major hydrolysis product (discussed further below). The electron density of MGG is clear and unambiguous, except for the galactopyranosyl unit in the −3 subsite, which points out of the binding cleft (Fig. 2B). The B values for the galactopyranosyl residue in the −3 subsite are also higher (between 34-63 Å 2 for the C atoms), than for the galactopyranosyl unit in the −2 subsite (between 17-30 for the C atoms) or for the mannopyranosyl moieties (between 14-28 for the C atoms). All the interactions between the enzyme and the ligand are clearly defined, except for the flexible galactopyranosyl unit. There is electron density present near the mutated Q165, which is remote from the ligand, and was described as acetate, which fits the density well. There was no acetate in the crystallisation buffer, but most probably it was a contaminant during purification or crystallisation, or was present in the cell growth media, similar to the unknown ligand described as propionate in 5G4Z 37 19,20,24 ) with the conserved residues His164, Trp170, Phe171, Tyr227, Trp279 (Fig. 2C). As described for the homologous enzymes 19,20,24 , YpenMan26A Tyr227 is involved in a hydrogen bond with the catalytic nucleophile Glu257 whilst the aromatic amino acids Trp170 and Trp279 stabilise the mannopyranose rings at the −1 and +1 subsites, respectively (Fig. 2C). Like PansMan26A, YpenMan26A displays a prominent −4 subsite, with stacking interactions between the mannopyranose ring and two aromatic residues W109 and W110 and hydrogen bonds between Asp61, Arg66 and the mannopyrannose ring (Fig. 2C). The −3 subsite appears weaker bound as judged from the ligand enzyme interactions. In the −2 subsite the two aromatic residues, Phe113 and Tyr114, equivalent to Phe248 and Tyr249 in PansMan26A, stabilise the interactions with the mannopyranose unit. Previously, enzyme interactions with a galactopyranosyl substituent attached to a mannopyranosyl unit within the −1 subsite of CjapMan26C have been described 21 . Interestingly, because of the captured ligand in the present study, it is possible to identify interactions between the galactopyranose unit and the YpenMan26A in the −2 subsite not previously described. Gln36, Asp37, and Asp58 are involved in hydrogen bonds with the galactose residue. Asp37 has a double conformation in the crystal structure, possibly because the amino acid conformation shifts upon ligand binding. PansMan26A has a Glu172 instead of the Asp37 in YpenMan26A, but otherwise the enzymes have essentially identical environments for interactions with the galactose residue. Out of the six closest structural matches (Table 2), only PansMan26A (3ZM8) accommodates galactopyranosyl residues in the −2 subsite like YpenMan26A. A surface view of YpenMan26A and CjapMan26C (2VX6) with their ligands superimposed (the MGG from YpenMan26A and a bound α-6 3 -galactosyl-mannotetraose (MGMM) in the −2 to +2 subsite of CjapMan26C) shows that the ligands overlap nicely. The data thus indicate accommodation of galactopyranosyl residues in the −3, −2 and −1 subsites of both enzymes (Fig. S2). These superimpositions show that CjapMan26C does not accommodate the galactopyranosyl unit in the −2 subsite, where the moiety is pointing into the enzyme structure, whereas YpenMan26A accommodates galactopyranosyl moieties in −3, −2 and −1 (Fig. S2). The data also show that YpenMan26A has a more open active site than CjapMan26C (Fig. S2).

Design of two YpenMan26A variants -inspired by Wsp.Man26A. A sequence similarity search with
the YpenMan26A sequence, using the NCBI protein-protein BLAST (Basic Alignment Search Tool at http://www. ncbi.nlm.nih.gov/BLAST/, against the non-redundant protein sequences database) 38 , identified the A. nidulans GH26 endomannanase (Swissprot ID Q5AWB7 26 ) with 67.5% amino acid identity as the closest characterised enzyme. A multiple sequence alignment of 9 fungal GH26 endomannanases showed that the amino acids that take part in ligand binding in YpenMan26A are highly conserved (Fig. 3, red stars) (see later paragraph for discussion of differences between sequences of the GH26 core domains with and without a CBM35). However, Wsp. Man26A has two striking differences compared to YpenMan26A and the other endomannanases. The first is in the −2 subsite (YpenMan26A Asp37), where the analysed endomannanases have either an Asp or a Glu, while Wsp.Man26A has Thr (Fig. 3).
The second is in the −4 subsite (YpenMan26A Trp110), where the tested endomannanases have Trp or Tyr, while Wsp.Man26A has His (Fig. 3). von Freiesleben et al. 1 showed that YpenMan26A and Wsp.Man26A differ in their substrate preferences for locust bean gum and guar gum. YpenMan26A barely discriminated between   . Sequence alignment of the catalytic GH26 core region from 9 fungal GH26 endomannanases. Secondary structure elements for YpenMan26A and PansMan26A are displayed above and below the alignment respectively. Mutated residues D37 and W110 (lilac) and residues involved in ligand binding (red stars) in the YpenMan26A structure including the two catalytic residues. The α-helix in PansMan26A (α9) which is nearest the CBM35 and which is a surface loop in YpenMan26A is coloured blue. Identical residues are shown in white on red background. Highly similar residues (when the similarity score assigned to one column is above 0.7) are coloured red and framed in a blue box. The GH26 core sequence of YpenMan26A (AYU65281), AnidMan26A the two substrates, whilst Wsp.Man26A had approximately four times higher initial hydrolysis rate on locust bean gum than on guar gum (Fig. 4B, data adapted from von Freiesleben et al. 1 ), indicating that this enzyme was more hindered or had less affinity for the increased amount of galactose substitutions in guar gum. In the present study, the hydrolysis product profiles from full conversion of guar gum were analysed using the DNA sequencer-Assisted Saccharide analysis in High throughput (DASH) method 26,39 (Fig. 4A).
Kinetics with galactomannans and MGGMM. The Michaelis-Menten kinetic parameters with locust bean gum and guar gum were determined for the two YpenMan26A mutants D37T and W110H and compared with those reported for the wild type enzymes YpenMan26A and Wsp.Man26A (Table 4). The wild-type YpenMan26A had the highest k cat /K M on both substrates, closely followed by the Wsp.Man26A on locust bean gum. The kinetic data for the two wild-type enzymes show that Wsp.Man26A is more compromised on the heavily substituted guar gum than YpenMan26A; these results corroborate our previous data 1 . The wild type YpenMan26A and the variant D37T had identical K M values on locust bean gum, but D37T had a higher K M than the wild type enzyme on guar gum. This result indicates that the D37T mutant has lower affinity for the galactose residues in the highly substituted guar gum than the wild type enzyme has. The reason that no difference in K M values was observed on locust bean gum as substrate might be due to the presence of unsubstituted blocks of mannan in the locust bean gum mannan 12 . It is likely that both the wild type and the D37T variant catalyse the degradation of the unsubstituted,  To validate that the increase of K M for YpenMan26A D37T on the highly substituted guar gum galactomannan was caused by the change in the −2 subsite, k cat /K M on MGGMM for the YpenMan26A wild-type and the D37T mutant were determined by following substrate depletion at low substrate concentration (0.1 mM) by MS (Table 5). A novel MS based method with an internal standard was developed to allow these measurements (relevant spectra, extracted ion chromatograms and a standard curve are shown in Fig. S4). The reaction rate of MGGMM depletion could be described by the equation described by Matsui et al. 40 (Fig. S5), which was used to determine k cat /K M . It is likely that MGGMM binds from the −4 to the +1 subsite in YpenMan26A, and therefore accommodates the galactopyranosyl residues in the −3 and −2 subsite, as in the X-ray structure (Fig. 2C). This can be assumed because of the dominant M5 productive binding mode for YpenMan26A from subsite −4 to +1 (see next section, Fig. 5) and the demonstrated capability of YpenMan26A to accommodate the galactopyranosyl moiety in the −3 and −2 subsites (Fig. 2). Furthermore, AnidMan26A, which is the closest homologue to YpenMan26A, was found to produce MGGM and M from MGGMM 26 .
The D37T variant had four times lower k cat /K M on MGGMM than the wild type enzyme (84 vs 19 s −1 ·mM −1 , Table 5), showing that the mutant has lower k cat and /or higher K M (probably a combination of both as for the individual kinetic parameters determined on guar gum). The observed k cat /K M for the wild-type YpenMan26A and the D37T variant is at the same level as k cat /K M 's reported for other fungal endomannanases on M5, which were found to range from 23-163 s −1 ·mM −1 for the GH5 endomannanases from A. nidulans and Trichoderma reesei 41 and to be 22 s −1 ·mM −1 for PansMan26A 24 . The bacterial GH26 endomannanase from B. ovatus, BovaMan26A, had a k cat /K M of 247 s −1 ·mM −1 on M5 22 . This result emphasises that substitution of Asp37 with Thr decreases the affinity for the galactopyranosyl moiety in the −2 subsite. The lower k cat /K M on MGGMM obtained for the D37T mutant compared to the wild type is consistent with the expected increase in distance between the galactopyranosyl unit and the amino acid residue when Asp is substituted with Thr (Fig. 4C).
Productive binding of M5. M5 hydrolysis product analysis using HPAEC combined with solvent isotope labelling and mass spectrometry (MS) analysis 24,42 was used to estimate the relative frequency of productive binding modes for the YpenMan26A wild type and W110H mutant. The HPAEC product quantification showed a clear difference between the wild type and the W110H variant (Fig. S6), with the wild type preferring producing M4 and M1 (89% relative productive binding frequency) with little formation of M3 and M2 (11%). For the W110H mutant the major hydrolysis products were M3 and M2 (70%) as well as some M4 and M1 (30%). Because two productive binding modes can give rise to the same products (M5 can for example be hydrolysed into M4 and M1 through removal of the reducing end or the non-reducing end mannopyranosyl unit), the HPAEC data were combined with an in situ labelling, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis procedure 24,42 where M5 hydrolysis is performed in 18 O-water, to obtain product ratios of 18 O-labelled versus ordinary 16 O-products. The newly formed reducing end will be labelled with 18 O (heavy product) while the "leaving group" saccharide (light) of each catalytic event will not. With the MS analysis, it is thus possible to distinguish between M4 produced by M5 binding from subsite −4 to +1 (generating heavy M4) and M5 binding from subsite −1 to +4 (generating light M4). The heavy versus light product ratios obtained for M3 and M4 were used to calculate the relative binding frequencies of binding modes that generate these products, respectively (Fig. 5).

Locust bean gum
Guar gum k cat (s −1 ) K M (mg/ml) k cat /K M (ml/(mg·s)) k cat (s −1 ) K M (mg/ml) k cat /K M (ml/(mg·s))   The data show that for wild-type YpenMan26A, the dominant productive M5 binding mode is from subsite −4 to +1 (80% binding frequency) (Fig. 5), but that this mode is significantly reduced (28%) for the W110H mutant. Instead the dominant productive M5 binding mode is shifted to cover subsites −3 to +2 (63% binding frequency). This is probably a consequence of Trp110 in the −4 subsite being changed to His, resulting in a weaker subsite. It is also possible that the W110H substitution has caused a slight change in the global active site fold, resulting in slightly reduced thermostability (Table 3) and decreased activity (Table 4). However, on locust bean gum it is mainly K M and to a smaller extend k cat that changes when comparing the W110H variant with the YpenMan26A wild-type, indicating that the affinity for the substrate is dramatically changed while the hydrolysis rate is affected to a lesser extent. These results suggest that the W110H substitution has caused changed to the binding subsites and not the overall fold.

Differences in the catalytic GH26 domain in fungal endomannanases with and without CBM35.
Most regions are highly conserved between YpenMan26A and PansMan26A ( Fig. 2A,C), but YpenMan26A lacks a N-terminal CBM35 domain. From the superimposition of the two crystal structures ( Fig. 2A), it is seen that the main difference in the secondary structure between the core modules of the two enzymes is in the area which approaches the CBM35 of PansMan26A, where PansMan26A has an α-helix and YpenMan26A a surface loop. Interactions occur through water between the Ala402 and Gln404 in the PansMan26A core domain and the Leu58 and the Ser130 in its CBM35 and linker respectively. Couturier et al. 24 also report that a hydrophobic patch comprising Leu58 and Leu130 on the surface of the CBM35 stands in front of a cluster of hydrophobic residues, Ala402, Tyr403 and Leu399 of the core domain 24 . These interactions would not be established if the PansCBM35 were appended to the YpenMan26A, because of differences in the amino acid sequence and the flexible nature of the surface loop. The multiple sequence alignment (Fig. 3) of the GH26 core domains of nine fungal GH26 endomannanases (two wild-type core enzymes, five with a N-terminal CBM35 and two with a CBM35 and a C-terminal CBM1), confirms variation in the region in and around α9 in PansMan26A (Fig. 3, marked blue), the area of the core domain approaching the CBM35. The seven enzymes with a CBM35 have identical sequences to PansMan26A (LQAY, for AstiMan26A it is MQLY), which forms an α-helix in PansMan26A, while the two enzymes with no CBM35, have a different and seemingly more variable sequence (TGGV for YpenMan26A and MRED for AnidMan26A). From this analysis, it seems that co-evolution has occurred between the GH26 core domain and the CBM35. It is likely that the core domain evolved to accommodate and maybe help position the CBM35. When the CBM35 is absent, α9 is not needed.

Discussion
Data presented here add to the current understanding of fungal GH26 endomannanases, which appear to be conserved in their known functional characteristics. Characterised fungal GH26 endomannanases, including YpenMan26A, have a characteristic ligand binding site with a strong -4 subsite, and a dominant M5 binding mode from the −4 to +1 subsite 24,26,27 , in contrast to at least some fungal GH5 endomannanases (including PansMan5A) which mainly bind M5 from the −3 to the +2 subsite 24 . To date, the fungal GH26 endomannanases which have been analysed with a focus on the accommodation of galactopyranosyl units, are able to degrade highly substituted galactomannans by allowing accommodation of galactose substitutions at least in the −3, −2, −1 and +1 subsites as judged by biochemical data and crystal structures. The biochemical data include the observations that PansMan26A and AnidMan26A produce α-galactosylmannose (G) as their dominant hydrolysis product from guar gum galactomannan and AnidMan26A catalyses the hydrolysis of MGGMM to MGGM and mannose 26 . The structural data include the crystal structure of PansMan26A 24 and the homology model of AnidMan26A that both show an open active site cleft with space for galactose substitutions 26 . Furthermore, our current crystal structure of YpenMan26A with bound MGG from the −4 to the −2 subsites and the observation that the amino acids participating in MGG binding in YpenMan26A are highly conserved between studied GH26 endomannanases (Fig. 2), further support this hypothesis. Some fungal GH5 endomannanases, e.g. the TresMan5A from T. reesei, have been found to accommodate galactopyranosyl residues in the −1 subsite 43 , but not in the −2 and +1 subsites 26 . Among the bacterial GH26 endomannanases there is a variation in their ability to accommodate multiple galactopyranosyl residues in the active site cleft, exemplified by BovaMan26A and BovaMan26B from Bacteroides ovatus 22 . We show that a single mutation in the substrate binding amino acids can result in altered binding modes or substrate affinity as seen for the YpenMan26 wild-type and mutants investigated in the present study. Of the 17 amino acids involved in ligand binding (including the two catalytic residues) only three residues were not conserved among the nine fungal GH26 endomannanases compared in this study (Fig. 3). In two of these changes Wsp.Man26A differed from the rest of the endomannanases. Mutation studies showed that W110H shifted the dominant productive M5 binding mode of YpenMan26A from covering the −4 to +1 subsites to the −3 to +2 subsites, emphasising the importance of Trp110 in the strong −4 subsite. The D37T mutation lowered the affinity for a galactopyranosyl unit in the −2 subsite of YpenMan26A. A third variation in ligand binding amino acids among the studied GH26 endomannanases was position Asn280 in YpenMan26A (Fig. 3). This residue is not conserved between the nine fungal GH26 endomannanases, which might indicate that this residue is not important for ligand binding or it could contribute to different affinity for galactose in the −2 subsite, similar to the D37T mutation investigated in the present study. Indeed fungal GH26 endomannanases were shown to have different ratios between their initial rate on locust bean gum and on guar gum 1 , indicating variations in galactose affinity and/or tolerance, which perhaps can be explained by variations at this position (Asn280 in YpenMan26A, Fig. 3). Detailed knowledge about binding mode and affinity for substitutions in different subsites is important when using these enzymes to produce specific oligosaccharides e.g. for prebiotics or alkyl mannooligosides.
As seen from the superimposition of YpenMan26A and PansMan26A ( Fig. 2A) and the sequence alignment of nine fungal GH26 endomannanases (Fig. 3), the main difference in their catalytic domains appears to be in the area approaching the CBM35 (if present). The GH26 core module of the enzymes with a CBM35 seems to have evolved to harbour this big binding domain (15 kDa) in close proximity to the core, by aid of an α-helix (α9) whereas the wild-type enzymes with no CBM35, YpenMan26A and AnidMan26A, have a less structured surface loop in this area. The α9-helix in PansMan26A is situated with the end of the helix pointing directly into the site where the linker is attached to the CBM35. It is possible that this α-helix plays an important role in positioning of the CBM35. It is also possible that the position we see in the crystal structure of PansMan26A is not that of the CBM35 in solution, and it is likely that the core domain and the CBM35 can come in even closer contact, perhaps facilitated by ligand binding. A similar event has been reported for processive GH9 endoglucanases, for which a CBM3c module were shown to align with the catalytic cleft of the GH9 module, presumably forming one functional entity 44 . The linker in these GH9 cellulases is wrapped around the core domain, similar to the linker in PansMan26A 24 , and contributes significantly to the positioning of the CBM3c.

Conclusions
This study identified important amino acids for binding galactomannan in the −4 to −2 subsites of YpenMan26A, by solving and analysing its crystal structure in complex with MGG. Particularly the −2 subsite has multiple interactions with the galactopyranosyl side group. The study also highlights the high sequence similarity of known fungal GH26 endomannanases, with conserved ligand binding amino acids in the active site cleft. These results strongly indicate that the capability of accommodating multiple galactopyranosyl side-groups in the binding cleft is conserved among the fungal enzymes in the GH26 family. The two YpenMan26A variants, W110H and D37T, showed that these changes shifted the dominant M5 binding mode from covering the −4 to +1 subsite to cover the −3 to +2 subsite and lowered the affinity for galactopyranosyl residues in the −2 subsite. The crystal structure of YpenMan26A has a unique surface loop when compared to the crystal structure of PansMan26A, which appears to be a consequence of the enzyme lacking a CBM35. Known fungal GH26 endomannanases, including YpenMan26A, seem tailored for hydrolysing highly substituted galactomannans. Understanding the intimate enzyme-substrate interactions and the possibilities of changing product profiles and substrate affinities are important for fine-tuned optimization and utilization of these enzymes in industrial applications.

Construction of variants.
The gene sequence encoding YpenMan26A (GenBank sequence ID: AYU65281) was used to make the mutated constructs. E165Q was introduced into the gene sequence by PCR using synthetic oligonucleotides replacing the codon GAG position 165 of the mature peptide with CAG. PCR was conducted for the 5′ fragment and 3′ fragment separately using Phusion High-Fidelity DNA Polymerase (ThermoFisher Scientific) under the following conditions: 98 °C 2 min, 35 cycles at 98 °C for 10 sec, 72 °C for 150 sec, followed by 72 °C for 10 min. The PCR products were gel purified and used as template for a second round of PCR, using the gene flanking primers to amplify the full-length gene with the native signal peptide. The full-length PCR product was cloned into pDAu222 45 , an Aspergillus expression vector under the control of a NA2-tpi double promoter using the BamHI and XhoI restriction sites, and its sequence was determined. The resulting pDAu222-YpenMan26A-E165Q expression vector was transformed into A. oryzae MT3568. MT3568 is an amdS (acetamidase) disrupted derivative of A. oryzae Jal_355 46 in which pyrG auxotrophy was restored in the process of inactivating the A. oryzae amdS gene. Secretion of YpenMan26A E165Q in the culture supernatant of the recombinant MT3568 clones was confirmed by SDS-PAGE. Mutants containing the D37T and W110H substitutions respectively were made as synthetic full-length cDNA constructs with the native signal peptide (ThermoFisher Scientific) cloned into pDAu222 using the BamHI and XhoI restriction sites. For D37T the codon GAC of position 37 of the mature peptide was replaced with ACC. For W110H the codon TGG of position 110 of the mature peptide was replaced with CAC. The constructs were verified by sequencing and the resulting pDAu222 expression vectors were transformed into A. oryzae MT3568. Secretion of mutants in the culture supernatant of recombinant MT3568 clones was confirmed by SDS-PAGE.
Expression and purification. The fungal wild-type GH26 endomannanases Wsp.Man26A and YpenMan26A, as well as the YpenMan26A mutants D37T, W110H and E165Q were recombinantly expressed in A. oryzae MT3568 an amdS 46 . The enzymes, wild-types and variants, were purified to electrophoretic purity using hydrophobic interaction and ion exchange chromatography. The inactive YpenMan26A E165Q variant, used for crystallisation, was further purified using size-exclusion chromatography and deglycosylated with Endoglycosidase H (Roche). The identity of the purified endomannanases was validated with mass spectrometry analysing a tryptic digest of the protein band excised from a SDS-PAGE gel. Protein concentrations were determined by UV absorption at 280 nm using theoretical extinction coefficients (ε). ε at 280 nm of all proteins were estimated by GPMAW 9.20 (Lighthouse Data) and were based on mature proteins without modifications. Data collection, structure solution and refinement. All computations were carried out using programs from the CCP4 suite v. 7.0 47 . For the MGGMM-YpenMan26A complex, data were collected at the Diamond Light Source beamline I04 to 1.36 Å resolution and processed using xia2 48 . The structure was solved using MOLREP 49 with PansMan26A (PDB entry: 3zm8; Couturier et al. 24 ; sequence identity: 47.7%) as template. The structure was refined using REFMAC5 50 iterated with manual model building/correction in Coot 51 . The final model was validated using Molprobity 52 as part of the Phenix package 53 . Data-processing and refinement statistics are given in Table 1 Man26A (Fig. 3).

Crystallisation. The inactive
Thermal stability. The thermal stability at pH 5.0 was investigated with Differential Scanning Calorimetry (DSC) following an established protocol 26 . The Thermal midpoint (Tm) was determined as the top of the protein denaturation peak, with an accuracy of +/−1 °C. Initial rates and analysis of product profiles by DASH. The initial rates on locust bean gum and guar gum by the endomannanases were determined with 2.5 mg/ml substrate in 50 mM sodium acetate pH 5.0 at 37 °C. The hydrolytic activity was determined after 15 min in a 200 µl hydrolysis volume. Released reducing sugars were measured with the 4-hydroxybenzoic acid hydrazide (PAHBAH) method described by Lever 58 , with mannose as standard. All hydrolysis assays were carried out at 7 different endomannanase doses as described elsewhere 26 . Initial rates were calculated in the initial linear range of the hydrolysis. Guar gum hydrolysis product profiles at high conversion (26-36%) were analysed by DASH after inactivation by heating at 95 °C for 15 min. APTS (9-aminopyrene-1,4,6-trisulfonate) labelling and analysis of the labelled saccharides were carried out as described elsewhere 26,39 .
Kinetics with locust bean gum and guar gum. The kinetic constants for locust bean gum and guar gum hydrolysis were determined by assaying the initial endomannanase rates at different substrate concentrations (10 to 0.1 mg/ml) using the PAHBAH assay as described above. . 100% reaction amplitude was used to ensure fragmentation of the precursor ion. The capillary voltage was set at 4.5 kV, end plate offset was 0.5 kV, nebulizer pressure 3.0 bar, dry gas flow 12.0 l/min, and dry gas temperature was set to 280 °C. Buffer concentration, 1 mM sodium acetate pH 5, was set as low as possible to minimize ion suppression without compromising pH in the reaction. The total reaction volume was 500 µl and the sample was incubated directly in an HPLC-vial in the HPLC-autosampler. The reaction was started by adding enzyme in 2 nM and 6 nM for the wild-type YpenMan26A and the D37T variant respectively. Two min after enzyme addition, the first sample was taken. Thereafter, sampling was performed every 5.4 min (including sampling procedure), when the autosampler injected 4 µL sample directly into the flow leading to the MS. The enzyme reaction was immediately quenched when entering the flow path because the mobile phase was pH 2.7 and detection occurred approx. 0.5 min after injection. Total acquisition time was set to 4 min. The enzyme reactions were followed for a maximum time period of 50 min, but only data describing the initial phase of the reaction (less than 25% conversion of substrate) were used for estimating k cat /K M . Details on extracted ion chromatograms used for quantification of MGGMM and XXXG can be seen in Fig. S4. Data were analysed and quantified using Compass DataAnalysis 4.2 and Compass QuantAnalysis 2.2 provided by Bruker Daltonics. Ln (S 0 /S t ) was plotted as a function of time (t) (Fig. S5) and k cat /K M was calculated as described by Matsui et al. 40 ; k = Ln (S 0 /S t ), where k = ((k cat /K M )·[enzyme])·t, S 0 = substrate concentration at time zero and S t = substrate concentration at time t.
Productive M5 binding modes. The hydrolytic cleavage pattern of M5 was determined for the YpenMan26A wild-type and the W110H variant, by the previously established 18 O-water product labelling methodology 24,42 . First, M5 hydrolysis products were analysed and quantified by high performance anion exchange chromatography with pulsed amperometric detection (HPAEC-PAD) using a Dionex ICS-5000 with a Carbo-Pac PA-200 column and guard column. For this, double incubations of 1 mM M5 and 50 nM wild-type enzyme or 200 nM W110H mutant in 1.5 mM sodium acetate buffer, pH 5 were stopped by boiling at timed intervals (30 min to 3 h). Data after 30 min incubation for YpenMan26A or 2 h for the W110H mutant (approximately 30% hydrolysis) were used. The quantification allowed distinguishing between productive M5 binding modes that generated M4 and M1 versus those that generate M3 and M2. However, HPAEC alone cannot distinguish between the two possible binding modes generating M4 and M1 (i.e. binding either from subsite −4 to +1 or from −1 to +4), neither the two binding modes that generate M3 and M2 (i.e. binding from subsite −3 to +2 or −2 to +3). Therefore, incubations as above were also set up at 8 °C using 97% H 2

18
O as stock solvent reaching 92% 18 O-water in the reactions. Duplicate reactions were stopped after 30 min (for wild-type) and 2 h (for W110H) by directly spotting 0.5 μl samples with 0.5 ml matrix (10 mg/ml 2,5-dihydroxybenzoic acid) on a stainless-steel plate, followed by immediate drying with warm air. Spectra were then obtained by MALDI-TOF MS and used to calculate the 18 O over 16 O product ratios using the monoisotopic peak areas as previously described 24,42 . Since M5 hydrolysis in 18 O-water generates products where the newly formed reducing end becomes 18 O-labelled (and other chain ends do not), the 18 O over 16 O product ratios can be used to calculate the relative frequency of the productive binding modes mentioned above (i.e. M5 binding from subsite −4 to +1 versus subsite −1 to +4 or binding from subsite −3 to +2 versus subsite −2 to +3) 24,42 . The procedure involves two calculated corrections for the product ratio determination, one for the (M + 2) natural isotope peak of the light ( 16 O) species which overlaps with the heavy ( 18 O) peak and a second for the presence of 8% ordinary H 2 16 O in the hydrolysis reaction.

Data Availability
All data generated or analyzed during this study are included in this article and its Supplementary Information file.