Characterization of a novel multidomain CE15-GH8 enzyme encoded by a polysaccharide utilization locus in the human gut bacterium Bacteroides eggerthii

Bacteroidetes are efficient degraders of complex carbohydrates, much thanks to their use of polysaccharide utilization loci (PULs). An integral part of PULs are highly specialized carbohydrate-active enzymes, sometimes composed of multiple linked domains with discrete functions—multicatalytic enzymes. We present the biochemical characterization of a multicatalytic enzyme from a large PUL encoded by the gut bacterium Bacteroides eggerthii. The enzyme, BeCE15A-Rex8A, has a rare and novel architecture, with an N-terminal carbohydrate esterase family 15 (CE15) domain and a C-terminal glycoside hydrolase family 8 (GH8) domain. The CE15 domain was identified as a glucuronoyl esterase (GE), though with relatively poor activity on GE model substrates, attributed to key amino acid substitutions in the active site compared to previously studied GEs. The GH8 domain was shown to be a reducing-end xylose-releasing exo-oligoxylanase (Rex), based on having activity on xylooligosaccharides but not on longer xylan chains. The full-length BeCE15A-Rex8A enzyme and the Rex domain were capable of boosting the activity of a commercially available GH11 xylanase on corn cob biomass. Our research adds to the understanding of multicatalytic enzyme architectures and showcases the potential of discovering novel and atypical carbohydrate-active enzymes from mining PULs.


Scientific Reports
| (2021) 11:17662 | https://doi.org/10.1038/s41598-021-96659-z www.nature.com/scientificreports/ range of substrates revealed the C-terminal GH8 domain to be a Rex, and the full-length protein was named BeCE15A-Rex8A. Direct synergy between the two catalytic domains could not be observed on GAX-rich corn cob biomass, possibly attributable to the minimal GE activity, though the Rex domain was able to boost the activity of a GH11 xylanase.
The product of HMPREF1016_02159 in PUL 27 has a very unusual enzyme architecture, encoding a predicted multicatalytic enzyme comprising an N-terminal CE15 domain and a C-terminal GH8 domain, and a very short potential linker region. When compared to sequences in the NCBI protein database, a similar architecture was only found in three other species from the Bacteroides genus (Bacteroides sp. NSJ-48, B. stercoris, and B. gallinarum), each encoding one uncharacterized protein with 89-94% sequence identity (100% seq. coverage) to the B. eggerthii enzyme 46,48 . Furthermore, two homologs with lower similarity were found encoded by the more distantly related Prevotella sp. BP1-148 and Prevotella sp. BP1-145 (55% seq. id., 97% seq. coverage). Of these, only the B. gallinarum enzyme is found in a very large PUL likely targeting xylan (encoding enzymes from e.g. GH10, GH43, GH67, GH115, CE1; Fig. 1b), and the Prevotella enzymes are encoded by two identical small PULs that in addition to the CE15-GH8 enzyme only encode CAZymes from GH3 and CE1 (Fig. 1c) 14 . The CE15-GH8 architecture thus appears confined to the Bacteroidetes phylum and is strongly suggested to be involved in xylan turnover.
The individual domains of the B. eggerthii enzyme were compared to characterized enzymes from CE15 and GH8, respectively. The CE15 domain was most similar to OtCE15B from the soil bacterium Opitutus terrae (seq.   37 . OtCE15B and the here investigated CE15 domain were phylogenetically more closely related to characterized fungal GEs than to other characterized GEs of bacterial origin and both contained a key disulfide bridge locking the catalytic serine and histidine in place as it is common in fungal GEs (Fig. S1). The catalytic triad was found to be conserved in BeCE15A (Ser230, Glu253, His357). In contrast to CE15, many more members belonging to GH8 have been biochemically characterized 13 . Previous work has shown phylogeny to be a useful tool to predict enzyme specificities in GH8 using a limited number of sequences 49 . As the number of characterized GH8 members have since grown significantly, we constructed a new phylogenetic tree using the catalytic domains of all characterized members of GH8 as well as the B. eggerthii GH8 domain (Fig. 2). The tree was largely in agreement with the previous one, with different specificities mostly clustering into separate clades, including a clade encompassing all xylanases characterized to date. Rex enzymes formed a separate branch within the xylanase clade. The B. eggerthii GH8 domain was found to be most similar to BiRex8A from Bacteroides intestinalis 50 ; the enzymes share 84% sequence identity which strongly indicates a similar function. BiRex8A was characterized simultaneously with BiXyn8A from the same organism, where the latter was shown to be an endo-xylanase, as BiXyn8A hydrolyzed both wheat arabinoxylan and oat spelt xylan into XOs 50 . BiRex8A on the other hand showed no xylanase activity but was instead able to release xylose moieties from the reducing end of XOs. The same study demonstrated that both BiRex8A and BiXyn8A shared the same conserved catalytic residues 50 , which are also conserved in the BeRex8A domain (Glu483, Asp541 and Asp679; Fig. S2).
Biochemical characterization of the BeCE15A-Rex8A CE15 domain. To confirm the putative functions of BeCE15A-Rex8A, the enzyme was heterologously produced in E. coli both as a full-length enzyme (91.5 kDa) and as the individual domains BeCE15A (46.8 kDa; amino acid residues 32-413) and BeRex8A (50.0 kDa; amino acid residues 414-812). BeCE15A-Rex8A and BeCE15A were assayed on the standard GE substrates benzyl glucuronoate (BnzGlcA), allyl glucuronoate (AllylGlcA), methyl glucuronoate (MeGlcA) and methyl galacturonoate (MeGalA) (Fig. 3). In contrast to previously studied GEs, none of the reactions were saturable up to concentrations of 40 mM substrate, precluding determination of either k cat or K M parameters. However, the catalytic efficiency (k cat /K M ) could be determined using linear regression and showed that BeCE15A-Rex8A and BeCE15A were most active on BnzGlcA, with the activity decreasing successively on AllylGlcA, MeGlcA, and MeGalA (Table 1). This is in accordance with other characterized bacterial GEs, and consistent with the hypothesis that GEs prefer bulky substrates that are ester-linked to the O-6 position of a uronic acid moiety 37,51,52 , mimicking lignin or a lignin fragment in LCCs. In GEs, the rate-limiting step has been proposed to be the deacylation of the acyl-enzyme intermediate, given the similar k cat values determined for various enzymes to date 52 . The low k cat /K M values of BeCE15A may thus be a result of high K M values, indicating a poor fit of the model substrates in the active site. The isolated BeCE15A was approximately as active on the model substrates as the full-length enzyme, with roughly equal catalytic efficiencies on BnzGlcA, and 1.5-fold higher catalytic efficiency on AllylGlcA. This indicates that the truncation of BeCE15A-Rex8A into BeCE15A did not negatively affect the GE. The observed catalytic efficiencies were minimal compared to the majority of previously studied GEs reported in literature and the activity on BnzGlcA was approximately 500-fold lower than that of TtCE15A from Teredinibacter turnerae, which to date has the highest reported k cat /K M value for this substrate 51 . However, enzymes with even lower k cat /K M values on BnzGlcA than BeCE15A have previously been studied, including the most closely related characterized enzyme OtCE15B from O. terrae (k cat /K M value of 18.6 s −1 M −1 ), which is approximately fourfold lower than that of BeCE15A 37 . OtCE15B is an exception among studied GEs, as it has a tyrosine residue in the equivalent position of the conserved active site arginine residue believed to partake in forming the oxyanion hole and stabilizing the transition state during catalysis 37,52,53 . BeCE15A has an unexpected non-polar phenylalanine residue (Phe231) in equivalent position, which would not be able to electrostatically stabilize the transition state with its side chain (Fig. 4). To investigate whether replacement of this residue with the expected arginine residue would improve the activity on GE model substrates, we constructed an F231R variant of BeCE15A. Instead of increasing the activity, the result was however a complete loss of GE activity. Similarly, a substitution with a tyrosine (F231Y), as present in OtCE15B, also led to a complete loss of GE activity (data not shown).
Biochemical characterization of the BeCE15A-Rex8A GH8 domain. As GH8 is a polyspecific family, the GH8 domain of BeCE15A-Rex8A was assayed on a range of polysaccharides: cellulose, birchwood and beechwood xylan, wheat arabinoxylan, linear ivory nut mannan, mixed linkage β-glucan from barley, as well as starch. No activity could be detected on any of these substrates even after prolonged incubations. Previously studied Rex enzymes have been shown to either be inactive or have minimal activity on polymeric xylan and instead are active on XOs 44,49,50,54,55 . Similarly, BeRex8A was able to hydrolyze XOs ranging from xylotriose (X 3 ) to xylohexaose (X 6 ), and only trace activity (above 0.02 µM min −1 mg −1 protein) was observed on X 2 when incubated for prolonged periods of time (Fig. 5). X 1 and X 2 were the end products of all reactions. Our time-course analysis shows highly similar hydrolysis progress curves to those of BiRex8A from Bacteroides intestinalis 50 , with the substrates being sequentially shortened into intermediate products, themselves acting as new substrates, and with a concomitant accumulation of X 1 and X 2 as end products (Fig. 5). BeRex8A was not active on pNPxylobiose or borohydrate-reduced xylotriose, further supporting the Rex activity.
Attempts to crystallize and determine the structure of BeCE15A-Rex8A or its parts were unfortunately not successful. However, modeling of BeRex8A using Phyre2 56 yielded a predicted protein structure with 95% coverage and 100% confidence based on the structurally determined E70A variant of PbRex8A from Paenibacillus barcinonensis (PDB ID: 6TRH; 42% seq. id. to BeRex8A; Fig. 4d) 57 . PbRex8A was previously shown to have minimal activity on xylan and a loop comprised of Leu320-His321-Pro322 blocking the active site after the + 1 subsite was www.nature.com/scientificreports/ attributed to the inability of the enzyme to efficiently bind polysaccharides and generate products longer than xylose 57 . Attempts to reduce the size of the loop to open up the active site did however not improve the activity on xylan 57 . An equivalent to the Leu320-His321-Pro322 loop in PbRex8A is not found in BeRex8A, but the modeled structure shows a similar active site groove, blocked by a single arginine residue (Arg670; Fig. 4d). This arginine residue appeared to form a tunnel allowing access of unsubstituted oligo-or polysaccharides. We hypothesized that this residue may play a role in the specificity for shorter oligos, and the lack of activity on polysaccharide chains, and constructed an R670A variant. However, this variant was inactive on polymeric xylan, similar to the wild type enzyme (data not shown). Deeper structural investigation would be needed to shed light on possible interchangeability between Rex vs. endo-xylanase activity.
Boosting of xylanase hydrolysis of corn cob. Enzymes  www.nature.com/scientificreports/ cated single domain versions (BeCE15A and BeRex8A), the enzymes were assayed for their ability to boost the action of a commercially available GH11 xylanase (Xyn11A), which has previously been used successfully in similar experiments 20,31 . Ball-milled corn cob biomass, which has a high content of GAX 15 , was used as substrate.
No release of sugars was observed when no enzyme was added (data not shown), and similarly no released sugars were detected if BeCE15A-Rex8A, BeCE15A or BeRex8A were added without Xyn11A (data not shown). Addition of Xyn11A (control reaction) lead to the release of small amounts of XOs ranging from X 1 to X 6 (Fig. 6).
The main products were X 1 and X 2 with concentrations reaching 1.6 mM each after 30 h, and substantially more X 4 and X 6 were released than X 3 and X 5 . Supplementation of Xyn11A with BeCE15A did not alter XO release substantially compared to the control reaction. Supplementation of Xyn11A with BeRex8A, BeCE15A-Rex8A or an equimolar mix of BeCE15A and BeRex8A increased X 1 (twofold), X 2 (1.3-fold), X 3 (5.6-fold) and X 5 concentrations (twofold), while X 4 and X 6 concentrations were reduced to roughly a third compared to the control reaction. The total xylose equivalents from X 1 -X 6 that were released by Xyn11A when supplemented with BeRex8A increased 20-30% compared to the reaction of Xyn11A alone, and do not appear to stem simply from conversion of longer XOs to short ones by the Rex enzyme. Possibly, the apparent improvement of Xyn11A could be a result of reduced product inhibition. The reason for the inability of BeCE15A to boost xylanase activity on corn cob biomass is not clear but echoes the results of the CE15 domain from the Caldicellulosiruptor kristjanssonii encoded CkXyn10C-GE15A, which similarly did not appear to boost xylanase activity directly either with commercial enzymes or the linked CkXyn10C xylanase domain 32 . Possibly, the effect of GEs on xylanases cannot be monitored by sugar release measurements due to the overall complexity of the (un-pretreated) material and reduced access to LCC esters that the enzymes can target. Alternatively, BeCE15A, with its atypical active site residues, may be more specialized to target structures that were not present or accessible in the here utilized corn cob biomass. The low activity of BeCE15A on GE model substrates, similar to that of OtCE15B 37 , suggests that these enzymes might have a different role in biomass turnover than other so far characterized CE15 enzymes. The main activity of characterized CE15 members to this date has been (4-O-methyl)-glucuronoyl esterase activity 13 , but the incorporation of a CE15 enzyme into a PUL suggests that the activity of BeCE15A supports xylan degradation. Deeper investigation  www.nature.com/scientificreports/ of atypical enzymes such as BeCE15A and OtCE15B holds the potential of adding to our knowledge on enzymatic biomass degradation and might be an interesting target for the improvement of industrial enzyme cocktails. Comparing supplementation of Xyn11A with the full-length BeCE15A-Rex8A and an equimolar mix of its single domains BeCE15A + BeRex8A did not reveal significant differences in the XO production profiles over the whole course of the experiment (Fig. S3). Deducing the preferred substrate of a multicatalytic enzyme can be challenging due to the highly specialized nature of these proteins and the vast diversity among polysaccharides, especially in the context of the complex cell wall polymer network. A lack of intramolecular enzyme synergy has also been observed for other multicatalytic enzymes, such as FjCE6-CE1 from F. johnsoniae 20 , CkXyn10C-GE15A from C. kristjansonii 32 , and DmCE1B from Dysgonomonas mossii 31 . Given the complexity of the substrates targeted by these enzymes, which are presumed to be part of LCCs, it is currently unclear whether the lack of observed intramolecular enzyme synergy is the result of missing intramolecular synergy, a lack of the right substrate, or another unknown reason. Typically, multicatalytic enzymes are joined by flexible linkers of varying length or small domains 29,32 . In BeCE15A-Rex8A, a short potential linker is present between Trp402 and Ala423, although exactly how flexible the linker is remains unclear. While no experimental structural data is available, multiple models constructed using the Phyre2 56 and I-TASSER 58 structural modelling servers suggest that the catalytic domains may be in close contact with each other (Fig. 7). Additionally, the domains appear to be oriented with their active sites facing in opposite directions. Whether the active sites are able to act in close proximity or not, depending on the length and flexibility of the putative linker, is currently unclear and would need support with structural data.  www.nature.com/scientificreports/ Analysis using Signal P 5.0 59 identified a 23 amino acid long signal peptide with 95% likelihood as Sec/ SPI, indicating that the protein is likely secreted into the periplasm, but whether BeCE15A-Rex8A is further transported outside the cell is not known. The presumed biological role of GEs would indicate that the target substrate(s) of BeCE15A is found in large LCCs that are unlikely to be imported into the periplasm. Conversely, the Rex activity of BeRex8A would be more in keeping with how final degradation of poly-and oligosaccharides in PULs is believed to mainly occur in the periplasm to prevent "leakage" of metabolizable sugars to surrounding cells 8,9,18,19 . The low GE activity of the enzyme and atypical active site setup might indicate that GlcA in xylan can be esterified with as of yet unidentified moieties that are hydrolyzed in the periplasm by B. eggerthii. Identification of such motifs would likely require significant efforts, though enzymes such as BeCE15A and OtCE15B could be highly useful tools in such an endeavor.

Conclusion
In this study we biochemically characterized the multicatalytic enzyme BeCE15A-Rex8A. The N-terminal domain was identified as a GE having minimal activity on model substrates and harboring a highly unusual amino acid substitution close to the catalytic serine that might play an important role in substrate turnover or substrate preferences that are yet unidentified in CE15. The C-terminal domain was identified as a Rex, an activity that has been demonstrated in very few enzymes to date. The here described enzyme architecture of BeCE15A-Rex8A was shown to be very rare and confined to a few PULs within the bacterial phylum of Bacteroidetes. This work further highlights the usefulness of mining PULs for the discovery of novel enzyme types and architectures.

Material and methods
Phylogenetic tree. The amino acid sequences of GH8 enzymes listed as characterized were downloaded from CAZy (Nov 2020), trimmed to only contain the catalytic domains, and subsequently aligned using MUSCLE 60 . The phylogenetic tree was built based on the alignment using IQ-TREE 61 , with automatic finding of the best substitution model (LG + F + I + G4) and 1000 ultrafast bootstraps. The maximum-likelihood tree was visualized using iTOL 62 . Cloning of BeCE15A-Rex8A and protein variants. The putative BeCE15-GH8 was amplified from genomic DNA of B. eggerthii 1_2_48FAA by PCR (primers listed in Table S2) and the products cloned into a modified pET-28a vector, by ligation independent cloning (In-Fusion HD kit; Clontech Laboratories), containing an N-terminal His 6 tag and a tobacco etch virus protease cleavage site. A signal peptide predicted at the N-terminal end of the gene encoding BeCE15A-Rex8A (residues 1-31) was not included for protein production. Enzyme variants were created by site-specific mutagenesis by the QuikChange method using the primers listed in Table S2   In both models, the GE domain is colored red, the Rex domain is purple, the potential linker region is green, and the active-site residues are blue. In both models the active sites of the two domains are positioned facing away from each other and marked by black arrows. www.nature.com/scientificreports/ and 180 rpm). The cells were harvested by centrifugation and lysed by sonication. The resulting protein containing crude lysate was purified using immobilized metal ion affinity chromatography as previously described 37 .
Purified protein was concentrated and buffer exchanged (BeCE15A in 50 mM Tris pH 8.0 + 100 mM NaCl; BeRex8A in sodium phosphate pH 6.5 + 100 mM NaCl; and BeCE15A-Rex8A in 50 mM Tris pH 8.0 + 250 mM NaCl + 5% w/v glycerol) using 10 kDa cut-off centrifugal filter units (Amicon Ultra-15, Merck-Millipore) and imidazole concentrations were reduced to < 1 mM. Sodium dodecyl sulfate polyacrylamide gel electrophoresis using Mini-PROTEAN TGX Stain-Free Gels (BIO-RAD) was used to verify molecular weight and protein purity. Protein concentrations were determined using a Nanodrop 2000 Spectrophometer (Thermo Fisher Scientific) using extinction coefficients and molecular weights predicted by Benchling.
Biochemical characterization of the CE15 domain. pH dependency was established for BeCE15A by comparing the activity on BnzGlcA in a range of buffers and pH values (Fig. S4). A pH dependency profile for BeRex8A could not be established as the enzyme fell out of solution at pH values different than 6.5 ± 0.5. Assays on model GE substrates (BnzGlcA, AllylGlcA, MeGlcA, MeGalA) were performed at pH 7.5 for comparison to other GEs as previously described 32 Biochemical characterization on XOs and complex substrates. All here described substrates were purchased from Megazyme if unless stated otherwise. Reactions were incubated at 37 °C with mixing at 500 rpm and contained BeRex8A (2 µM; in 50 mM sodium phosphate buffer pH 6.0 + 100 mM NaCl) and the different substrates. Screening of possible polysaccharide hydrolyzing ability of the Rex8A domain was done using 1.25% w/v cellulose, birchwood xylan, beechwood xylan (Apollo scientific), wheat arabinoxylan, linear ivory nut mannan, mixed linkage β-glucan from barley, or starch, with sugar release monitored using the dinitrosalicylic acid assay. Xylooligosaccharides tested were xylobiose (X 2 ; 3.2 mM), xylotriose (X 3 ; 3.25 mM), xylotetraose (X 4 ; 3.3 mM), xylopentaose (X 5 ; 2.65 mM) and xylohexaose (X 6 ; 3.33 mM). Samples were flash-frozen in liquid nitrogen, diluted with HCl (0.1 M final concentration) to stop the enzymatic reaction and analyzed using HPAEC-PAD (see below). Corn cob biomass for xylanase hydrolysis studies was produced by processing corn cob (excluding corn grains) in a kitchen blender followed by ball-milling into a fine powder, washing with water, and then freezedrying. The corn cob was used as substrate (0.45% w/v) with BeCE15A-Rex8A, BeCE15A or BeRex8A, incubated at 37 °C and 1000 rpm in 100 mM sodium phosphate pH 6.5 including 0.5 µM of each enzyme, in various combinations with and without addition of the commercially available endo-β-1,4-xylanase Xyn11A from N. patriciarum (E-XYLNP; Megazyme; concentration in assay 11 µM). The samples were flash-frozen in liquid nitrogen and stopped by addition HCl (0.1 M final concentration) before being analyzed using HPAEC-PAD.
High-performance anion-exchange chromatography with pulsed amperometric detection. HPAEC-PAD was performed on a Dionex ICS-5000 + (Thermo Fisher Scientific) equipped with a Dionex CarboPac™ PA200 column (Thermo Fisher Scientific). To achieve sufficient separation of the XOs a constant flow of 0.5 mL/min and a multistep gradient (Table S3) were applied using deionized water, 300 mM NaOH, and 1 M NaAc. Prior to use dissolved oxygen was removed from all solutions by sparging with helium gas.
Structural models of BeCE15A-Rex8A. The model for BeRex8A was generated with Phyre2 56 and based on the structurally determined E70A variant of PbRex8A from P. barcinonensis. Models of full-length BeCE15A-RexA domains combined were generated both with the Phyre2 server 56 and with the I-TASSER server 58 . When selecting a model from I-TASSER, manual inspection of the predicted folding of the individual domains was used (in comparison to crystal structures of other Rex and GE domains) in order to select the most likely model.