Structure, function, and inhibition of drug reactivating human gut microbial β-glucuronidases

Bacterial β-glucuronidase (GUS) enzymes cause drug toxicity by reversing Phase II glucuronidation in the gastrointestinal tract. While many human gut microbial GUS enzymes have been examined with model glucuronide substrates like p-nitrophenol-β-D-glucuronide (pNPG), the GUS orthologs that are most efficient at processing drug-glucuronides remain unclear. Here we present the crystal structures of GUS enzymes from human gut commensals Lactobacillus rhamnosus, Ruminococcus gnavus, and Faecalibacterium prausnitzii that possess an active site loop (Loop 1; L1) analogous to that found in E. coli GUS, which processes drug substrates. We also resolve the structure of the No Loop GUS from Bacteroides dorei. We then compare the pNPG and diclofenac glucuronide processing abilities of a panel of twelve structurally diverse GUS proteins, and find that the new L1 GUS enzymes presented here process small glucuronide substrates inefficiently compared to previously characterized L1 GUS enzymes like E. coli GUS. We further demonstrate that our GUS inhibitors, which are effective against some L1 enzymes, are not potent towards all. Our findings pinpoint active site structural features necessary for the processing of drug-glucuronide substrates and the inhibition of such processing.

Scientific RepoRts | (2019) 9:825 | https://doi.org/10.1038/s41598-018-36069-w and toxic metabolite, SN-38 12,13 . Prior to excretion, SN-38 is sent to the liver where uridine diphosphate glucuronosyltransferase (UGT) enzymes attach a glucuronide group to the SN-38 scaffold, converting it to the inactive metabolite SN-38-glucuronide (SN-38-G), which is non-toxic. However, upon its delivery to the GI tract, gut microbial GUS enzymes hydrolyse SN-38-G and reactivate it back into its toxic form SN-38, which causes dose limiting diarrhoea 14,15 . In a similar fashion, NSAIDs have also been shown to cause small intestinal ulcers and inflammation, presumably due to the action of GUS enzymes that convert NSAID glucuronides back into their parent forms following Phase II glucuronidation 16 . In previous work, we have shown in mice that inhibitors selective for bacterial GUS alleviated SN-38 dose limiting diarrhoea and reduced the number of NSAID-induced small intestinal ulcers, further suggesting that GUS enzymes give rise to undesired GI side effects by reversing Phase II glucuronidation [17][18][19] . It is apparent that GUS enzymes are capable of hydrolysing a diverse array of glucuronides, but limited information is available on the specific types of GUS enzymes that are most efficient at processing drug glucuronides. In an attempt to gain insight into the structural and functional diversity of GUS enzymes, we recently reported an atlas of 279 unique GUS enzymes identified from the stool sample catalogue in the Human Microbiome Project (HMP) that clustered into six structural groups based on their active site loops, Loop 1 (L1), Mini Loop 1 (mL1), Loop 2 (L2), Mini Loop 2 (mL2), Mini Loop 1,2 (mL1,2), and No Loop (NL) 20 (Fig. 1a-c). We further showed that representative GUS enzymes possessing a Loop 1 were capable of processing the small standard glucuronide substrate p-nitrophenol-β-D-glucuronide (pNPG) faster than non-L1 GUS enzymes 20 . We also found that two selective microbial GUS inhibitors, Inhibitor 1 and UNC10201652, were potent against the L1 E. coli GUS (EcGUS) but did not inhibit the non-L1 GUS enzyme mL1 Bacteroides fragilis GUS (BfGUS) 17,21,22 . From these data, we hypothesized that GUS enzymes possessing a Loop 1 efficiently process drug glucuronide substrates, such as SN-38-G and NSAID glucuronides, and are susceptible to inhibition by our GUS selective inhibitors.
To date, five L1 GUS enzymes, EcGUS, Eubacterium eligens GUS (EeGUS), Streptococcus agalactiae (SaGUS), Clostridium perfringens GUS (CpGUS), and Faecalibacterium prausnitzii (FpGUS) have exhibited activity with pNPG, and all have been structurally characterized, with the exception of FpGUS 17,20,21 . All four L1 GUS structures exhibited similar tertiary, quaternary, and active site architectures 17,21,22 . To determine whether any structural and functional differences exist among L1 GUS enzymes and to test our hypothesis that GUS enzymes containing a Loop 1 are the most efficient within the GUS family at processing drug glucuronides, we cloned, expressed, and purified two additional L1 GUS enzymes, Lactobacillus rhamnosus GUS (LrGUS) and Ruminococcus gnavus GUS (RgGUS). Lactobacillus rhamnosus was found to be adherent to healthy colon tissue in a patient biopsy obtained at UNC Hospitals (T. Keku, personal communication); thus, we chose to study a GUS from this bacterial species. Ruminococcus gnavus GUS was previously identified and examined for general biochemical properties 23 .
Here we present the crystal structures of the L1 GUS enzymes FpGUS, LrGUS, and RgGUS, as well as the NL Bacteroides dorei (BdGUS), and show that LrGUS and RgGUS exhibit unique active site features not previously seen in L1 GUS enzymes. We also determined the kinetic parameters k cat and K m for each L1 GUS enzyme and our panel of non-L1 GUS enzymes with the small standard substrate pNPG. Surprisingly, we found that FpGUS, LrGUS, and RgGUS exhibited catalytic efficiencies 10 to 100-fold lower than the those of the L1 GUS enzymes previously characterized. We further demonstrate that while these three L1 GUS enzymes were not inhibited by our selective GUS inhibitors, NL BdGUS was weakly inhibited, despite its lack of a Loop 1. We show that our panel of GUS enzymes differentially processed the NSAID metabolite diclofenac glucuronide (DCF-G) and that the relative cleavage rates were analogous to that observed for pNPG. Finally, we demonstrate that treating mice with diclofenac (DCF) increases GUS activity in faecal samples. These findings advance our understanding of the structure, function and inhibition of GUS enzymes. Furthermore, they suggest that the specific amino acid composition of Loop 1, as well as additional GUS structural features, likely play a role in the ability of GUS enzymes to cleave drug-glucuronides and in the efficacy of bacterial GUS inhibitors.

Results
Visualization of structure-function relationships across GUs proteins. Human gut microbial GUS enzymes have been previously shown to exhibit unique structures and functions 20 . To gain greater insight into the specific sequence-structure-function relationships among GUS proteins, particularly those belonging to the L1 class, we generated a sequence similarity network (SSN), which groups protein sequences into clusters and facilitates the analysis of functional relationships within protein families 24 . We utilized the 279 GUS protein sequences identified in the Human Microbiome Project (HMP) database to construct the SSN, as well as the sequences of the L1 GUS enzymes previously characterized (EcGUS, EeGUS, FpGUS, CpGUS, and SaGUS), and the new L1 GUS sequences RgGUS and LrGUS 20,21 .
The GUS enzymes examined largely clustered based on their previously defined active site loop architectures: L1, mL1, L2, mL2, mL1,2, and NL ( Fig. 1). Of the GUS enzymes that have already been examined both structurally and functionally, and are annotated on the SSN, we found that the non-L1 GUS enzymes, mL1 BfGUS, L2 Bacteroides uniformis (BuGUS), nL BdGUS, mL1,2 Bacteroides ovatus (BoGUS), and L2 Parabacteroides merdae (PmGUS), clustered with GUS enzymes containing their same loop type (Fig. 1c). Three of the previously characterized L1 GUS enzymes, EcGUS, SaGUS, and CpGUS, were singletons that did not group with any other GUS enzyme, while EeGUS clustered with two other GUS enzymes from bacteria identified as Eubacterium sp. FpGUS clustered with the new L1 RgGUS and three other L1 GUS enzymes in a group we have termed as the "L1 Cluster" (Fig. 1c). One of the five GUS enzymes that compose the L1 Cluster was identified as a GUS from Faecalibacterium prausnitzii that shares 79% sequence identity to the previously characterized FpGUS sequence 20 . The remaining GUS sequences that associated within the L1 Cluster were determined to belong to bacteria that could not currently be identified. Finally, the new L1 LrGUS was a singleton and did not group with any other GUS enzyme.
To test our previous hypothesis that, among GUS family members, L1 GUS enzymes most efficiently process small glucuronide substrates, we cloned, expressed and purified two previously uncharacterized L1 GUS enzymes, RgGUS and LrGUS, for subsequent structural and functional studies. These two novel GUS enzymes, along with the ten GUS enzymes previously characterized, compose the panel of GUS enzymes examined below. structural analysis. The structures of the L1 GUS enzymes EeGUS (PDB: 6BJW), SaGUS (PDB: 4JKL), CpGUS (PDB: 4JKM), and EcGUS (PDB: 3LPG) have been previously determined. To further examine the structural variability of L1 GUS enzymes, we determined the crystal structures of FpGUS, LrGUS, and RgGUS ( Fig. 2 and Table S1). The structures of these three new enzymes reveal highly similar tertiary structures (Fig. 2a) relative to one another, as well as to the previously determined L1 GUS enzymes (Fig. S1) 20,21 . FpGUS, LrGUS, and RgGUS share a similar core fold with EcGUS, with 1.7 Å root-mean-square deviation (rmsd), 1.7 Å rmsd, and 2.0 Å rmsd, respectively, over 576 Cα equivalent positions. The three new L1 GUS enzymes also retain the same tetramer organization compared to previously determined L1 structures, in which GUS monomers are in a "C-term-mediated" tetrameric state rather than a "square" tetrameric state, as exhibited by mL1 BfGUS (PDB: 3CMG) and discussed below (Figs 2b and S2) 17,21 .
While LrGUS, RgGUS, and FpGUS also exhibit similar tertiary and quaternary structures when compared to previously determined L1 GUS enzymes, LrGUS and RgGUS display unique features within their active sites. Specifically, LrGUS possesses a patch of negatively charged residues, E450, D451, and D452, which we term the "EDD" motif (Figs 2c and S3a). These positions are generally occupied by polar or hydrophobic residues in our previously characterized L1 GUS enzymes, such as M453, T454, and S455 at the equivalent positions in FpGUS, for example (Figs 2c and S3b). In the RgGUS active site, six amino acids from the Loop 1 region of this enzyme were observed to adopt an alpha helix conformation that directly folds over the catalytic gorge (Fig. 2c). This structure represents the first instance in which Loop 1 adopts a secondary structural motif. Together, these data extend our knowledge regarding the active site structural variability sampled by human gut microbial GUS enzymes.
To further advance our understanding of GUS structure and function we also determined the crystal structure of NL BdGUS ( Fig. 3 and Table S1). Like the L1 GUS structures, BdGUS shares a similar core fold with EcGUS, with 3.1 Å root-mean-square deviation (rmsd) over 528 Cα positions (Figs 3a and S1 25 . The presence of these C-terminal domains likely explains the unique quaternary arrangement of BdGUS compared to that of other L1 GUS enzymes. The tertiary and quaternary structure in BdGUS is highly similar to that exhibited by BuGUS, with a 2.3 Å root-mean-square deviation over 816 Cα positions (Fig. S4). Both contain one DUF and one CBM, and their dimers are arranged in identical configurations 20 . While BdGUS does not possess a loop at the active site, as In FpGUS and LrGUS, the loop is disordered. A magnesium ion was built into the electron density within the FpGUS active site. Given the crystallisation conditions (which contained magnesium formate), relatively solvent exposed active site observed in the structure, and the absence of unique coordinating residues, the ion is most likely an artefact from crystallization and is not expected to play a functional role in catalysis. Glycerol molecules are highlighted in yellow. Catalytic glutamates are boxed. pNpG processing. L1 GUS enzymes have been previously shown to process the small standard substrate pNPG faster than non-L1 GUS enzymes 20 . To gain greater insight into the small molecule glucuronide processing capabilities of L1 GUS enzymes, we assessed the ability of LrGUS and RgGUS, which have not been previously characterized, to cleave pNPG by determining their k cat and K m at their optimal pH values. We determined the optimal pH values of LrGUS and RgGUS to be 4.5 and 6.5, respectively (Fig. S5). While pNPG hydrolysis by the remaining ten GUS enzymes in our panel have been previously assessed, these data represent either apparent k cat values or k cat /K m values determined at non-optimal pHs 17,20,21 . Therefore, we used the Michaelis-Menten equation to determine the k cat and K m values at the optimal pH of the remaining GUS enzymes analysed to generate a complete set of kinetic parameters suitable for comparison (Table 1 and Figs S5, S6, and S7). RgGUS and BoGUS exhibited complex substrate inhibition kinetics, and we were not able to fit these to established substrate inhibition models 26,27 (Fig. S8). Consequently, apparent k cat and K m values were estimated at low pNPG concentrations at which inhibition was not observed (Figs S6 and S7). Future studies will explore the mechanistic and structural properties of this inhibition. The L1 GUS enzymes EcGUS, EeGUS, and SaGUS exhibited the highest turnover numbers (~120 s −1 ) and are the most efficient of the twelve GUS enzymes at processing pNPG, with catalytic efficiencies ranging from 540 to 920 s −1 mM −1 . The L1 GUS enzymes CpGUS and FpGUS demonstrated moderate pNPG processing rates, exhibiting k cat values of 57 s −1 and 58 s −1 , respectively. However, FpGUS exhibited the second highest K m (2.2 mM) of all the GUS enzymes tested, resulting in a lower catalytic efficiency compared to that of EcGUS, EeGUS, SaGUS, and CpGUS. In addition, while FpGUS was previously concluded to process pNPG faster than the non-L1 GUS enzymes BfGUS and BuGUS, determination of k cat and K m values reveal that FpGUS demonstrates a lower catalytic efficiency than these two GUS enzymes 20 . Interestingly, the L1 GUS enzymes LrGUS and RgGUS exhibited much lower k cat values (10.0 s −1 and 2.8 s −1 , respectively) than the five other L1 GUS enzymes as well as K m values ~10-fold higher than that of EcGUS. LrGUS and RgGUS also demonstrated lower catalytic efficiencies than those of BfGUS and BuGUS, which are mL1 and L2 GUS enzymes, respectively. These results for LrGUS and RgGUS likely reflect their unique active site features, appearing to be electrostatically and sterically occluded from processing this small molecule glucuronide substrate.
PmGUS and BoGUS, which are mL2 and mL1,2 GUS enzymes, respectively, are even poorer than LrGUS and RgGUS in utilizing this substrate, exhibiting the lowest pNPG processing efficiencies observed, with k cat /K m values of 3-4 orders of magnitude less than that of EcGUS. The mL2 PmGUS processed pNPG the slowest, with a k cat value of 0.088 s −1 and exhibited the highest K m value (2.40 mM). Thus, gut microbial GUS enzymes offer a range of active site architectures and activities toward a model small glucuronide substrate, likely reflecting the range of variable native substrates they likely evolved to process within the GI tract.

GUs inhibition. Structural evidence suggests that the loop present in L1 GUS enzymes makes contacts with
Inhibitor 1 and UNC10201652, stabilizing it within the active site 17,21,22 . We sought to assess the inhibition propensities of these inhibitors against both the L1 GUS and non-L1 GUS enzymes within our panel. To date, our inhibitors have only been tested with pNPG at pH 7.4 17,21 . However, the gut pH increases from approximately 5 in the proximal duodenum to about 7.4 in the colon 28 . Thus, we determined IC 50 values for Inhibitor 1 and UNC10201652 against each GUS at pH 6.5 and 7.5 to sample two pH values in the GI tract (Figs S9 and S10).
We first tested whether we observed inhibition at 100 µM Inhibitor 1 or UNC10201652 at pH 6.5 and 7.5 for each GUS enzyme. However, due to their poor pNPG processing activity, exhibiting k cat values of less than   (Fig. S10a,b), while the remaining L1 GUS enzymes and the non-L1 GUS enzymes did not appear to be inhibited by 80% at 100 µM. Inhibitor 1 was most potent against EcGUS and exhibited an IC 50 value of ~2 µM at both pH 6.5 and 7.5 ( Table 2 and Fig. S10a,b). For SaGUS, Inhibitor 1 demonstrated an IC 50 value of >25 µM (Fig. S10a,b) The potency of Inhibitor 1 did not appear to be affected by pH for either EcGUS or SaGUS (Table 2 and Fig. S10a,b). At 100 µM UNC10201652, the L1 GUS enzymes EcGUS, EeGUS, SaGUS, and CpGUS were inhibited at both pH 6.5 and 7.5 (Fig. S10c,d). In contrast, FpGUS and BdGUS were inhibited at pH 7.5 but did not show an inhibition of 80% or greater at pH 6.5 (Fig. S10c,d). UNC10201652 was most potent against CpGUS with an IC 50 value of 26 nM at pH 7.5 and demonstrated nanomolar IC 50 values for EcGUS and SaGUS at pH 7.5 (100 nM and 133 nM, respectively) ( Table 2). Of all the L1 GUS enzymes tested, UNC10201652 was least potent towards FpGUS, with a predicted IC 50 value of >3 µM at pH 7.5 (Fig. S10c). Unlike Inhibitor 1, the potency of UNC10201652 decreased with decreasing pH for all enzymes tested ( Table 2). The decrease in UNC10201652 potency appeared to vary with each enzyme tested. For EcGUS, the potency decreased by 3-fold from pH 7.5 to pH 6.5, whereas the potency decreased by 6-fold for CpGUS as the pH dropped from 7.5 to 6.5. Finally, despite its lack of an active site loop, the NL BdGUS was found to be weakly inhibited at pH 7.5 by UNC10201652, with a predicted IC 50 value of >25 µM (Fig. S10c). Thus, this compound appears to be capable of associating with gut microbial GUS enzymes via contacts not exclusively limited to Loop 1-mediated interactions.
DCF-G processing. While pNPG serves as a useful tool to assess the glucuronide processing capabilities of GUS enzymes, it is not a substrate present in the GI tract. Given our hypothesis that L1 GUS enzymes efficiently process drug glucuronides, we determined using ultra performance liquid chromatography (UPLC) the cleavage rates of the novel L1 GUS enzymes FpGUS, LrGUS, RgGUS, as well as the ten additional enzymes in our GUS panel, with the physiologically relevant glucuronide metabolite of the NSAID diclofenac, DCF-G (Table 3 and Fig. S11).
Analogous to their pNPG processing rates, LrGUS and RgGUS were the slowest L1 GUS enzymes to hydrolyse DCF-G and exhibited apparent k cat values of 10 s −1 and 24.0 s −1 , respectively, at a substrate concentration of 400 µM. In comparison, EeGUS and SaGUS hydrolysed DCF-G the fastest with an apparent k cat of 138 s −1 and 97 s −1 , respectively. EcGUS, one of the three fastest GUS enzymes to process pNPG, cleaved DCF-G faster than LrGUS and RgGUS but slower than the other L1 GUS enzymes, exhibiting a k cat of 30.7 s −1 .
The non-L1 GUS enzymes processed DCF-G slower than all of the L1 GUS enzymes tested, with the exception of BfGUS, which cleaved DCF-G at a rate of 15.1 s −1 , and BuGUS, which cleaved DCF-G at a rate equal to that of LrGUS. PmGUS processed DCF-G the slowest of all of the GUS enzymes tested, exhibiting the apparent k cat value of 0.034 s −1 . Taken together, these data demonstrate that overall trends with small molecule glucuronide substrates are maintained when comparing pNPG and DCF-G and suggest that L1 GUS enzymes catalyse the hydrolysis of small molecule glucuronides more efficiently than other classes of GUS enzymes. Specifically, these data highlight the fact that EeGUS, CpGUS and SaGUS efficiently reactivate the inactive NSAID metabolite diclofenac-glucuronide back to the active and GI-toxic NSAID diclofenac, and are effectively inhibited by UNC10201652. Thus, not all L1 GUS enzymes process this small molecule drug glucuronide well, but those that do are inhibited in vitro by UNC10201652.
Glucuronide processing by faecal extracts. The data presented above show that pNPG and DCF-G are processed by gut microbial GUS enzymes in vitro. To examine whether the GUS enzymes are also able to cleave a small glucuronide substrate in vivo, we incubated pNPG with the faecal extracts collected from four adult mice. pNPG processing was observed in all faecal samples, suggesting that the mouse intestine harbours GUS enzymes  capable of processing drug glucuronide substrates (Fig. S12). To determine how DCF-G treatment alters the glucuronide processing activity of these GUS enzymes, we treated the mice with DCF-G and collected their faecal samples after 12 hours. Interestingly, DCF-G treatment increased pNPG processing activity in faecal extracts, which may be indicative of either an expansion of bacteria encoding DCF-G reactivating GUS enzymes, or an upregulated expression of these enzymes in specific gut microbial species following DCF-G treatment (Fig. S12).
Taken together, these results support the conclusion that faecal samples contain GUS activity, and that treatment with a drug known to be glucuronidated, DCF in this case, increases GUS activity compared to pre-treatment levels.

L1 GUS distribution across the SSN points to distinct substrate specificities.
We previously demonstrated that a small set of L1 GUS enzymes are capable of processing small glucuronide substrates and are susceptible to selective GUS inhibition, while non-L1 GUS enzymes process larger glucuronide-containing substrates and are not effectively inhibited 20,21 . Here, we extend these investigations by examining the structure and function of two novel L1 GUS enzymes, as well as a panel of ten additional L1 and non-L1 GUS enzymes. By creating an SSN using the 279 GUS enzymes identified from the HMP, we found that while L1 GUS enzymes characterized in this paper retained highly similar tertiary and quaternary structures to previously characterized GUS enzymes, they do not cluster together (Fig. 1c). Further, EcGUS, SaGUS, CpGUS, and LrGUS did not cluster with any other L1 or non-L1 GUS enzyme. Given the fact that these L1 GUS enzymes displayed a range of DCF-G and pNPG cleavage rates, the separation of L1 GUS enzymes across the SSN may indicate that they possess distinct substrate specificities. These observations demonstrate the utility of SSNs to predict functional characteristics within large enzyme families encoded by the gut microbiome.  Table 2). The length (>15 amino acids) and position of Loop 1 in these enzymes appear to be suitable for stabilizing pNPG in their active sites and provide an explanation as to why these enzymes exhibited low K m values (Table 1 and Fig. 4). By contrast, the non-L1 GUS enzymes possess active site loops distinct in position or length compared to Loop 1. For example, mL1 BfGUS possesses a Mini Loop 1 (Fig. 5a). pNPG docking reveals that the Mini Loop 1 of BfGUS is too short to make productive contacts with pNPG (Fig. 5a). This observation suggests an explanation for why GUS enzymes possessing shorter loops between 10 and 15 residues, such as BfGUS, BoGUS, and PmGUS, poorly process pNPG. As observed in the crystal structure of mL2 PmGUS, a loop insert 41 residues in length occupies the active site and appears to clash with pNPG (Fig. 5b). As such, this loop insert may also explain why PmGUS displays a high K m value (2.40 ± 0.07 mM) for pNPG (Table 1).
While BuGUS does not possess a shorter active site loop, the structure of this active site loop appears to explain why it does not process pNPG as efficiently as Loop 1 GUS enzymes. As reported in Pollet et al., the Loop 2 in BuGUS adopts an extended alpha helix structure (Fig. 5c) 20 . Therefore, in this state it is not capable of folding over the active site to make contacts with pNPG. In addition, NL BdGUS also lacks an active site loop; in contrast to BuGUS, however, it does possess a loop in its CBM domain that enters the BdGUS active site (Figs 3b and 5d). Despite this extra CBM loop, the absence of a loop ~15 residues in length that resides at the Loop 1 position may explain why BuGUS, BdGUS, and other non-L1 GUS enzymes exhibited higher K m values for pNPG compared to EcGUS, EeGUS, SaGUS, and CpGUS. Importantly, however, not all L1 GUS enzymes exhibited efficient k cat /K m values. The three new L1 GUS enzymes were inefficient at processing pNPG, in part due to their high K m values. For RgGUS, this may be explained by the alpha helix motif adopted by Loop 1 (Fig. 2c). Docking pNPG into the active site reveals that the alpha helix motif would sterically clash with pNPG (Fig. 4c). Therefore, the propensity for the loop in RgGUS to form an alpha helix may explain its low catalytic efficiency. To determine whether other L1 GUS enzymes are capable of forming an alpha helix in the active site, we aligned the Loop 1 sequences (Fig. 4d). Sequence alignment revealed that FpGUS retains five of the six amino acids, termed the "NFXAA motif " where "X" is a hydrophobic residue, that make up the alpha helix conformation in the RgGUS loop, indicating that FpGUS may also be capable of forming this secondary structure motif even though this loop is disordered in the structure of this enzyme presented here. This structural motif may aid in rationalizing the inability of these enzymes to process small glucuronide substrates. To determine whether crystal packing causes distinct L1 conformations, we analysed the loop-based symmetry contacts of each L1 shown in Fig. 4. In EcGUS, SaGUS, and RgGUS, crystal contacts were observed, although fewer contacts were observed for RgGUS compared to EcGUS and SaGUS. In contrast, no crystal contacts were observed within 25 Å of the EeGUS L1 structure. These findings suggest that while crystal packing may dictate the L1 conformation in some GUS enzymes, it likely plays a minor role in the formation of the L1 alpha helix motif in RgGUS.
Although LrGUS does not possess the NFXAA motif, we hypothesized that the EDD motif, which places three anionic side chains into the active site, precludes the entry of neutral substrates and may explain the poor pNPG processing activity of LrGUS. The EDD motif is unique to LrGUS and was not present in the six other L1 GUS enzymes characterized in the study (Figs 2c and S3b). Site-directed mutagenesis of the EDD motif to the corresponding neutral polar residues QNN led to a complete loss of GUS activity (Fig. S13b), suggesting that this negatively charged patch is critical for small glucuronide processing by LrGUS.
In general, the overall trends with small molecule glucuronide substrates were maintained when comparing pNPG and DCF-G processing activity. However, EcGUS, the fastest and most efficient GUS at cleaving pNPG, was the slowest at processing DCF-G. Given that that the aglycone is expected to primarily interact with the L1 in EcGUS, we expect that the amino acid composition of L1 is responsible for this differential processing. Future studies will be carried out to evaluate how specific L1 residues affect pNPG and DCF-G processing.  Table 2). For example, UNC10201652 was 19 times more potent against EcGUS at pH 7.5 than Inhibitor 1. The mechanism of action demonstrated by UNC10201652 may provide insight into its enhanced inhibitory activity. UNC10201652 displays unique substrate-dependent and slow-binding inhibition, in which it binds to the glucuronic acid (GlcA)-enzyme catalytic intermediate as a piperazine-linked glucuronide 22 . The resulting piperazine-linked glucuronide contains a positively charged piperazine amine and is predicted to form an electrostatic interaction with the catalytic acid/base glutamate in the active site (Fig. 6b) 22 . The presence of this ionic interaction likely aids in stabilizing the bound UNC10201652-GlcA conjugate in the active site. In contrast, only hydrogen bonds and hydrophobic interactions are present in the Inhibitor 1-bound complex (Fig. 6a). In addition, a conserved tyrosine in GUS enzymes (Y472 in EcGUS and CpGUS) forms pi-pi stacking interactions with the UNC10201652 aromatic scaffold, which may also aid in stabilizing the inhibitor-bound complex (Fig. 6b). This tyrosine is not expected to provide pi-pi stacking interactions with Inhibitor 1 (Fig. 6a). The formation of the UNC10201652-GlcA conjugate within the GUS active site may also explain the pH dependency of UNC10201652, in which the inhibitory activity decreases with decreasing pH ( Table 2). As the pH decreases, the catalytic glutamates are expected to be protonated, which would disrupt the ionic bond between a negatively charged glutamate and the positively charged piperazine amine.

Active site loop composition impacts L1 GUS inhibition. By comparing the IC 50 values determined
in this study, we found that the L1 Cluster GUS enzymes were not inhibited, or poorly inhibited, by the bacterial GUS inhibitors ( Table 2). As stated above, the alpha helix formed in the Loop 1 of RgGUS and predicted to form in the Loop 1 of FpGUS may block the active site of this enzyme for productive binding to this substrate. This structural feature may also prevent Inhibitor 1 and UNC10201652 from accessing the active site pocket, resulting in the lack of inhibition observed for these L1 Cluster GUS enzymes.
Further, Inhibitor 1 and UNC10201652 displayed varying degrees of potency against the L1 GUS enzymes that were susceptible to inhibition ( Table 2). The specific Loop 1 amino acid sequence likely explains the differences in inhibition observed for the L1 GUS enzymes (Fig. 4d). For example, Inhibitor 1 was found to be more potent against EcGUS than CpGUS (Table 2). Upon docking Inhibitor 1 into the active sites of these enzymes based on the previously determined structures of the Inhibitor 2-and Inhibitor 3-bound EcGUS complexes (PDB: 3LPF and 3LPG, respectively) and the UNC10201652-bound CpGUS complex, we observed that F365 present on the Loop 1 of the adjacent EcGUS monomer likely serves as a primarily contact with Inhibitor 1, as previously described (Fig. 6c) 20 . In CpGUS, a glycine is present at this position and may explain why Inhibitor 1 is  (Figs 4d and 6e). In contrast, UNC10201652 was determined to be more potent towards CpGUS than EcGUS. Again, the specific Loop 1 composition of both enzymes may provide an explanation for this observation. Upon docking UNC10201652 into the EcGUS active site, we found that F365 in the adjacent monomer may clash with the inhibitor scaffold, although the exact positioning of the loop and inhibitor is difficult to predict in the absence of a co-crystal structure (Fig. 6d). To test the role F365 plays in Inhibitor 1 and UNC10201652 potency, we determined the IC 50 values for each inhibitor using EcGUS F365 variant protein. Inhibitor 1 and UNC10201652 were less potent toward the EcGUS F365 mutant than the WT (Fig. S14), suggesting that this residue aids in stabilizing both inhibitors and that the L1 in EcGUS adopts a conformation that accommodates UNC10201652. Given these results, the L1 in CpGUS may also adopt a conformation distinct from that displayed in Fig. 6e,f. Thus, the role specific L1 amino acids play in Inhibitor 1 and UNC10201652 binding, should be evaluated in future more expansive mutagenesis studies. In addition, the L1 EeGUS was also only moderately inhibited by UNC10201652 and not inhibited by Inhibitor 1 ( Table 2). While this may be explained in part by the specific Loop 1 composition of EeGUS, the presence of an additional Loop, named the "Ee Extra Loop", distinct from the Loop 1 or Loop 2 position may also impede inhibitor binding in the active site (Figs 4b and S3a). To determine whether the Ee Extra Loop affects inhibitor potency, we deleted residues M149 to A159, which comprise the loop. Deletion of this region led to a significant loss of pNPG activity, with the mutant exhibiting 9% of the cleavage rate demonstrated by the WT (Fig. S13c). Circular dichroism (CD) analysis revealed that deletion of the Ee Extra Loop did not alter the overall fold of the enzyme (Fig. S15a). Thus, this additional loop appears to be important in small glucuronide processing.

C-terminal domains affect substrate processing and GUS inhibition. The crystal structure of NL
BdGUS presented here, as well as the previously determined structures of non-L1 GUS enzymes, revealed the presence of additional C-terminal domains, including carbohydrate binding modules (CBMs) and domains of unknown function (DUFs). These additional domains may also provide additional rationale as to why non-L1 GUS enzymes, in general, process pNPG less efficiently than L1 GUS enzymes, which do not possess CBMs or DUFs (Figs 3a, S1, S16 and Table 1). The C-terminal domains cause non-L1 GUS enzymes to adopt more open active sites, as opposed to the "C-term-mediated" tetrameric state exhibited by L1 GUS enzymes that create smaller, more intimate active sites (Figs 2b and 3b). Such a configuration favours larger glucuronide-containing polysaccharide substrates over smaller drug glucuronides. In addition, CBMs, like that found in BdGUS, have been reported to mediate the binding and positioning of large carbohydrate substrates into the active site 29 , and their presence may indicate that larger polysaccharides are the native substrate of the GUS enzymes. As noted above, BdGUS and BuGUS share highly similar tertiary and quaternary structures, but the catalytic efficiency of BdGUS was nearly 10-fold less than that of BuGUS ( Fig. S4 and Table 1). Given that the main structural difference between these two enzymes is the presence of a CBM loop insert in BdGUS, the CBM loop may explain this difference in activity, as it may occlude a substrate like pNPG from being properly positioned in the active site. Such considerations address the binding of substrate to these enzymes (K m ). It is also noteworthy that the k cat of the two enzymes also differ by 3-fold, something that we cannot explain using static structures and may involve distinctions in active site motion, as outlined elegantly for other enzyme systems 30 .
In addition, the presence or absence of C-terminal domains likely affects the binding of our GUS inhibitors. Here, we show that, with the exception of BdGUS, only GUS enzymes that do not contain a C-terminal domain are inhibited. The previously reported inhibitor-bound structures of EcGUS reveal that the selective GUS inhibitors make contacts with both the Loop 1 of the primary monomer as well as the Loop 1 of the adjacent monomer in its functional "C-term-mediated" tetrameric state (Fig. 7a) 17,21 . However, the presence of extra domains at the C-terminus of the GUS protein causes the enzyme to adopt either a "square" tetrameric state (e.g. BfGUS) or dimeric states (e.g. BuGUS and BdGUS) (Fig. 7b-d). These quaternary assemblies preclude the positioning of an additional loop near the active site of the primary monomer, resulting in fewer putative inhibitor contacts. Therefore, the lack of a "C-term-mediated" tetrameric state due to the presence of one or more C-terminal domains presents a plausible explanation as to why no inhibition was observed in the majority of non-L1 GUS enzymes. BdGUS, the exception to this observation, was slightly inhibited by UNC10201652. However, the CBM loop in BdGUS may enter into the BdGUS active site and provide stabilizing contacts, enabling slight inhibition (Fig. 7d). To evaluate the role the CBM loop plays in UNC10201652 inhibition, we deleted residues N737 to Y499 that form the CBM loop in BdGUS. While deletion of this region did not alter the stability of the BdGUS mutant, as assessed by CD-monitored thermal denaturation, it rendered the enzyme inactive with pNPG, preventing further inhibition studies (Figs S13a and S15d). Concluding remarks. In summary, this work provides insight into the structural and functional diversity among GUS enzymes, specifically focusing on those of the L1 class, and advances our knowledge of the specific GUS enzymes that are capable of reactivating drug glucuronides in the gut. We have shown that not all L1 GUS enzymes efficiently process small glucuronide substrates and have provided several structural explanations to support these observations. However, those enzymes that process small substrate glucuronides well are inhibited by UNC10201652. ssN construction. The sequence similarity network diagram of GUS enzyme sequences was generated using the online enzyme function initiative-enzyme similarity tool (EFI-EST) 31 . The sequences obtained from the GUS rubric were used in combination with the EFI-EST "fasta" tool to create a sequence with 282 nodes. Each node represents sequences bearing ≥90% sequence identity to each other. A BLAST E-value of 1 × 10 −220 was employed. enzyme cloning. The GUS genes from Lactobacillus rhamnosus and Ruminococcus gnavus were purchased from Bio Basic in the pUC57 vector. The genes were amplified by PCR and inserted into the ligation independent cloning vector (LIC) pLIC-His using the primers shown in Table S2. The pLIC-His vector contains a N-terminal 6× -histidine tag. For LrGUS and FpGUS crystallisation, an additional 36 amino acids and 33 amino acids were added to the C-termini of the LrGUS and FpGUS proteins, respectively, using the C-term primers (Table S2).  Table S2. The mutant plasmids were sequenced to confirm the mutations. The mutants were produced and purified using E. coli BL21 (DE3) Gold as described below.

Methods
CD analysis of GUs mutants. The protein stabilities of the WT and mutant GUS enzymes described above were determined using the Circular Dichroism method 31 . Enzyme (2.5 µM) in CD buffer containing 10 mM potassium phosphate (pH 7.4) and 100 mM potassium fluoride was loaded into a 1-mm pathlength cuvette. Using a Chirascan-plus instrument (Applied Photophysis Limited), spectra from 185 to 260 nM were recorded at 20 ± 1.0 °C. Measurements were corrected for background signal using a CD buffer sample. The melting profile of the sample (2.5 µM) was monitored at 228, 220, 225, and 218 nm for EcGUS, EeGUS, LrGUS, and BdGUS, respectively, from 20 °C to 94 °C. enzyme expression and purification. All GUS enzymes were expressed and purified as previously described 20 . Expression and purification of LrGUS and RgGUS was identical to that previously described for EeGUS 20 . Briefly, LrGUS and RgGUS expression plasmids were transformed into BL21 DE3 Gold cells. Cells were grown in the presence of ampicillin in LB at 37 °C until the OD 600 reached 0.5, at which point the temperature was reduced to 18 °C. At OD 600 ~0.8, protein expression was induced by the addition of 0.1 mM isopropyl-1-thio-D-galactopyranoside (IPTG), and cells were incubated overnight. Cells were pelleted by centrifugation at 4500 × g for 20 min at 4 °C. Cell pellets were resuspended in Buffer A (20 mM potassium phosphate pH 7.4, 50 mM imidazole, 500 mM NaCl), DNase, lysozyme, and a Roche complete EDTA free protease inhibitor tablet. Proteins contained 6x histidine tags and were purified by Ni-column affinity chromatography and eluted with Buffer B (20 mM potassium phosphate pH 7.4, 500 mM imidazole, 500 mM NaCl). After Ni-column chromatography, the proteins were subjected to size-column chromatography using Buffer C (20 mM HEPES pH 7.4, 50 mM NaCl). Protein eluents were then concentrated, flash frozen in liquid nitrogen, and stored at −80 °C.
In vitro pNpG processing assay. The standard substrate p-nitrophenyl-β-D-glucuronide (pNPG) was purchased as a solid (Sigma Aldrich) and resuspended in water to a concentration of 100 mM. In vitro assays were conducted in 96-well, clear bottom assay plates (Costar, Tewksbury MA) at 37 °C in a 50 µL total volume. Reactions consisted of 10 µL assay buffer (50 mM HEPES, 50 mM NaCl, various pH), 10 µL enzyme (various concentrations), and 30 µL pNPG (various concentrations) diluted in assay buffer. Product formation was measured at 410 nm using a PHERAstar Plus Microplate reader (BMG Labtech). To determine the optimal pH for SaGUS, CpGUS, LrGUS, and RgGUS, the above assay was conducted at various enzyme concentrations and 800 µM pNPG in the appropriate assay buffer where the pH ranged from 4.0 to 7.4. For assays at pH 6.0 or lower, reactions were quenched with 100 µL of 0.2 M sodium carbonate, and product formation was measured over time via absorbance at 410 nm. For reactions at pH 6.5 or above, reactions were not quenched. For CpGUS and RgGUS, pH 6.5 was selected as the optimal pH due to the ability to monitor product formation continuously at this pH. Upon determining the optimal pH for each enzyme, velocities were determined for multiple substrate and enzyme concentrations at each enzyme's optimal pH, and the Michaelis-Menten kinetics module in SigmaPlot was used to calculate K m , k cat , and catalytic efficiency. RgGUS and BoGUS displayed complex substrate inhibition kinetics, and we were not able to fit these to established substrate inhibition models, including the uncompetitive substrate inhibition model, as determined in Equation  In vitro DCF-G processing assay. Diclofenac acyl β-D-glucuronide (DCF-G) was purchased as a solid (LC Scientific Inc., Concord ON) and resuspended in DMSO to a concentration of 25 mM. In vitro assays were conducted at 37 °C in a 50 µL total volume. Reactions consisted of 10 µL assay buffer (50 mM HEPES, 50 mM NaCl, various pH), 10 µL enzyme (various concentrations), and 30 µL DCF-G (400 µM final) diluted in assay buffer. The pH of each reaction was chosen based on the optimal pH determined for each GUS with pNPG. Reactions were quenched at 0, 1, 2, 3, 4, and 5 min with 50 µL of 25% trichloroacetic acid (TCA). For PmGUS, reactions were quenched at 0, 10, 20, 30, 40, and 50 min. After centrifugation at 13,000xg for 10 min, the resultant supernatant was subjected to UPLC analysis. The concentration of DCF-G remaining at each time point was quantified on a Waters Acquity H-class liquid chromatograph system. Samples were separated on a Waters Acquity UPLC BEH C18 column (2.1 × 50 mm, 1.7 µm particle size) at 40 °C. The flow rate was 0.6 mL/min, and the injection volume was 3 µL. LC conditions were set at 100% water with 0.1% formic acid (A) ramped linearly over 9.8 mins to 95% acetonitrile with 0.1% formic acid (B) and held until 10.2 mins. At 10.21 mins the gradient was switched back to 100% A and allowed to re-equilibrate until 11.25 mins. DCF-G was monitored at 280 nm. The concentration of DCF-G was determined from a standard curve (0-500 µM DCF-G in assay buffer). Control reactions were performed in which enzyme was substituted with buffer. Background hydrolysis was not observed at each pH tested. Reactions were performed in triplicate for each enzyme. pNpG processing assay in faecal extracts. All animal studies were approved by the University of North Carolina Institutional Animal Care and Use Committee (IACUC), in accordance with the Care and Use of Laboratory Animals guidelines set by the National Institutes of Health. Twelve-week old female C57/BL6J mice were individually housed in specific pathogen-free conditions with sterile ventilator cages containing corn bedding, with ad libitum access to chow and water. Prior to treatment, faecal pellets were collected from each mouse shortly by gentle abdominal palpation and snap frozen in sterile microfuge tubes. Animals received a single ulcerogenic dose of DCF (60 mg/kg) by intraperitoneal injection, as previously described 18 . Twelve hours following DCF exposure, another set of faecal pellets were collected and stored as described above. To perform the assay, frozen faecal samples were rehydrated in 15× assay buffer (weight/volume; 20 mM HEPES, 50 mM NaCl, pH 7.4, 1 × Complete ® Protease inhibitor cocktail (Roche). Bacterial cells were lysed using a Tissuelyzer II (Qiagen) for 2 min at 30 Hertz. Homogenate was sonicated for 4 min, and then clarified by centrifugation for five minutes at 13,000×g. All experimental manipulation until this point occurred at 4 °C. 5 μL of faecal slurry supernatant was used to initiate the hydrolysis reaction of 1 mM pNPG resuspended in the same buffer. Parallel reactions containing only pNPG or only buffer/faecal slurry were used as negative controls; an aliquot of each sample was heat inactivated at 95 °C and used in the assay for further background establishment. Each sample was assayed using three technical replicates. The initial velocities of the resultant progress curves of the reaction were calculated in MATLAB by linear regression, and then normalized to the total faecal protein content calculated using a standard Bradford assay.
In vitro inhibition assay. In vitro inhibition of bacterial GUS enzymes by UNC10201652 was assessed as previously described 22 . Reactions consisted of 5 µL of GUS (15 nM final for EcGUS, EeGUS, SaGUS, and CpGUS; 150 nM final for RgGUS, LrGUS, and BdGUS), 5 µL of inhibitor (various concentrations), 30 µL of pNPG (900 µM final), and 10 µL of assay buffer (25 mM NaCl, 25 mM HEPES, pH 6.5 or pH 7.5 final). Reactions were initiated by addition of pNPG and then incubated for 1 hour, after which the end point absorbance was determined. Due to the slow-binding nature of UNC10201652, the IC 50 was determined as the inhibitor concentration that yielded a 50% reduction in the maximum absorbance of the uninhibited reaction, where percent inhibition was calculated as: where A exp is the end point absorbance at a particular inhibitor concentration, A max is the absorbance of the uninhibited reaction, and A bg is the background absorbance the assay. Percent inhibition values were subsequently plotted against the log of inhibitor concentration and fit with a four-parameter logistic function in SigmaPlot 13.0 to determine the IC 50 .
In vitro inhibition of bacterial GUS enzymes by Inhibitor 1 was determined using the reaction conditions described above, but the IC 50 was determined as the inhibitor concentration that yielded 50% reduction in the maximum initial velocity of the uninhibited reaction.