The abundance of recorded protein sequence data stands in contrast to the small number of experimentally verified functional annotation. Here we screened a million-membered metagenomic library at ultrahigh throughput in microfluidic droplets for β-glucuronidase activity. We identified SN243, a genuine β-glucuronidase with little homology to previously studied enzymes of this type, as a glycoside hydrolase 3 family member. This glycoside hydrolase family contains only one recently added β-glucuronidase, showing that a functional metagenomic approach can shed light on assignments that are currently ‘unpredictable’ by bioinformatics. Kinetic analyses of SN243 characterized it as a promiscuous catalyst and structural analysis suggests regions of divergence from homologous glycoside hydrolase 3 members creating a wide-open active site. With a screening throughput of >107 library members per day, picolitre-volume microfluidic droplets enable functional assignments that complement current enzyme database dictionaries and provide bridgeheads for the annotation of unexplored sequence space.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Gene sequences of SN243 and SN268 have been uploaded to GenBank (entries ON385387 and ON385388, respectively) and are provided in the Supporting Information. The apo-crystal structures of SN243 wild-type (7QE1) and mutant D415N (7QG4), as well as structures of SN243 wild-type co-crystallized with GlcA (7QE2), mutant D415A co-crystallized with FD-β-GlcA (7QEA) and pNP-β-GlcA (7QEF), and mutant D415N with pNP-β-GlcA (7QEE) have been uploaded to the PDB. Accession numbers for hits from homology searches and PDB codes for structures used for comparative analyses are provided in the text.
The following publicly available datasets were used for analysis: CAZy database (http://www.cazy.org; release date, 22/01/21), PDB (7CBO, 3NVD, 5M6G, 6JZ5), GenBank (ACZ66247.2), NCBI non-redundant protein sequence database (WP_110852811). Source data are provided with this paper.
Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).
Chang, A. et al. BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res. 49, D498–D508 (2021).
Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490–D495 (2014).
Consortium, T. U. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2020).
O’Brien, P. J. & Herschlag, D. Catalytic promiscuity and the evolution of new enzymatic activities. Chem. Biol. 6, R91–R105 (1999).
Khersonsky, O. & Tawfik, D. S. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu. Rev. Biochem. 79, 471–505 (2010).
Robinson, S. L., Piel, J. & Sunagawa, S. A roadmap for metagenomic enzyme discovery. Nat. Prod. Rep. 38, 1994–2023 (2021).
Lorenz, P. & Eck, J. Metagenomics and industrial applications. Nat. Rev. Microbiol. 3, 510–516 (2005).
Neun, S., Zurek, P. J., Kaminski, T. S. & Hollfelder, F. Ultrahigh throughput screening for enzyme function in droplets. Methods Enzymol. 643, 317–343 (2020).
Leemhuis, H., Stein, V., Griffiths, A. D. & Hollfelder, F. New genotype–phenotype linkages for directed evolution of functional proteins. Curr. Opin. Struct. Biol. 15, 472–478 (2005).
Baret, J. C. et al. Fluorescence-activated droplet sorting (FADS): efficient microfluidic cell sorting based on enzymatic activity. Lab Chip 9, 1850–1858 (2009).
Tauzin, A. S. et al. Investigating host–microbiome interactions by droplet based microfluidics. Microbiome 8, 141 (2020).
Zinchenko, A. et al. One in a million: flow cytometric sorting of single cell-lysate assays in monodisperse picolitre double emulsion droplets for directed evolution. Anal. Chem. 86, 2526–2533 (2014).
Colin, P. Y. et al. Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics. Nat. Commun. 6, 10008 (2015).
Carnachan, S. M., Bell, T. J., Hinkley, S. F. R. & Sims, I. M. Polysaccharides from New Zealand native plants: a review of their structure, properties, and potential applications. Plants 8, 163 (2019).
Whitfield, C., Wear, S. S. & Sande, C. Assembly of bacterial capsular polysaccharides and exopolysaccharides. Annu. Rev. Microbiol. 74, 521–543 (2020).
Esko, J. D., Kimata, K. & Lindahl, U. in Essentials of Glycobiology 2nd edn (eds Varki A. et al.) Ch. 16 (Cold Spring Harbor Laboratory Press, 2009).
Dutton, G. J. in Glucuronidation of drugs and other compounds Ch. 7 (CRC Press, 1980).
Macdonald, S. S. et al. Structural and mechanistic analysis of a β-glycoside phosphorylase identified by screening a metagenomic library. J. Biol. Chem. 293, 3451–3467 (2018).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Mirdita, M., Ovchinnikov, S. & Steinegger, M. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Bäumgen, M. et al. A new carbohydrate-active oligosaccharide dehydratase is involved in the degradation of ulvan. J. Biol. Chem. 297, 101210 (2021).
Biernat, K. A. et al. Structure, function, and inhibition of drug reactivating human gut microbial β-glucuronidases. Sci. Rep. 9, 825 (2019).
Wallace, B. D. et al. Structure and inhibition of microbiome β-glucuronidases essential to the alleviation of cancer drug toxicity. Chem. Biol. 22, 1238–1249 (2015).
Helbert, W. et al. Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space. Proc. Natl Acad. Sci. USA 116, 6063–6068 (2019).
Vocadlo, D. J., Mayer, C., He, S. & Withers, S. G. Mechanism of action and identification of Asp242 as the catalytic nucleophile of Vibrio furnisii N-acetyl-β-d-glucosaminidase using 2-acetamido-2-deoxy-5-fluoro-α-l-idopyranosyl fluoride. Biochemistry 39, 117–126 (2000).
Litzinger, S. et al. Structural and kinetic analysis of Bacillus subtilis N-acetylglucosaminidase reveals a unique Asp–His dyad mechanism. J. Biol. Chem. 285, 35675–35684 (2010).
Bacik, J. P., Whitworth, G. E., Stubbs, K. A., Vocadlo, D. J. & Mark, B. L. Active site plasticity within the glycoside hydrolase NagZ underlies a dynamic mechanism of substrate distortion. Chem. Biol. 19, 1471–1482 (2012).
Pellock, S. J. et al. Three structurally and functionally distinct β-glucuronidases from the human gut microbe Bacteroides uniformis. J. Biol. Chem. 293, 18559–18573 (2018).
Pollet, R. M. et al. An atlas of β-glucuronidases in the human intestinal microbiome. Structure 25, 967–977 (2017).
Tian, W., Chen, C., Lei, X., Zhao, J. & Liang, J. CASTp 3.0: computed atlas of surface topography of proteins. Nucleic Acids Res. 46, W363–W367 (2018).
Vadlamani, G. et al. Conformational flexibility of the glycosidase NagZ allows it to bind structurally diverse inhibitors to suppress β-lactam antibiotic resistance. Protein Sci. 26, 1161–1170 (2017).
Dashnyam, P. et al. β-Glucuronidases of opportunistic bacteria are the major contributors to xenobiotic-induced toxicity in the gut. Sci. Rep. 8, 16372 (2018).
Little, M. S. et al. Active site flexibility revealed in crystal structures of Parabacteroides merdae β-glucuronidase from the human gut microbiome. Protein Sci. 27, 2010–2022 (2018).
Renard, C. M., Crepeau, M. J. & Thibault, J. F. Glucuronic acid directly linked to galacturonic acid in the rhamnogalacturonan backbone of beet pectins. Eur. J. Biochem. 266, 566–574 (1999).
Williams, P. A. & Phillips, G. O. in Handbook of Hydrocolloids 3rd edn (eds Phillips, G. O & Williams, P. A.) 627–652 (Woodhead Publishing, 2021).
Wee, M. S. M., Matia-Merino, L., Carnachan, S. M., Sims, I. M. & Goh, K. K. T. Structure of a shear-thickening polysaccharide extracted from the New Zealand black tree fern, Cyathea medullaris. Int. J. Biol. Macromol. 70, 86–91 (2014).
Kidgell, J. T., Magnusson, M., de Nys, R. & Glasson, C. R. K. Ulvan: a systematic review of extraction, composition and function. Algal Res. 39, 101422 (2019).
Grondin, J. M. et al. Polysaccharide utilization loci: fueling microbial communities. J. Bacteriol. 199, e00860–e00816 (2017).
Larsbrink, J. et al. A complex gene locus enables xyloglucan utilization in the model saprophyte Cellvibrio japonicus. Mol. Microbiol. 94, 418–433 (2014).
Vickers, C. J., Fraga, D. & Patrick, W. M. Quantifying the taxonomic bias in enzymology. Protein Sci. 30, 914–921 (2021).
Chuzel, L., Ganatra, M. B., Rapp, E., Henrissat, B. & Taron, C. H. Functional metagenomics identifies an exosialidase with an inverting catalytic mechanism that defines a new glycoside hydrolase family (GH156). J. Biol. Chem. 293, 18138–18150 (2018).
Cheng, J. et al. Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences. PLoS One 12, e0172545 (2017).
Armstrong, Z. et al. High-throughput recovery and characterization of metagenome-derived glycoside hydrolase-containing clones as a resource for biocatalyst development. mSystems 4, e00082-19 (2019).
van Loo, B. et al. Balancing specificity and promiscuity in enzyme evolution: multidimensional activity transitions in the alkaline phosphatase superfamily. J. Am. Chem. Soc. 141, 370–387 (2019).
Copley, S. D. Evolution of new enzymes by gene duplication and divergence. FEBS J. 287, 1262–1283 (2020).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
van Loo, B. et al. An efficient, multiply promiscuous hydrolase in the alkaline phosphatase superfamily. Proc. Natl Acad. Sci. USA 107, 2740–2745 (2010).
Tyzack, J. D., Furnham, N., Sillitoe, I., Orengo, C. M. & Thornton, J. M. Exploring enzyme evolution from changes in sequence, structure, and function. Methods Mol. Biol. 1851, 263–275 (2019).
Lappe, M. & Holm, L. Unraveling protein interaction networks with near-optimal efficiency. Nat. Biotechnol. 22, 98–103 (2004).
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
Gabor, E. M., de Vries, E. J. & Janssen, D. B. Construction, characterization, and use of small-insert gene banks of DNA isolated from soil and enrichment cultures for the recovery of novel amidases. Environ. Microbiol. 6, 948–958 (2004).
Gabor, E. M., de Vries, E. J. & Janssen, D. B. Efficient recovery of environmental DNA for expression cloning by indirect extraction methods. FEMS Microbiol. Ecol. 44, 153–163 (2003).
Takata, R. et al. Degradation of carbohydrate moieties of arabinogalactan-proteins by glycoside hydrolases from Neurospora crassa. Carbohydr. Res. 345, 2516–2522 (2010).
Tsumuraya, Y., Mochizuki, N., Hashimoto, Y. & Kovác, P. Purification of an exo-β-(1-3)-d-galactanase of Irpex lacteus (Polyporus tulipiferae) and its action on arabinogalactan-proteins. J. Biol. Chem. 265, 7207–7215 (1990).
Tryfona, T. et al. Structural characterization of Arabidopsis leaf arabinogalactan polysaccharides. Plant Physiol. 160, 653–666 (2012).
Goubet, F., Jackson, P., Deery, M. J. & Dupree, P. Polysaccharide analysis using carbohydrate gel electrophoresis: a method to study plant cell wall polysaccharides and polysaccharide hydrolases. Anal. Biochem. 300, 53–68 (2002).
Wagner, A., Duman, R., Henderson, K. & Mykhaylyk, V. In-vacuum long-wavelength macromolecular crystallography. Acta Crystallogr. D. Biol. Crystallogr. 72, 430–439 (2016).
Skubak, P. et al. A new MR-SAD algorithm for the automatic building of protein models from low-resolution X-ray data and a poor starting model. IUCrJ 5, 166–171 (2018).
The authors thank P. Mair for help with FADS, D. Janssen (Groningen University) for making metagenomic DNA libraries available for this work; K. Stott for help with circular dichroism spectra and G. Aleku, S. Ladevèze and other members of the Hollfelder group for comments on the manuscript. We are grateful for access to and support of the Department of Biochemistry X-ray crystallographic facility. Data collected at Diamond Light Source on beamlines IO3, IO4 and I23 contributed to the data in this manuscript (proposal numbers 18548 and 25402). This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 685474 (Metafluidics), the BBSRC (BB/W000504/1, BB/T003545/1) and the EPSRC (EP/H046593/1). S.N. received an AstraZeneca studentship. F.H. is an ERC Advanced Investigator (695669).
The authors declare no competing interests.
Peer review information
Nature Chemical Biology thanks Gabrielle Potocki-Veronese and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Structural comparison of SN268 (based on an Alphafold2 prediction) with its closest homolog in the PDB.
(a) Overall alignment of SN268 (green) and the β-N-acetylhexosaminidase from Akkermansia muciniphila co-crystallized with GlcNAc (PDB: 7CBO, purple). The two sequences share 40.5% identity and a high degree of structural alignment (RMSD: 0.768 Å, over 2239 atoms). (b) The focus on the active site shows an almost perfect alignment of the binding sites. A reasonable conjecture based on the perfect active site similarity is that they catalyze turnover of the same main substrate, but that weaker promiscuous activities may exist, and we have detected one of them in our droplet screens.
SN243 was most active on pNP-β-GlcA between pH 7.5 and 9.5. The buffer pH was varied between pH 4 and 11 using 100 mM citrate (pH 4-6), sodium phosphate (NaP, pH 6-8), Tris (pH 7.5-9), glycine (pH 9-10) and CAPS (pH 9.5-11) buffer, respectively, while keeping the salt concentration at 150 mM NaCl. (b) Thermostability of SN243. In a differential scanning fluorimetry experiment, the melting temperature of SN243 was determined at 62.6 °C. The assay was performed in triplicate. (c) Optimization of the reaction temperature. The temperature was increased in 2 °C increments and the initial reaction velocity with 10 nM SN243 and 50 μM pNP-β-GlcA recorded and normalized to the highest observed value (56 °C). (d) Determination of Michaelis-Menten kinetics at the optimal reaction temperature. Using 2 nM SN243, kinetic parameters for the reaction with pNP-β-GlcA were determined at 56 °C. The mean of three independent datasets and the standard errors with a 95% confidence level are shown.
(a) There is substantial evidence that GHs can have side activities in addition to the one that is considered their main activity of biological relevance. Catalytic promiscuity is difficult or impossible to predict, so collecting experimental evidence for promiscuity is crucial to gain a more systematic understanding of its occurrence. Some of the promiscuous activities of SN243 had been observed for β-glucuronidases before, but there is no previous report of an enzyme that was active on all these six glycosides. The thicker arrow highlights the new combination of a β-glucuronidase with promiscuous α-arabinofuranosidase activity added with the discovery of SN243. The figure tries to summarize the literature evidence by showing arrows staring at the main activity of an enzyme and pointing to the observed promiscuous activity. (Supplementary Fig. 17 gives the literature references in a replica of this plot indicated for each arrow in square brackets.) (b) Catalytic efficiencies (kcat/KM – left y-axis) for SN243 main and promiscuous activities vary between 106 M−1s−1 (pNP-β-GlcA) and 10−1 M−1s−1 (pNP-α-L-Araf), which corresponds to second order rate enhancements (right y-axis) in the range of 1014 and 107, respectively. Detailed kinetic information can be found in Table 1 and Supplementary Table 6.
Extended Data Fig. 4 Multiple sequence alignment of SN243 with characterized bacterial GH3 family members.
For this alignment the closest 33 sequences (accession codes indicated) from the phylogenetic tree generated with ClustalW were selected and aligned, and displayed with ESPript (http://espript.ibcp.fr/ESPript/cgi-bin/ESPript.cgi). Conservation is highlighted in black. A consensus marked by arrows identifies the catalytic nucleophile Asp415 (a) and the acid-base dyad Asp331/His333 (b).
Extended Data Fig. 5 Active site of SN243 variants co-crystallized with reaction substrates and product.
The structures of SN243 wt co-crystallized with GlcA (PDB: 7QE2) (a) and of the inactive mutant SN243D415A with FD-β-GlcA (PDB: 7QEA) (b) and pNP-β-GlcA (PDB: 7QEF) (c) bound in the active site were obtained at 2.15 Å, 2.28 Å and 2.41 Å, respectively. The gray mesh shows the Fo-Fc map for GlcA (a), FD-β-GlcA (b) and pNP-β-GlcA (c) contoured at 2 σ. As no electron density was observed for the second GlcA moiety of the FD-β-GlcA ligand in (b), the monoglycosylated substrate was modeled.
Extended Data Fig. 6 Comparison of the active site access between SN243 and the structure of a homologous GH3 member.
(a) Cartoon representation of the structures of SN243 (crystal structure with density for loop His190-Ala202, PDB: 7QG4, green) and Saccharopolyspora erythraea glucan β-1,4-glucosidase (PDB: 5M6G, blue). (b) The zoom into the active site with the reaction product GlcA (yellow) shows loops folding over the binding pocket in the glucan β-1,4-glucosidase (blue), making the active site much smaller than in SN243 (green). (c) Surface display of SN243 (green) and the glucan β-1,4-glucosidase (blue). Active site binding pockets have been calculated by Castp 3.0 and are highlighted. The active site volume of SN243 is with 1293 Å3 about twenty times as large as the active site volume (65 Å3) of the homologous GH3 structure.
Extended Data Fig. 7 Surface display for the comparison of the size of the active site in the experimentally derived model (a) and the AlphaFold2 predictions (b).
The apo-structure of SN243 mutant D415N (a; pdb 7QG4 – corresponding to Fig. 4) shows crystallographic density for the loop ranging from His190 to Ala202. This loop is tied back to expose a large active site opening. By contrast, in the AlphaFold2 prediction in (b) the same loop appeared to fold over into the active site, effectively reducing the volume of the binding pocket and hindering access.
Extended Data Fig. 8 Sequence-similarity network of sequences related to SN243 (red), starting with a list of significant hits from a MGnify search.
Shown in purple and yellow are the only other known GH3 enzymes with β-glucuronidase activity (no edge with SN243) and the 427 nodes directly connected to SN243 with an E-value < e-40, respectively. The closest MGnify hit (MGYP000481601007) aligned with an E-value of 1.1e-229 to SN243 (corresponding to a 55% identity and 87% query coverage). Homology analysis of MGYP000481601007 with characterized sequences (UniProt/SwissProt) returns β-glucosidase/β-xylosidase hits with much lower homology to the query (E-value of the closest hit: 3e-28; Q46684.1). The discovery and functional characterization of SN243 (and closely related sequences in its perimeter) add a bridgehead in sequence space that can help with the annotation of related sequences. Even though sequence similarity does not strictly predict functional similarity, the characterization of SN243 illuminates the functional potential in this sequence neighborhood.
About this article
Cite this article
Neun, S., Brear, P., Campbell, E. et al. Functional metagenomic screening identifies an unexpected β-glucuronidase. Nat Chem Biol 18, 1096–1103 (2022). https://doi.org/10.1038/s41589-022-01071-x