Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase


Multidomain enzymes orchestrate two or more catalytic activities to carry out metabolic transformations with increased control and speed. Here, we report the design and development of a genome-mining approach for targeted discovery of biochemical transformations through the analysis of co-occurring enzyme domains (CO-ED) in a single protein. CO-ED was designed to identify unannotated multifunctional enzymes for functional characterization and discovery based on the premise that linked enzyme domains have evolved to function collaboratively. Guided by CO-ED, we targeted an unannotated predicted ThiF–nitroreductase di-domain enzyme found in more than 50 proteobacteria. Through heterologous expression and biochemical reconstitution, we discovered a series of natural products containing the rare oxazolone heterocycle and characterized their biosynthesis. Notably, we identified the di-domain enzyme as an oxazolone synthetase, validating CO-ED-guided genome mining as a methodology with potential broad utility for both the discovery of unusual enzymatic transformations and the functional annotation of multidomain enzymes.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Outline of the CO-ED workflow.
Fig. 2: CO-ED networks can help guide enzyme discovery.
Fig. 3: Heterologous expression of oxzAB results in the production of a series of oxazolones.
Fig. 4: In vitro characterization of oxazolone biosynthesis by P. rubra OxzA and OxzB.

Data availability

All data generated during this study are included in this article and its Supplementary Information. CO-ED networks used to generate Fig. 2 and several Supplementary Figs. showing CO-ED networks are available in Supplementary Data 1. The curated set of enzyme domains for CO-ED can be found as the node table of the ‘all_of_uniprot’ network in Supplementary Data 1, as well as at This same set of domains is also the default setting for the web tool. NMR spectra of newly reported structures are available in the Supplementary Note. Tandem MS spectra of newly reported structures are available in the Supplementary Note and were also deposited to the GNPS spectral library at the URLs shown in the Supplementary Note. The following bioinformatic databases were employed in this study: BRENDA (, MIBiG (, UniProt ( and Pfam-A ( Source data are provided with this paper.

Code availability

Jupyter notebooks containing Python code for CO-ED analysis and for generating the statistics shown in Supplementary Figs. 1 and 18 are publicly available at


  1. 1.

    Gu, C., Kim, G. B., Kim, W. J., Kim, H. U. & Lee, S. Y. Current status and applications of genome-scale metabolic models. Genome Biol. 20, 121 (2019).

    PubMed  PubMed Central  Google Scholar 

  2. 2.

    Bornscheuer, U. T. The fourth wave of biocatalysis is approaching. Philos. Trans. A Math. Phys. Eng. Sci. 376, 20170063 (2018).

  3. 3.

    Gerlt, J. A. et al. The enzyme function initiative. Biochemistry 50, 9950–9962 (2011).

    CAS  PubMed  Google Scholar 

  4. 4.

    Hanson, A. D., Pribat, A., Waller, J. C. & de Crécy-Lagard, V. Unknown proteins and orphan enzymes: the missing half of the engineering parts list—and how to find it. Biochem. J. 425, 1–11 (2009).

    PubMed  Google Scholar 

  5. 5.

    Scott, T. A. & Piel, J. The hidden enzymology of bacterial natural product biosynthesis. Nat. Rev. Chem. 3, 404–425 (2019).

    PubMed  PubMed Central  Google Scholar 

  6. 6.

    Ellens, K. W. et al. Confronting the catalytic dark matter encoded by sequenced genomes. Nucleic Acids Res. 45, 11495–11514 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Medema, M. H., de Rond, T. & Moore, B. S. Mining genomes to illuminate the specialized chemistry of life. Nat. Rev. Genet. (2021).

  8. 8.

    Michael, A. J. Evolution of biosynthetic diversity. Biochem. J. 474, 2277–2299 (2017).

    CAS  PubMed  Google Scholar 

  9. 9.

    Hagel, J. M. & Facchini, P. J. Tying the knot: occurrence and possible significance of gene fusions in plant metabolism and beyond. J. Exp. Bot. 68, 4029–4043 (2017).

    CAS  PubMed  Google Scholar 

  10. 10.

    Bashton, M. & Chothia, C. The generation of new protein functions by the combination of domains. Structure 15, 85–99 (2007).

    CAS  PubMed  Google Scholar 

  11. 11.

    Weissman, K. J. The structural biology of biosynthetic megaenzymes. Nat. Chem. Biol. 11, 660–670 (2015).

    CAS  PubMed  Google Scholar 

  12. 12.

    Winzer, T. et al. Plant science. Morphinan biosynthesis in opium poppy requires a P450–oxidoreductase fusion protein. Science 349, 309–312 (2015).

    CAS  PubMed  Google Scholar 

  13. 13.

    Ng, T. L., Rohac, R., Mitchell, A. J., Boal, A. K. & Balskus, E. P. An N-nitrosating metalloenzyme constructs the pharmacophore of streptozotocin. Nature 566, 94–99 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res. 47, D427–D432 (2019).

    CAS  PubMed  Google Scholar 

  15. 15.

    Jeske, L., Placzek, S., Schomburg, I., Chang, A. & Schomburg, D. BRENDA in 2019: a European ELIXIR core data resource. Nucleic Acids Res. 47, D542–D549 (2019).

    CAS  PubMed  Google Scholar 

  16. 16.

    Kautsar, S. A. et al. MIBiG 2.0: a repository for biosynthetic gene clusters of known function. Nucleic Acids Res. 48, D454–D458 (2020).

    PubMed  Google Scholar 

  17. 17.

    The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 46, 2699 (2018).

    PubMed  Google Scholar 

  18. 18.

    Ishino, F., Mitsui, K., Tamaki, S. & Matsuhashi, M. Dual enzyme activities of cell wall peptidoglycan synthesis, peptidoglycan transglycosylase and penicillin-sensitive transpeptidase, in purified preparations of Escherichia coli penicillin-binding protein 1A. Biochem. Biophys. Res. Commun. 97, 287–293 (1980).

    CAS  PubMed  Google Scholar 

  19. 19.

    Goodlove, P. E., Cunningham, P. R., Parker, J. & Clark, D. P. Cloning and sequence analysis of the fermentative alcohol-dehydrogenase-encoding gene of Escherichia coli. Gene 85, 209–214 (1989).

    CAS  PubMed  Google Scholar 

  20. 20.

    Williams, G. J., Breazeale, S. D., Raetz, C. R. H. & Naismith, J. H. Structure and function of both domains of ArnA, a dual function decarboxylase and a formyltransferase, involved in 4-amino-4-deoxy-l-arabinose biosynthesis. J. Biol. Chem. 280, 23000–23008 (2005).

    CAS  PubMed  Google Scholar 

  21. 21.

    Rusnak, F., Sakaitani, M., Drueckhammer, D., Reichert, J. & Walsh, C. T. Biosynthesis of the Escherichia coli siderophore enterobactin: sequence of the entF gene, expression and purification of EntF, and analysis of covalent phosphopantetheine. Biochemistry 30, 2916–2927 (1991).

    CAS  PubMed  Google Scholar 

  22. 22.

    De Rond, T. et al. Oxidative cyclization of prodigiosin by an alkylglycerol monooxygenase-like enzyme. Nat. Chem. Biol. 13, 1155–1157 (2017).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Agarwal, V. et al. Biosynthesis of polybrominated aromatic organic compounds by marine bacteria. Nat. Chem. Biol. 10, 640–647 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Ross, A. C., Gulland, L. E. S., Dorrestein, P. C. & Moore, B. S. Targeted capture and heterologous expression of the Pseudoalteromonas alterochromide gene cluster in Escherichia coli represents a promising natural product exploratory platform. ACS Synth. Biol. 4, 414–420 (2015).

    CAS  PubMed  Google Scholar 

  25. 25.

    Herz, S., Eberhardt, S. & Bacher, A. Biosynthesis of riboflavin in plants. The ribA gene of Arabidopsis thaliana specifies a bifunctional GTP cyclohydrolase II/3,4-dihydroxy-2-butanone 4-phosphate synthase. Phytochemistry 53, 723–731 (2000).

    CAS  PubMed  Google Scholar 

  26. 26.

    Schulman, B. A. & Harper, J. W. Ubiquitin-like protein activation by E1 enzymes: the apex for downstream signalling pathways. Nat. Rev. Mol. Cell Biol. 10, 319–331 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Miyauchi, K., Kimura, S. & Suzuki, T. A cyclic form of N6-threonylcarbamoyladenosine as a widely distributed tRNA hypermodification. Nat. Chem. Biol. 9, 105–111 (2013).

    CAS  PubMed  Google Scholar 

  28. 28.

    Regni, C. A. et al. How the MccB bacterial ancestor of ubiquitin E1 initiates biosynthesis of the microcin C7 antibiotic. EMBO J. 28, 1953–1964 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Xi, J., Ge, Y., Kinsland, C., McLafferty, F. W. & Begley, T. P. Biosynthesis of the thiazole moiety of thiamin in Escherichia coli: identification of an acyldisulfide-linked protein–protein conjugate that is functionally analogous to the ubiquitin/E1 complex. Proc. Natl Acad. Sci. USA 98, 8513–8518 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Godert, A. M., Jin, M., McLafferty, F. W. & Begley, T. P. Biosynthesis of the thioquinolobactin siderophore: an interesting variation on sulfur transfer. J. Bacteriol. 189, 2941–2944 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Akiva, E., Copp, J. N., Tokuriki, N. & Babbitt, P. C. Evolutionary and molecular foundations of multiple contemporary functions of the nitroreductase superfamily. Proc. Natl Acad. Sci. USA 114, E9549–E9558 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Mondal, S., Raja, K., Schweizer, U. & Mugesh, G. Chemistry and biology in the biosynthesis and action of thyroid hormones. Angew. Chem. Int. Ed. Engl. 55, 7606–7630 (2016).

    CAS  PubMed  Google Scholar 

  33. 33.

    Schneider, T. L., Shen, B. & Walsh, C. T. Oxidase domains in epothilone and bleomycin biosynthesis: thiazoline to thiazole oxidation during chain elongation. Biochemistry 42, 9722–9730 (2003).

    CAS  PubMed  Google Scholar 

  34. 34.

    Taga, M. E., Larsen, N. A., Howard-Jones, A. R., Walsh, C. T. & Walker, G. C. BluB cannibalizes flavin to form the lower ligand of vitamin B12. Nature 446, 449–453 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Bashiri, G. et al. A revised biosynthetic pathway for the cofactor F420 in prokaryotes. Nat. Commun. 10, 1558 (2019).

    PubMed  PubMed Central  Google Scholar 

  36. 36.

    Gondry, M. et al. Cyclic dipeptide oxidase from Streptomyces noursei. Isolation, purification and partial characterization of a novel, amino acyl α,β-dehydrogenase. Eur. J. Biochem. 268, 1712–1721 (2001).

    CAS  PubMed  Google Scholar 

  37. 37.

    Zenno, S., Saigo, K., Kanoh, H. & Inouye, S. Identification of the gene encoding the major NAD(P)H-flavin oxidoreductase of the bioluminescent bacterium Vibrio fischeri ATCC 7744. J. Bacteriol. 176, 3536–3543 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Guella, G., N’Diaye, I., Fofana, M. & Mancini, I. Isolation, synthesis and photochemical properties of almazolone, a new indole alkaloid from a red alga of Senegal. Tetrahedron 62, 1165–1170 (2006).

    CAS  Google Scholar 

  39. 39.

    Seyedsayamdost, M. R. High-throughput platform for the discovery of elicitors of silent bacterial gene clusters. Proc. Natl Acad. Sci. USA 111, 7266–7271 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Walsh, C. & Wencewicz, T. Antibiotics: Challenges, Mechanisms, Opportunities (American Society for Microbiology, 2016).

  41. 41.

    Hon, J. et al. EnzymeMiner: automated mining of soluble enzymes with diverse structures, catalytic properties and stabilities. Nucleic Acids Res. 48, W104–W109 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Gerlt, J. A. Genomic enzymology: web tools for leveraging protein family sequence–function space and genome context to discover novel functions. Biochemistry 56, 4293–4308 (2017).

    CAS  PubMed  Google Scholar 

  43. 43.

    Wuchty, S. & Almaas, E. Evolutionary cores of domain co-occurrence networks. BMC Evol. Biol. 5, 24 (2005).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Barrera, A., Alastruey-Izquierdo, A., Martín, M. J., Cuesta, I. & Vizcaíno, J. A. Analysis of the protein domain and domain architecture content in fungi and its application in the search of new antifungal targets. PLoS Comput. Biol. 10, e1003733 (2014).

    PubMed  PubMed Central  Google Scholar 

  45. 45.

    Suhre, K. Inference of gene function based on gene fusion events: the Rosetta-stone method. In Methods in Molecular Biology 396, 31–41 (2007).

    CAS  Google Scholar 

  46. 46.

    Promponas, V. J., Ouzounis, C. A. & Iliopoulos, I. Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey. Brief. Bioinform. 15, 443–454 (2014).

    CAS  PubMed  Google Scholar 

  47. 47.

    Alborzi, S. Z., Devignes, M.-D. & Ritchie, D. W. ECDomainMiner: discovering hidden associations between enzyme commission numbers and Pfam domains. BMC Bioinformatics 18, 107 (2017).

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    De Castro, P. P., Carpanez, A. G. & Amarante, G. W. Azlactone reaction developments. Chem. Eur. J. 22, 10294–10318 (2016).

    PubMed  Google Scholar 

  49. 49.

    Kenney, G. E. et al. The biosynthesis of methanobactin. Science 359, 1411–1416 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Haft, D. H., Paulsen, I. T., Ward, N. & Selengut, J. D. Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic. BMC Biol. 4, 29 (2006).

    PubMed  PubMed Central  Google Scholar 

  51. 51.

    Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 20, 1160–1166 (2019).

  52. 52.

    Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997).

    CAS  PubMed  Google Scholar 

  54. 54.

    Xie, B.-B. et al. Genome sequence of the cycloprodigiosin-producing bacterial strain Pseudoalteromonas rubra ATCC 29570(T). J. Bacteriol. 194, 1637–1638 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Bentley, S. D. et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417, 141–147 (2002).

    PubMed  Google Scholar 

  56. 56.

    Udwary, D. W. et al. Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc. Natl Acad. Sci. USA 104, 10376–10381 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Paulsen, I. T. et al. Complete genome sequence of the plant commensal Pseudomonas fluorescens Pf-5. Nat. Biotechnol. 23, 873–878 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank our UCSD colleagues J. Li, V. Shende, T. Fallon and B. Duggan for helpful discussions. This work was supported by National Institutes of Health awards F32GM129960 to T.d.R. and R01GM085770 to B.S.M., as well as an American Society for Pharmacognosy Undergraduate Research Award and a UC San Diego ‘Eureka’ Undergraduate Research Scholarship to J.E.A.

Author information




T.d.R. and B.S.M. designed research; T.d.R. and J.E.A. performed research; T.d.R. analyzed data; T.d.R. and B.S.M. wrote the paper.

Corresponding authors

Correspondence to Tristan de Rond or Bradley S. Moore.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Chemical Biology thanks A. James Link, Maude Pupin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Survey of transformations known to be catalyzed by the ThiF and nitroreductase domains, and properties of organisms harboring a ThiF-nitroreductase di-domain enzyme identified by CO-ED.

a, OxzB exhibits homology to both the ThiF and nitroreductase enzyme families. Shown is a selection of characterized members of these enzyme families along with the transformations they catalyze, along with midpoint-rooted gene trees showing the relationships between each other and OxzB. ThiF family enzymes are known to catalyze ATP-dependent carboxylate activating reactions, while nitroreductase family enzymes catalyze a variety of redox reactions. Inferred phylogenies were generated from protein sequences using neighbor joining on the MAFFT web server51 and midpoint-rooted. Scale bars designate 1 substitution per site. b,c, Phylogenetic distribution (b) and habitat (c) of the host organisms harboring genes encoding OxzB proteins represented in the Uniprot database. Organisms with unknown habitats are not included. d, Genomic context of oxzB genes as determined using the Enzyme Function Initiative Genome Neighborhood Tool42. Most oxzB homologs are accompanied by oxzA, which codes for an N-acyltransferase. The arrow labeled ‘?’ represents genes unrelated to oxzA or oxzB. Organisms with unknown oxzB genomic context (for example, at the edge of a contig) are not included.

Source data

Extended Data Fig. 2 Induction of oxazolone production in P. rubra and C. chukchiensis by various antibiotics.

Bacteria were grown as a lawn on Marine Agar 2216 with a drop of antibiotic. A consistent amount of biomass adjacent to the zone of inhibition was harvested, extracted and analyzed by HPLC. Dots indicate summed peak areas between wavelengths of 300 nm and 400 nm. Three biological replicates were analyzed for each condition.

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–19, Tables 1 and 2 and Note

Reporting Summary

Supplementary Data 1

Source data

Source Data Extended Data Fig. 1

Statistical source data for Extended Data Fig. 1b–d.

Source Data Extended Data Fig. 2

Statistical source data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

de Rond, T., Asay, J.E. & Moore, B.S. Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase. Nat Chem Biol 17, 794–799 (2021).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing