Abstract
Natural products research increasingly applies -omics technologies to guide molecular discovery. While the combined analysis of genomic and metabolomic datasets has proved valuable for identifying natural products and their biosynthetic gene clusters (BGCs) in bacteria, this integrated approach lacks application to fungi. Because fungi are hyper-diverse and underexplored for new chemistry and bioactivities, we created a linked genomics–metabolomics dataset for 110 Ascomycetes, and optimized both gene cluster family (GCF) networking parameters and correlation-based scoring for pairing fungal natural products with their BGCs. Using a network of 3,007 GCFs (organized from 7,020 BGCs), we examined 25 known natural products originating from 16 known BGCs and observed statistically significant associations between 21 of these compounds and their validated BGCs. Furthermore, the scalable platform identified the BGC for the pestalamides, demystifying its biogenesis, and revealed more than 200 high-scoring natural product–GCF linkages to direct future discovery.

This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout





Data availability
All genomes that were sequenced for this work are available via NCBI under BioProject PRJNA852164. The metabolomics data (as .mzXML files) for the 110-strain dataset are available via the MassIVE repository under accession no. MSV000089848. Additionally, we have included Supplementary Data 1, which includes.html files for all MIBiG-anchored GCFs with detected metabolites, as well as the pestalamide GCF discovered in this work. The processed MZmine peak list that we used for correlations (generated using the publicly available .mzXML files) is provided as Supplementary Data 2.
References
Bernardini, S., Tiezzi, A., Laghezza Masci, V. & Ovidi, E. Natural products for human health: an historical overview of the drug discovery approaches. Nat. Prod. Res. 32, 1926–1950 (2018).
Hyde, K. D. et al. The amazing potential of fungi: 50 ways we can exploit fungi industrially. Fungal Divers. 97, 1–136 (2019).
Ráduly, Z., Szabó, L., Madar, A., Pócsi, I. & Csernoch, L. Toxicological and medical aspects of Aspergillus-derived mycotoxins entering the feed and food chain. Front. Microbiol. 10, 2908 (2020).
Bills, G. F., Gloer, J. B., Heitman, J., Howlett, B. J. & Stukenbrock, E. H. Biologically active secondary metabolites from the fungi. Microbiol. Spectr. 4, 4.6.01 (2016).
Li, Y. F. et al. Comprehensive curation and analysis of fungal biosynthetic gene clusters of published natural products. Fungal Genet. Biol. 89, 18–28 (2016).
Keller, N. P. Fungal secondary metabolism: regulation, function and drug discovery. Nat. Rev. Microbiol. 17, 167–180 (2019).
Caesar, L. K., Montaser, R., Keller, N. P. & Kelleher, N. L. Metabolomics and genomics in natural products research: complementary tools for targeting new chemical entities. Nat. Prod. Rep. 38, 2041–2065 (2021).
Kautsar, S. A., van der Hooft, J. J. J., de Ridder, D. & Medema, M. H. BiG-SLiCE: a highly scalable tool maps the diversity of 1.2 million biosynthetic gene clusters. GigaScience 10, giaa154 (2021).
Robey, M. T., Caesar, L. K., Drott, M. T., Keller, N. P. & Kelleher, N. L. An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes. Proc. Natl Acad. Sci. USA 118, e2020230118 (2021).
Chavali, A. K. & Rhee, S. Y. Bioinformatics tools for the identification of gene clusters that biosynthesize specialized metabolites. Brief. Bioinform. 19, 1022–1034 (2017).
Navarro-Muñoz, J. C. et al. A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol. 16, 60–68 (2020).
Nielsen, J. C. et al. Global analysis of biosynthetic gene clusters reveals vast potential of secondary metabolite production in Penicillium species. Nat. Microbiol. 2, 17044 (2017).
Kautsar, S. A., Blin, K., Shaw, S., Weber, T. & Medema, M. H. BiG-FAM: the biosynthetic gene cluster families database. Nucleic Acids Res. 49, D490–D497 (2021).
Drott, M. T. et al. Microevolution in the pansecondary metabolome of Aspergillus flavus and its potential macroevolutionary implications for filamentous fungi. Proc. Natl Acad. Sci. USA 118, e2021683118 (2021).
Doroghazi, J. R. et al. A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat. Chem. Biol. 10, 963–968 (2014).
Goering, A. W. et al. Metabologenomics: correlation of microbial gene clusters with metabolites drives discovery of a nonribosomal peptide with an unusual amino acid monomer. ACS Cent. Sci. 2, 99–108 (2016).
Schorn, M. A. et al. A community resource for paired genomic and metabolomic data mining. Nat. Chem. Biol. 17, 363–368 (2021).
Duncan, K. R. et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).
Maansson, M. et al. An integrated metabolomic and genomic mining workflow to uncover the biosynthetic potential of bacteria. mSystems 1, e00028–15 (2016).
Tryon, J. H. et al. Genome mining and metabolomics uncover a rare d-capreomycidine containing natural product and its biosynthetic gene cluster. ACS Chem. Biol. 15, 3013–3020 (2020).
Männle, D. et al. Comparative genomics and metabolomics in the genus Nocardia. mSystems. 5, e00125–20 (2020).
Handayani, I. et al. mining indonesian microbial biodiversity for novel natural compounds by a combined genome mining and molecular networking approach. Mar. Drugs 19, 316 (2021).
Cao, L., Shcherbin, E. & Mohimani, H. A metabolome- and metagenome-wide association network reveals microbial natural products and microbial biotransformation products from the human microbiota. mSystems 4, e00387–19 (2019).
Cao, L. et al. MetaMiner: a scalable peptidogenomics approach for discovery of ribosomal peptide natural products with blind modifications from microbial communities. Cell Syst. 9, 600–608 (2019).
Hjörleifsson Eldjárn, G. et al. Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions. PLoS Comput. Biol. 17, e1008920 (2021).
Johnston, C. W. et al. An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products. Nat. Comm. 6, 8421 (2015).
Kersten, R. D. & Weng, J.-K. Gene-guided discovery and engineering of branched cyclic peptides in plants. Proc. Natl Acad. Sci. USA 115, E10961–E10969 (2018).
Merwin, N. J. et al. DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products. Proc. Natl Acad. Sci. USA 117, 371–380 (2020).
Mohimani, H. et al. NRPquest: coupling mass spectrometry and genome mining for nonribosomal peptide discovery. J. Nat. Prod. 77, 1902–1909 (2014).
Medema, M. H. et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).
Blin, K. et al. antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 49, W29–W35 (2021).
Al Subeh, Z. Y. et al. Media and strain studies for the scaled production of cis-enone resorcylic acid lactones as feedstocks for semisynthesis. J. Antibiotics. 74, 496–507 (2021).
Flores-Bocanegra, L. et al. Cytotoxic naphthoquinone analogues, including heterodimers, and their structure elucidation using LR-HSQMBC NMR experiments. J. Nat. Prod. 84, 771–778 (2021).
Knowles, S. L. et al. Opportunities and limitations for assigning relative configurations of antibacterial bislactones using GIAO NMR shift calculations. J. Nat. Prod. 84, 1254–1260 (2021).
El-Elimat, T. et al. High-resolution MS, MS/MS, and UV database of fungal secondary metabolites as a dereplication protocol for bioactive natural products. J. Nat. Prod. 76, 1709–1716 (2013).
Paguigan, N. D. et al. Enhanced dereplication of fungal cultures via use of mass defect filtering. J. Antibiot. 70, 553–561 (2017).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).
Van Santen, J. A. et al. The natural products atlas: an open access knowledge base for microbial natural products discovery. ACS Cent. Sci. 5, 1824–1833 (2019).
Wang, F. et al. CFM-ID 4.0: more accurate ESI-MS/MS spectral prediction and compound identification. Anal. Chem. 93, 11692–11700 (2021).
Ding, G. et al. Pestalazines and pestalamides, bioactive metabolites from the plant pathogenic fungus Pestalotiopsis theae. J. Nat. Prod. 71, 1861–1865 (2008).
Hashimoto, M., Kato, H., Katsuki, A., Tsukamoto, S. & Fujii, I. Identification of the biosynthetic gene cluster for Himeic acid A: a Ubiquitin‐Activating Enzyme (E1) inhibitor in Aspergillus japonicus MF275. Chem. Bio. Chem. 19, 535–539 (2018).
Hiort, J. et al. New natural products from the sponge-derived fungus Aspergillus niger. J. Nat. Prod. 67, 1532–1543 (2004).
Zhou, H. et al. Penipyridones a–f, pyridone alkaloids from Penicillium funiculosum. J. Nat. Prod. 79, 1783–1790 (2016).
Zhou, X. et al. Aspernigrins with anti-HIV-1 activities from the marine-derived fungus Aspergillus niger SCSIO Jcsw6F30. Bioorg. Med. Chem. Lett. 26, 361–365 (2016).
Wang, B. et al. Deletion of the epigenetic regulator GcnE in Aspergillus niger FGSC A1279 activates the production of multiple polyketide metabolites. Microbiol. Res. 217, 101–107 (2018).
Chiang, Y.-M. et al. Characterization of a polyketide synthase in Aspergillus niger whose product is a precursor for both dihydroxynaphthalene (DHN) melanin and naphtho-γ-pyrone. Fungal Genet. Biol. 48, 430–437 (2011).
Montaser, R. & Kelleher, N. L. Discovery of the biosynthetic machinery for stravidins, biotin antimetabolites. ACS Chem. Biol. 15, 1134–1140 (2019).
Wang, F.-Q. et al. Molecular cloning and functional identification of a novel phenylacetyl-CoA ligase gene from Penicillium chrysogenum. Biochem. Biophys. Res. Comm. 360, 453–458 (2007).
Albright, J. C. et al. Large-scale metabolomics reveals a complex response of Aspergillus nidulans to epigenetic perturbation. ACS Chem., Biol. 10, 1535–1541 (2015).
Knowles, S. L., Raja, H. A., Roberts, C. D. & Oberlies, N. H. Fungal–fungal co-culture: a primer for generating chemical diversity. Nat. Prod. Rep. 39, 1557–1573 (2022).
Nickles, G., Ludwikoski, I., Bok, J. W. & Keller, N. P. Comprehensive guide to extracting and expressing fungal secondary metabolites with Aspergillus fumigatus as a case study. Curr. Protoc. 1, e321 (2021).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 44, D67–D72 (2016).
Nordberg, H. et al. The genome portal of the department of energy joint genome institute: 2014 updates. Nucleic Acids Res. 42, D26–D31 (2014).
Potter, S. C. et al. HMMER web server: 2018 update. Nucleic Acids Res. 46, W200–W204 (2018).
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
Pluskal, T., Castillo, S., Villar-Briones, A. & Orešič, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinf. 11, 395 (2010).
Du, X., Smirnov, A., Pluskal, T., Jia, W. & Sumner, S. in Computational Methods and Data Analysis for Metabolomics (ed. Li, S.) 25–48 (Springer, 2020).
Bok, J. W. et al. Fungal artificial chromosomes for mining of the fungal secondary metabolome. BMC Genomics 16, 343 (2015).
Acknowledgements
Genome sequencing for this project was conducted at the Roy J. Carver Biotechnology Center at the University of Illinois-Urbana-Champaign. This research was supported in part by the National Institutes of Health grant nos. F32 GM132679 to L.K.C., R01 GM112739-05A1 to N.P.K., T32 GM135066 to G.N., R44 AI140943-03 to J.W.B., P01 CA125066 to N.H.O. and 2R01 AT009143 to N.L.K. This work also made use of the IMSERC NMR facility at Northwestern University, which has received support from the Soft and Hybrid Nanotechnology Experimental Resource (grant no. NSF ECCS-2025633) grant. Figures 1, 4 and 5 were created using BioRender.com.
Author information
Authors and Affiliations
Contributions
L.K.C. led the project, organized data collection and analyzed data for both large scale correlations and targeted biosynthetic studies. F.A.B. was responsible for fungal culture and DNA extraction and helped prepare and run MS samples. M.T.R. assembled genomes and conducted bioinformatic analysis for both GCF networking and metabologenomics correlations. N.J.A. grew fungi, extracted metabolomes and ran MS samples. R.G. and D.D. prepared and ran MS samples and assisted with metabolomics analysis. J.W.B. completed fungal transformations for heterologous expression and knockout studies. G.N. compiled correlations plots and assisted with GCF optimization. R.J.S., D.J. and D.M. designed, cloned and validated plasmids for heterologous expression. K.B.C., C.E.E. and N.H.O. assisted with metabolite dereplication and NMR analysis. H.A.R. provided expertise for fungal growth and extraction and taxonomic identification of fungal strains. N.P.K. and N.L.K. supervised the project after its initiation by N.L.K. The manuscript was written by L.K.C. and N.L.K., with all authors providing substantial edits and commentary throughout.
Corresponding author
Ethics declarations
Competing interests
The authors declare financial conflicts of interest with MicroMGx (N.L.K.), Varigen Biosciences (D.M.) and Terra Bioforge (N.L.K., D.M., M.T.R. and N.P.K.). Further, N.L.K. is a consultant for Thermo Fisher Scientific focusing on the use of Fourier-transform Mass Spectrometry in multi-Omics research. Finally, N.H.O. and H.A.R. are on the Scientific Advisory Board of Clue Genetics, and N.H.O. is on the Scientific Advisory Board of Mycosynthetix. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Chemical Biology thanks Hosein Mohimani and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Tables 1–15, Figs. 1–34 and uncropped gel images associated with Supplementary Figs. 31 and 32.
Supplementary Data 1
Folder containing .html files for 29 GCFs discussed in this paper. Each .html file includes gene cluster arrow diagrams for all BGCs belonging to a specific GCF and individual arrows are linked to their genetic sequences.
Supplementary Data 2
Filtered MzMine peak list used for metabologenomics analysis.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Caesar, L.K., Butun, F.A., Robey, M.T. et al. Correlative metabologenomics of 110 fungi reveals metabolite–gene cluster pairs. Nat Chem Biol (2023). https://doi.org/10.1038/s41589-023-01276-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41589-023-01276-8