Introduction

Over the past few decades, natural enzymes that form the core structure of natural products1, 2, 3 have been studied extensively for their reaction mechanisms, and the study of auxiliary enzymes4, 5, 6, 7, 8 that modify the core structure has been expanding our understanding of how nature generates complex chemical structures. Accumulation of such knowledge would permit quick identification of biosynthetic gene clusters, which are practically blueprints of the microbial assembly machineries that generate natural products. Identification of biosynthetic gene clusters will facilitate our engineering efforts toward inducing native hosts to produce more of desired compounds or activating biosynthesis of silenced secondary metabolites. Isolation of relevant genes will also allow programing of convenient host organisms for heterologous production of desired biosynthetic enzymes and natural products or their analogs.

Many bioactive molecules have been isolated from different organisms, and extremely effective pharmaceutical products have been generated successfully for commercial distribution in the market. In recent years, drug resistance among microbes began to proliferate with improper use of antibiotics, where bacteria started to acquire resistance not only against classic drugs such as penicillin, but also against the so-called drugs of last resort, such as vancomycin.9 Under the circumstances, it has become imperative to discover new compounds to combat drug-resistant microbes and to control communicable diseases, such as AIDS, malaria and tuberculosis. However, isolating new and useful natural products from different organisms is a resource-intensive and a time consuming task with poor success rate. Thus, we attempted to develop a simple method of finding novel natural products and biosynthetic genes responsible for their biosynthesis, from underutilized source organisms, by taking advantage of the wealth of knowledge of natural products and their biosynthetic mechanisms that has been accumulated to date. In this method, we would initially conduct chemical screening to identify new compounds by searching natural product databases, with molecular formula elucidated from exact mass and UV spectrum of compounds that were identified in the culture of a source organism by ultra HPLC–high-resolution MS (UHPLC–HRMS). Then, bioinformatic analyses would be applied to predict biosynthetic genes necessary to form the newly identified compounds and locate such genes in the genome of the source organism. Lastly, those predicted genes would be expressed heterologously in a convenient host to confirm their functions. We expected that this approach would enable us to identify new compounds from a culture, without having to have to perform large-scale fermentation or laborious bioactivity assays, thereby achieving reduction in the time and the cost of natural product discovery.

Results

Isolation of the natural products

To put our method into practice, we decided to focus on an entomopathogenic filamentous fungus called Metarhizium robertsii ARSEF 23 (previously named M. anisopliae ARSEF 23), because it had not been routinely included in the screening efforts toward isolation of new natural products thus far. This fungus was used as a biopesticide to control insects such as locusts, and recent studies identified a class of compounds called destruxins as the toxin responsible for the pathogenicity of the fungus.10 The genome of the fungus was sequenced,11 and the gene cluster responsible for the biosynthesis of destruxin was identified.12

For secondary metabolite isolation, M. robertsii ARSEF 23 was cultured for 8 days in either MYG (malt extract, yeast extract, glucose) or YPD (yeast extract, peptone, dextrose) liquid medium, and the culture was analyzed for the presence of metabolites by UHPLC–HRMS. Analysis of extracts from the MYG medium culture identified two compounds, destruxin A and B (Figure 1). The peak at 4.8 min was determined as destruxin A by HRMS data analysis (m/z 578.3547 [M+H]+, molecular formula C29H47O7N5). Similarly, the peak near 5.5 min was determined as destruxin B (m/z 594.3859 [M+H]+, molecular formula C30H51O7N5) (Supplementary Figure 1b). On the other hand, analysis of extracts from the YPD liquid medium culture identified three compounds: compound 1 having an m/z of 443.2791 [M+H]+ (calcd for C27H39O5, 443.2792, Δ=0.1 mmu); compound 2 445.2948 [M+H]+ (calcd for C27H41O5, 445.2949, Δ=0.1 mmu); and compound 3 427.2841 [M+H]+ (calcd for C27H39O4, 427.2843, Δ=0.2 mmu) (Supplementary Figure 1a). A search of the natural compound identifier database AntiBase201313 with these predicted molecular formulae, in conjunction with the ultraviolet–visible (UV–Vis) spectra showing a λmax at 296 nm (Supplementary Figure 1c), indicated that they might be unknown compounds. Therefore, we proceeded to purify 13 for further characterization.

Figure 1
figure 1

Chemical structures of newly isolated natural products subglutinol C (1) and D (2), previously known subglutinol A (3) and B (4), and destruxin A and B from M. robertsii ARSEF 23, as well as intermediates 5 and 6 biosynthesized heterologously in A. nidulans A1145.

Characterization of the isolated natural products

Compounds 1, 2 and 3 were purified from a 10-liter culture of M. robertsii ARSEF 23 grown in YPD liquid medium for 8 days by liquid chromatography (see the Materials and methods section for details). NMR data, including 1H and 13C spectra, as well as two-dimensional NMR analysis (Supplementary Table 2) allowed determination of the chemical structure of 3 to be subglutinol A,14 which was originally isolated from an endophytic fungus Fusarium subglutinans along with its epimer subglutinol B (4) as immunosuppressive agents.14 The absolute configurations shown in Figure 1 are drawn based on those established for the related compounds viridoxin A and B, and other insecticidal metabolites from the entomopathogenic fungus M. flavoviride.15 The planar structure of 1 (Supplementary Table 3 and Supplementary Figure 3) was elucidated with one- and two-dimensional NMR analyses, and it was determined to be a 15-hydroxylated derivative of 3. The relative stereochemistry of 1 was assigned by the similarity of the nuclear Overhauser effect spectroscopy (NOESY) spectrum to that of 3. Furthermore, the absolute configuration of 1 was determined by comparing its optical rotation with that of 3. The absolute configuration of 2 (Supplementary Table 4 and Supplementary Figure 9) was also determined by essentially the same process used for 1, except for the stereochemistry at C14. We named 1 and 2 as subglutinol C and D, respectively.

Identification of the biosynthetic gene cluster responsible for the formation of subglutinols

Subglutinols are categorized as meroterpenes consisting of an α-pyrone (4-hydroxy-5,6-dimethyl-2-pyrone) moiety attached to a decalin core fused to a five-membered cyclic ether carrying a prenyl side chain.14 Many meroterpenes have been isolated from fungi that have their scaffold derived from a polyketide–terpene hybrid backbone. They include andrastin A from Penicillium sp. FO-3929,16 terretonin from A. terreus,17 austinol from A. nidulans18, 19 and anditomin from A. variecolor,20 whose biosynthetic pathways and the genes encoding the enzymes responsible for the formation of those compounds have been described. Therefore, we expected that subglutinols were also derived from a polyketide and a terpenoid unit. Furthermore, the elucidated chemical structure of 1 revealed its similarity to pyripyropene A isolated from A. fumigatus FO-1289, having an α-pyrone and a decalin moiety.21 However, subglutinols are unique in that the α-pyrone is fused to a geranylgeranyl (GG) moiety, not a farnesyl group found in all the meroterpenes listed above. Therefore, the current study would be the first report of a biosynthetic gene cluster that produces α-pyrone–GG hybrids. Nevertheless, the similarity in the chemical structure suggested that the gene cluster responsible for the biosynthesis of subglutinols would have a similar composition as the more recently identified biosynthetic gene cluster of pyripyropene A.22 The α-pyrone part of subglutinols was predicted to be biosynthesized by a nonreducing polyketide synthase (NR-PKS) containing a ketosynthase, an acyltransferase, a methyltransferase and an acyl carrier protein (ACP) domain. Then, a prenyltransferase could catalyze the condensation of the α-pyrone and a diterpene chain. Subsequently, a terpene cyclase, which is often annotated as an integral membrane protein, could catalyze the decalin ring formation, followed by the five-membered ring formation to yield the common subglutinol framework. Therefore, to identify the subglutinol biosynthetic gene cluster, we searched the annotated M. robertsii ARSEF 23 genome for a clustering of genes that were predicted by Basic Local Alignment Search Tool for Protein (BLASTP)23 to encode an NR-PKS, a prenyltransferase, a terpene cyclase (integral membrane protein) and redox enzymes. The search identified a biosynthetic gene cluster spanning from MAA_07492 to MAA_07500 that fulfilled all the criteria. We named this gene cluster sub (Figure 2).

Figure 2
figure 2

Organization of the sub gene cluster responsible for the biosynthesis of subglutinols. aDeduced functions of the open reading frames identified within the biosynthetic gene cluster were based on the percentage sequence identity/similarity to known proteins as determined by protein BLAST search against the NCBI non-redundant database.23 bThe coding region has not been determined experimentally. GGPP, geranylgeranyl pyrophosphate; PKS, polyketide synthase; FAD, flavin adenine dinucleotide.

Engineered biosynthesis of subglutinol intermediates in A. nidulans A1145

To confirm that the sub gene cluster was responsible for the biosynthesis of subglutinols, we chose to express the sub genes heterologously in A. nidulans A114524 as a host of choice for the production of expected subglutinol biosynthetic intermediates. The standard gene knockout experiment was not successful with M. robertsii ARSEF 23 due to the difficulty of transforming this fungus with exogenous DNA. The subglutinol biosynthetic genes subA, subC and subD were introduced to A. nidulans A1145 on plasmids (see the Materials and methods section for details). A. nidulans A1145 transformed with relevant plasmids was grown in CD–ST (Czapek Dox–starch, tryptone) medium for 3 days before metabolites were extracted. The strain carrying the subA gene coding for the NR-PKS produced the α-pyrone portion of subglutinols (5) (Figure 3). The identification of 5 was achieved with the use of an authentic standard.25 On the other hand, the strain carrying all of subA, subC and subD was able to form the linear GG α-pyrone (6) (Figure 4). The chemical structure of 6 was elucidated based on the NMR data (Supplementary Table 5 and Supplementary Figure 15). Combined, these results clearly established that the sub gene cluster indeed encoded enzymes responsible for the biosynthesis of subglutinols.

Figure 3
figure 3

Engineered biosynthesis of 5 in A. nidulans A1145. LC traces from the UHPLC–HRMS analysis of (a) the authentic reference of 525 and (b) the metabolic extracts from A. nidulans A1145 transformed with pKW16020 carrying subA. All of the UHPLC traces were monitored at 280 nm. UHPLC–HRMS, ultra HPLC–high-resolution MS.

Figure 4
figure 4

Engineered biosynthesis of 6 in A. nidulans A1145. LC traces from the UHPLC–HRMS analysis of the metabolic extracts from (a) A. nidulans A1145 and (b) A. nidulans A1145 transformed with three plasmids, pKW16020, pKW16025 and pKW16026, carrying subA, subC and subD, respectively. All of the UHPLC traces were monitored at 280 nm. UHPLC–HRMS, ultra HPLC–high-resolution MS.

Discussion

In this study, we successfully isolated two new meroterpenes, subglutinol C and D, in addition to the known immunosuppressive subglutinol A, from the metabolically under-characterized M. robertsii ARSEF 23 through natural product database search using HRMS and spectrophotometric data collected on the culture extract. The chemical structures of the isolated compounds were determined by NMR analyses. Elucidation of the structure of the compounds led to the identification of the gene cluster predicted to be responsible for the biosynthesis of subglutinols. Lastly, heterologous expression of three key genes in A. nidulans A1145 allowed us to confirm the gene cluster to be responsible for the biosynthesis of subglutinols. In the proposed subglutinol biosynthetic pathway, the α-pyrone 5 produced by SubA is combined with GG pyrophosphate formed by SubD through the action of SubC to yield 6. The expression of the NR-PKS gene subA in A. nidulans A1145 confirmed that it was responsible for the formation of 5. Furthermore, the formation of the linear α-pyrone diterpenoid 6 upon introduction of subA, the prenyltransferase-encoding subC and the GG pyrophosphate synthase-encoding subD into A. nidulans A1145 provided us with an experimental support for the first half of the proposed biosynthetic pathway leading to the formation of subglutinols 1, 2 and 3 (Figure 5). Subsequent steps in the subglutinol biosynthetic pathway involves the decalin core formation, which is thought to be initiated by the epoxidation of the C10–C11 olefin by FAD-dependent oxidoreductase SubE. The following cyclization cascade would be catalyzed by the terpene cyclase SubB, which is frequently annotated as an integral membrane protein. Lastly, the FAD-dependent dehydrogenase SubF can perform the five-membered cyclic ether formation to complete the formation of 3. Subsequent redox reactions appear to give rise to the two new subglutinol analogs isolated in this study. However, it remains unclear which enzymes are responsible for these transformations. Elucidation of the complete subglutinol biosynthetic pathway is currently underway.

Figure 5
figure 5

Proposed subglutinol biosynthetic pathway. The absolute configuration of 2 was predicted based on the known absolute configuration of 3 and 432 along with the currently available experimental data, except for C14. Predicted domain organization of the iterative PKS SubA is shown. ACP, acyl carrier protein; AT, acyltransferase; FPP, farnesyl pyrophosphate; GGPP, geranylgeranyl pyrophosphate; IPP, isopentenyl pyrophosphate; OPP, diphosphate; KS, ketosynthase; MT, methyltransferase; PKS, polyketide synthase; SAT, starter unit: ACP transacylase.

Our A. nidulans A1145-based heterologous expression system8, 26 proved to be very useful, because it allowed engineered biosynthesis of secondary metabolites even if the original producing fungus was not a genetically well-established strain. Use of the system not only enabled us to identify the biosynthetic genes directly, but also provided a convenient means to achieve higher titer of desired intermediates for detailed physicochemical characterizations. As demonstrated in this study, with the steadily growing chemical and biosynthetic knowledge of secondary metabolites, it is expected that chemical screening carried out by searching natural product databases with LC and HRMS data obtained from microbial culture would allow us to identify readily new natural products produced by a microorganism. In addition, ease and lowered cost of sequencing the microbial genome and performing bioinformatics analyses facilitate greatly the identification of biosynthetic genes responsible for the formation of new compounds in a shorter time frame. Lastly, convenient heterologous expression of the identified genes in an easy-to-handle-engineered microbe-like A. nidulans A1145 simplifies the confirmation of the function of the genes. Combining these techniques, we should be able to practice innovative and sustainable drug discovery for broadening the repertoire of available therapeutic agents and biosynthesizing various natural product analogs through streamlining the process of identifying novel compounds and their associated biosynthetic genes.

Materials and methods

Reagents, strains and general techniques for DNA manipulation

All of the chemicals were purchased from Tokyo Chemical Industry Co. Ltd, Sigma-Aldrich and Wako Pure Chemical Industries Ltd unless otherwise specified. Purchased chemicals were of reagent grade and used without further purification. M. robertsii ARSEF 23 was obtained from the United States Department of Agriculture-Agricultural Research Service. A. nidulans A1145 was obtained from the Fungal Genetics Stock Center in the USA. Escherichia coli XL1-Blue (Agilent Technologies, Santa Clara, CA, USA) and DNA restriction enzymes were used as recommended by the manufacturer (Thermo Fisher Scientific Inc., Waltham, MA, USA). Polymerase chain reaction was carried out using PrimeSTAR GXL DNA polymerase (TAKARA Bio Inc., Kusatsu City, Japan) as recommended by the manufacturer. Sequences of polymerase chain reaction products were confirmed through DNA sequencing (Macrogen Japan Corporation, Kyoto City, Japan). Saccharomyces cerevisiae BY474127 used for homologous recombination-based molecular cloning of genes and plasmid assembly was obtained from the Yeast Genetic Resource Center in Japan.

Purification of the natural products from M. robertsii ARSEF 23

To purify 1, M. robertsii ARSEF 23 was cultured in 10 l of YPD liquid medium at 30 °C for 8 days with shaking at 200 r.p.m. The culture was filtrated to separate the liquid medium and the cells. The cells were extracted with acetone (2 × 500 ml). The acetone extract was combined and concentrated in vacuo to give an oily residue, which was then fractionated by silica gel flash column chromatography with CHCl3/CH3OH (1:0→0:1). The fraction eluted with CHCl3/CH3OH (10:1) was further purified by a reversed-phase HPLC using COSMOSIL 5C18 MS-II, 20 × 250 mm (Nacalai Tesque Inc.) on an isocratic elution system of 55% CH3OH (v/v) in H2O at a flow rate of 8.0 mlmin–1 to afford 1 (90.4 mg). Other compounds 2 (8.6 mg) and 3 (2.2 mg) were purified, following essentially the same experimental procedure as described above.

Culture media and transformation of A. nidulans A1145

For isolation of natural products, M. robertsii ARSEF 23 was cultured using YPD medium (20 gl–1 peptone, 20 gl–1 glucose and 10 gl–1 yeast extract) and MYG medium (10 gl–1 malt extract, 4 gl–1 yeast extract and 4 gl–1 glucose). A. nidulans A1145 was initially grown on CD agar plates containing 10 mM uridine, 5 mM uracil, 0.5 μgml–1 pyridoxine HCl and 2.5 μgml–1 riboflavin at 30 °C for 5 days.8 Approximately 1 × 108 to 1 × 1010 of conidia collected from a single plate were used to inoculate 200 ml of CD medium containing 10 mM uridine, 5 mM uracil, 0.5 μgml–1 pyridoxine HCl and 2.5 μgml–1 riboflavin. This culture was shaken at 30 °C for 20 h. Grown cells were collected by filtration and washed with 0.8 M sodium chloride. The cells were incubated with 10 ml of 10 mM sodium phosphate buffer (pH 6.0) containing 0.8 M sodium chloride, 20 mgml–1 lysing enzyme (Sigma-Aldrich, St Louis, MO, USA) and 1500 units of β-glucuronidase at 30 °C for 24 h. The resulting protoplasts were filtered and subsequently centrifuged at 2500 × g for 5 min at room temperature. The collected protoplasts were washed with 0.8 M sodium chloride and centrifuged to remove the wash solution. The cells were suspended in 200 μl of STC buffer at pH 8.0 (0.8 M sodium chloride, 10 mM calcium chloride and 10 mM Tris-HCl). Then, 40 μl of PEG solution at pH 8.0 (400 mgml–1 polyethylene glycol 4000, 50 mM calcium chloride and 50 mM Tris-HCl) was added to the protoplast suspension. The mixture was subsequently combined with 5 μg of the plasmid with which the cells were to be transformed. The mixture was incubated on ice for 20 min to allow the transformation to proceed. After incubation on ice, 1 ml of the PEG solution was added to the reaction mixture, and the mixture was incubated at room temperature for additional 5 min. The cells were plated on selective CD agar plates supplemented with 0.8 M sodium chloride as an osmotic stabilizer to select for transformants harboring desired plasmid(s). Then, the plates were overlaid with CD medium containing 0.5% agar, and incubated at 30 °C for 3 days.

Cloning of the subglutinol biosynthetic genes

The open reading frame of each of the subglutinol biosynthetic genes was predicted based on the M. robertsii ARSEF 23 genome sequence information available from the National Center for Biotechnology Information (NCBI) database, and their predicted function was determined by comparison with known proteins using the BLAST protein sequence database search program.23 To construct vectors for expression of the genes in A. nidulans A1145, we isolated the genomic DNA from the mycelium of M. robertsii ARSEF 23 that was grown on an MYG agar plate at 30 °C for 5 days. The mycelia weighing 100 mg were flash-frozen in liquid nitrogen, and grounded with a refrigerated mortar and pestle for 2 min. The resulting cell powder was allowed to thaw into a lysate mixture. The genomic DNA was isolated from the lysate using the cetyl trimethylammonium bromide (CTAB) DNA extraction method.28 The full-length gene-coding locus was amplified by polymerase chain reaction for subsequent cloning for heterologous expression in A. nidulans A1145 (see the Supplementary Information for details).

Engineered biosynthesis of subglutinol intermediates in A. nidulans A1145

Firstly, subA coding for an NR-PKS was inserted into pKW2009329 that carried a copy of the A. fumigatus orotidine-5′-phosphate decarboxylase AfpyrG as a selective marker to yield pKW16020 (see the Supplementary Information and Supplementary Figure 2). The resultant plasmid was introduced into A. nidulans A1145, and the transformants were initially grown on selective CD agar30 plates containing 0.8 M sodium chloride, 0.5 μgml–1 pyridoxine HCl and 2.5 μgml–1 riboflavin at 30 °C for 3 days.8 The grown transformants were checked by polymerase chain reaction to confirm for the presence of pKW16020 carrying subA. The spores from transformants carrying pKW16020 were used to inoculate 30 ml of fresh CD-ST medium (CD liquid medium containing 20 g l–1 starch and 20 g l–1 tryptone without glucose) containing 0.5 μg ml–1 pyridoxine HCl and 2.5 μg ml–1 riboflavin. The culture was incubated at 30 °C for 3 days and subsequently extracted with ethyl acetate to give an oily residue. The dried material was dissolved in 100 μl of N,N-dimethylformamide and subjected to UHPLC–HRMS analysis (see Materials and methods), where a newly formed peak was identified (Figure 3). The constituent of the peak having an m/z of 141.0544 [M+H]+ was confirmed to be the α-pyrone portion of subglutinols (5) with the use of an authentic standard.25 Similarly, we constructed two additional expression vectors, pKW16025 (selective marker: A. fumigatus pyridoxal 5′-phosphate synthase, AfpyroA) carrying subC coding for a prenyltransferase and pKW16026 (selective marker: A. fumigatus GTP cyclohydrolase, AfriboB) carrying subD coding for a GG pyrophosphate synthase (the Supplementary Information and Supplementary Figure 2). Subsequently, those three plasmids pKW16020/16025/16026 were introduced into A. nidulans A1145. The grown transformants having the triple plasmids were cultured for 3 days in CD-ST medium. We were able to detect the presence of a product having an m/z of 413.3046 [M+H]+, the expected m/z of the linear GG α-pyrone (6), in the culture by the UHPLC–HRMS analysis. For detailed characterization of the compound, we prepared a large culture of the strain. Five liters of fresh CD-ST was inoculated with 150 ml of the culture and incubated at 30 °C for 6 days using the BioFlo/CelliGen 115 fermentor system (Eppendorf Inc., Hamburg, Germany). The resultant culture was processed to obtain an oily residue following the same extraction procedure described above. This residue was fractionated by flash column chromatography, and we isolated 6 (0.1 mg) (Figure 4).

UHPLC–HRMS and NMR analyses

UHPLC–HRMS analysis was performed with a Thermo Scientific Accela Exactive liquid chromatography mass spectrometer using both positive and negative ESI. Samples were analyzed using an ACQUITY UPLC 1.8 μm, 2.1 × 50 mm C18 reversed-phase column (Waters, Milford, MA, USA), and separated on a linear gradient of 5–100% (v/v) CH3CN in H2O supplemented with 0.05% (v/v) formic acid at a flow rate of 500 μlmin−1. The results of the analysis by Thermo Scientific Xcalibur software are given in Figures 3 and 4 in the main text and Supplementary Figure 1 in the Supplementary Information. NMR spectra were obtained with a JEOL (Musashino City, Japan) JNM-ECA 500 MHz spectrometer (1H 500 MHz, 13C 125 MHz). 1H NMR chemical shifts are reported in p.p.m. using the proton resonance of residual solvent as reference: CDCl3 δ 7.26 and CD3OD δ 3.31. 13C NMR chemical shifts are reported relative to CDCl3 δ 77.16 and CD3OD δ 49.0.31