Abstract
The use of compounds produced by hosts or symbionts for defence against antagonists has been identified in many organisms, including in fungus-farming termites (Macrotermitinae). The obligate mutualistic fungus Termitomyces plays a pivotal role in plant biomass decomposition and as the primary food source for these termites. Despite the isolation of various specialized metabolites from different Termitomyces species, our grasp of their natural product repertoire remains incomplete. To address this knowledge gap, we conducted a comprehensive analysis of 39 Termitomyces genomes, representing 21 species associated with members of five termite host genera. We identified 754 biosynthetic gene clusters (BGCs) coding for specialized metabolites and categorized 660 BGCs into 61 biosynthetic gene cluster families (GCFs) spanning five compound classes. Seven GCFs were shared by all 21 Termitomyces species and 21 GCFs were present in all genomes of subsets of species. Evolutionary constraint analyses on the 25 most abundant GCFs revealed distinctive evolutionary histories, signifying that millions of years of termite-fungus symbiosis have influenced diverse biosynthetic pathways. This study unveils a wealth of non-random and largely undiscovered chemical potential within Termitomyces and contributes to our understanding of the intricate evolutionary trajectories of biosynthetic gene clusters in the context of long-standing symbiosis.
Similar content being viewed by others
Introduction
Symbiotic interactions are often mediated by specialised (secondary) metabolites of symbiont origin that serve in communication, signalling, and defence1,2. The genetic machinery encoding the production of these metabolites is organised in biosynthetic gene clusters (BGCs) that usually encode all genes required for their biosynthesis3. The repertoire of compounds produced by bacterial and fungal symbionts is vast and rarely comprehensively characterised in a symbiotic or coevolutionary context. Characterising such landscapes can provide insights into the natural product repertoire of symbionts and the genetic underpinnings of selection regimes that have characterised the evolutionary histories of metabolite gene clusters of importance in symbioses4,5.
Fungus-farming termites in the subfamily Macrotermitinae (Blattodea, Termitidae) have cultivated basidiomycete fungi in the genus Termitomyces (Agaricomycetes, Lyophyllaceae) for 30 million years (Fig. 1A)6. Termite colonies maintain monoculture fungus gardens (Fig. 1B) in optimal growth conditions on a constant supply of plant substrate for fungal growth and, in return, benefit from a dependable nutritionally enriched fungal food source7,8. Over the course of 30 million years, the symbiosis has diversified to include more than 300 described termite and 52 described Termitomyces species9, displaying some degree of association specificity indicative of host-symbiont adaptation and coevolution6,10.
A A fungus-farming termite colony of Macrotermes natalensis in South Africa (Photo credit: Michael Poulsen), within which termite hosts cultivate Termitomyces in fungus combs (Photo credit: Saria Otani) (B). C Phylogeny based on 100 single-copy orthologous genes allowed for solid placement of 39 genomes of Termitomyces, for which individual strain labels are coloured by termite host genus (green: Odontotermes, blue: Macrotermes, yellow: Microtermes, pink: Ancistrotermes, brown: Pseudacanthotermes and black: unknown). Support values are ultrafast bootstrap support (left) and SH-aLRT (right). Based on species-delimitation results using the Poisson tree processes (PTP)64, genomes were assigned to 21 species, indicated by letters A-U (Supplementary Table 2). The genomes ranged in completeness based on the BUSCO Agaricales gene set (48–98%). D The BUSCO measure of overall genome completeness was only weakly associated with the number of BGCs identified per genome (10–35). The left portion of the panel provides a heatmap of predicted BGCs for each genome (Supplementary Table 3), amounting to a total of 754 BGCs (Supplementary Data 3). E Violin plot of the percentage of complete BGCs by class across all Termitomyces species.
Specialised metabolites produced by Termitomyces have been hypothesized to play roles in signalling and defence, and Termitomyces species produce a range of metabolites, including terpenes, fatty acids, indoles, polyketides (PKSs), non-ribosomal peptides (NRPs), and phenylpropanoids11,12. These classes of specialised metabolites cover a broad range of functions in fungi, from the frequent antimicrobial properties of NRPs13, the iron capture of siderophores14, intra- and inter-kingdom signalling roles of terpenes and terpenoids15, to roles of indoles in growth and development16. More than 257 bioactive specialised metabolites have been identified over the past decades, almost exclusively from chemical analyses of mushrooms17,18, fungal gardens19, and cultured isolates under different growth conditions20,21, but only a fraction of these has been tied to BGCs22. This is in part a consequence of the length of BGCs that often preclude identification from low-quality genomes23. With only 11 of 52 Termitomyces species having been explored, and with varying analytical approaches, it has remained unclear how consistent the biosynthetic potential is across the genus and how the evolutionary histories of BGCs have been shaped by millions of years of coevolution with termite hosts.
To obtain a robust understanding of the biosynthetic potential of Termitomyces, we improved the annotation of 22 existing genomes and generated 17 new genomes, which allowed for comparative genomics analyses of the evolutionary histories of gene clusters coding for the production of specialised metabolites across 21 Termitomyces species. We combined in silico tools with manual curation to identify BGCs that putatively synthesize compounds from five major specialised metabolite compound groups. We then, established their consistency across species and evaluated their evolutionary trajectories and signatures of positive selection that could indicate ongoing arms race between Termitomyces chemistry and antagonists. We show that a small number of Termitomyces BGCs represent a core set of specialised metabolites for the genus that has likely remained important over the course of millions of years of coevolution with termites. The variation we observe in BGC profiles between species is characteristic of gains or losses over time that could indicate ecological importance to specific termite-Termitomyces pairs, and signatures of positive selection in a subset of BGCs imply the potential for defensive functions subject to selection pressures invoked by target antagonists.
Results
High-resolution consensus phylogeny and extensive presence of BGCs across the genus
Capitalising on 22 existing and 17 newly sequenced genomes of Termitomyces and extensive optimisation of assemblies and annotation procedures, we generated a consensus phylogeny revealing that the 39 genomes spanned 21 different species – a substantial fraction of the known Termitomyces species diversity6,24 (Fig. 1C). Robust confidence scores indicated that most branch splits were well supported. This high-resolution phylogeny represents one of the most robust phylogenetic analyses of the genus to date, based on the best-quality Termitomyces genome set available, and is congruent with previously established termite-fungus co-phylogenies6,10.
The optimised annotations allowed identification of 754 BGCs (Supplementary Table 3 and Supplementary Data 3) across all strains, ranging from 10–35 BGCs per genome. BGCs encoding terpene biosynthesis were most abundant, with 2–19 per genome, followed by 1–11 non-ribosomal peptide synthetases (NRPS-like), 1–6 ribosomally-synthesized and fungal post-translationally modified peptides (fungal-RiPP-like), one indole-like (missing in two and fragmented in three genomes), one NRPS-independent (NI)-siderophore (missing in two and fragmented in three genomes), and one NRPS-siderophore (missing in eight genomes) (Fig. 1D). Throughout the manuscript, we refer to the compound class a BGC encodes the biosynthesis of as its “class”, e.g., BGCs encoding terpene biosynthesis are referred to as terpene BGCs25,26. Predictably, there was greater variation in the percentage of complete BGCs in each class for terpene and NRPS-like BGCs, as they were more abundant. However, the majority of BGCs were complete for each Termitomyces species within a class (Fig. 1E and Supplementary Data 3). Notably, BGCs classified as fungal-RiPP, based on the DUF3328 UstY-like HMM profile hit, should be interpreted with some caution. To establish that these BGC indeed are fungal-RiPPs, the precursor peptide would need to be annotated25,26. We therefore tentatively consider these BGCs as “fungal RIPP-like”. Similarly, indole BGCs were classified as such when a BGC encodes a dimethylallyltryptophan synthase (DMATS)-type prenyltransferase; however, DMATS-type prenyltransferases can also prenylate non-indole substrates27. To ascertain caution in indications of the final product of these BGSs, we refer to them as indole-like BGCs.
Grouping the diversity of detected BGCs into GCFs based on their shared domains and encoded backbone enzyme identities allowed us to distribute 660 of the 754 BGCs into 61 distinct GCFs (Fig. 2B–D, Supplementary Table 4). BGCs not assigned to a GCF, e.g., singletons or very short BGCs, were excluded from subsequent analyses. The retained GCFs represented putative BGCs encoding 32 terpene cyclases, 19 NRPS-like, seven fungal RiPP-likes, one potential indole, one NRPS-siderophore, and one NI-siderophore. The NRPS-siderophore (GCF6) had a conserved core enzyme domain architecture (adenylation(A)1-thiolatino(T)1-condsation(C)1-T2-C2-T3-C3) (Fig. 3B) and subsequent sequence analysis revealed close similarity to a type VI basidiomycete siderophore synthetase. This synthetase is believed to produce the trimeric siderophore basidioferrin14,28, so that this BGC encodes for a NRPS-based siderophore (NRPS-siderophore). The BGCs for the NI-siderophore (GCF2a+b) and fungal-RiPP-like (GCF5a+b) GCFs each encoded two core enzymes that were situated either on a single contig or split across two contigs (Supplementary Data 3 and 4). This separation appeared to be due to fragmented genome assemblies, as subsequent alignment and phylogenetic analyses of core and tailoring genes revealed that each pair should be in each their respective GCF (Supplementary Data 4 and Figs. 1 and 2). The two core enzymes in the NI-siderophore both contained an N-terminal IucA/IucC family domain and a C-terminal conserved ferric iron reductase FhuF-like transporter domain, whereas the fungal-RiPP-like BGCs encode two enzymes: one with a hit to the adenylate-forming (AMP-binding) profile and one with a hit to the DUF3328 profile (Fig. 3 and Supplementary Data 4).
A Comparison of the Termitomyces species tree (left, derived from Fig. 1C) with cluster analyses of the similarity in GCF composition (Supplementary Table 4) (based on complete linkage) revealed some degree of matching. Colours behind the species tree tip label are by termite host genus: green: Odontotermes, blue: Macrotermes, yellow: Microtermes, pink: Ancistrotermes, brown: Pseudacanthotermes and black: unknown. Presence (blue) / absence (white) heatmap of GCFs (class labels at the bottom), organised by their profile similarity based on the BiG-SCAPE distance matrix67 across Termitomyces species. This revealed a set of universal GCFs across species, a set of common but not universal GCFs, sets of species-specific GCFs and a set of inconsistently distributed GCFs, including singletons. B–D Network representation of the 61 GCFs (BGCs) visualised using Cytoscape 3.9.1, where each node in a network represents a BGC in a species (labelled A-U), edges represent their relations, and colours represent BGC class (Supplementary Table 5 and Supplementary Data 4). B Networks of the seven GCFs whose member BGCs were identified in all 21 Termitomyces species. Networks 2a + 2b (the siderophore) and 5a + 5b appeared to be distinct; however, further analysis showed each pair to respectively be a single GCF (see text for details). C Networks of 21 terpene, NRPS-like, and fungal-RIPP-like GCFs for which member BGCs were present in only a subset of Termitomyces species but where all 1–4 genomes of a given species had the BGC. D Networks of the remaining 33 GCFs which contained BGCs that were only present in a subset of genomes within a species.
A Box plot showing the percentage of genes with positive selection across BGC classes (left) and 25 GCFs ordered from highest to lowest mean of genes with positive selection (right). Whiskers extend to 1.5*Interquartile Range (IQR). NRPS and NRPS-like BGCs generally contain genes experiencing more positive selection than other BGC classes (Supplementary Table 6 and Supplementary Data 6). B Gene cluster structures of the nine GCFs with most genes with positive selection, of which two (GCFs 6 and 7) displayed consistent positive selection in the same gene in every BGC assigned to the GCF. GCFs are displayed as consensus representations of their contained BGCs. The number of species that contain a gene is indicated in the box denoting the gene in the cluster. The type of gene as assigned by fungiSMASH is indicated by colour. Genes with positive selection are indicated with bar plots of the percentage of genes displaying signatures of positive selection. If one or two species have positive selection for a given gene, the species identities are indicated with icons (A–U) (Supplementary Data 4). This is only noted if positive selection was present in 50% or more of the genomes for a given species. In each representation of a BGC we show the identified protein families obtained from the Pfam database, a = more than one known domain was present within the gene. All Pfam annotations are given Supplementary Data 4. DUF = Domain of unknown function. Significant levels for positive selection analyses are: * = p < 0.05, ** = p < 0.01, and *** = p < 0.001. The rest of the GCFs that were analysed for positive selection but with no or fewer branches under positive selection are available in Fig. S4.
Consistent GCFs across the Termitomyces phylogeny included one indole-like, one NI-siderophore, one NRPS-like, one fungal-RiPP-like, and three terpene-encoding GCFs that were identified in all 21 Termitomyces species (Fig. 2A + B, Supplementary Tables 4 and 5), implying that identical or very similar specialised metabolites are universally present across the genus. Twenty-one GFCs were only present in a subset of Termitomyces species (Fig. 2C), but they were represented in all genomes sequenced for each of these species (excluding genomes with BUSCO score <60%). These GCFs thus likely exist only in these species and their absence in other species is unlikely to be a product of poor annotation or assembly (Fig. 2, Supplementary Table 5 and Supplementary Data 3). Of these, GCFs 33, 34, 44, 60, and 65 were only present in Termitomyces species cultivated by Macrotermes spp. The remaining 33 GCFs were inconsistently detected across the diversity of Termitomyces (Fig. 2D). Given that the genomes were incomplete, we cannot rule out that some GCFs exist in more genomes than our analyses indicate. Although GCF profiles were also significantly affected by genome completeness (BUSCO score) (PERMANCOVA: F1,11 = 2.427, R2 = 0.0313, p = 0.0242), termite host (F4,11 = 6.535, R2 = 0.3369, p < 0.001) and Termitomyces species (F13,11 = 2.860, R2 = 0.4859, p < 0.001) more strongly impacted profiles.
BGCs with signatures of positive selection
Significant signatures of positive selection were absent for the vast majority of BGC genes belonging to the 25 GCFs we tested (143 of 216 orthologs; aBSREL, p > 0.05) (Supplementary Table 6 and Supplementary Data 5). The remaining 72 orthologs had significant (p < 0.05) dN/dS ratios (ω > 1) on specific gene branches, suggesting positive selection (Supplementary Data 6). From the 25 tested GCFs, genes originating from the single indole-like and 14 terpene biosynthesis-encoding BGCs generally experienced less positive selection than genes from the single NI-siderophore, fungal-RiPP-like, NRPS-siderophore and seven NRPS-like (Fig. 3A) BGCs, as evident from lower omega values (Supplementary Data 6). This suggests that these BGCs are more conserved than e.g., those of NRPS and NRPS-like GCFs highlighted in Fig. 3B. We observed evidence for gene-wide positive selection in a core gene of all BGCs in GCF6 and an unknown gene in GCF7 (Fig. 3B). Evidence for positive selection was, at times, limited to genes from specific species, such as gene 18 in the NRPS-siderophore GCF6, where positive selection was detected in Termitomyces sp. L (Fig. 3B, Supplementary Data 4). Similar patterns were identified in GCF1 (spp. K and J), GCF5 (spp. F, and H and J), GCF9 (spp. D and E), GCF11 (spp. K and J), GCF 29 (spp. L, M and N), GCF30 (sp. E) and GCF44 (sp. L) (Fig. 3B and Supplementary Fig. 3).
Similarity to biochemically characterised BGCs
Using the MIBiG database, we were able to identify two GFCs for which the constituent BGCs had high similarity matches. First, the NRPS-siderophore (GCF6) showed high similarity to a synthetase forming the siderophore basidioferrin, which is widely distributed in basidiomycetes fungi14. Second, GCF27 had high similarity to (+)-δ-cadinol, which has also previously been identified in a basidiomycete29 (Supplementary Table 4) and shows cytotoxic activity30. Both BGCs we also also identified in the closest known relative of Termitomyces (A. matolae; GCA_018855395). Many of the biochemically characterised terpene cyclases identified in Termitomyces have yet to be added to the MIBiG database22,28. Our manual similarity assessment demonstrated matches to four terpene BGCs (Supplementary Table 4), including GCF29 that includes BGCs encoding the biosynthesis of (+)-germacrene D-4-ol, present in all Termitomyces species. ( + )-Germacrene D-4-ol is typically produced by plants, where it has been linked to antimicrobial properties31,32, and a study has linked it to roles as an insect pheromone33. Our analysis also uncovered BGCs in GCF37 in five species of Termitomyces that encode (-)-δ-cadinene28,34. BGCs belonging to GCF34, which was only identified in Macrotermes spp.-associated Termitomyces, contained BGCs that have previously displayed bifunctional activity capable of transforming geranyl pyrophosphate and farnesyl pyrophosphate into numerous terpenes, most of which have been identified within the fungus comb volatilome, including camphene and d-limonene28.
Discussion
Improved genomes allowed robust phylogenomics and elucidation of a core set of Termitomyces BGCs
By significantly enhancing the annotation of publicly available genomes and generating new high-quality annotated Termitomyces genomes, we established the to-date most robust phylogenomic analyses of the genus. This allowed solid species assignment of 39 genomes to 21 species and the identification of 754 BGCs distributed across 61 GCFs that we could analyse in an evolutionary context. These efforts increased the available genomes for the genus from 28 to 45, spanning the five termite host genera Macrotermes, Microtermes, Ancistrotermes, Odontotermes, and Pseudacanthotermes and will serve as an important genomics resource to address new questions in the termite-fungus symbiosis. Despite previous reports of polyketides in Termitomyces12, our analysis found no BGCs encoding polyketide synthases (PKSs). This implies that PKS BGCs were either not identified with the pipeline we employ or that they are recognised as other types of BGCs. Future work is needed to improve the identification and classification of PKS BGCs bioinformatically, and to develop more comprehensive databases. Experimental validation of predicted BGCs and their associated compounds will also be crucial to understand the chemical identities of the BGCs uncovered. This further underlines that although our approach provides a comprehensive overview, it does not allow a complete account of the chemical diversity encoded in the genus.
We identified a core set of seven GCFs that were consistently present across the genus and future improvements of genomes from sequenced species, as well as the sequencing of new species, will inevitably expand this set. As a case in point, GCF31 (encoding a terpene synthase), GCF6 (NRPS-siderophore), and GCF7 (NRPS-like) were consistently present across the Termitomyces phylogeny, yet missing in, respectively, one, two and seven fungal species, suggesting that improved genomes will fill these gaps. We therefore cautiously conclude that at least 10 GCFs are consistently present across the Termitomyces genus, encoding an indole-like, a NI-siderophore, an NRPS (presumably basidioferrin), a fungal-RiPP-like, two NRPS-like compounds, and four terpenes (two of which were putatively identified to synthesise(+)-δ-cadinol and (+)-germacrene D-4-ol). From these BGCs, only two have previously been identified in other basidiomycete fungi (GCF6 and GCF27). The maintenance of a core set of BGCs across more than 30 million years of coevolution with termite hosts suggest important functions, and future work should prioritise deciphering roles of their products in symbiotic interactions with hosts or in defence.
BGC compositions are non-random across the phylogenetic history of the genus
Beyond the core set of GCFs, we identified a further 21 that were only present in subsets of Termitomyces species, indicating that some gene sequences linked to specialised metabolite syntheses likely serve specific roles in some species and in interactions with hosts. These were dominated by BGCs encoding terpenes (13) but also five NRPS-like and three fungal-RiPP-like BGCs. The species specificity in these BGCs is intriguing and may have arisen from species-specific losses of ancestrally universal BGCs or from gains over evolutionary time through horizontal gene transfers from other microorganisms. However, given that the assignments of several of the GCFs are based on only few genomes, it will be important in future work to verify this species specificity, in concert with determining the ecological factors that have led to their origins and persistence.
In line with species specificities and events of BGC gains and losses, we found that GCF profiles were to some extent congruent with the Termitomyces phylogeny; however, with several incidences of incongruence, suggesting either recent divergence or incomplete predictions. Despite these discrepancies, the patterns of broad-scale association specificity between termite host genera and Termitomyces imply some degree of host-specificity in BGC segregation and biosynthetic potential of Termitomyces. The most notable examples of this include five terpene clusters (GCFs 33, 34, 44, 45, and 60) and a fungal-RiPP-like cluster (GCF65) that were exclusive to Termitomyces cultivated by Macrotermes spp. Similarly, an NRPS-like compound (GCF24) and five terpenes (GCFs 45, 46, 47, 51, and 57) were only present in the Termitomyces sister species J and K that is cultivated by Ancistrotermes and Microtermes6,35. As alluded to above, such patterns may arise from losses in all other Termitomyces species, should they have been present in the most recent common ancestor of the genus or via gains over evolutionary time. Irrespectively, the patterns imply that there is not a universal recipe for the biosynthetic capacities associated with being a termite cultivar. Elucidating the idiosyncrasies that surround uniqueness in biosynthetic potential will be important to improve our understanding of the fundamental biology of host-symbiont interactions and cultivar roles.
Potential for distinct evolutionary trajectories of different BGC classes
Although some BGCs contained genes subject to positive selection based on dN/dS ratios, the selection analyses indicated that most did not, and no GCF contained BGCs that experienced positive selection on all genes. Our analyses did, however, reveal 18 GCFs with significant positive selection on at least one site or branch. This was primarily within the NRPS-like clusters, which may reflect that they often are antimicrobial and therefore may engage in arms-race dynamics with pathogens, where positive selection could generate novel chemistry for antimicrobial defence. Furthermore, gene specific positive selection was inferred to varying degrees, e.g., the finding that GCFs 6 and 7 experienced gene-wide, positive selection acting consistently on specific genes could indicate that gene functions within clusters could play adaptive roles.
The majority of the 25 GCFs subjected to positive selection analyses experienced multiple episodic positive selection events within their representative BGCs. In some instances, episodic positive selection events appeared to be associated with termite host species. For example, positive selection confined to a single gene in the NRPS-like GCF11 was inferred only in Termitomyces species associated with Ancistrotermes. Similarly, BGCs in Macrotermes-associated Termitomyces that encode the biosynthesis of GCF4 (NRPS-like) and GCF29 (terpene), which are both present in all species of Termitomyces, exhibited episodic positive selection events on non-core genes. This suggests termite host-dependent effects on specialised metabolites through specific ecological pressures that are then experienced by given fungal species. BGCs with evidence for positive selection may be of particular interest for further characterisation of the produced compound for the discovery of chemical novelties and compounds with antimicrobial properties.
GCFs with no or very few genes under positive selection could either reflect conserved functions or alternative avenues to chemical novelty. The latter may be the case for several terpene-encoding BGCs that overall displayed lower rates of inferred positive selection, but for which it has been established that a single BGC can give rise to extraordinary chemical diversity28. For example, a single terpene cyclase in Termitomyces species L, M, and N (based on our species assignment) allows for the production of more than 20 different compounds28. Further understanding of the evolution of these gene clusters, the functions of their genes, and the compounds they encode, will be needed to determine the relative role of positive selection and alternative avenues for chemical novelty in fungus-farming termite cultivar.
Conclusions and perspectives
Through significant improvement of genomes across the phylogenetic history of the genus Termitomyces, our comparative genomics unravelled a rich set of biosynthetic gene clusters encoded by the fungal cultivar of farming termites. BGC profiling allowed us to establish a core set of BGCs that likely has been present since the origin of fungiculture in termites 30 million years ago, as well as indications of BGC gains and losses over evolutionary time. Based on our positive selection analyses, our findings suggest that different compound classes may be subject to distinct evolutionary trajectories. Specifically, our findings suggest that NRPS and NRPS-like gene clusters are subject to more frequent consistent or episodic positive selection than e.g., terpenes. Chemical novelties may nevertheless occur in the latter, where substantial chemical diversity may arise from just a single conserved gene cluster28. Collectively, this indicates that millions of years of termite-fungus symbiosis have led to rich biosynthetic potential with distinct evolutionary trajectories of biosynthetic gene clusters and ample chemical novelties.
The vast non-random and largely undescribed chemical potential in Termitomyces implies a rich potential for future discoveries of specialised metabolites. Only five of the GCFs we describe encode metabolites or analogues of metabolites that have been characterized. The ecological function of these compounds for the fungus and potential roles in interactions with termite hosts remain largely unknown. Our evolutionary-guided approach serves as an important framework to further our understanding of both universally-present and unique chemistry in the fungal genus, and broadly for detailing identities and ecological roles of specialised metabolites in the ecology of fungal species and host-specific contexts. The vast roles and activities these natural products likely represent provide a rich potential to elucidate the chemical ecology of farming-termite symbiosis and the roles of natural products in interactions and defence in ecological and evolutionary contexts.
Materials and methods
Termitomyces genomes
To secure a comprehensive set of genomes that span the diversity of Termitomyces and termite hosts, we compiled genomes from 22 strains of published work (Supplementary Data 1) and sequenced 17 additional genomes of Termitomyces obtained from termite colonies collected in the Comoé and Lamto field stations in Cote d’Ivoire in 2018 and 2019 (Supplementary Data 1). New isolates were obtained by placing nodules (asexual structures produced within fungus combs) of Termitomyces on Potato Dextrose Agar (PDA; 39 g/l) and subcultured until pure.
The resulting 39 Termitomyces genomes spanned at least ten species (five genera) of termite hosts, but conceivably more since host origin was unknown for eight strains (Supplementary Data 1). Isolates from known hosts included one strain from Pseudacanthotermes sp., one from Ancistrotermes sp., one from Ancistrotermes guinensis, three from Ancistrotermes cavithorax, two from Odontotermes transvaalensis, two from Odontotermes cf. badius, two from unknown Odontotermes species, six from Macrotermes bellicosus, three from Macrotermes subhyalinus, five from Macrotermes natalensis, one from Macrotermes gilvus, and four from Microtermes spp. (Supplementary Data 1). Termite species were confirmed with barcoding as described by Zaman et al. (Supplementary Data 1)36.
Strain genotyping
New isolates were verified as Termitomyces by barcoding of the Internal Transcribed Spacer (ITS) region37. DNA was extracted using the Chelex protocol as described by Conlon (2022)38. PCR was performed using basidiomycete-specific primers ITS1F and ITS4B39 following the protocol described by Schmidt et al.40. Purified PCR products were sent to Eurofins MWG Operon (Ebersberg, Germany) for sequencing. Forward and reverse sequences were aligned in Geneious prime version 2019.1.1 (Biomatters Ltd., New Zealand) using Geneious’ own algorithm after primers and low-quality ends were trimmed. Sequences were then blasted against the NCBI database to confirm their identity.
Genome sequencing, assembly, and annotation
We extracted DNA from the 17 Termitomyces isolates using a CTAB extraction optimised for high yield and fragment length, with an initial fast-freeze step to increase yield38. Whole-genome sequencing was performed using a combination of 100 bp/150 bp paired-end shotgun (BGISEQ/DNBSEQ) and long-read (PacBio Sequel) sequencing by BGI. Short reads were filtered to a phred score of 30 using bbduk.sh v38.89 from the BBtools package (BBMap). PacBio long reads and BGIseq/DNBseq short reads were hybrid assembled with SPAdes (v.3.13.0)41,42 using the isolate mode. The resulting contigs were improved and re-scaffolded via RagTag (v. 2.1.0)43 using the low error rate short read sequences and the high-quality Termitomyces draft assembly “Termitomyces_v3.0”44. Assembly quality was assessed by quantifying genome completeness based on the expected gene content of the Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.0.045 against the database for Agaricales-odb10 genomes. Genome assembly statistics are available in Supplementary Data 2. In addition, we obtained 22 assembled genomes from GenBank. We used these 39 genomes to build a transposable element library and model with RepeatModeler v2.0.146, which was subsequently used by RepeatMasker v4.1.147 to soft mask all genomes with the rmblastn engine on sensitive mode, skipping bacterial insertions. We annotated the 39 soft masked genomes with Braker2 v2.1.648 using OrthoDBs odb10_fungi database49 with fungi mode and skipping the fixing of broken genes48,49,50,51,52,53. We retained the longest isoforms of the braker.gtf output, filtering out all other isoforms with ProtHints print_longest_isoform.py54. An antiSMASH fungal version (fungiSMASH) v7.055 ready EMBL annotation file was generated by converting the gtf output to gff3 and then EMBL.
Consensus phylogeny and species assignment of genomes
A total of 6807 orthologous groups (Supplementary Data 2) in the 39 genomes and a species from the sister genus of Termitomyces, Arthromyces matolae (GCA_018855395)10, were generated with OrthoFinder v2.5.456. Single copy orthogroups present in at least 90% of the species were aligned with MAFFT v. 7.45357 in auto mode followed by gene tree generation with IQ-TREE v2.1.358 using model finder (-m MFP)59. These gene trees were used to infer a species tree with ASTRAL-Pro from ASTRAL v5.7.160. Both the species and all gene trees served as input into GenesortR61 to evaluate the phylogenetic information of the orthologs and subsample 100 genes (Supplementary Data 2), thereby minimizing potential sources of systematic bias62,63. A matrix of these 100 genes was subsequently used by IQ-TREE v2.1.3 to reconstruct a final phylogeny using model finder (m- MFP), and 1000 ultra-fast bootstrap (-B 1000)1 to estimate the support of each node. We grouped genomes according to putative species boundaries using the PTP (Poisson Tree Processes) model via the PTP web server64 (Supplementary Table 2). This allowed unambiguous assignment of genomes into 21 species for subsequent BGC comparisons (Fig. 1C).
In silico analysis of Biosynthetic Gene Clusters (BGCs) and Gene Cluster Families (GCFs)
To identify putative BGCs, we uploaded genomes to antiSMASH fungal version (v7.0)55 with default parameters (Supplementary Table 3 and Supplementary Data 3). BGCs were deemed incomplete if they were positioned close to the edge of a contig65,66, BGCs positioned in the middle of a sequence thus represent BGCs with presumed known boundaries and an accompanying complete set of tailoring enzymes (Supplementary Data 3). To reduce the complexity of the BGC dataset, GenBank files from the fungiSMASH analysis were assigned to biosynthetic gene cluster families (GCFs; i.e., families of BGCs that share a similar organization and likely code for the biosynthesis of the same or similar compounds) through a pairwise distance analysis using the BiG-SCAPE network prediction software67, with default settings (Supplementary Tables 4 and 5). GCFs were illustrated through sequence similarity networks using Cytoscape 3.8.0 (Shannon et al., 2003). All homologous enzymes within each GCF were aligned using MUSCLE in Geneious prime version 2019.1.1. Phylogenetic trees for each GCF family were built using RAXML, following the Algorithm Rapid bootstrapping and by searching for the best-scoring Maximum Likelihood tree.
To improve BGC annotations, we manually curated outputs from fungiSMASH and BiG-SCAPE by alignment of all genes within each putative BGC from each network together with all BGCs without an assigned GCF, i.e., singletons. This was done to 1) ensure that all BGCs were assigned to a GCF correctly, 2) organise all genes within a GCF to correct for rearrangements and enable comparison of gene similarity and BGC completeness, and 3) identify putative matches to protein domain families using the Pfam database68. To ascertain caution in indication of the final product of these BGCs, Supplementary Data 3 provides an overview of all information regarding the BGC, GCF number, Termitomyces species, BGC class, BGC length, number of genes within a BGC, and whether a BGC is close to a contig edge. The resulting GCF profiles (i.e., the collective of BGCs in a genome belonging to different GCFs) are available in Supplementary Data 4. Comparative analysis of GCF profiles was undertaken by modelling the effect of termite host genus (excluding those with unknown hosts) and Termitomyces species on GCF presence with a PERMANOVA on pairwise Bray Curtis distances between GCF profiles using adonis2 from the Vegan package v2.6-269.
We further compared BGCs to chemically characterised BGCs from the Minimum Information about a Biosynthetic Gene Cluster (MIBiG v 3.0) repository34, following a published pipeline28,70. We also compared terpene core enzymes to previously identified terpenes28 to see if we could link GCFs to previously characterised BGCs. Consensus graphical representations of a GCF as a single BGC (Fig. 3B and Supplementary Figs 1 and 2) were created using Adobe Illustrator based on the BGC structure shown in the fungiSMASH output and the manual MUSCLE alignments of individual genes (using Geneious prime version 2019.1.1) (Fig. 3). In each representation of a BGC we show the identified protein families obtained from the Pfam database68 to highlight which domain(s) the individual gene carries i.e., its domain architecture (Fig. 3 and Supplementary Data 4).
Assessment of BGC genes under positive selection
To find signatures of selection on genes within BGCs of the 25 most abundant GCFs, we first assessed the orthology relationship between the genes from all BGCs comprising a GCF using OrthoFinder v2.5.456. Then, the nucleotide coding sequences from the genes in orthogroups present in at least five genomes were extracted with exonerate v.2.4.071 and aligned independently for each orthogroup using the PRANK v.170427 codon model72. Considering the divergence and some degree of fragmentation in the genomes, alignment quality was assessed using Zorro73, and unreliable positions were filtered. We then ran hmmcleaner v0.180750 to identify alignment errors and subsequently mask them74. For each orthogroup, we inferred a gene tree using the codon alignment in IQ-Tree v2.1.358; with model finder to determine the best fit model “-m MFP” and 1000 ultrafast bootstrap replicates “-B 1000”. To test for positive selection occurring across lineages, we used the adaptive Branch-Site Random Effects Likelihood (aBSREL) model implemented in the HyPhy package using the codon alignment and gene tree as input75,76. This model fits an optimal number of ⍵ (ratio of nonsynonymous (dN) to synonymous (dS) substitution rates) rate classes for each branch and allows inference of positive selection in specific lineages when ⍵ > 1. In addition, we tested for relaxation or intensification of the strength of positive selection using the RELAX method in HyPhy77. The p-values from the selection analysis were adjusted for the false discovery rate (FDR; Benjamini and Hochberg 1995) (Supplementary Data 5 and 6). Individual gene trees for the orthogroups with evidence for positive selection are available in Fig. 3 and Supplementary Fig. 3.
Statistics and reproducibility
This study presents the first extensive analysis of biosynthetic potential within the Termitomyces phylogeny, utilizing a sample size of 39 genomes sourced from both field collections and existing records in GenBank. Statistical analyses were rigorously performed using methods that uphold the standards of validity and reproducibility, with significance thresholds set at *P < 0.05, **P < 0.01, and ***P < 0.001. To guarantee the reproducibility of our findings, meticulous data management practices were employed. This included comprehensive documentation of data sources, processing steps, and analytical pipelines. Additionally, all codes and scripts utilized in this analysis are publicly accessible online.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Fungal genomes, termite COII and fungal ITS Sanger sequences have been deposited on GenBank under BIOproject PRJNA1111472 with the accession numbers SAMN41392512- SAMN41392528 (genome assemblies), PP239693-PP239709 (ITS), and PP833543-PP833559 (COII). Genome annotations have been deposited on Zenodo (https://doi.org/10.5281/zenodo.13268707).
Code availability
Code used for analyses is available on GitHub: https://github.com/Rob-murphys/Termitomyces_BGC_potential. Requests for additional information can be directed to the corresponding author, Suzanne Schmidt, upon reasonable request. Code as it existed at the time of publication is available at https://doi.org/10.5281/zenodo.1373571778.
References
Appalasamy, S., Diyana, M. H. A., Arumugam, N. & Boon, J. G. Evaluation of the chemical defense fluids of Macrotermes carbonarius and Globitermes sulphureus as possible household repellents and insecticides. Sci. Rep.-Uk 11, 153 (2021).
Kuswanto, E., Ahmad, I., Putra, R. E. & Harahap, I. S. Two Novel Volatile Compounds as the Key for Intraspecific Colony Recognition in Macrotermes gilvus (Isoptera: Termitidae). J. Entomol. 12, 87–94 (2015).
Tran, P. N., Yen, M. R., Chiang, C. Y., Lin, H. C. & Chen, P. Y. Detecting and prioritizing biosynthetic gene clusters for bioactive compounds in bacteria and fungi. Appl Microbiol Biot. 103, 3277–3287 (2019).
Kaltenpoth, M., Yildirim, E., Gurbuz, M. F., Herzner, G. & Strohm, E. Refining the Roots of the Beewolf-Streptomyces Symbiosis: Antennal Symbionts in the Rare Genus Philanthinus (Hymenoptera, Crabronidae). Appl Environ. Micro. 78, 822–827 (2012).
Engl, T. et al. Evolutionary stability of antibiotic protection in a defensive symbiosis. P Natl Acad. Sci. USA 115, E2020–E2029 (2018).
Aanen, D. K. et al. The evolution of fungus-growing termites and their mutualistic fungal symbionts. P Natl Acad. Sci. USA 99, 14887–14892 (2002).
Murphy, R. et al. in Assessing the Microbiological Health of Ecosystems (ed Christon J. Hurst) Ch. 8, 185–203 (John Wiley & Son Inc., 2023).
Poulsen, M. et al. Complementary symbiont contributions to plant decomposition in a fungus-farming termite. P Natl Acad. Sci. USA 111, 14500–14505 (2014).
Roskov, Y. et al. Species 2000 & ITIS Catalogue of Life, 2019 Annual Checklist. Digital resource. Species 2000: Naturalis, Leiden, the Netherlands, 2019).
van de Peppel, L. J. J. et al. Ancestral predisposition toward a domesticated lifestyle in the termite-cultivated fungus Termitomyces. Curr. Biol. 31, 4413–4421 (2021).
Hsieh, H. M. & Ju, Y. M. Medicinal components in Termitomyces mushrooms. Appl. Microbiol. Biot. 102, 4987–4994 (2018).
Schmidt, S., Kildgaard, S., Guo, H., Beemelmanns, C. & Poulsen, M. The chemical ecology of the fungus-farming termite symbiosis. Nat. Prod. Rep. 39, 231–248 (2022).
Oide, S. & Turgeon, B. G. Natural roles of nonribosomal peptide metabolites in fungi. Mycoscience 61, 101–110 (2020).
Brandenburger, E. et al. A Highly Conserved Basidiomycete Peptide Synthetase Produces a Trimeric Hydroxamate Siderophore. Appl. Environ. Micro. 83, e01478–17 (2017).
Schmidt, R. et al. Fungal volatile compounds induce production of the secondary metabolite Sodorifen in PRI-2C. Sci. Rep.-Uk 7, 862 (2017).
Fu, S. F. et al. Indole-3-acetic acid: A widespread physiological code in interactions of fungi with other organisms. Plant Signal Behav. 10, e1048052 (2015).
Borokini, F. et al. Chemical profile and antimicrobial activities of two edible mushrooms (Termitomyces robustus and Lentinus squarrosulus). J. Micro. Biotec. Food 5, 416–423 (2016).
Mahamat, O., André-Ledoux, N., Chrisopher, T., Mbifu, A. A. & Albert, K. Assessment of antimicrobial and immunomodulatory activities of termite associated fungi, Termitomyces clypeatus R. Heim (Lyophyllaceae, Basidiomycota). Clin. Phytosci. 4, 28 (2018).
Otani, S. et al. Disease-free monoculture farming by fungus-growing termites. Sci. Rep. 9, 8819 (2019).
Kreuzenbeck, N. B. et al. Isolation, (bio)synthetic studies and evaluation of antimicrobial properties of drimenol-type sesquiterpenes of fungi. Commun. Chem. 6, 79 (2023).
Yang, G. et al. Termitomyces heimii Associated with Fungus-Growing Termite Produces Volatile Organic Compounds (VOCs) and Lignocellulose-Degrading Enzymes. Appl. Biochem. Biotechnol. 192, 1270–1283 (2020).
Burkhardt, I., Kreuzenbeck, N. B., Beemelmanns, C. & Dickschat, J. S. Mechanistic characterization of three sesquiterpene synthases from the termite-associated fungus Termitomyces. Org. Biomol. Chem. 17, 3348–3355 (2019).
Blin, K., Kim, H. U., Medema, M. H. & Weber, T. Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Brief. Bioinform. 20, 1103–1113 (2019).
Ainsworth, G. C. Ainsworth & Bisby’s Dictionary of the Fungi. IMA Fungus 4 (2013).
Vogt, E., Sonderegger, L., Chen, Y. Y., Segessemann, T. & Künzler, M. Structural and Functional Analysis of Peptides Derived from KEX2-Processed Repeat Proteins in Agaricomycetes Using Reverse Genetics and Peptidomics. Microbiol Spectr. 10, e0202122 (2022).
Umemura, M. Peptides derived from Kex2-processed repeat proteins are widely distributed and highly diverse in the Fungi kingdom. Fungal Biol. Biotechnol. 7, 11 (2020).
Wunsch, C., Zou, H. X., Linne, U. & Li, S. M. C7-prenylation of tryptophanyl and O-prenylation of tyrosyl residues in dipeptides by an Aspergillus terreus prenyltransferase. Appl Microbiol. Biot. 99, 1719–1730 (2015).
Kreuzenbeck, N. B. et al. Comparative Genomic and Metabolomic Analysis of Termitomyces Species Provides Insights into the Terpenome of the Fungal Cultivar and the Characteristic Odor of the Fungus Garden of Macrotermes natalensis Termites. Msystems 7, e0121421 (2022).
Ringel, M. et al. Biotechnological potential and initial characterization of two novel sesquiterpene synthases from Basidiomycota Coniophora puteana for heterologous production of delta-cadinol. Micro. Cell Fact. 21, 64 (2022).
Yap, H. Y. et al. Heterologous expression of cytotoxic sesquiterpenoids from the medicinal mushroom Lignosus rhinocerotis in yeast. Micro. Cell Fact. 16, 103 (2017).
Cabral, C. et al. Composition and anti-fungal activity of the essential oil from Cameroonian Vitex rivularis Gürke. Nat. Prod. Res 23, 1478–1484 (2009).
Zamora, C. M. P., Torres, C. A. & Nunez, M. B. Antimicrobial Activity and Chemical Composition of Essential Oils from Verbenaceae Species Growing in South America. Molecules 23, 544 (2018).
Mozuraitis, R., Stranden, M., Ramirez, M. I., Borg-Karlson, A. K. & Mustaparta, H. (-)-germacrene D increases attraction and oviposition by the tobacco budworm moth Heliothis virescens. Chem. Senses 27, 505–509 (2002).
Terlouw, B. R. et al. MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters. Nucleic Acids Res 51, D603–D610 (2023).
van de Peppel, L. J. J. & Aanen, D. K. High diversity and low host-specificity of Termitomyces symbionts cultivated by Microtermes spp. indicate frequent symbiont exchange. Fungal Ecol. 45, 100917 (2020).
Zaman, M., Khan, I. A., Schmidt, S., Murphy, R. & Poulsen, M. Morphometrics, Distribution, and DNA Barcoding: An Integrative Identification Approach to the Genus Odontotermes (Termitidae: Blattodea) of Khyber Pakhtunkhwa, Pakistan. Forests 13, 674 (2022).
Schoch, C. L. et al. Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. P Natl Acad. Sci. USA 109, 6241–6246 (2012).
Conlon, B. H., Schmidt, S., Poulsen, M. & Shik, J. Z. Orthogonal protocols for DNA extraction from filamentous fungi. STAR Protoc. 3, 101126 (2022).
Gardes, M. & Bruns, T. D. Its Primers with Enhanced Specificity for Basidiomycetes - Application to the Identification of Mycorrhizae and Rusts. Mol. Ecol. 2, 113–118 (1993).
Schmidt, S. et al. Make the environment protect you from disease: elevated CO2 inhibits antagonists of the fungus-farming termite symbiosis. Front Ecol. Evol. 11, 1134492 (2023).
Prjibelski, A., Antipov, D., Meleshko, D., Lapidus, A. & Korobeynikov, A. Using SPAdes De Novo Assembler. Curr. Protoc. Bioinforma. 70, e102 (2020).
Antipov, D., Korobeynikov, A., McLean, J. S. & Pevzner, P. A. HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015 (2016).
Alonge, M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol, 23, 258. https://doi.org/10.1186/s13059-022-02823-7 (2022).
Vreeburg, S. M. E. et al. A genetic linkage map and improved genome assembly of the termite symbiont Termitomyces cryptogamus. BMC Genom. 24, 123 (2023).
Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: Assessing Genomic Data Quality and Beyond. Curr. Protoc. 1, e323 (2021).
Smit, A. F. A. & Hubley, R. RepeatModeler Open-1.0. 2008-2015 http://www.repeatmasker.org.
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0. 2013-2015 http://www.repeatmasker.org.
Hoff, K. J., Lange, S., Lomsadze, A., Borodovsky, M. & Stanke, M. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32, 767–769 (2016).
Kriventseva, E. V. et al. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res 47, D807–D811 (2019).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008).
Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y. O. & Borodovsky, M. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res 33, 6494–6506 (2005).
Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y. O. & Borodovsky, M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res 18, 1979–1990 (2008).
Lomsadze, A., Burns, P. D. & Borodovsky, M. Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res 42, https://doi.org/10.1093/nar/gku557 (2014).
Bruna, T. & Lomsadze, A. ProtHint, https://github.com/gatech-genemark/ProtHint/blob/master/bin/print_longest_isoform.py, (Georgia Institute of Technology, Atlanta, USA).
Blin, K. et al. antiSMASH 7.0: new and improved predictions for detection, regulation, chemical structures and visualisation. Nucleic Acids Res 51, W46–W50 (2023).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20, https://doi.org/10.1186/s13059-019-1832-y (2019).
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
Minh, B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinforma. 19, 153 (2018).
Koch, N. M. Phylogenomic Subsampling and the Search for Phylogenetically Reliable Loci. Mol. Biol. Evol. 38, 4025–4038 (2021).
Nesnidal, M. P. et al. New phylogenomic data support the monophyly of Lophophorata and an Ectoproct-Phoronid clade and indicate that Polyzoa and Kryptrochozoa are caused by systematic bias. BMC Evol. Biol. 13, 253 (2013).
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol. Biol. Evol. 35, 518–522 (2018).
Zhang, J. J., Kapli, P., Pavlidis, P. & Stamatakis, A. A general species delimitation method with applications to phylogenetic placements. Bioinformatics 29, 2869–2876 (2013).
Waschulin, V. et al. Biosynthetic potential of uncultured Antarctic soil bacteria revealed through long-read metagenomic sequencing. ISME J. 16, 101–111 (2022).
Rajwani, R., Ohlemacher, S. I., Zhao, G. X., Liu, H. B. & Bewley, C. A. Genome-Guided Discovery of Natural Products through Multiplexed Low-Coverage Whole-Genome Sequencing of Soil Actinomycetes on Oxford Nanopore Flongle. Msystems 6, e0102021 (2021).
Navarro-Munoz, J. C. et al. A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol. 16, 60–68 (2020).
Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Res 49, D412–D419 (2021).
Oksanen, J. et al. vegan: Community Ecology Package. R package version 2.6-2. https://CRAN.R-project.org/package=vegan (2022).
Robey, M. T., Caesar, L. K., Drott, M. T., Keller, N. P. & Kelleher, N. L. An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes. Proc. Natl Acad. Sci. USA 118, e2020230118 (2021).
Slater, G. S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinforma. 6, 31 (2005).
Loytynoja, A. Phylogeny-aware alignment with PRANK. Methods Mol. Biol. 1079, 155–170 (2014).
Wu, M. T., Chatterji, S. & Eisen, J. A. Accounting For Alignment Uncertainty in Phylogenomics. Plos One 7, e30288 (2012).
Di Franco, A., Poujol, R., Baurain, D. & Philippe, H. Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences. BMC Evol. Biol. 19, 21 (2019).
Pond, S. L. K., Frost, S. D. W. & Muse, S. V. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21, 676–679 (2005).
Smith, M. D. et al. Less Is More: An Adaptive Branch-Site Random Effects Model for Efficient Detection of Episodic Diversifying Selection. Mol. Biol. Evol. 32, 1342–1353 (2015).
Wertheim, J. O., Murrell, B., Smith, M. D., Pond, S. L. K. & Scheffler, K. RELAX: Detecting Relaxed Selection in a Phylogenetic Framework. Mol. Biol. Evol. 32, 820–832 (2015).
Murphy, R. Zenodo DOI for Github repository. Zenodo https://doi.org/10.5281/zenodo.13735717 (2024).
Acknowledgements
We thank members of the Social and Symbiotic Evolution Group for comments on a previous draft of the manuscript, Le Ministere de l’Enseignement Superieur de la Recherche Scientifique, Republique de Côte d’Ivoire, for export permits (189/UNA/CRE/SREC), and Cene Gostinčar for help with uploading genome assemblies to GenBank. This work was supported by a European Research Council Consolidator Grant (ERC-CoG 771349) to M.P and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Project-ID 239748522 – SFB 1127 to CB (A06). T.W. was supported by the Novo Nordisk Foundation (NNF20CC0035580) and the Danish National Research Foundation (CeMiSt, DNRF137).
Author information
Authors and Affiliations
Contributions
S.S., R.M. and M.P. conceptualization and writing—original draft. S.S., B.C., S.K.S., S.K., N.K. and M.P.: data curation. S.S., R.M., B.C., J.V., N.B.K., S.V., L.P., D.K.A., C.B., T.W. and M.P.: methodology. SS, RM and JV: formal analysis and visualization. S.S., R.M., B.C., J.V., S.K., N.K., C.B., T.W. and M.P.: writing—review and editing. N.K., C.B. and M.P.: supervision. M.P.: funding acquisition. All authors contributed to the article and approved the submitted version.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Yin Wen-Bing and the other anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Koon Ho Wong and David Favero. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Schmidt, S., Murphy, R., Vizueta, J. et al. Comparative genomics unravels a rich set of biosynthetic gene clusters with distinct evolutionary trajectories across fungal species (Termitomyces) farmed by termites. Commun Biol 7, 1269 (2024). https://doi.org/10.1038/s42003-024-06887-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-06887-y
This article is cited by
-
How cryptic animal vectors of fungi can influence forest health in a changing climate and how to anticipate them
Applied Microbiology and Biotechnology (2025)





