Significance of the claim for regulated mating genes in AMF

Arbuscular mycorrhizal fungi (AMF) are keystone mutualists in terrestrial ecosystems, as they improve plant yields and protect their hosts against pathogens [2]. These fungi are also genetic oddballs, as they carry thousands of nuclei in a large syncytium at all times [3]. This constant multinucleate state was proposed to have helped AMF evolve for close to a billion year in the absence of sex [4], but this hypothesis is challenged by the discovery of compelling signatures of sexual reproduction in these organisms. Specifically, all AMF carry meiosis and mating-related genes [5] and genome-based evidence for inter-strain recombination [6]. Furthermore, some cultured strains show a dikaryotic-like nuclear organization where two parental nuclear genotypes co-exist in the mycelium [7, 8]. However, despite this evidence, sexual reproduction, i.e., mating and plasmogamy producing a recombined haploid progeny through meiosis [7, 9] has not been observed in these organisms.

Mateus et al. [1], recently investigated transcriptional responses in AMF fungi during the strain co-existence in plants using RNA-seq and concluded that several genes involved in mating were up or down regulated (see Tables 1 and 2 in Mateus et al. [1]). If true, this finding would represent the first direct evidence for mechanistic processes related to mating in AMF [1].

Table 1 Best reciprocal hit for the R. irregularis genes claimed to be involved in mating from the MycoCosm database.
Table 2 Alternative best hit genes from GBC data set used by Mateus et al. [1] using reciprocal blast.

AMF genomes are very large compared to most fungal relatives [3] and contain highly expanded gene families, including an overrepresentation of genes involved in signaling pathways and protein–protein interactions compared with known fungal gene repertoires [6, 10,11,12]. The use of sequence homology to attribute specific functions to genes that are members of large and functionally diverse families is problematic, especially in AMF where large genomes include many expanded gene families [12, 13]. The difficulty concerns discriminating between orthologues, which are genes in different species that evolved from a common ancestral gene (i.e., are monophyletic), and paralogs, which result from duplications (i.e., can belong to distinct paraphyletic clades). Orthologues generally retain the same function during the course of evolution, while paralogues facilitate functional innovation by removing evolutionary constraints on conserved functions [14, 15]. Therefore, clarification of evolutionary relationships is an essential step for reliable prediction of gene function in silico, and overlooking this step most often leads to spurious gene predictions.

Since orthology analyses were not clearly described by Mateus et al. [1], we first sought to clarify the “homology status” of the genes listed in Tables 1 and 2 from Mateus et al. [1], as these are claimed to be involved in mating and differentially expressed in R. irregularis during strain co-existence [1]. To do so, we used two approaches that provide gene orthology prediction between species [16] (Supplemental Methods). The best candidate orthologues we identified differ from those identified by Mateus et al. [1]. We then re-analyzed the RNA-seq data for evidence of differential expression of both the originally claimed “mating gene homologs” and the “best match orthologs” of validated mating genes from our analysis using stringent statistical thresholds. We conclude that there is no significant support for mating in this data set.

Best reciprocal hits and OrthoMCL analysis of R. irregularis genes claimed to be involved in mating

Attribution of functions to differentially expressed genes in the R. irregularis genome by Mateus et al. was based on their similarity to known fungal mating genes show surprisingly low statistical significance. For example, the R. irregularis gene GBC47251.1, which is claimed by Mateus et al. to represent the key mating gene STE20, shows an e-value of only 5e−12 and an amino-acid sequence identity of only 26.57% against the Saccharomyces cerevisiae STE20 gene (KZV10712.1) used in their comparison. However, this attribution is not  justified since GBC47251.1 is not the closest match to STE20. When STE20, a validated fungal mating gene, is used as the query sequence this shows that other R. irregularis genes are significantly more similar—e.g., the R. irregularis accession GBC37837.1 has e-value 7e−139 and 41% identity. Given this, we were concerned many of the putative genes may represent distant paralogues of fungal mating genes and systematically assessed the support for functional attribution from sequence relationship.

To assess the potential for paralogy to confound the interpretation of mating functions we used the 18 putative mating genes identified by Mateus et al., as query sequences against the high-quality protein databases from the JGI Mycocosm Rhiir2 [17]. The best hits emerging from our analysis showed that each of the claimed mating genes is a member of a large fungal gene family of broad function—i.e., protein kinases, cytochrome oxidases, etc., as can be seen from the number of hits in the JGI fungal database recorded in column 2 of Table 1. A similar analysis was then conducted using fungal mating genes that were reported by Mateus et al. as being homologous to the upregulated R. irregularis genes (see Table 2, column 3, in Mateus et al. [1]). In this analysis none of the supposed mating genes was the best hit to the reference fungal mating  gene—e.g., using AAD42946.1 as query should find GBC19598.1 as first hit, instead in our analyses it retrieved GBC26474.1 (Table 2, column 5). To challenge these findings with a different approach, we used OrthoMCL [18] (Supplemental Methods) to identify functional clusters of orthologs (and recent paralogs) that include both validated fungal mating genes and R. irregularis genes contained in either the GBC database used by Mateus et al. [1], or the JGI Mycocosm Rhiir2 database [17].

The OrthoMCL analysis revealed that only 7 of the 18 genes listed by Mateus et al. [1] share such clusters (Supplemental Table 1). Based on OrthoMCL only 2 of these 7 cases are the putative R. irregularis orthologue candidate of fungal mating genes (e.g., GBC28192.1, and GBC28793.1; Supplemental Table 1).

In summary, our analysis suggests two significant weaknesses. The best matches to validated mating genes have not been identified in R. irregularis (did not find candidate orthologs) and of the genes identified by differential expression the predictions of function are not based on demonstrated orthology and are therefore not supported by available evidence.

Regulation of proposed mating-related AMF genes using conventional RNA-seq analyses and statistical thresholds

 The above mentioned findings do not exclude the possibility that the genes identified by Mateus et al. [1] are involved in mating, rather they clarify that their supporting arguments are based on spurious evolutionary relationships. To explore further the evidence for differential expression under the specific conditions of co-existence, we also aimed to validate their differential expression  when the R. irregularis strains DAOM197198 and B1 colonize the same plant host.

Mateus et al. [1] claimed that 18 genes are differentially expressed during strain co-existence. We note, however, that some of the same genes have a different expression response across strains under the same condition—i.e., upregulated in the strain DAOM197198 but downregulated in B1 (e.g., GBC53331.1, GBC31594.1, GBC37885.1) in their study with p-value cut-off 0.1. In our view, the claim that mating genes are differentially expressed during strain co-existence should show consistency in expression in both strains (biological replication). Specifically, if mating processes are really underway in planta, then the same genes should be either consistently upregulated or downregulated across similar conditions—i.e., genes should not be subjected to random regulation as suggested, for example, by data available in Supplemental Table 6 from Mateus et al. [1]. Furthermore, we note that Mateus et al. [1], using an adjusted value of <0.1 as a threshold to claim that transcript changes are significant. Given that transcript level validation was not performed by Mateus et al. [1], for example, using digital droplet or qPCR, for at least a subset of the 18 genes they highlight, we believe that an adjusted value of <0.1 could result in a substantial number of false positives; particularly given the large number genes analyzed in the R. irregularis genome. As such, given the importance of the claims reported by Mateus et al. [1], a more canonical adjusted value of <0.05 is needed to conclusively support evidence of gene regulation using their RNA-seq data set.

To test the support for expression changes in the 18 putative mating genes during co-existence, we re-analyzed their RNA-seq data using Deseq2 with the following basic assumptions: to be deemed regulated a gene should be (i) differentially expressed in both co-inoculation treatments compared to DAOM197198 or B1 alone; (ii) differentially expressed identically in both conditions and (iii) show regulation at the adjusted value of <0.05. Using this approach revealed that only two R. irregularis genes (GBC31744.1, GBC38036.1) out of 18 proposed by Mateus et al. [1] to be involved in mating show evidence of differential expression in both comparisons—i.e., in both co-inoculation treatments compared to DAOM197198 or B1 alone (Supplemental Table 2). However, neither of these two genes share clades with validated fungal mating genes and thus, in our opinion, should not be considered mating genes.

Our analysis had identified other candidate R. irregularis mating genes using OrthoMCL (Supplemental Table 1), so we also tested if these were differentially expressed in both co-inoculation treatments. This analysis revealed changes for two putative orthologues of mating genes (Rhiir2_1|1633285, Rhiir2_1|1616235, Supplemental Tables 1 and 2), however, one was upregulated and one downregulated. During mating, recombination occurs and triggers the expression of meiosis genes [5, 19]. As such, AMF strain co-existence should lead to the upregulation of known meiosis-specific AMF genes (MSG) if mating is present. We tested this hypothesis by investigating gene expression of R. irregularis meiosis-specific genes (MSG) using the above-mentioned approach. We found that MSG are expressed at very low levels—i.e., most have no mapped read or are not significantly and conservatively expressed across conditions during strain co-existence (Supplemental Table 3).

In summary, our analysis of available RNA-seq data from Mateus et al. [1] suggests their conclusions are not based on robust evidence. Most R. irregularis genes proposed to be involved in mating by the authors do not show evidence of regulation across conditions and/or at statistically significant thresholds. Most R. irregularis genes expected to be involved in mating are not expressed when strains co-exist. The study by Mateus et al. [1] concludes, for example, that: “AMF genes known to be involved in different stages of mating responses in other fungi are upregulated when two genetically distinct strains co-exist in roots”, or that “the discovery of in planta activation of genes related to different stages of mating in R. irregularis and also provides some clues to understanding the early steps of the evolution of sex-determination of fungal systems”.

Although the hypothesis that AMF could mate in planta is intriguing, it is not supported by our re-analysis of the data set from Mateus et al. [1]. First, the above-mentioned statements are not supported by orthology predictions. These predictions are especially important to highlight gene function when members of very large families (protein kinases, HMG) are studied, as virtually any relative of such families shares, by definition, some level of sequence homology. Our re-analysis of RNA-seq data from Mateus et al. [1] also failed to detect significant regulation of R. irregularis mating genes during strain co-existence. Specifically, although some evidence of regulation can be found for a few genes in one condition at an adjusted p-value of 0.1, the use of conventional statistical standards (adjusted p-value of 0.05) and replication (gene regulation must be shared across biological replicates conditions) revealed regulation of a mere  two putative R. irregularis orthologues of known fungal mating genes. These two genes encode for one putative RNA helicase (out of 21 identified in Rhiir2 gene repertoire) and one velvet factor (out of 6). These genes are part of families involved in a myriad of cellular functions that are not linked to mating [20]. Given the strong emphasis of Mateus et al. [1] on the regulation of mating-related genes, it is surprising that the authors did not investigate the expression of AMF MSG, as these are specifically upregulated during fungal mating. Within this context, our re-analysis found no evidence for their regulation, providing an independent absence of evidence for the presence of sexual reproduction during co-inoculation with strains DAOM197198 and B1.

Overall, the absence of upregulation in mating-related genes in Mateus et al. [1] could be easily explained by the actual incompatibility (as defined, for example, by their divergent MAT- locus sequences [7]) between the strains DAOM197198 and B1 used by the authors. However, identifying the putative compatibility of these two strains is currently unfeasible because the genome of strain B1 has not been sequenced by Mateus et al. [1], despite the fact that each strain may theoretically differ by up to 50% in gene content [8]. Obtaining the genome of the B1 strain would ensure that standard requisites for conventional in silico gene expression analyses are met. Specifically, it would ensure that mapping of RNA-seq reads is performed on the proper reference genome and that the relative transcriptomic contribution of each strain is clearly identified.