Divergent genomic trajectories predate the origin of animals and fungi

Animals and fungi have radically distinct morphologies, yet both evolved within the same eukaryotic supergroup: Opisthokonta1,2. Here we reconstructed the trajectory of genetic changes that accompanied the origin of Metazoa and Fungi since the divergence of Opisthokonta with a dataset that includes four novel genomes from crucial positions in the Opisthokonta phylogeny. We show that animals arose only after the accumulation of genes functionally important for their multicellularity, a tendency that began in the pre-metazoan ancestors and later accelerated in the metazoan root. By contrast, the pre-fungal ancestors experienced net losses of most functional categories, including those gained in the path to Metazoa. On a broad-scale functional level, fungal genomes contain a higher proportion of metabolic genes and diverged less from the last common ancestor of Opisthokonta than did the gene repertoires of Metazoa. Metazoa and Fungi also show differences regarding gene gain mechanisms. Gene fusions are more prevalent in Metazoa, whereas a larger fraction of gene gains were detected as horizontal gene transfers in Fungi and protists, in agreement with the long-standing idea that transfers would be less relevant in Metazoa due to germline isolation3–5. Together, our results indicate that animals and fungi evolved under two contrasting trajectories of genetic change that predated the origin of both groups. The gradual establishment of two clearly differentiated genomic contexts thus set the stage for the emergence of Metazoa and Fungi.

multicellular phenotype of these groups as complex multicellularity (CM), which differentiates from simpler forms of multicellularity that are more prevalent in eukaryotes. Another common feature of complex multicellularity would be the existence of a three dimensional structure in which only a fraction of cells is exposed to the extracellular environment 2 .

Multicellularity in Opisthokonta
Animals, land plants and multicellular fungi (mushrooms) are the most well studied groups among those that present CM. Opisthokonta thus represents an excellent phylogenetic framework where to study the origin of CM in eukaryotes, as it includes animals, fungi, and the protist relative lineages of both groups 3 . Given the widespread distribution of multicellularity in Opisthokonta (see below), it is tentative to suggest that the common ancestor of this eukaryotic supergroup had at least a clear potential to evolve multicellularity. However, there is no evidence to suggest that this ancestor could have presented CM. Firstly, because CM is completely different in Metazoa and Fungi from a cell biology perspective. Secondly, because CM is not observed neither in the protist lineages that branch as sister-group to Metazoa in the Holozoa division of Opisthokonta, nor in the the protist relative lineages of Fungi in the Holomycota division of Opisthokonta (see below).
The main difference between animals and the rest of Holozoa is the fact that all metazoans are multicellular. Animal multicellularity is characterized by a complex genetically regulated spatial and temporal cell differentiation program. CM was a remarkable innovation in the path to Metazoa 4 , and indeed, most of the gene functional categories that expanded in the metazoan root are tightly linked to multicellularity (see main text). However, besides Metazoa, simpler versions of multicellularity have been observed in some representatives of the other holozoan groups. Clonal multicellularity have been characterized in the choanoflagellate Salpingoeca rosetta 5 , and it has recently been described a choanoflagellate species that presents an aggregative multicellular behavior 6 . Aggregative multicellularity has also been observed in the four known species of the group Filasterea. This includes the amoeboid species Ministeria vibrans and the flagellated species Pigoraptor vietnamica and Pigoraptor chileana 7 (for which we produced genomic data for this manuscript), although it has only been studied in the filasterean Capsaspora owczarzaki 8,9 . Finally, coenocytes are characteristic of some multinucleated stages in some members of the group Teretosporea 10,11 .
Holomycota, the other division of Opisthokonta, also includes different forms of multicellularity.
On the one hand, Fonticula alba, a poorly characterized species that branches as sister-group to the nucleariids (a group of amoeboid organisms that includes Parvularia atlantis, another species that was sequenced for this study). F. alba presents two distinct multicellular behaviors: a bacteria-feeding sorocarpic amoeba stage 12 , and collective invasive behavior which has been recently described 13 . On the other hand, the most paradigmatic example of multicellularity in Holomycota is the fungal hypha. However, not all Fungi present hypha, and the presence of hypha is not per se a signature of CM (see the following section).

Complex multicellularity in Fungi
In contrast to animal clonal multicellularity, fungal multicellularity is based on chains of interconnected cells organized in branching filaments known as hyphae 14 . Despite hyphae being widespread across Fungi, CM (i.e., tissues with tridimensional organization, with molecular mechanisms for cell communication and spatial cell differentiation regulated by a developmental program) is restricted to only a few phylogenetically distant lineages 15 . The convoluted distribution of CM in Fungi complicates the inference of its origins, which could go from a single event and multiple losses to multiple independent origins 15 . Previous studies have reported mostly differences but also similarities in the molecular toolkits involved in CM across Fungi 16,17 . While the genetic similarities may favor the single origin hypothesis, they may also be explained by convergence on a molecular level (e.g., co-option and/or expansion of the same gene families in distinct CM lineages). Knowing whether these similarities were already present or not in the common ancestor of CM fungi would provide insights into this yet unsolved evolutionary history. We thus searched for signatures at gene content level differentiating CM fungi from other fungi using a manual and an unsupervised approach and explored the inferred gene contents of the corresponding ancestors for the hypothetical presence of these signatures (see below).
Second, we searched for OGs present also in other fungi but with more copies in CM taxa ( Supplementary Information 4-Fig. 1B). Of these, 13 were found with more copies in at least 3 CM taxa (subset B in Supplementary Table 15). Only 9 and 3 OGs from subsets A and B, respectively, were common to at least 4 of the 5 CM taxa, and none was common to all of them. While these values are concordant with the overall poor homology between the CM molecular toolkits of the distinct CM lineages 15 , our stringent conditions may have limited the detection of further OGs potentially related to CM origin/s in Fungi. For example, while a given ancestral innovation related to CM may have been secondarily lost in multiple yeast lineages due to a reversion to unicellularity, some could have retained and co-opted it for other purposes. Also, some of those OGs that experienced gene gains in CM fungi could have also been expanded in non-CM lineages. We thus used a less stringent and unsupervised approach based on the Random Forest (RF) classifier, which not only classifies data into categories (CM taxa vs other fungi) but also reports which features (in this case, which OGs) were the most informative for the classification purpose. In particular, RF decides on a consensus of multiple suboptimal decision trees, each of which has only access to a subset of the available features to prevent overfitting. We performed 5 independent runs of RF, as each round is blind to a distinct subset of the data. There were 14 OGs used by at least 3 of the 5 RF runs. Of these, 4 and 1, respectively, coincide with the subsets A and B, and 13 have more copy numbers in CM fungi than in non-CM fungi. These 13 OGs (see subset C in Supplementary Table 15) are thus good candidates for being used as markers to explore the evolutionary history of CM in Fungi. Supplementary Information 4-Fig. 1 or because a given hypothetical ancestral toolkit for CM would have been replaced at least to a great extent by group-specific genetic innovations. If the second scenario were true, we would expect to find that the 13 OGs from subset C, which are potential components of an ancestral toolkit for CM (provided that it ever existed), would not have expanded independently in the distinct CM groups but rather along the ancestral path that is common to all them.
Our results partially agree with each of the two scenarios proposed above. On the one hand, the highest copy number for these 13 OGs is found in Pezizomycotina (Ascomycota) and in Agaricomycotina (Basidiomycota), which in our dataset are represented by N. crassa and T.
melanosporum and by C. cinerea and L. bicolor, respectively ( Supplementary Information 4-Fig. 2B). Given that both groups present the most complex multicellularity among Fungi, our results confirm the occurrence of genetic changes accompanying this phenotypic complexification. However, we observe that the increase in copy number for these OGs predated the emergence of both groups, and indeed started not in their last common ancestor (Dikarya), but in the preceding ancestral node (Dikarya + Mucoromycotina).. Supplementary Information 4-Fig. 2. Phylogeny of Holomycota, with branches being colored according to (A) the probabilities retrieved from the Random Forest classifier trained to detect CM-Fungi based on the relative OG representation of every genome (excluding species-specific OGs) and (B) mean counts for the 13 OGs used by more than half of the runs of the classifier (see text).
Although the canonical cases of fungal CM are in Dikarya, fruiting bodies -i.e., fungal organs with CM patterns-have also been described in Mucoromycotina 15 . Within Dikarya, these OGs are mostly absent from non-CM species, most of which evolved a yeast phenotype. The absence of these OGs from these species is not surprising given that the classifier was trained to find potential markers of CM in Fungi (i.e., OGs present in CM fungi and absent in non-CM fungi). However, for this same reason, it is remarkable that some of the best markers of CM found by the classifier (the 13 OGs from subset C) are also distributed across Mucoromycotina, in spite of the fact that the species from this group were not considered as CM fungi during the training step of the classifier. Based on this finding, we suggest that future studies exploring the hypothesis of a single origin for CM in Fungi should focus not only on those evolutionary changes occurred at the root of Dikarya, but also on those occurred at the preceding ancestral node. .
Overall, given the ancestral signal detected ( Supplementary Information 4-Fig. 2B), it is tentative to hypothesize that CM could have already been present at the root of Dikarya + Mucoromycotina, or at least a rudimentary version of this phenotype. CM would have later become more complex in some groups of Dikarya. However, an alternative scenario, proposed by 15 , is that fungal CM is the outcome of an outstanding process of convergent evolution that could have been favored by the possibility of co-opting ancestral genes that were appropriate to evolve this phenotype 15