Phylogenomic analysis uncovers the evolutionary history of nutrition and infection mode in rice blast fungus and other Magnaporthales

The order Magnaporthales (Ascomycota, Fungi) includes devastating pathogens of cereals, such as the rice blast fungus Pyricularia (Magnaporthe) oryzae, which is a model in host-pathogen interaction studies. Magnaporthales also includes saprotrophic species associated with grass roots and submerged wood. Despite its scientific and economic importance, the phylogenetic position of Magnaporthales within Sordariomycetes and the interrelationships of its constituent taxa, remain controversial. In this study, we generated novel transcriptome data from 21 taxa that represent key Magnaporthales lineages of different infection and nutrition modes and phenotypes. Phylogenomic analysis of >200 conserved genes allowed the reconstruction of a robust Sordariomycetes tree of life that placed the monophyletic group of Magnaporthales sister to Ophiostomatales. Among Magnaporthales, three major clades were recognized: 1) an early diverging clade A comprised of saprotrophs associated with submerged woods; 2) clade B that includes the rice blast fungus and other pathogens that cause blast diseases of monocot plants. These species infect the above-ground tissues of host plants using the penetration structure, appressorium; and 3) clade C comprised primarily of root-associated species that penetrate the root tissue with hyphopodia. The well-supported phylogenies provide a robust framework for elucidating evolution of pathogenesis, nutrition modes, and phenotypic characters in Magnaporthales.

M agnaporthales is an order in Ascomycota within Fungi that contains over 200 species, including pathogens of cereals and grasses as well as saprobes on submerged wood [1][2][3] . The best-studied species in Magnaporthales is the rice blast fungus Pyricularia oryzae (5Magnaporthe oryzae), which is one of the most devastating threats to food security worldwide 4,5 . This pathogen is better known for its Pyricularia asexual state ( fig. 1), which has been used for decades as a paradigm for fungal leaf infection studies 6,7 . The rice blast fungus also has the recently discovered root-infecting capability 8 . Gaeumannomyces graminis (take-all pathogen) and Magnaporthiopsis poae (summer patch pathogen), two other model species in Magnaporthales, are root-infecting pathogens of cereals and grasses. In addition to the plant pathogens, several saprotrophic woodinhabiting genera (e.g., Pseudohalonectria and Ophioceras) are also placed in Magnaporthales 9 .
The studies of Magnaporthales date back to the late 19th century when the rice stem rot fungus Nakataea oryzae (5Magnaporthe salvinii) and the crabgrass pathogen Pyricularia grisea were first recorded 10,11 . Cannon 12 established the family Magnaporthaceae to accommodate these grass-associated fungi centered on Magnaporthe.
Historically, Magnaporthaceae was placed in various orders in Ascomycota, such as Diaporthales 13 , Phyllachorales, and Xylariales 14 , due to a lack of robust phylogenetic evidence and paucity of convincing phenotypic characters. A new order Magnaporthales was recently proposed for these fungi 2 . Molecular phylogenetic studies indicated that Magnaporthales belongs to the subclass Sordariomycetidae (Sordariomycetes, Ascomycota). However, consensus regarding additional phylogenetic affinities has not yet been reached. For example, in Zhang and Blackwell's 15 18S rDNA tree, Magnaporthales formed a sister clade with Ophiostomatales. Huhndorf et al. 16 generated a 28S rDNA tree that suggested a close relationship of Magnaporthales with Sordariales, Chaetosphaeriales and Boliniales. In the 18S rDNA tree of Thongkantha et al. 2 , Magnaporthales formed a sister clade to Diaporthales and Ophiostomatales. In the same paper, however, the 28S rDNA tree grouped it with Chaetosphaeriales and Sordariaceae. Zhang et al. 17 generated a four-gene phylogeny and found the rice blast fungus to be most closely related to Diaporthales; however, no other Magnaporthales taxa were included in the analysis. Poor taxon sampling due to a lack of genome data from Magnaporthales and phylogenetic heterogeneity among gene loci 18,19 likely explain the conflicting tree topologies in the studies described above. Indeed, the current public availability of genome sequences from Magnaporthales is limited to the three model species: Pyricularia oryzae, Magnaporthiopsis poae, and Gaeumannomyces graminis var. tritici 7 (www.broadinstitute.org).
Therefore, despite their pathogenic diversity and economic importance, several key questions regarding the evolution of Magnaporthales remain unanswered. 1) Does Magnaporthales constitute a monophyletic group? 2) What is the phylogenetic position of Magnaporthales within Sordariomycetes? 3) What are the phylogenetic relationships among key taxa in Magnaporthales? and, 4) How have pathogenicity and phenotypes evolved among Magnaporthales taxa? In order to address these issues, we reconstructed the Magnaporthales phylogeny using a phylogenomic approach. Phylogenomics can often overcome problems associated with single locus or multi-locus phylogenetic analyses. To this end, we generated transcriptome data (,420 M paired-end reads) from 21 species of Magnaporthales, of which five were chosen for draft genome sequencing. Our work resulted in the largest collection of genome data to date for this poorly studied group and offers the opportunity to resolve the Magnaporthales phylogeny and to understand the evolution of its constituent lineages.

Results and Discussion
Transcriptome sequencing from 21 Magnaporthales species. We generated ,420 M paired-end reads (totaling ,65 Gbp) from the 21 targeted Magnaporthales taxa (tables S1 and S2). These data were assembled into a total of 315 K contigs with 13-31 K contigs from each taxon (N50 5 1.3-2.1 Kbp). To aid detection of potential gene duplications among Magnaporthales (see Methods), five of these taxa were subjected to complete genome sequencing, resulting in 35-43 Mbp genome assemblies that encoded ,12 K genes from each lineage (table S3). The mRNA-seq data have been submitted to NCBI (SRP050514, Bioproject number: PRJNA269089), whereas the results of the genome sequences and assembled contigs are presented at the project (dblab.rutgers.edu/FunGI/). In this paper, we focused on reconstruction of the Magnaporthales tree of life based on a set of highly conserved ''core'' genes. To this end, we selected 309 individual CEGMA (Core Eukaryotic Genes Mapping Approach; Parra et al. 38 ) gene markers that are largely free of gene duplication in Magnaporthales and other Ascomycota (see Methods).
Phylogenomics resolves the Magnaporthales tree of life. To address the phylogenetic position of Magnaporthales among Sordariomycetes, a maximum likelihood (ML) tree was generated using 83,616 aligned amino acids from 226 single-copy genes within six Magnaporthales species and 15 non-Magnaporthales species representing the major lineages of Pezizomycotina with two Saccharomycetes as outgroup ( fig. 1, table S4). The ML tree provided strong bootstrap support (90-100%) for all interior nodes in the tree except for the union of Cryphonectria parasitica, Grosmannia clavigera, and Magnaporthales (41%). Bayesian inference (BI) using the same dataset showed a congruent and highly supported topology (see fig. S1). The topology of a wellsupported coalescent model-based tree using this dataset of 23 species ( fig. S2) was also consistent with the results shown in fig. 1, except for the swapping of branches containing Cryphonectria parasitica and the Neurospora crassa-Chaetomium globosum clade. Therefore, the phylogeny of the major Sordariomycetidae lineages appears to be largely resolved, except for the relationships among Sordariales, Cryphonectria parasitica, and Grosmannia clavigera.
To study in detail the relationships within Magnaporthales, a ML tree was built using a 226-protein alignment that included 82,715 amino acids from 24 Magnaporthales and five outgroup taxa. This ML tree provided robust bootstrap support (98-100%) for the monophyly of the Magnaporthales and most interior nodes therein, except for the branch uniting Gaeumannomyces radicicola and G. graminis species (55%; fig. 2). The BI tree and the coalescent model-based tree using this dataset ( fig. S3 and S4) resulted in very similar topologies. The minor differences between the multi-protein ML tree and the coalescent tree include: 1) The poorly supported paraphyletic relationship between Gaeumannomyces graminis var. graminis and G. radicicola in the ML tree ( fig. 2), that was replaced with a monophyletic relationship in the coalescent tree (fig. S4); and 2) The positions of Cryphonectria parasitica and Neurospora crassa were swapped in the two trees.
In summary, we inferred well-supported phylogenies of Sordariomycetes and Magnaporthales using .200 conserved single-copy gene markers. Notably, the consistent topologies generated using multi-protein concatenation and the concatenation-free coalescent model suggests that our results are not biased by phylogenetic heterogeneity across genetic loci that may mislead multiprotein phylogeny construction (e.g., Song et al. 20 ).
The phylogenetic position of Magnaporthales within Sordariomycetes. Ascomycota, the largest group within Fungi includes three subphyla, Taphrinomycotina, Saccharomycotina, and Pezizomycotina, which are differentiated by the ascus dehiscence mechanism and fruiting body structure 21,22 . The order Magnaporthales, together with other filamentous ascomycetes, comprise the class Sordariomycetes that belongs to subphylum Pezizomycotina. Our phylogenomic analysis resolved relationships among the classes in Pezizomycotina included in the study ( fig. 1). The Orbiliomycetes-Pezizomycetes clade (Monacrosporium haptotylum and Tuber melanosporum) represents an early diverging lineage in Pezizomycotina. The affinity between these two classes received stronger support from our analyses when compared to a previous study 21 . This result is consistent with shared Pezizomycotina characters such as the basic ascus type, the simple spore dehiscence mechanism, and the exposed hymenium 21,23 . The classes Eurotiomycetes (e.g., Aspergillus niger), Lecanoromycetes (Cladonia grayi), and Dothideomycetes (Phaeosphaeria nodorum) form a strongly supported clade. Their monophyletic relationship was recovered in earlier phylogenetic studies, however with lower bootstrap support than reported here 21,22,24,25 . Sordariomycetes and Leotiomycetes (e.g., Blumeria graminis) are the later diverging lineages and their monophyletic relationship is consistent with previous studies 18,19,21,24 . The shared characters between these two lineages include unitunicate and inoperculate asci. The latter is regarded as the advanced spore dispersal mechanism in Pezizomycotina 22,24 . Given these results, we surmise that single-copy CEGMA genes are useful phylogenetic markers that appear to recapitulate and consolidate previously suggested relationships amongst Pezizomycotina.
Although Magnaporthales formed a strongly supported monophyletic clade in Sordariomycetes ( fig. 1), the phylogenetic position of this lineage relative to Ophiostomatales and Diaporthales is less well resolved using different phylogenetic methods (i.e., compare figs. 1, S1, S2). This was also the case in previous analyses of morphological and molecular data 2,16,21 . Nonetheless, our analyses support a sister group relationship between Magnaporthales and Ophiostomatales; the latter is represented here by the mountain pine beetle-associated blue stain fungus Grosmannia clavigera. This result is consistent with the shared characteristics between the two orders of non-stromatic perithecia and a hyphomycetous asexual state. In contrast, Diaporthales typically have stromatic perithecia and a coelomycetous asexual state ( fig. 1). The grouping of Magnaporthales and Ophiostomatales (Gromannia clavigera or Ophiostoma piliferum) was also reported in two previous phylogenomic analyses that included fewer taxa 19,26 .
Phylogenetic relationships among Magnaporthales taxa. In our phylogenomic analyses, three well-supported major clades are recognized within Magnaporthales (fig. 2). These three clades correspond to Magnaporthaceae, Pyriculariaceae and Ophioceraceae, the recently classified families in Magnaporthales that were identified using a 2locus phylogeny 27 . Most species in clade C are associated with grass roots, use a hyphopodium as the penetration structure, and have a phialophora-like asexual state. The exceptions are Buergenerula spartinae, Bussabanomyces longisporus, Nakataea oryzae, and Omnidemptus affinis which are associated with the above-ground parts of host plants. The clade B species that cause blast diseases in Poaceae and other monocot plants use an appressorium as the host penetration structure and have a Pyricularia asexual state. The saprotrophic Ophioceras and Pseudohalonectria species form the basal clade A, which are associated with submerged wood tissue. Currently the asexual state of the majority of these species has not yet been observed 3 .
Recent phylogenetic studies support the idea that terrestrial habits and a saprotrophic lifestyle are plesiomorphies of the Sordariomycetes fungi 17,28,29 . Our results indicate that the clade A in Magnaporthales shifted from a terrestrial to an aquatic environment but maintained the ancestral saprotrophic lifestyle. In addition to the Wood clade in Magnaporthales, aquatic lineages occur in several other orders in Sordariomycetes, including Halosphaeriales, Sordariales, Diaporthales, and Xylariales 17 . Therefore, such a habit shift has apparently occurred several times in Sordariomycetes 17,28 .
Aerial and root infections are two strategies employed by plant pathogens to obtain host nutrients 30 . Species in clade B in Magnaporthales have adapted to aerial infection. The reproductive hyphae of the Pyricularia asexual state are long and erect, which facilitates conidium dispersal. After conidia attach to the host, a specialized structure known as an appressorium develops from conidia and penetrates the leaf cuticle using high turgor pressure, which is essential for fungal invasion of the host aerial parts ( fig. 2) 30 . However, most clade C members appear to lack aerial infection ability and have adapted to root infection. Their asexual states are phialophora-like, which is characterized by simple and short reproductive hyphae 12,31 . These taxa form lobed and swollen hyphal structures named hyphopodia to penetrate the host root tissue and obtain nutrients ( fig. 2) 30 .
Traditionally, ascospore morphology is the basis for genus and species delimitation in Magnaporthales and in many other ascomycetes. For example, in Magnaporthales, species that produce spindleshaped ascospores are placed in Magnaporthe, whereas those that produce filiform ascospores are in Gaeumannomyces. The results of our study indicate that morphology-based generic concepts are unreliable as systematic markers; e.g., Magnaporthe and Gaeumannomyces are polyphyletic, whereas ecological and pathogenicity features correspond better to the phylogenetic history of Magnaporthales. The clear separation of the Blast clade and the Magnaporthe type species, M. salvinii, indicates that the rice blast fungus does not belong to the genus Magnaporthe, which was mistakenly classified in the 1970s www.nature.com/scientificreports SCIENTIFIC REPORTS | 5 : 9448 | DOI: 10.1038/srep09448 based on ascospore morphology 14 . Pyricularia, the older asexual generic name for the rice blast fungus is informative and legitimate 32 , and therefore is suggested as the correct scientific name for this model taxon 27,33,34 . Taxonomic revision of Magnaporthe, Gaeumannomyces, and other Magnaporthales is underway or already published in other papers 35,36 .
In conclusion, our results demonstrate that Magnaporthales is a robustly supported monophyletic order within Sordariomycetes with its sister group provisionally being Ophiostomatales. The early diverging lineage of Magnaporthales (clade A) is mostly aquatic and saprotrophic on wood substrates. The pathogenicity of the remaining lineages is likely a derived feature with the B clade evolved to infect host aerial parts using appressoria and clade C adapted to plant root infection with hyphopodia. Our results significantly enrich sequence data from the order Magnaporthales and, more importantly, establish a framework for future comparative genomic and functional studies to address the evolution and ecology of this important lineage.

Methods
Fungal strains. Based on previously generated six-locus phylogenies 3,35,36 , we targeted 21 Magnaporthales species (table S1) for transcriptome sequencing, of which five (Magnaporthe salvinii, Magnaporthiopsis incrustans, Magnaporthiopsis rhizophila, Ophioceras dolichostomum, and Pseudohalonectria lignicola) were also used for whole-genome sequencing. The sequencing, annotation, and analyses of the genome data will be presented elsewhere (see table S3  Transcriptome assembly and protein prediction. Transcriptome data from the five Magnaporthales taxa for which we generated whole-genome data (table S3) were trimmed to remove adaptors and assembled using Trinity with guidance of the corresponding in-house genome assemblies 37 . The mRNA-seq data from each of the 16 remaining Magnaporthales taxa (table S1 and S2) were cleaned and then assembled using CLC workbench 7 (http://www.clcbio.com/) under the default settings. Briefly, reads were cleaned to remove low-quality regions followed by adaptor sequence trimming. The resulting reads were used for de novo assembly. The reads were mapped to resulting contigs and the consensus contig sequences were optimized during the mapping procedure. The contig sequences were split at regions with ,2X coverage and the consensus sequences (length $ 200 bp) were exported for further analyses.
Protein sequences from the five Magnaporthales species with whole-genome data were predicted using a combination of three gene prediction programs (i.e., SNAP, Augustus, and the self-training GeneMark-ES). The predicted protein sequences were used to build a local database that included complete proteomes from 20 other Ascomycota that are available in the public domain (table S4). The assembled contig sequences for the remaining 16 taxa were used as a query to search the local database using BLASTx (e-value # 1e-10). The translated protein sequence from each query contig was derived from the BLASTx output.
Selection of gene markers. We chose the CEGMA (Core Eukaryotic Genes Mapping Approach) gene set 38 as markers to reconstruct the phylogeny. CEGMA proteins represent 458 KOGs (eukaryotic orthologous groups) that are conserved among eukaryotes 39 . To avoid misleading effects caused by gene duplication (e.g., Qiu et al. 40 ), we limited the analyses to single-copy CEGMA genes. To this end, we downloaded the seven whole-proteome datasets that are annotated in the KOG database (http://www.ncbi.nlm.nih.gov/COG/), including those from nematode (Caenorhabditis elegans), common fruit fly (Drosophila melanogaster), human (Homo sapiens), mouse-ear cress (Arabidopsis thaliana), baker's yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and microsporidian parasite (Encephalitozoon cuniculi). We mapped a total of 25 complete proteomes, including those from the five in-house Magnaporthales taxa (table S3) and 20 other Ascomycota taxa (table S4), to these seven KOG-annotated proteomes based on the top hit from BLASTp (e-value # 1e-5) searches. KOGs with more than one gene in two or more mapped Ascomycota taxa were removed, leaving 309 single-copy CEGMA KOG genes that were used for downstream analyses.
Construction of single-protein alignments. We identified the genes corresponding to the 309 single-copy CEGMA genes in each of the tested taxa using a reciprocalbest-hit strategy. Ascomycota proteomes or proteins translated from EST contigs were used as query to search a local database comprising all CEGMA proteins using BLASTp (e-value # 1e-5) and vice versa. For each CEGMA gene, sequences from different taxa were collected and aligned using MAFFT version 6.923b (--auto) 41 . The alignments were trimmed using Gblocks (-b4 5 5, -b5 5 h) 42 . The resulting alignments were then processed with T-COFFEE version 9.03 43 to remove poorly aligned residues (conservation score # 5) within conserved sequence blocks. Columns with missing data (.80%) were removed and short alignments (,100 amino acids) were discarded. The remaining alignments were used for phylogenetic tree reconstruction. We generated a Pezizomycotina dataset (226 genes) including six Magnaporthales species and 15 other species representing major lineages in Pezizomycotina with two Saccharomycetes as outgroup ( fig. 1), and a Magnaporthales-focused dataset (another 226 genes) including 24 Magnaporthales taxa and five closely related Sordariomycetes species (fig. 2).
Phylogenetic reconstruction using concatenated multi-protein alignments. For each multi-protein alignment, we first inferred the best protein evolutionary model using Prottest version 3.2 44 . The LG 1 C 1 F model was identified as the top model for both multi-protein alignments (i.e., those used to infer the trees shown in figs. 1 and 2). Maximum likelihood trees were built using RAxML version 7.2.8 45 under the selected model with branch support inferred using 1000 bootstrap replicates. Bayesian trees were built using MrBayes version 3.2.2 46 . Because MrBayes does not include LG model 47 , we modified the source code 'model.c' to incorporate it and then compiled the MPI-version executable. For each alignment, the analyses were performed with two independent runs each with one million generations. Trees were sampled every 100 generations. Consensus trees were built and the posterior probability values for all branches were calculated after removing 25% of the trees as burn-in (figs. S1 and S3).
Tree reconstruction using the coalescent model. For the Pezizomycotina dataset, we reconstructed a maximum pseudo-likelihood tree ( fig. S2) using a coalescent model that does not require the concatenation of multiple single-gene alignments 48 . Statistical support was estimated using 100 multilocus replicates generated using Seo's method 49 . For each replicate, we randomly sampled 226 genes with replacement. For each sampled gene, a pseudo-alignment was generated by random sampling of amino acid site from the corresponding original alignment with replacement. A ML tree was built for each pseudo-alignment using IQtree 50 under the best amino acid evolutionary model selected on the fly. The ML trees were rooted using Yarrowia lipolytica or Monacrosporium haptotylum (in case of missing data in Y. lipolytica) as outgroup. The rooted trees were then used for maximum pseudo-likelihood tree construction using MP-EST 48 under the default settings. The resulting 100 maximum pseudo-likelihood replicate trees were summarized following a majority rule using 'Consense' function in the Phylip package (http://evolution.genetics.washington.edu/ phylip.html). The coalescent model-based tree for the Magnaporthales-focused dataset was generated following a similar procedure (fig. S4). The single gene trees were rooted using Verticillium dahliae or Botrytis cinerea (in case of missing data in V. dahliae).