Genome wide analysis of the transition to pathogenic lifestyles in Magnaporthales fungi

The rice blast fungus Pyricularia oryzae (syn. Magnaporthe oryzae, Magnaporthe grisea), a member of the order Magnaporthales in the class Sordariomycetes, is an important plant pathogen and a model species for studying pathogen infection and plant-fungal interaction. In this study, we generated genome sequence data from five additional Magnaporthales fungi including non-pathogenic species, and performed comparative genome analysis of a total of 13 fungal species in the class Sordariomycetes to understand the evolutionary history of the Magnaporthales and of fungal pathogenesis. Our results suggest that the Magnaporthales diverged ca. 31 millon years ago from other Sordariomycetes, with the phytopathogenic blast clade diverging ca. 21 million years ago. Little evidence of inter-phylum horizontal gene transfer (HGT) was detected in Magnaporthales. In contrast, many genes underwent positive selection in this order and the majority of these sequences are clade-specific. The blast clade genomes contain more secretome and avirulence effector genes, which likely play key roles in the interaction between Pyricularia species and their plant hosts. Finally, analysis of transposable elements (TE) showed differing proportions of TE classes among Magnaporthales genomes, suggesting that species-specific patterns may hold clues to the history of host/environmental adaptation in these fungi.


Supplementary information
Supplementary Information accompanies this paper at online.

Supplementary text
Analysis of TEs in Magnaporthales. (A) Comparison of the total length of transposable element (TE) hits predicted by the REPET v2.5 pipeline as a percentage of genome size across ten fungi in the Magnaporthaceae. The ten fungi are organized according to phylogenetic relatedness, with the most basal species given on the right and most derived species on the left. TEs predicted as Class I are shown in red, Class II in green, and those of unknown class in blue. (B) Comparison of lengths for predicted transposable elements (TEs) grouped by orders as a percentage of genome size for each of the ten Magnaportheaceae genomes. Shown are the Class I orders: LTR, LINE and DIRS, as well as, Class II orders: TIR and Helitron. The "OTHER" category encompasses all other TEs predicted by REPET for a given genome, including those of unknown class.

Supplementary figure S3.
Comparison of di-nucleotide RIP indices produced by RIPCAL for six TE superfamilies across

Supplementary table S1.
Raw sequence reads of five species in Magnaporthales.

Supplementary table S2.
Genome assembly and annotation statistics of five species in Magnaporthales.

Supplementary table S3.
CEGMA analysis to identify 248 conserved core eukaryotic genes in assembled genomes of five species in Magnaporthales.

Supplementary table S4.
The 321 cases of HGT that are either supportive or inconclusive (regardless of branch support) that need further investigation.

Supplementary table S5.
Ortholog groups (OGs) that show evidence of positive selection (FDR ≤ .01) in the wood, blast, and root infecting fungal clades.

Supplementary table S6.
Over-represented GO terms shared between the root and blast pathogenic fungal clades that may comprise common gene families that elucidate pathogen adaptation.

Supplementary table S8.
List of identifiers of the secretome and small secreted proteins (SSPs), and species-specific SSPs in Magnaporthales species.

List of identifiers of the clade-specific secretome and small secreted proteins (SSPs) in
Magnaporthales species.

Supplementary Text
Transposon Analyses Analysis of de novo TEs: REPET de novo predicted TEs from the TEdenovo pipeline were clustered using the CD-HIT server to check for similarities across the ten genomes. 580 initial sequences were clustered into 340 clusters containing, at most, 18 sequences with ≥ 80% shared sequence identity. The majority of TE clusters (337) contain TEs from only a single Magnaporthales taxon, suggesting that most of these elements are adapted to their genome of origin, and/or originated after diversification and speciation. Eight TE clusters were identified as containing TEs from two taxa, while one example was found to contain TEs from three and four taxa, respectively (Clusters 21 and 35). BLAST2GO was run on the representative sequence identified by CD-HIT for each cluster using the NCBI nt database. Cluster #21, with TEs from four taxa (G. graminis, M. incrustans, M. poae, and M. rhizophila), was checked using a large retrotransposon derivative (LARD) representative sequence from M. incrustans and produced a consensus hit of "proline permease mRNA 13508" (NCBI: XM_009228526.1) from the G. graminis tritici R3-111a-1 genome. Cluster #35 with TEs from three taxa (F. oryzae, M. poae, and M. rhizophila) was most similar to an F. oryzae unclassified TIR and returned a "telomere partial sequence" result from the M. oryzae 70-15 genome. Across all clusters, avirulence (AVR) genes from M. oryzae were the most common identification (33 hits), including: Pita-1 (19 hits), Pita-2 (2 hits) and Pia (12 hits). This is unsurprising given that AVR genes have been shown in