Co-evolution of large inverted repeats and G-quadruplex DNA in fungal mitochondria may facilitate mitogenome stability: the case of Malassezia

Mitogenomes are essential due to their contribution to cell respiration. Recently they have also been implicated in fungal pathogenicity mechanisms. Members of the basidiomycetous yeast genus Malassezia are an important fungal component of the human skin microbiome, linked to various skin diseases, bloodstream infections, and they are increasingly implicated in gut diseases and certain cancers. In this study, the comparative analysis of Malassezia mitogenomes contributed to phylogenetic tree construction for all species. The mitogenomes presented significant size and gene order diversity which correlates to their phylogeny. Most importantly, they showed the inclusion of large inverted repeats (LIRs) and G-quadruplex (G4) DNA elements, rendering Malassezia mitogenomes a valuable test case for elucidating the evolutionary mechanisms responsible for this genome diversity. Both LIRs and G4s coexist and convergently evolved to provide genome stability through recombination. This mechanism is common in chloroplasts but, hitherto, rarely found in mitogenomes.

www.nature.com/scientificreports/and pityriasis versicolor 14,15 .In recent years, Malassezia has increasingly been implicated in health and disease beyond the skin: as an underestimated cause of Malassezia bloodstream infections (BSIs) in immunocompromised patients and neonates 16 , as resident of the healthy human gut mycobiota 17 , but it has also been associated with Crohn's disease 18 , colorectal cancer 19,20 , promoting pancreatic oncogenesis 21 , and is implicated in various respiratory diseases 22 .Several Malassezia species may also cause cutaneous pathologies in a variety of animals 12 .Direct sequencing approaches investigating various environmental sources, such as insects, nematodes, sponges, corals, soils, deep-sea vents and subglacial habitats, showed that this yeast may be ecologically more diverse than believed until now 12,23 .
Malassezia is lipid dependent and many species have slow and fastidious growth 24 .Due to culturing limitations, there are challenges to obtain living cells from complex clinical and environmental sources to accurately represent the role of Malassezia species and to better understand their evolution and the underlying mechanisms in various diseases and ecologies.With recent advancements in genomics, nuclear genomes are now available for all currently described species [24][25][26][27] providing valuable resources for examining functional aspects and the evolutionary trajectory of this genus.However, only a limited number of studies have covered mitochondrial aspects and their contribution the evolution of the genus.Recently, Malassezia sympodialis and Malassezia furfur mitogenomes revealed LIRs 3,26,28 .
In this study, the mitogenomes of 11 Malassezia sp. were de novo sequenced and annotated in order to expand the analysis to all species of this genus.In detail, the first comprehensive analysis of the mitochondrial genomes of all 18 currently described Malassezia species plus two putative new ones provided insights into Malassezia mt genome organization and evolution.Furthermore, the potential roles of LIRs, and G4s are discussed since these molecular elements seem to coexist and possibly could act as a unit in genome recombination and stabilization.This work raised the question whether these two features are functionally correlated and evolved convergently.

Results
General mitogenomic features reveal significant size variation and diversity.To date the mitochondrial genomes of nine Malassezia species are available in GenBank but only the ones of M. furfur and M. sympodialis have been described in detail 26,28 .In this study, newly acquired genomes of the remaining 11 species, including a hitherto undescribed species from bat, where sequenced, and their mitogenomes were assembled and annotated.In total, 28 mitogenomes were analyzed, representing all known Malassezia species including multiple strains, when available, for an in-depth pan-Malassezia comparative mitogenomic approach (Table 1).
Mitogenomes varied in size between 28,009 and 63,712 bp, with the one of Malassezia japonica being the smallest and the genome of Malassezia cuniculi the largest.This size variation mostly resulted from differences in number and size of introns, the presence of diverse LIRs and the size of intergenic regions (Suppl.Table S1).The number of introns varied between zero for M. japonica and two strains of M. globosa, and ten for hybrid M. furfur strain CBS 7019 with a total intron size ranging from zero to 9172 bp (for M. obtusa).All introns were of group I type, located in the rnl, cox1 and cob genes.Several introns hosted putative LAGLIDADG homing endonuclease genes (HEGs), with the exception of the first intron (ID subtype) in the cob gene of M. pachydermatis strain CBS 1879, which contained a GIY-YIG homing endonuclease gene (Suppl.Table S2).
The total number of nucleotides allocated to intergenic regions varied between 13,612 bp for M. obtusa and 47,344 bp for M. cuniculi.The GC-content was low as expected for mitochondrial genomes, approximately 30%.The mitogenomes for all analyzed strains contained the expected standard 14 protein-coding genes, two rRNA genes, between 22 and 32 tRNAs, and the protein coding rps3 gene, with M. slooffiae and M. cuniculi containing two copies of this gene.M. slooffiae uniquely possessed the gene for the RNA subunit of mitochondrial RNase (rnpB).Most Malassezia mitogenomes contained a LIR with a variable number of duplicated genes (Suppl.Table S1).
For some species, mitogenomes for multiple strains were available, reanalyzed, and assessed, showing a variable level of intraspecies mitogenomic differences mostly resulting from the diversity of their intergenic regions and LIRs, and their intron abundance.For example, of the three M. globosa strains, the mitogenomes of the CBS 7966 and CBS 7874 do not possess any introns, but CBS 7991 contains three introns.
Up to now an intraspecies mitogenome variability was previously analyzed for 20 M. furfur strains 3 .The authors reported a mitogenome size variability ranging from 45,715 to 49,317 bp, mostly resulting from variation in intron abundance and variable intergenic regions 3 .In the present study the mitogenome of CBS 9595 was selected and analyzed, as this turned out to be an important parental reference strain in a study focusing on M. furfur hybridization events 28 .

Mitochondrial phylogeny.
Phylogenetic relationships were determined considering the 15 mt protein coding genes, with Ustilago maydis used as outgroup (Fig. 1).Three main phylogenetic clades could be determined: clade I, containing M. arunalokei, M. restricta, and M, globosa; clade II, with M. caprae, M. dermatis, M. sympodialis, M. equina, M. nana, and M. pachydermatis; clade III, containing M. brasiliensis, M. furfur, M. obtusa, M. psittaci, M. yamatoensis, M. japonica, and M. vespertilionis; and finally, the basally positioned species M. slooffiae and M. cuniculi, together with strain CBS17886 which represents a putative new species.In current mitogenomic phylogeny, M. furfur CBS 9595 was positioned slightly deviating from the other four M. furfur strains with separate positioning, and in a cluster also containing M. brasiliensis.Based on the concatenated 15 protein coding genes (PCGs), sequence similarity between CBS 9595 and the other four M. furfur strains was 98.3-98.4% compared to 99.7-99.9%among the four closer related strains, corroborating previous indications that M. furfur may comprise two closely related species 28 (Suppl.Table S3).
www.nature.com/scientificreports/Mitochondrial synteny analyses suggest evolution towards fixated genome organization.Mitogenomic gene shuffling between species was observed, and only two groups of species presented a similar syntenic configuration (Fig. 1).Species of phylogenetic clades I and II, except for M. nana, showed identical gene order indicating late evolutionary acquired genome conservation.Some LIR variation existed for M. nana and M. restricta strain KCTC 27527 (see next section).The other group with identical synteny consisted of the closely related species M. brasiliensis, M. furfur, and M. obtusa from phylogenetic group III.One deviation in genome organization could, however, be observed in M. obtusa, for which the trnG-nad2-nad3 gene cluster was positioned in an inverted orientation compared to M. brasiliensis and M. furfur.The remaining seven species retained unique genome organizations with only partially shared synteny.Among all Malassezia species, six syntenic blocks could be distinguished: trnM-trnE-trnT, nad2-nad3, cox3-nad6, trnV-cox2-nad1, trnD-nad4L-nad5, and trnA-trnF-trnY-trnS (Fig. 2, i-vi).If M. cuniculi is excluded, three more syntenic blocks could be identified (i.e., rnl-trnP-trnN, trnI-cob and trnS-cox1-atp8) (Fig. 2, vii-ix).Syntenic diversity was illustrated by the transposition and inversion of gene blocks and single genes among species.These events most probably occurred among the earlier divergent species, with the M. cuniculi mt genome containing the most differing genome organization.As phylogenetically basal species possessed a less organized gene order among them compared to the more evolved clusters, an evolutionary drive towards fixated genome organization seems to exist.Table 1.Malassezia species and strains used for the mitogenomic comparative analysis.Their origin, source is provided along with the NCBI Accession number and the size (in bp) of their mitogenome.The mitochondrial phylogenetic clade in which each species belong is also indicated.Bold lettering indicates the newly acquired mt genomes, sequenced and annotated in this work.NA non-available, PV pityriasis versicolor, SD seborrheic dermatitis and TV tinea versicolor.japonica and 10,698 bp for M. brasiliensis (Suppl.Fig. S1).Variation in LIR assembly resulted from the presence or absence of trn genes in the repeat unit and in the intra-IR region, the orientation or combinations of genes, and finally from variation in the intergenic regions (Fig. 3).The early diverged mt genomes of M. cuniculi and M. slooffiae evolved independently, resulting in unique synteny and LIR composition.In M. cuniculi a duplicated region composed by trnP-trnN-trnS-rps3-trnG-atp9-trnL-trnH-rns-trnK was present with the second copy positioned in an inverted orientation and separated by a long genomic stretch containing rnl-trnR-nad3-nad2-trnG-trnL-trnH (Fig. 3).Thus, it should probably be disqualified as a LIR but attributed to a simple inverted duplication.This duplication may have been the result of a recombination event and it facilitated the existence of duplicated rps3 and rns gene copies.In M. slooffiae the repeat unit contained the gene rps3 with genes trnH, rnpB, trnM located in the intra-IR region.The protein coding gene rnpB is only present in the mitogenome of M. slooffiae and it seems that this gene was introduced with the insertion of the LIR.
In CBS17886, representing a yet undescribed Malassezia species, a 3.7 kb fragment including the genes trnM and trnH was observed comprising the first evolutionary step for the formation of laterally evolved LIRs in all Figure 1.Phylogeny and synteny of the mitogenome of 28 Malassezia strains representing all currently known species.The phylogenetic tree was produced using Bayesian Inference based on the concatenated amino acid matrix of the 15 mitochondrial PCGs.Numbers at the nodes of the tree indicate posterior probability (PP) values.The species Ustilago maydis strain 521 has been used as an outgroup.Shading of phylogenetic clusters is shown as green, yellow, purple, and blue for clades I, II, III and the basal species, respectively.Blue text indicates the taxa whose mt genome is presented for the first time in this study.Additionally, synteny is provided in all cases starting with rnl gene.The genes which are located on the complementary reverse strand are marked with red color.Amino acid symbols represent the corresponding tRNA gene (e.g., P for trnP).Genes referring to subunits of NADH complex, apocytochrome b, cytochrome c oxidase complex and ATP synthase complex are shown in green, orange, blue, light green respectively.in a more recent evolutionary event, as both species clustered together in a separate clade (Fig. 3).In M. furfur and M. sympodialis, intraspecies LIR variation was also detected and mainly resulting from different orientations of the intra-IR fragment.Interestingly, one of the M. restricta strains lacked the intra-IR region.

G-quadruplex (G4) motifs in Malassezia mitogenomes are predominantly distributed in the
LIRs.Potential G4 formation was identified in all examined mitogenomes when the threshold of 1.2 and the window size of 25 (default parameters) were used.The number of found G4s varied from one in M. yamatoensis to 37 in M. obtusa, without significant correlation with the mitogenome size and the GC content (Suppl.Table S4, Suppl.File S1).G4s are mainly distributed within the LIRs, with few exceptions.M. globosa was the only species without any G4s in the LIR (Fig. 4).In the M. yamatoensis and M. psittaci mitogenomes, which do not possess LIRs, G4s were located within the intergenic regions.Interestingly, only phylogenetic clades I and II incorporate G4DNA motifs in trnP, and their distribution is conserved within each of these clades.Potential G4s were also observed (a) in cox1 third intron (group-IB) of M. sympodialis and U. maydis, (b) in atp9 of the early diverging species M. slooffiae which is not part of its unique LIR, and (c) in the trnK of M. caprae.G4 analysis showed that some predicted G4s have only three runs of Gs instead of the four needed, with at least two consecutive Gs.This could be explained by (a) the potential presence of bulges found in adjacent single Gs and (b) the presence of G-runs upstream and downstream of the predicted G4s.For this reason, the analysis was repeated with the same threshold (1.2), but with larger window size (30).Interestingly, the predicted G4s for this latter analysis were only located in LIRs (94.5%) and intergenic regions (5.5%) (Fig. 4, Suppl.Table S4).With these stricter parameters, the species M. globosa, M equina and M. yamatoensis did not possess potential G4s.When comparing the results of G4 analyses with different window sizes, it became evident that G4s with scores more than 1.2 could result in both 25 and 30 window size, depending on the species and its mitogenome sequence www.nature.com/scientificreports/(Table 2).Among all predicted G4s with both parameters, the loop length varied from one to 18 bp, and G4 sequence conservation was only detected at the intra-species level (Suppl.Table S4).

Discussion
The mitochondrion is a vital organelle for cell growth and survival of eukaryotic cells including fungi, but recent research has also implicated mitochondrial functions in the ability for pathogenic fungi to cause disease 29,30 .Mitochondrial genomes may also provide useful insights into fungal evolution as previously shown 4,5 .In general, whole genome sequencing projects focus on the nuclear genomes and thus far, only 192 out of 1,189 available fungal mt genomes belong to Basidiomycetes (https:// www.ncbi.nlm.nih.gov/ genome/ browse# !/ organ elles/).However, species of Basidiomycetes present a complex mitogenomic gene organization with both ancestral and newly acquired properties, rendering them excellent model organisms for studying fungal evolution.Additionally, the medically relevant genus Malassezia embodies scarcely observed genetic elements, like LIRs, in their mitogenomes 3,28 .In this study, the mitogenomes of 11 Malassezia species along with publicly available mitogenome data (totaling 28 mt genomes, representing 18 described species and two putative new species) were utilized in a comparative genome analysis of all species of the genus Malassezia, in order to unravel the evolution of its mitogenome.
As expected, all Malassezia mt genomes contain 14 protein-coding genes that are present in nearly all fungi and metazoa, plus the rps3 gene which is present only in ca.65% of Basidiomycetes 31 .Malassezia mitogenomes demonstrate substantial variation among species, mostly resulting from variation from intron size and frequency, intergenic regions, and LIR diversification, leading to mt genome sizes ranging from 28 to 64 kb.
A recent study, presenting a whole nuclear genome based phylogenetic tree, identified a similar main clustering as the mitochondrial phylogeny of this work, with some deviations and in some cases slightly different species positions withing the main clusters.Most noteworthy, in the nuclear phylogeny, M. cuniculi and M. slooffiae are also basally positioned but seem to have a sister relationship, whereas in the mitochondrial phylogeny, M. cuniculi is ancestral to M. slooffiae.Such discrepancies between mitochondrial and nuclear phylogenies have been described before and may result from different selection pressures on both genomes 32 and also showcase the added value of concatenated mitochondrial PCG matrices in fungal phylogenetics, as was recently also shown for the subphylum Saccharomycotina of Ascomycota 5 .
In fungi, variation in gene order is immense [33][34][35] .Even when only considering the genus Malassezia, a diversity in mt genome organization was observed (Fig. 1).Only six shared syntenic blocks exist (Fig. 2), of which nad2-nad3 seems to form a syntenic block in most fungal species 36 .The only other two genes that generally exist next to each other in fungi are nad4L-nad5 and it has been suggested that the variations in gene order is most likely www.nature.com/scientificreports/ the result of recombination 5,34 .Gene organization seems to be more discrepant between basal species and has a tendency to stabilize in the lately evolved phylogenetic clades.All but two Malassezia species contain a LIR of variable size and content (Fig. 3).Even the mitogenome of the earlier diverging species M. cuniculi, contains a unique inverted duplication (Fig. 3).Another, evolutionarily independent species-specific duplication happened in M. slooffiae, which is the only species that additionally contains the rnpB gene which is uncommon among Basidiomycetes.This gene encodes the RNA subunit of mitochondrial RNaseP, an enzyme necessary for the maturation of the tRNAs 34 .It has been shown that rnpB is a functional gene in Ascomycetes and the protists Reclinomonas americana and Nephroselmis olivacea, and it is considered an ancestral element which was probably lost in most fungal species during evolution 37 .However, the existence of this gene in the intra-IR region of M. slooffiae may suggest a evolutionarily more recent addition to the mitogenome due to the LIR mobility.The LIRs of the remaining species seem to be the result of a continuous evolutionary process of three steps, starting with the duplication and inversion of the ancestral syntenic unit trnH-trnM in the mitogenome of Malassezia sp.CBS17886 (Fig. 3, step 1).As a second step of the process, the duplication of ancestral syntenic unit trnR-trnL-atp9, also found in U. maydis (NCBI Acc No. NC_008368.1)and Pseudozyma sp.(NCBI Acc No. MK714018.1),and the loss of the one trnH copy led to the formation of the LIRs in phylogenetic clades I, II and in M. vespertilionis.Subsequently, in the other species of clade III an inversion of the previous syntenic unit is observed along with the deletion of one of the two trnMs, resulting in an intra-IR region containing only trnM-trnH.In agreement with previous studies, the mechanism might have been recombination 38 which in this case resulted in a cross-over fixation, leading tothe loss of trnM.Interestingly, between closely related species with comparable LIR composition in clades I, II and III, the intra-IR is detected in normal or inverted orientation.Studies focusing on intra-species variation confirmed the presence of multiple Table 2. Predicted G4s with the highest score in G4hunter tool for each Malassezia species.G4 length, sequence, score and location on the mitogenome is provided.G4 residues are colored based on their contribution to the final score, i.e., dATPs and dTTPs are neutral (black color), dGTPs are positive (red color) and dCTPs negative (blue color).The window size, in which the best G4 was predicted, is also indicated.M. globosa and M. equina are not included since the score of the predicted G4s did not exceed 1.2.www.nature.com/scientificreports/mt genome rearrangements in various strains of one species, resulting from recombination between LIRs 3,26 .This could possibly apply for all Malassezia species but requires further exploration.The LIR variability in Malassezia is most probably the main course for the diverse synteny seen within the genus.This hypothesis is in agreement with the work of Liu and colleagues (2020) who demonstrated the presence of two different mtDNA genotypes in the same cell of Agrocybe aegerita, as a result of intramolecular recombination 39 .Up to date, LIRs have only been described for a few fungal genera, including several Candida species 28,38,40 , Saccharomyces cerevisiae 41 , Hanseniaspora sp.(i.e., Kloeckera africana) 42 and several species of Agaricales 43 .Mito-LIRs can be found in a certain array of "lower" eukaryotes, like the oomycetes Achlya ambisexualis 44 and the protozoan Tetrahymena pyriformis in which the mitogenome is linear 45 .Previous studies showed that LIRs are recombination hotspots leading to the initiation of replication in Candida albicans 38 .A comparative study exploring the mt genomes of eight Candida species determined that LIRs may be involved in the so-called flip-flop recombination resulting in genome isomers and it was demonstrated that LIRs are involved in conversions between circular and linear mitochondrial topologies and in extent they may play a role in replication 40 .A possible mode of LIR introduction into mitochondria is plasmid integration 46 , although this study does not confirm this mechanism of origin for Malassezia, since no DNA polymerase gene or other plasmid related elements were found in all species examined.The lack of LIRs in two Malassezia species indicates that replication is not performed due to the LIRs.Finally, the LIR variation with its expansion and/or contraction within the genus Malassezia suggests that it is an ongoing process with back-and-forth results as observed in the different species.

Species
In general, fungal mitogenome analyses present considerable genome size variation, which until now has been contributed to their sequence alterations resulting from mutation and recombination 5,47 .On the other hand, in plants and more specifically in chloroplast genomes, size diversity may be explained by sequence complexity as well as by the amount of repeated DNA 48 .Chloroplast genomes (cpDNA) usually present a commonly found structure of two single regions divided by LIRs 49 allowing cpDNA to evolve under strong constraint-either mechanistic or selective-in order to maintain a compact, largely genic genome 48 .In fungi, the lack of LIRs in the majority of the species indicates a free evolution of the mitogenomes with respect to their size, with other proposed underlying mechanisms 34,50 .
Additionally, the existence of inverted repeats is strongly related to the high number of G4s in the S. cerevisiae mitogenome 8,11 .In the current study, a similar observation is made for the mitogenomes of the genus Malassezia (Fig. 4).In species which do not possess LIRs, G4 formation can occur in intergenic regions, and in some cases in trn genes or within introns (Fig. 4).Interestingly, setting the window size to 25, potential G4 structures were overlapping with the trnP gene in Malassezia's later diverging phylogenetic clades I and II.This phenomenon may be additional evidence to the hypothesis of trn genes acting as recombinational hotspots for mitogenome gene shuffling 5,51 .Moreover, this distribution contradicts with the respective G4 analysis in S. cerevisiae, where G4s did not overlap with trns 11 .G4 DNA structures have already been related to DSBs 9,52 .Their location within the LIRs, which have been implicated in recombination events whenever DSBs exist, offer a repair mechanism and contribute to genome stability, despite the G4 DNA implication to deleterious effects on mtDNA 52 .
The following hypothesis concerning fungal mitogenome evolution is presented: In contrast to the cpDNAs, mitogenomes rarely contain cpDNA-like structures, and whenever found in fungi they are tandemly scattered, i.e., in different orders/subphyla which are also phylogenetically distant.Thus, their acquisition must be a recent event which happened several times during fungal evolution, as phylogenetically scattered species carrying LIRs in their mt genome suggest.The G4 DNA structures seem to have coevolved with the LIRs, whenever both exist.In other words, they coexisted to act as a unit performing recombination and repair to stabilize the mitogenome.
In conclusion, Malassezia's mitogenome analyses showed the existence of newly acquired elements like LIRs and G4 DNAs, which converged and provided a reliable recombination-based mechanism to facilitate genome stability.This work adds to our understanding of the mechanisms that drive mitogenomic evolution.

Materials and methods
Malassezia species and strains.In this study 28 Malassezia strains, corresponding to all known species of the genus, including a putative new species from bat, were used in the in-silico comparative mt genome analysis.In detail, 11 mt genomes were retrieved from in house generated whole genome sequencing data (WGS) (Table 1).
Culture and DNA extraction.Malassezia strains were grown on modified Dixon agar 53 for 48-96 h at 30 °C (M.cuniculi on Leeming and Notman agar for 7 d at 36 °C; CBS 17886 for 7 d at 24 °C), and cells were harvested into 50-mL tubes.Yeast cells were lysed, following the Qiagen genomic DNA purification procedure for yeast samples (Qiagen, Hilden, Germany), with some modifications.Lyticase incubation was performed for 2 h at 30 °C, and RNase/Proteinase incubation was performed for 2 h at 55 °C.Genomic DNA was purified using Genomic-tip 100/G prep columns, according to the manufacturer's instructions.
Whole genome sequencing.Samples were sequenced on the Oxford nanopore PromethIon platform, and libraries were prepared using the standard Native Barcoding Genomic DNA Ligation Protocol.Base calling was performed on the PromethIon (Release 21.02.7) with high accuracy setting.
Long read assembly.De novo assembly was carried out by Flye version 2.9 54 .The "-nano-raw" flag was used; genome size was set to 8 Mb and polishing iterations to 2. www.nature.com/scientificreports/Mitochondrial DNA assembly.For most strains, mitochondrial DNA was found in multiple fragments in the assembled genomes.To retrieve it in one piece, the following process was followed: (1) raw reads were mapped against the assembled genome, using minimap2 version 2.17-r941, (2) reads that mapped against chromosomal fragments were discarded; (3) remaining reads were reassembled using both Flye 2.9 54 , setting "-asmcoverage" to 50, and canu 2.2 55 .Since reads were not 100% mitochondrial and the exact size of the remaining chromosomal fragments was not known, multiple assemblies took place for every strain.Genome size was set to 50 kb, 500 kb or 1000 kb, until the mtDNA was retrieved in one scaffold.Where Illumina data were available, mtDNAs were assembled using GetOrganelle v1.7.5 using recommended parameters 56 .The obtained mt scaffolds were additional validated by aligning them against known Malassezia mt genomes using tBLASTn/ BLASTx/BLASTn 57 and were further analyzed as described below.
Mitochondrial genome annotation and comparative analyses.The mt scaffolds were annotated as follows: the protein coding and the ribosomal (rRNA) genes were identified using BLASTx and BLASTn, respectively.The tRNA genes were detected using the web-based tRNAScan-SE platform 58 .Intron characterization was performed by RNAweasel online tool 59 and the intron borders were also verified manually.The open reading frames (ORFs) were identified using ORF finder (https:// www.ncbi.nlm.nih.gov/ orffi nder/) setting "ORF start codon to use'' parameter as "ATG and alternative initiation codons", and the genetic code employed was "The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code" (NCBI transl_table=4).The 11 newly annotated mt genomes of this study were deposited to GenBank (acc Nos: ON585844-ON585854).The sequence of the nine conserved mt gene blocks identified in synteny analysis, were collected, and aligned by multiple sequence alignment program MAFFT using the E-INS-i method 60,61 in order to identify the mt sequence diversity in Malassezia spp.
Phylogenetic analyses.Amino acid sequences as produced by the 15 mt protein coding genes (PCGs) (i.e., atp6, atp8-9, cob, nad1-6, nad4L, cox1-3 and rps3) were collected from the 28 Malassezia mt genomes for phylogenetic purposes (Suppl.File S2).Ustilago maydis strain 521 was used as an outgroup (GenBank Acc. No. NC_008368) in all analyses besides the phylogenetic one.The sequences for each protein were aligned by Lasergene's MegAlign v.11 program (10.1385/1-59259-192-2:71) using the ClustalW method with default settings.The concatenated dataset was created for phylogenetic purposes.The phylogenetic tree construction was performed using Neighbor Joining (NJ) and Bayesian Inference (BI) methods through PAUP4 62 and MrBayes 63 software, respectively.For NJ analyses, reliability of nodes was evaluated using 10.000 bootstrap iterations for all concatenated and individual datasets.For BI analyses, the determination of the evolutionary model, which was best suitable for each dataset, was performed using the program ProtTest (ver.3) 64 .The BIC Information Criterion was applied, and the best nucleotide substitution model was found to be LG + I + G + F. Four independent MCMCMC analyses were performed, using 5 million generations and sampling set adjustment for every 100,000 generations.The remaining parameters were set to default.
LIRs and G-quadruplex prediction.The LIR of each genome was identified in sequence level using Blastn against itself and the identical inverted repeats were collected.For the prediction of G-quadruplexes in Malassezia's mitogenomes, G4Hunter web application, a rewrite of "G4 hunter Python implementation by Jean-Louis Mergny and Amina Bedrat" algorithm, were used setting the window (a) in 25 bp and (b) in 30 bp, and the threshold/G4Hunter score to 1,2 65,66 in both cases.
www.nature.com/scientificreports/Long inverted repeats (LIRs) contribute to the conserved synteny of the more recently evolved phylogenetic clades.Nearly all Malassezia species contained a LIR varying in size between 2098 bp for M.

Figure 2 . 4 other
Figure 2. The main syntenic units of Malassezia mitogenomes.The first six units can be found in all Malassezia spp.(i.e., i-vi) while the last three in all except M. cuniculi (i.e., vii-ix).Arrows indicate gene transcription direction.

Figure 3 .
Figure 3. Schematic representation of Malassezia LIR evolution in relation to phylogeny.LIRs synteny is presented next to the respective phylogenetic clade.The intra-IR region can be seen in between black vertical lines.The genes located on the complementary reverse strand are indicated in red coloring and amino acid symbols represent the corresponding tRNA gene (e.g., H for trnH).The main evolutionary steps are shown as different colored symbols of the nodes and clades of the tree.Latin numbers in circles indicate the three different phylogenetic clades.The ancestral gene block of Ustilago maydis is provided.