Genome sequence and annotation of Periconia digitata a hopeful biocontrol agent of phytopathogenic oomycetes

The Periconia fungal genus belongs to the phylum Ascomycota, order Pleosporales, family Periconiaceae. Periconia are found in many habitats, but little is known about their ecology. Several species from this genus produce bioactive molecules. Periconia digitata extracts were shown to be deadly active against the pine wilt nematode. Furthermore, P. digitata was shown to inhibit the plant pathogenic oomycete Phytophthora parasitica. Because P. digitata has great potential as a biocontrol agent and high quality genomic resources are still lacking in the Periconiaceae family, we generated long-read genomic data for P. digitata. Using PacBio Hifi sequencing technology, we obtained a highly-contiguous genome assembled in 13 chromosomes and totaling ca. 39 Mb. In addition, we produced a reference transcriptome, based on 12 different culture conditions, and proteomic data to support the genome annotation. Besides representing a new reference genome within the Periconiaceae, this work will contribute to our better understanding of the Eukaryotic tree of life and opens new possibilities in terms of biotechnological applications.

digitata (strain Y3), previously misidentified as Phoma sp.CNCM I-4278 17,18 .P. digitata CNCM I-4278 was able to inhibit the growth and cyst germination of the plant pathogenic oomycete Phytophthora parasitica both in vitro and in planta, without phytotoxicity 17,18 .In addition to anti-oomycete activity, the water filtrate and/or the crude extract of P. digitata CNCM I-4278 also inhibited the growth of several phytopathogenic fungi 17 .In another screening of fungal culture filtrates isolated from freshwater submerged wood, a P. digitata strain was highly active on the fungivorous and phytophagous nematode Bursaphelenchus xylophilus responsible for dramatic losses in pine forests.Indeed, 70% to 80% of nematodes were killed 48 h after the treatment with P. digitata extracts 19 .
Multi-omic resources.Owing to its high potential as a biocontrol agent for plant protection against several classes of problematic plant pathogens, we present the chromosome-scale genome assembly and annotation of P. digitata CNCM I-4278.To date, only 2 other genomes were available in the genus Periconia and the family Periconiaceae, the genomes of Periconia macrospinosa 20 and Periconia sp.R9002.However, these genome assemblies, made of hundreds of contigs/scaffolds, appear to be much fragmented compared to the genomes that can be obtained nowadays using long read technologies.Therefore, we used a PacBio HiFi sequencing technology to produce highly accurate long reads and assembled the P. digitata genome in 13 haploid chromosomes with a total length of ca.39 Mb.The genome annotation was guided by the assembly of a reference transcriptome composed of the transcriptomes of 12 different culture conditions, including stresses, leading to very different mycelium phenotypes.The annotation revealed 15,520 protein-coding genes and conserved InterPro domains could be identified in 60% of the proteins.In addition, we carried out Nano-HPLC-HRMS analyses to characterize the proteome of P. digitata and strengthen our functional annotation.The proteomic analysis retrieved and confirmed more than one-third of the predicted proteins no matter the chosen parameter (1 or 2 unique peptides).
Overall, this work generated a high-quality genome completed with rich transcriptomic and proteomic data that will be useful to future research (Fig. 1).This study will constitute an important resource for our further understanding of the Eukaryotic tree of life and for future comparative genomics 21 .This will help to better delineate the evolution within fungi, a kingdom that is constantly revisited from a systematic point of view on the basis of molecular markers rather than on life history traits.This work will also contribute to bringing new insights into the Pleosporales, potentially the largest order of Dothideomycetes that account for more than 300 genera and 4,700 species 22 and contains only 109 sequenced genomes Mycocosm Portals (doe.gov)(2022).

Methods
Strain identification.The strain Phoma sp.CNCM I-4278 was previously isolated from the rhizosphere of Nicotiana tabacum (cv Xanthi, Solanaceae) grown under controlled conditions 17 .The fungus was identified according to the closest similarity of its 18 S rRNA sequence (HM161743 23 ) with those present in GenBank by that time and it was deposited in the National Collection of Institut Pasteur (CNCM I-4278) 17,18 .This first attempt to identify the strain was uncertain since the alignment showed more than 30 mismatches with a putative Phoma sp.However, the increasing availability of fungal sequences in the database allowed a taxonomic reevaluation of the strain.
The fungus was cultivated for 2 weeks on Petri dishes containing Potato Dextrose Agar (PDA -20 g glucose, 4 g potato extract, 15 g agar, up to 1 L Milli-Q ® water) in order to obtain sufficient biomass to perform a DNA extraction.The DNA extraction protocol was optimized in our laboratory starting from previous works [24][25][26] .Briefly, about 100 mg of mycelium were placed in 2 mL Eppendorf tubes with two steel beads and disrupted in a MM301 tissue lyzer (Retsch GmbH, Haan, Germany).Then, a volume of 1 mL of lysis buffer CTAB (28 mM NaCl, 2 mM Tris-base, 0.4 mM Na 2 EDTA, pH 8) plus 2% Polyvinylpyrrolidone (PVP) was added to the samples.The tubes were incubated at 65 °C for 2 h.400 μL of chloroform:isoamyl alcohol (24:1 v/v) were then added to the samples that were vortexed and centrifuged for 5 min at 13,000 rpm.The supernatant (600 μL) was transferred to a new Eppendorf tube and 100 μL of 10 M ammonium acetate was added; the solution was gently mixed and the samples were incubated at 4 °C for 20 min.The mixture was vortexed and centrifuged for 10 min at 13,000 rpm, the supernatant (650 μL) was transferred to a new Eppendorf tube and one volume of isopropanol (kept at −20 °C) was added.The sample was gently mixed and incubated overnight at −20 °C in order to precipitate the DNA.The final pellet was collected by centrifugation for 5 min at 13,000 rpm at 4 °C.The supernatant was discarded, and the pellet was washed with 500 μL of 75% aqueous ethanol (kept at −20 °C) and recovered by centrifugation for 2 min at 13,000 rpm.The supernatant was discarded, and the pellet was dried under the airflow of a chemical hood.Then, the DNA was resuspended in 30 μL of sterile Milli-Q ® water.
The DNA quality and quantity were evaluated using a NanoDrop 2000 (Thermo Scientific, Wilmington, USA).The DNA was stored at −20 °C.
The obtained DNA was used to amplify partial sequences of two genetic markers.The primer pairs ITS1/ ITS4 27 and LR0R/LR5 28 were used to amplify the internal transcribed spacers and the 28 S large ribosomal subunit (nrLSU) region, respectively.The PCR reaction was performed in 25 μL final volumes and consisted of 12.5 μL GoTaq ® G2 Hot Start Colorless Master Mix (2X -Promega), 1 μL of each primer (10 μM), 5 μL genomic DNA extract (10 ng/μL) and 5.5 μL nuclease-free water.PCR products were loaded on 2% agarose gel electrophoresis in 0.5X Tris-acetate-EDTA buffer.The gel was stained with ethidium bromide and the PCR products were visualized under UV light.The PCR products were sequenced (accession numbers OP329216 29 -ITS; and OP329219 30 -28 S) and used to build a Maximum Likelihood (ML) phylogenetic tree with MEGA X 31 .The dataset used to build the tree is shown in the Table 1.Among the 47 Periconia species having sequences deposited in GenBank, only 27 exhibited both ITS and 28 S sequences of sufficient length (Table 1) allowing to build trees from trimmed and concatenated ITS-28S.For the sequences selection a priority has been given to strains cultured from holotype or considered"reference strains" from culture collections.The phylogenetic tree was inferred by ML method with the Tamura-Nei model of evolution 32 .The tree with the highest log likelihood (−6834,70) is shown (Fig. 2).The percentage of trees in which the associated taxa clustered together is shown next to the branches.Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood approach and then selecting the topology with superior log likelihood value.A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0,3430)).The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 29,28% sites).The tree is drawn to scale, with branch lengths measured in the number of substitutions per site.
The resulting tree allowed unambiguously identifying the strain Phoma sp.CNCM I-4278 as Periconia digitata whose name was used further throughout this work.
Periconia digitata culture conditions.Unless otherwise stated, P. digitata was grown in 6-well plates containing 5 mL of sterile media (Table 2) that were inoculated with a spore suspension (1.5 × 10 4 spores/mL final concentration).For the high molecular weight DNA extraction, the fungus was grown on Potato Dextrose Broth -PDB (Potato extract 4 g/L, Dextrose 20 g/L) and incubated at 24 °C, in the dark, for 7 days, prior to DNA extraction.
In order to generate a reference transcriptome of P. digitata, different culture conditions were selected and several stress factors were applied to capture a variety of transcripts and generate an as comprehensive as possible transcriptome.We applied (or not) light, salt, heat, cold, oxidative and heavy metal stresses.We also introduced complex media of plant origin that induced very different phenotypes in terms of mycelium organization.The liquid culture conditions were prepared in 6-well plates with two wells for each condition in the dark.A solid culture was also prepared in Petri dishes (9 cm Ø); the plates were inoculated in three spots with 10 µL of the same spore suspension used for the liquid culture.
The 12 culture conditions, the stress factors and when they were applied are detailed in Table 2.The plates were incubated for 8 days prior to RNA extractions.
High molecular weight DNA extraction.High molecular weight DNA extraction was performed using the MasterPure ™ Complete DNA and RNA Purification Kit from Epicentre.Seven days old-mycelia of P. digitata were retrieved from a 6-well plate on PDB as described above and directly ground in liquid nitrogen with sterilized pestles and mortars.The resulting powder was transferred in six microtubes containing 1 μl of proteinase K and 300 μl of tissue and cell lysis solution for each tube and homogenized.All homogenisation steps were performed gently to avoid DNA fragmentation.Tubes were incubated at 65 °C for 15 minutes, then cooled down at 37 °C before adding 1 μl of 5 μg/μl RNase A and finally incubated for 30 minutes at 37 °C.Samples were left on ice for 5 minutes and 175 μl of MPC protein precipitation reagent was added to each sample and mixed.The debris were pelleted by centrifugation at 4 °C for 10 minutes at ≥10,000 × g.The supernatant was transferred to a clean microcentrifuge tube.500 μl of isopropanol was added and tubes were inverted several times before centrifugation at 4 °C for 10 minutes.Isopropanol was removed and pellets were rinsed 2 times with 70% ethanol and left dry before solubilization with 35 μL of EB buffer (Qiagen).The six samples were pooled together and purified.
DNA was analyzed for quality and quantity controls using nanodrop, Qubit and fragment analyser.The final sample used for library preparation contained 19.2 µg of gDNA with an average fragment size of 30,145 bp.
RNA extraction for transcriptome sequencing.Mycelia from the 12 different culture/stress conditions (Table 2) were retrieved separately and directly ground in liquid nitrogen with sterilized pestles and mortars.The resulting powder for each condition was transferred into 2 to 3 microtubes, depending on the sample quantity.800 µl of extraction buffer (CTAB 2.5%, PVPP 2%, Tris-HCL 100 mM, EDTA 25 mM, NaCl 2 M, β-mercaptoethanol 2%) was added to each tube.After 30 minutes of incubation at 65 °C, 800 µl of Chloroform/ Isoamyl alcohol (CI; v/v; 24/1) was added, and after homogenization, the tubes were centrifuged (16000 g -8 min -4 °C).The supernatant was retrieved and the same volume of water-saturated Phenol (pH 4.5-5)/Chloroform/ Isoamyl alcohol (PCI; v/v; 25/24/1) (ca.700 µl) was added and centrifuged.A second step with CI was carried out and the supernatant was retrieved and mixed with 500 µl NaCl 5 M and 500 µl of isopropanol and stored overnight at −20 °C.After centrifugation (16000 g -20 min -4 °C), two cleaning steps were carried out with ethanol 70%.The pellet was dried and resuspended in 30 µl of buffer EB (Qiagen).Genomic DNA was removed from samples using the kit TURBO DNA-free (Ambion) following the supplier's instructions.Samples purity and quality were assessed with a nanodrop and a bioanalyzer.The samples with the best qualitative parameters (1.98 < OD 260/280 < 2.07; 1.72 < OD 260/230 < 2.19; 2.5 < RIN < 5; 3.6 µg < RNA Quantity < 15.8 µg) in each condition were kept for library preparation Table 3.

DNA sequencing. Library preparation and sequencing were performed at GeT-PlaGe core facility, INRAE
Toulouse according to the manufacturer's instructions "Procedure & Checklist Preparing HiFi SMRTbell Libraries using SMRTbell Express Template Prep Kit 2.0".At each step, DNA was quantified using the Qubit dsDNA HS Assay Kit (Life Technologies).DNA purity was tested using a nanodrop (Thermofisher) and size distribution and degradation assessed using the Femto pulse Genomic DNA 165 kb Kit (Agilent).Purification steps were performed using AMPure PB beads (PacBio)0.15µg of DNA was purified then sheared at 20 kb using the Megaruptor1 system (Diagenode).Using SMRTbell Express Template prep kit 2.0, a Single strand overhangs removal, a DNA and END damage repair step were performed on 5 µg of sample.Then blunt hairpin adapters were ligated to the library.The library was treated with an exonuclease cocktail to digest unligated DNA fragments.A size selection step using a 9 kb cutoff was performed on the BluePippin Size Selection system (Sage Science) with "0,75% DF Marker S1 High Pass 15-20 kb" protocol.Using Binding kit 2.0 kit and sequencing kit 2.0, the primer V2 annealed and polymerase 2.0 bounded library was sequenced by diffusion loading onto 1 SMRTcell on Sequel2 instrument at 95 pM with a 2 hours pre-extension and a 30 hours movie.

RNA sequencing.
RNAseq was performed at the GeT-PlaGe core facility, INRAE Toulouse.RNA-seq libraries have been prepared according to Illumina's protocols using the Illumina TruSeq Stranded mRNA sample prep kit to analyze mRNA.Briefly, mRNA were selected using poly-T beads.Then, RNAs were fragmented to generate double stranded cDNA and adaptors were ligated to be sequenced.11 cycles of PCR were applied to amplify libraries.Library quality was assessed using a Fragment Analyser and libraries were quantified by QPCR using the Kapa Library Quantification Kit.RNA-seq experiments have been performed on an Illumina NovaSeq.6000 using a paired-end read length of 2 × 150 pb with the Illumina NovaSeq.6000 sequencing kits.
Genome assembly, QC, Contamination.PacBio Hifi reads with a highly accurate median accuracy of minimum 99.9% (Q30) were used as input for the HiCanu assembler 35 .Post-assembly quality control and taxonomic partitioning were assessed with BlobTools 36,37 .Previously quality-filtered PacBio Hifi reads were mapped back to the assembly with mimimap2 38 to estimate contigs coverage.Each contig was assigned a taxonomy affiliation based on BLAST 39,40 results against the NCBI nt database.
Telomere detection.Terminal telomeric repeats were searched using tidk software v0.1.5(https://github.com/tolkit/telomeric-identifier).The tidk software explore module was used to search the genome for repeats from length 5 to 10. Positions of repeats are only reported if they occur sequentially in a higher number than the threshold of 5.The most represented repeat unit was AACCCT with a maximum frequency of 209.Then, this putative telomeric repeat was scanned on the contigs with the tidk search module using a window size of 150 to calculate repeat counts.This information is then used as an input for the tidk plot module to visualize positions of the putative telomeric repeats along each contig sequence.
Gene prediction.Gene models prediction was done with the fully automated pipeline EuGene-EP version 1.6.5 41 .EuGene has been configured to integrate similarities with known proteins of "ascomycota" section of UniProtKB/Swiss-Prot library (UniProt Consortium 2018 42 ), with the prior exclusion of proteins that were similar to those present in RepBase 43 .
The dataset of Periconia digitata transcribed sequences generated in this study were aligned on the genome and used by EuGene as transcription evidence.For this, we first assembled de novo using Trinity 44 the transcriptomes of P. digitata obtained from the twelve above-described conditions and for a given trinity locus we only retained the transcript returning the longest ORF.Finally, only de novo assembled transcripts that aligned on the genome on at least 30% of their length with at least 97% identity were retained.
The EuGene default configuration was edited to set the "preserve" parameter to 1 for all datasets, the "gmap_intron_filter" parameter to 1 and the minimum intron length to 35 bp.Finally, the Fungi specific Weight Array Method matrices were used to score the splice sites (available at this URL: http://eugene.toulouse.inra.fr/Downloads/WAM_fungi_20180126.tar.gz).

Genome and protein set completeness assessment.
We used BUSCO 45 version 5.2.2 in protein and genome modes with the eukaryota odb10 dataset of 255 BUSCO groups and the fungi odb10 dataset of 758 BUSCO groups to assess the completeness of the predicted protein set as well as the genome assembly.We compared BUSCO scores to those obtained for the Periconia macrospinosa genome and predicted proteins 20 .

Functional annotation.
All predicted proteins were scanned for the presence of conserved protein domains and motifs using InterProScan v.5.51-85.0 46with the options -iprlookup, -goterms and -pa to assign Gene Ontology (GO) terms, MetaCyc and Reactome biochemical pathways based on detection of Interpro domains.

Gene prediction and functional annotation of mitochondrion.
The annotation was performed using MITOS2 47 including the ncRNA (t-and r-RNA) and the protein coding sequences with the codon usage number 4. The gene predictions were refined using the assembled Trinity transcripts aligned on the mitogenome by direct translation in ORFfinder as well as annotation with smartBlast (https://www.ncbi.nlm.nih.gov/orffinder/), and intron reconstruction with BioEdit v 7.0.5.3 48.We also compared our results to the following Pleosporales available mitogenomes: NC_058694 (Edenia gomezpompae, 37  Protein extraction and sample preparation for proteomics analysis.P. digitata was cultivated, in 500 mL plastic Roux bottles, in 50 mL of five different sterile liquid media (RMI free of asparagine and vitamins, RMI supplemented with B vitamins, RMI plus wheat peptone, RMI plus Citrus pectin, RMI plus Guar gum, RMI plus malt) over 7 days in the dark at 24 °C.The media were chosen for their ability to change the strain phenotype (see RNA extraction above); two biological replicates were prepared.The mycelium was recovered by filtration on GF/C Whatman glass filter and rinsed with water.It was ground using liquid nitrogen then sequentially extracted using successive buffers starting with a Tris buffer 20 mM pH 8, 10 mM DTT (dithiothreitol), then the same buffer supplemented with 200 mM NaCl buffer, the same buffer supplemented with 8 M urea and finally the same buffer with 6 M guanidine hydrochloride (1 mL per 300 mg fresh material weight).Centrifugations (13,200  The timsTOF Prowas equipped with the CaptiveSpray nano-electrospray ion source.MS and MSMS data were acquired in a positive mode, in a PASEF (Parallel Accumulation -Serial Fragmentation) data dependent acquisition (DDA), TIMS ON mode from 100 to 1700 m/z mass range (TimsControl version 2.0.53.0).Ion mobility resolution (1/K0) was set to 0.70-1.10V•s/cm 2 over a ramp time of 180 ms.To exclude low m/z, singly charged ions from PASEF precursor selection, a polygon filter was applied in the m/z and ion mobility space.
Analyses were processed by Peaks Studio X Pro (version 10.6, bioinformatics Solutions Inc.) MSMS raw data were processed using Peaks solution applying 3 levels of identification.MSMS spectra were matched against the P. digitata Y3 predicted proteome (merged core and mitochondrial genomes), including the MaxQuantcontaminant database (contaminants.fasta,MaxQuant 2.1).Parameters were set as follows: protein FDR <1%, decoy fusion method, cysteine carbamidomethylation as a fixed modification, 2 miscleavage, 3 to 5 post-translational modifications (PTM) per peptide.The MSMS spectra that did not match with these parameters were processed again, looking for other possible PTM and amino acid mutations to enrich the list of identified proteins.The obtained results were merged in a list of significant proteins 51 .The mass spectrometry proteomic raw data are available in the ProteomeXchange Consortium via the PRIDE [1] partner repository with the dataset identifier PXD038112 53 and PXD038175 54 .

Run Accession Sample Experiment
All the analyzed data are publicly available at https://entrepot.recherche.data.gouv.fr/dataverse/pdig.The detailed list of analyzed files are presented in the Table 3.
The complete mitochondrial genome assembly of P. digitata can be retrieved at the NCBI through the accession number OP787475 49 .

Technical Validation
Contamination assessment.Blobtools analysis showed that all the contigs formed a dense blob at a homogenous coverage (550X) and GC content (49%) indicating no evidence for contamination (i.e., no contig deviates from this distribution) (Fig. 3).Moreover, the taxonomic affiliation analysis based on homology, using BLAST against the NCBI's nt library, showed that all contigs are of Ascomycota origin, which is consistent with the absence of evident contamination (Fig. 4).

Genome size estimation and de novo assembly.
Based on k-mer multiplicity distribution using GenomeScope2, both 1n and 2n models converged in showing one single peak at a very high coverage of ca.560X.The GenomeScope model fit values were slightly higher for the haploid (1n) model (92.59% -94.61%) than for the diploid (2n) model (92.59% -94.31%).Collectively, these results strongly suggest a haploid genome sequenced at a very high coverage.The haploid 1n model returned an estimated genome size of ca.36 Mb with an error rate of ca.0.43% (Fig. 5).
The HiCanu assembler yielded a genome assembly that was ca.39 Mb long, consistent with haploid genome size estimated with k-mers.The genome was assembled in 15 contigs with a N50 value of 3 Mb and a L50 of 5 (i.e.half of the genome is present in the 5 biggest contigs) (Table 4).Among the 15 contigs, HiCanu generated two outliers of 27-28 kb (Pdig14 and Pdig15, Table 5).Both contained only rDNA repeats that can be partially aligned at the end of Pdig08, which exclusively exhibits rDNA repeats.
The repeat sequence (AACCCT)n we have identified at the terminal regions of the contigs corresponds to the reverse complement of the (TTAGGG)n telomeric repeat widely conserved in vertebrates, many other animals, plants as well as several different eukaryotes, including fungal species.
It is worthy to note that the telomeric repeats were detected at both ends of 11 out of 13 contigs.Telomeric repeats were detected at only one end in the contigs Pdig06 and Pdig08 (Table 5).Overall, we obtained a highly-contiguous genome assembly for Periconia digitata that is structured in 13 chromosomes in its haploid mycelium with a total length of 38,967,494 bp.
Genome completeness assessment.BUSCO (v5.2.2) analysis at the genome level indicated that 99.6% and 99.1% of nearly-universal single copy genes from the eukaryota and fungi datasets, respectively, were retrieved in full-length.Only 0.4 and 0.7% of eukaryotic and fungal BUSCO genes were identified as duplicated, consistent with a genome assembled in a haploid state with no evidence for substantial gene duplications (Table 6).Although the genome of P. macrospinosa was considerably more fragmented, the gene content completeness was comparable to that of P. digitata according to BUSCO metrics.Functional annotation.Conserved InterPro domains and motifs were identified on 61.1% of the 15,520 predicted protein sequences.The 9,479 annotated proteins returned 7,713 different InterPro domains 56 .
The top 15 interpro homologous superfamilies and domains contained some large gene families found in most organisms together with domains restricted to fungi.Among the detected domains, we noticed the presence of the HET (Heterokaryon incompatibility) domain (Fig. 6) which is specific to Ascomycota 57     associated with key enzymes for many secondary metabolites biosynthesis in fungi such as the beta-ketoacyl synthase or the polyketide synthase, enoylreductase domains 58 .Using SinalP (v6.0) 59 , we assessed the number of predicted proteins that contained a putative signal peptide for secretion.Among the 15,520 proteins, 1,597 (10.3%) were predicted to display a signal peptide thus potentially to be addressed to extracellular space or the membrane.This value is in the upper section of the   range observed in fungal proteomes, e.g.8.5% in Trichoderma asperellum 60 , 1.1% -12% in 132 Zygomycota proteomes 61 , 3% -10% obtained in 49 fungal proteomes 62 .
Proteomics support of predicted proteins.Of the 15,551 predicted proteins, 6,598 (42.4%) returned matches with at least 2 unique peptides (-10lgP > 50) at a maximum FDR of 1%.The identification reached 46.9% after inclusion of proteins identified with a single peptide (698, −10lgP > 50).This value is in line with the numbers observed among the top 15 interpro homologous superfamilies and domains identified (Fig. 7).Although 6,041 proteins (38.9% of the predicted ones) returned no InterPro annotation, 883 of them (14.6%) were identified by proteomics, suggesting they are actual proteins with no known conserved domain to date.In addition, many post translational modifications (PTM, A score >20) were detected 51 (Table 8).Phosphorylation mainly concerned proteins predicted to be involved in transport (13), ubiquitin trafficking (13) cytoskeleton (12) and transcription (8).Acetylation and methylation mainly concerned proteins predicted to interact with nucleic acids (from histones to ribosomal proteins) or to have diverse enzymatic activities.Mitochondrial genome assembly and annotation.The assembled mitochondrial genome 49 was 76,558 bp long with 27.56% of GC (Fig. 8).As for other Pleosporales mitogenomes, we retrieved a complete set of tRNA, among which some were in multiple copies (Met, Ser, Leu) as well as rnl and rns.Out of the 31 predicted proteins encoded in the mitochondrion, 14 were identified by proteomics including typical enzymes, one ribosomal protein and 3 hypothetical proteins (Table 9).

Fig. 1
Fig. 1 Schematic overview of the study design.

Fig. 2
Fig. 2 Phylogenetic inference based on a combined ITS and 28 S dataset.The tree is rooted to Trematosphaeria pertusa and Penicillium roqueforti.The blue branch highlights the cluster of P. digitata, where our strain P. digitata CNCM I-4278 is positioned.The numbers indicate the percentage of trees in which the associated taxa clustered together (1000 bootstrap).Bar = expected changes per site (0.10).

Fig. 3
Fig. 3 BlobPlot of the genome assembly.Each circle is a contig proportionally scaled by contig length and coloured by taxonomic annotation based on BLAST similarity search results.Contigs are positioned based on the GC content (X-axis) and the coverage of PacBio reads (Y-axis).

Fig. 4
Fig. 4 ReadCovPlot.Mapped reads are shown by the taxonomic group at the rank of 'phylum' .

Fig. 6
Fig. 6 Interpro functional annotation of the P. digitata predicted proteome.The Top 15 homologous superfamilies (a) and domains (b) are indicated.

Fig. 7
Fig. 7 Top 15 homologous superfamilies (a) and domains (b) obtained from Interpro predictions (blue) together with the corresponding homologous superfamilies and domains in proteins identified by proteomics (orange).

Table 1 .
List of strains with their corresponding sequence accession number used to build the phylogenetic tree of Periconia species.RS: reference strain (holotype not available), CH: culture from holotype, G: available genome, N: neotype.

Table 2 .
49lture conditions used to obtain a reference transcriptome for P. digitata.Pithomyces chartarum, 69 kb, 37 ORF) in addition to this of Phaeosphaeria nodorum (see mitochondrion assembly).The concatenated file obtained from MITOS2 and protein-coding sequences coordinates was used for genbank submission (OP78747549) and mitogenome drawing by OGDRAW version 1.3.1 50 .
kb, 14 ORF), NC_040008 (Coniothyrium glycines, 98 kb, 35 ORF), NC_026869 (Shiraia bambusicola, 39 kb, 17 ORF) and NC_035636 rpm, 5 min, 4 °C) were achieved after each extraction and the 4 successive supernatants were recovered.The two first buffers (alone and saline) were immediately adjusted to 8 M urea.All these fractions were incubated at 37 °C for 15 min then alkylated by iodoacetamide (41.6 mM) during 15 minutes at room temperature.After two buffer exchanges with trypsin buffer on Vivaspin 15 R (5 kDa) columns (Sartorius), the samples were digested by trypsin with 1/80 (w/w) trypsin/total protein ratio (Sequencing Grade Modified Trypsin, Promega) according to the manufacturer recommendations.The residual pellets after successive extractions were suspended in 1 mL of Tris 20 mM pH 8, 10 mM DTT, 8 M urea then incubated at 37 °C for 15 min and alkylated with 41.6 mM of iodoacetamide for 15 min at room temperature with gentle mixing.After centrifugation (see above), the pellets were rinsed/centrifuged 2 fold in the trypsin buffer.The final pellet suspension was directly digested by trypsin (1 µg/pellet) overnight under gentle agitation on a rotator mixer.After centrifugation, the supernatants were recovered.The samples were Proteomics analysis.The samples were analyzed by nanoUHPLC-HRMS (nanoElute -timsTOF Pro, Bruker Daltonics). 5 µL of sample were injected on an Aurora column (75 µm id × 250 mm, C18, 1.6 µm, ionOpticks) with a flow rate of 200 nL/min at 50 °C.The mobile phase was a gradient of CH3CN-0.1% FA (B) in 0.1% FA-H 2 O (A) as follows: 5% B for 1 min, 5% to 13% of B for 18 min, 13% to 19% of B for 7 min, 19% to 22% of B for 4 min, 22% to 95% of B for 3 min.

Table 5 .
and domains Metrics of Periconia digitata's contigs and telomere detection.

Table 6 .
BUSCO scores for the genome of P. digitata and P. macrospinosa using the eukaryota odb10 and the fungi odb10 datasets.

Table 7 .
BUSCO scores for the proteins of P. digitata and P. macrospinosa using the eukaryota odb10 and the fungi odb10 datasets.

Table 9 .
List of the 14 mitochondrial proteins identified by proteomics.