Background & Summary

Cucurbit powdery mildew (CPM) is a common and severe disease of cucurbit crops, producing a characteristic white powdery fungal growth on leaves, stems, petioles and rarely on fruits which can cover the entire host surface causing heavy crop yield and quality losses in most areas of the world1,2,3. Disease management is not easy and often requires numerous applications of plant protection products, including synthetic fungicides, natural substances and microbial antagonists. Several fungi are reported as causal agents of CPM4. In Southern Italy, as well as in other geographic areas, the main pathogen responsible for the disease is the ascomycete Podosphaera xanthii, an ectophytic and biotrophic pathogen. The fungus has the potential to evolve and differentiate new more adapted genotypes that can overcome genetic resistance of crop varieties and efficacy of new fungicides. Sexual reproduction represents an important source of genetic variation in pathogen populations. P. xanthii shows a heterothallic bipolar mating system with both mating types (MAT1-1 and MAT1-2) detected in fungal populations occurring in both field and greenhouse crops in South Italy5.

The genetic structure within P. xanthii populations has been investigated using molecular markers and by analysing specific genes or functional gene categories such as those responsible for pathogenicity or fungicide resistance6,7. Transcriptomic sequences from a single isolate of P. xanthii during its ectophytic growth on the host leaf surface has been used to identify secreted proteins with a putative role in pathogenesis8 and characterize some of them through host-induced gene silencing (HIGS) mediated by Agrobacterium tumefaciens9.

Additional sources of variation in fungi are transposable elements (TEs) and cytoplasmic genetic materials, also including mycoviruses, i.e. viruses infecting fungi. TEs are ‘jumping’ DNA sequences, moving from one location to another of the genome, recognised as extraordinary contributors to genomic variation and evolution in most eukaryotes and prokaryotes10. They play a role also in host-pathogen interactions since effector genes are located within or in proximity to TE-rich genomic regions of pathogens11. The role of TEs in P. xanthii remains to be investigated. Most mycoviruses do not have visible effects on their hosts but some of them cause debilitation or reduced virulence and have the potential to be developed as innovative biocontrol agents12. An increasing number of viral genomes from different fungal pathogens have been recently sequenced and deposited in public databases but there are no records of mycoviruses from P. xanthii until now.

We report here Illumina sequencing and de novo assembly of the transcriptome from two P. xanthii isolates of opposite mating type and their sexual cross aimed at obtaining a more comprehensive transcriptome and improving the resources for investigations on interactions among P. xanthii, host plants and the associate microbiome.

Methods

The MAT1-1 reference strain G24 was kindly supplied by Prof. M.T. McGrath (Cornell University, USA) while the MAT1-2 strain 7A was isolated from Cucurbita pepo cv. Roberta in Apulia region, South Italy in 2014. Both strains are maintained in the fungal collection at the Plant Pathology Section of the Department of Soil, Plant and Food Sciences of University of Bari and are freely available upon request, without any restriction. Growing conditions on zucchini cotyledons were as described by Miazzi et al.3. For mating, the two strains were paired on single cotyledons (5 mm apart) and grown for 15 days (Fig. 1). Mycelium and conidia of each strain and their pairing were scraped from the surface of infected cotyledons, and total RNA was extracted using TRI Reagent (Sigma-Aldrich, Milan, Italy) according to the manufacturer’s protocol. cDNA libraries of a 400-bp average-sized fragments were obtained using TruSeq RNA Sample Preparation Kit v2 (Illumina, Inc., San Diego, CA, USA) and sequenced (Illumina Sequencing Technology; HiScanSQ platform; SELGE Network Sequencing Service) to obtain a total of 5.5 Gb corresponding to 59.53 M reads (92-bp paired-end reads; QS ≥ 30)13 (Table 1). Reads were analysed for quality statistics, nucleotide distribution and redundancy using FastQC14 and trimmed to discard low-quality reads (less than 2%) with Trimmomatic15. Raw reads were aligned against the Cucurbita pepo reference genome v.3.2 using the CLC Genomics Workbench (CLC bio, Aarhus, Denmark) to filter out contaminant sequences from zucchini cotyledons on which P. xanthii strains were grown (1.8 to 6.6% reads for the three libraries). Trinity software was used for de novo assembly of the transcriptome using sequencing data from the three libraries16. To reduce redundancy, the assembled sequences were then merged and reassembled by using CAP3 software with a minimum overlap length of 50 and at least 95% identity17. After CAP3 clustering, the obtained total contigs (71,095), corresponding to putative transcripts and including isoforms and unigenes (54,561)18, were functionally annotated using local BLAST+19 and Blast2GO PRO to predict Gene Ontology (GO) terms, to assign the assembled sequences to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and to analyse protein domains using the InterProScan tool20.

Fig. 1
figure 1

Pairing between the two strains of opposite mating type, G24 (MAT1-1) and 7A (MAT1-2), after 15 days of growing on zucchini cotyledons.

Table 1 List of raw reads.

Data Records

Data generated in this study are publicly available from the NCBI/GenBank database at Bioproject ID PRJNA505479. All raw sequence data have been deposited in the Sequence Read Archive under the accession number SRP16988313 (Table 1). The Transcriptome Shotgun Assembly project have been deposited at DDBJ/EMBL/GenBank under the accession GHEF0000000018 (Table 2). The annotation dataset of the total Trinity assembly as well as the annotation of putative P. xanthii mycoviral sequences, classified according to their sequence homologies with known mycoviruses have been uploaded to figshare21.

Table 2 Assembly statistics.

Technical Validation

Our annotated transcriptome draft improves the publicly available P. xanthii transcriptome8, in terms of completeness and contig size reducing the proportions of fragmented and missing transcripts (Tables 2 and 3 and Fig. 2). The row reads were re-aligned to the de novo assembled transcriptome using the CLC Genomics Workbench. Quality control of alignment data was performed with Qualimap 2 to obtain read alignment and coverage statistics22 (Fig. 3). Of the total reads, 81.7% successfully mapped in pairs and 14.5% mapped in broken pairs to the assembled transcriptome. A total of 19,324 putative Open Reading Frames (ORFs) were predicted within transcript sequences by TransDecoder and 79.1% were complete. BUSCO23 was used to evaluate transcriptome completeness based on a set of 1,438 conserved fungal orthologs, showing that 86% of the assembled transcripts were complete, with 20% of estimated duplication level, and few fragmented (6%) and missing (7%) transcripts (Table 3). The assembled contigs (71,095) were grouped in sequence sets according to their BLASTx annotation21. More than 55% of them (40,221 contigs), mapping more than 70% of row reads, had a significant hit in BLASTx search (Table 4). They included a large fraction (95.1%) of sequences showing homology to proteins of Fungi and Oomycetes with the highest similarities with Erisiphe necator and Blumeria graminis f.sp. hordei. Sequences with no significant hits were mostly short fragments or non-coding RNA sequences. A total of 5,013 Gene Ontology (GO) terms, including the three main categories of biological process (3,171), molecular function (1,195) and cellular component (647), were assigned to 24,048 unigenes. WEGO was used to perform functional classification of Trinity unigenes based on the GO annotation24 (Fig. 4). Among the identified P. xanthii putative transcripts at least 195 sequences related to sexual compatibility and reproduction of the fungus were identified21 (Fig. 5). Three hundred sixty contigs showed homology with sequences of viral origin, including several known mycoviruses having double stranded (ds)RNA [i.e. Totiviridae (308), Partitiviridae (7) or unclassified dsRNA (3)] or positive single stranded + (ss)RNA [i.e. Narnaviridae (18), Ourniavirus (10)] genomes and unclassified virus-like sequences from fungi (14)21 (Fig. 5). They represent novel putative mycoviruses infecting P. xanthii that should be further characterized to explore their potential effects on the virulence of the hosting strains. Putative transposable elements (TEs) in the assembled transcriptome were identified and classified by similarity search against Repbase, the reference database of eukaryotic repetitive DNA25, by using the CENSOR software tool with default parameters26. Overall, 14,793 contigs were homologous to fungal TEs and 1,475 contigs were homologous to TEs identified in other Eukaryotes (Fig. 5). The NonLTR/Tad1 (44.0%) followed by LTR/Gypsy (28.4%), LTR/Copia (17.2%) and DNA/Mariner (8.9%) were the most represented classes among the fungal TEs while LTR/Copia (79.1%) followed by LTR/Gypsy (12.8%) were the most represented among Eukaryotic TEs.

Fig. 2
figure 2

Contig length distribution in the previously available transcriptome of P. xanthii including 37,241 transcripts (accession GEUO00000000) as compared to the total assembled contigs (71,095) and the identified transcripts (23,065) of the fungus obtained in this study.

Fig. 3
figure 3

Average per base coverage distribution calculated by alignment of row reads on the de novo assembled transcriptome of P. xanthii.

Table 3 BUSCO analysis of assembly completeness.
Table 4 Annotation statistics.
Fig. 4
figure 4

Frequency distribution of Gene Ontology (GO) terms grouped into the main functional categories of cellular component, molecular function and biological process. The right y-axis indicates the number of unigenes per category. The left y-axis indicates the percentage of a specific category of unigenes in the main category.

Fig. 5
figure 5

Sets of Trinity-assembled transcripts and related unigenes according to their BLAST annotations.