Integrative transcriptomic and gene co-expression network analysis of host responses upon Verticillium dahliae infection in Gossypium hirsutum

Worldwide, Verticillium wilt is among the major harmful diseases in cotton production, causing substantial reduction in yields. While this disease has been extensively researched at the molecular level of the pathogen, the molecular basis of V. dahliae host response association is yet to be thoroughly investigated. In this study, RNA-seq analysis was carried out on V. dahliae infected two Gossypium hirsutum L. cultivars, Xinluzao-36 (susceptible) and Zhongzhimian-2 (disease resistant) for 0 h, 24 h, 72 h and 120 h time intervals. Statistical analysis revealed that V. dahliae infection elicited differentially expressed gene responses in the two cotton varieties, but more intensely in the susceptible cultivar than in the resistant cultivars. Data analysis revealed 4241 differentially expressed genes (DEGs) in the LT variety across the three treatment timepoints whereas 7657 in differentially expressed genes (DEGs) in the Vd592 variety across the three treatment timepoints. Six genes were randomly selected for qPCR validation of the RNA-Seq data. Numerous genes encompassed in disease resistance and defense mechanisms were identified. Further, RNA-Seq dataset was utilized in construction of the weighted gene co-expression network and 11 hub genes were identified, that encode for different proteins associated with lignin and immune response, Auxin response factor, cell wall and vascular development, microtubule, Ascorbate transporter, Serine/threonine kinase and Immunity and drought were identified. This significant research will aid in advancing crucial knowledge on virus-host interactions and identify key genes intricate in G. hirsutum L. resistance to V. dahliae infection.

www.nature.com/scientificreports/ In the recent past, significant advancement has been achieved in the use of genomics, transcriptomics and proteomics with aim of gaining insight on the genetic mechanism behind cotton resistance to V. dahliae, and a variety of V. dahliae resistance related genes have been revealed 3,[9][10][11][12][13] . However, given the common evolving interaction between cotton and V. dahliae, it is critical to investigate the molecular pathways that determine V. dahliae pathogenicity. Due to the completion of genome sequencing and the advancement of bioinformatics methods, genomic and transcriptomic sequence knowledge of V. dahliae offers us with an ability to better understand its pathogenicity. The transcriptomic analysis of V. dahliae during microsclerotia establishment and initial infection phase have provided a glimpse of the genes essential for V. dahliae growth, microsclerotia formation, and infection [14][15][16] . For example, VdPKAC1, VMK1, VdMsb, VdGARP1, VDH1, Vayg1 and VGB were discovered to be intricate in the development of microsclerotia and the pathogenic phase of V. dahliae 17,18 ; VdNEP, VdpevD1, VdNLP1 and VdNLP2 encode for effector proteins are linked in pathogenic responses [19][20][21][22] ; VdFTF1, Vta2 and VdSge1 encode for transcriptional factors that regulate pathogenic genes [23][24][25] . Nonetheless, owing to the intricacy of V. dahliae pathogenic associated molecular mechanisms, the functions of these genes in the interaction between V. dahliae and cotton are yet to be fully understood. In the present study, we utilized V. dahliae fungus in infecting two Gossypium hirsutum L. cultivars, Xinluzao-36 (susceptible) and Zhongzhimian-2 (disease resistant) for 0 h, 24 h, 72 h and 120 h time intervals, so as to determine the molecular basis of V. dahliae host response mechanisms.

Methodology
V. dahliae culture. A defoliating V. dahliae Vd592 strain isolate which was acquired from Shihezi University, was cultured on potato dextrose agar (PDA) for a period of 7 days at a temperature of 25 °C. The cultured isolates were incubated in Czapek liquid medium for a period of 5 days at a temperature of 25 °C, so as to get the conidia. Using distilled water, the obtained spores were diluted to almost 1 × 10 6 spores per milliliter prior to inoculation. Data and differentially expressed gene analysis. Quality assurance checks were performed on the sequenced raw reads. Poor quality reads were filtered out and adapter sequences together with reads containing ambiguous bases 'N' and ones that had more than 20% Q < 30 bases were discarded. Following Meyer et al., all sequences of less than 60 bp were also discarded 26 . Using TopHat version 2.01 27 , the obtained clean reads were mapped to the G. hirsutum (AD1) reference genome 28 . The Cufflinks software (version 2.2.1) was utilized in measuring the abundance transcripts and differential gene expression 27 . Comparing gene expression levels amongst the two libraries was achieved through the relative transcript level of individually expressed gene was achieved through the calculation and normalization to the reads per kilobase of the exon model per million mapped reads (RPKM) 29 . Detection of the considerable variations and differences in gene expression patterns was done through the Chi-squared test that was incorporated in the IDEG6 software 30 . P-value threshold was defined through false discovery rate (FDR) so as to account for the multiple significance tests. In the present study, an FDR threshold ≤ 0.01 and Fold change ≥ 2 were accepted in determining the significantly expressed gene value differences.

Plant material and inoculation procedure.
In performing functional annotation of the obtained DEGs, Basic Local Alignment Search Tool (BLAST) alignment was executed via search in the Clusters of Orthologous Groups (COG) 31 , SwissProt, Kyoto Encyclopedia of Genes and Genomes (KEGG) 32,33 and Non-redundant (Nr) protein databases with the E-value set at ≤ 1e−5. Significant matches were carefully chosen in annotation of the DEGs. Blast2go software version 5.2.5 was utilized in annotating the DEGs' major Gene Ontology (GO) groups, including molecular functions, biological processes, and cellular components and the E-value ≤ 1e−5 34 .
Real-time quantitative PCR (RT-qPCR) analysis of the DEGs. Sequencing libraries were prepared from the total RNA, as described above. For RT-qPCR analysis, 6 randomly selected DEGs and cotton gene as the control were chosen, and primer design was done using DNAMAN 6.0 software package. RT www.nature.com/scientificreports/ for 5 s, and 60 °C for 30 s. The 2 − ΔΔCT procedure was used to normalize and calibrate the relative expression levels 35 . For each of the chosen genes, tests were performed in triplicates for all the biological and technical groups.
Statistical analysis and graphic presentation. R statistical package version 3.6.2 was utilized in experimental data analysis 36 . Prcomp software package was employed in Principal components analysis (PCA) analysis as well as graphic demonstration was executed via the scatterplot3d package. The boxplot package was used in the graphic demonstration of expression level distribution. The hierarchical clustering package was utilized in clustal analysis whereas the cluster tree graphic presentation and heatmap were achieved by means of the ggplots package. The WGCNA package version 1.70-3 was applied in preparing the graphical outlook of the network/ module construction 37 .

Statement.
The two cotton cultivars that were used in this study were collected from Shihezi University.
Its growth, harvest and use in this manuscript complies with relevant institutional, national, and international guidelines and legislation.

Results
The two cotton genotypes morphologically are identical, However, Xinluzhao-36 cultivar exhibited significant variations when exposed to V. dahliae fungus infection. This was exhibited by the leaves yellowing and starting to dry up from the edges. Further, it was observed that the Zhongzhimian-2 variety had a large biomass and yellowing of leaves compared to the Xinluzhao-36 cultivar (Fig. 1).
Sequencing and transcript identification. In this study, two cotton cultivars; Xinluzao-36 (susceptible) and Zhongzhimian-2 (disease resistant) exposed to V. dahliae fungus infection were utilized in RNA-Sequencing (Table 1). These two V. dahliae strains differ in their pathogenicity, in that the LT strain is a weak pathogenic defoliating isolate, whereas the Vd592 strain is a strong pathogenic defoliating isolate. The root tissues of the two accessions were considered at three different treatment periods; the samples being labeled as S2-24 h, S2-72 h and S2-120 h and S3-24 h, S3-72 h, S3-120 h for the Xinluzao-36 cotton variant whereas T2-24 h, T2-72 h, T2-120 h and T3-24 h, T3-72 h, T3-120 h for the Zhongzhimian 2 cotton variant (Supp File 1). All the samples were sequenced in triplicates. The two upland cotton accessions were categorized as either V. dahliae susceptible variety (S2 and S3) or V. dahliae resistant variety (T2 and T3). CK group represented the control plants that grew in normal optimum conditions. To examine the transcriptomic contrasts during www.nature.com/scientificreports/ V. dahliae fungus infection in the treated groups, total RNA was extracted from the root of the two varieties in three different treatment periods. A total of 1106 million raw reads were obtained from 32 cDNA libraries, and 1097 million (99.38%) clean reads were collected following data quality trimming processes (Supp File 1). For each library the proportion of the mapped data was between 82.97% and 90.01%. In addition, the clean reads were mapped to G. hirsutum (AD1) reference genome in which 84.36 to 89.83 percent of the 32 samples obtained, producing 77.4 to 82.3% reads that uniquely mapped to the reference genome.
In the Xinluzao-36 and Zhongzhimian-2 response to V. dahliae fungus infection, 13,541 genes were upregulated in at least one time-point when the treatment group was compared to the control group (Fig. 3A). In the S2_120h vs SCK, 5808 genes were differentially expressed while 4,936 were differentially expressed in the T2_120h vs TCK group. Equally, 4873 were differentially expressed in S3_120h vs SCK, and 6711 were differentially expressed in the T3_120h vs TCK group. Comparison between the treatment groups, 5308 were www.nature.com/scientificreports/ differentially expressed in the S2_120h vs T2_120h group while 3100 were differentially expressed in the S3_120h vs T3_120h group. Equally, 9071 were differentially expressed in S2_120h vs T3_120h, and 2613 were differentially expressed in the S3_120h vs T2_120h group (Fig. 3B).

Functional classification and enrichment analysis of DEGs. In comparisons of libraries made for
the various time points, GO enrichment approaches were used to identify the putative roles of DEGs (Fig. 4). GO enriched categories of the DEGs in biological process groups were primarily linked in glucan metabolism process, cellular carbohydrate biosynthesis process, plant-type cell wall Biogenesis and polysaccharide biosynthesis. The GO enriched categories of the DEGs in cellular component were mostly linked to the anchored component of membrane, anchored component of plasma membrane, microtubule, microtubule associated complex and kinesin complex. GO enriched categories of the DEGs in molecular function were mostly linked to glucosyltransferase activity, copper iron binding, cellulose synthase activity and ATP-dependent microtubule motor activity (Fig. 4).
We also conducted enrichment analysis by mapping sequences to KEGG database groups in order to examine the roles of the differentially expressed transcripts. KEGG annotated DEG's were allocated to various classes, mainly related to biosynthesis of other secondary metabolites, signal transduction and carbohydrate metabolism. KEGG enrichment analyses as well showed DEGs were considerably enriched in pathways such as in ubiquitin biosynthesis, pyruvate metabolism, gluconeogenesis, glutathione metabolism, fructose and mannose metabolism, Flavonoid biosynthesis, fatty acid metabolism, cysteine and methionine metabolism (Fig. 5). www.nature.com/scientificreports/

Weighted gene expression clustering analysis.
A popular approach to comparative genomics is to identify and annotate DEG sets for the purpose of mining the key genes. Therefore, WGCNA was conducted to further examine the basic molecular pathways and key genes affecting V. dahliae growth within the two cotton varieties. According to similar expression and similar functions, we finally selected all the differential genes (13,541) for 120 h for the WGCNA analysis. The soft power was set at 15 since this was the lowest power required to attain nearly scale-free topology (R2 = 0.865), and the number of co-expression modules was 16 modules in total (Fig. 6). As per the correlation results between the modules and the two sample treatments in each phase, the pink module and S sample that are significantly positively correlated with T sample at 120 h and negatively correlated with S sample at120h were selected. The 120 h significant positive correlation of the T sample and the large negative correlation of the T sample 120 h brown module are used to construct the interactive network and hub screening of genes.
From the module trait correlation analysis, 3 modules had an analogous down-regulation in four time points that include T24 h, T72 h, and T120 h, in the MECyan, MEpurple, MEyellow and MEpink modules. Whereas MEbrown and MEred were upregulated at S120 hours post inoculation (hpi), and Mgrey encompassed upregulated genes that were at both S0 hpi and S24 hpi. Equally, MEyellow contained genes that were both upregulated in the T0hpi up to the T120hpi. Thus, each of these modules recognizes a gene set of G. hirsutum in response to V. dahliae fungus infection at specific time points (Fig. 7).

Networks displaying relationships among genes within co-expressed modules.
With the goal of identifying main hub genes, we built a network of the identified co-expressed modules. In this study, we identified 11 hub genes that encode for different proteins associated with lignin and immune response, Auxin response factor, cell wall and vascular development, microtubule, Ascorbate transporter, Serine/threonine kinase and Immunity and drought (Fig. 8, Table 2). qPCR validation. A set of differentially genes were randomly selected for detection and subsequent validation via qPCR, Fig. 9.

Discussion
Generally, Verticillium wilt restrains water and nutrient movement within the cotton vascular system, and considerably limits the crop yields. Equally, the photosynthetic rate and other related physiological traits in cotton genotypes are severely influenced by the Verticillium wilt 38 . Cotton genotypes display diverse reactions and resistance mechanisms to Verticillium wilt infection 39 . Mechanisms for V. dahliae Resistance may involve www.nature.com/scientificreports/ modifications to the transcription factors, extracellular enzymes, cell wall, ethylene-associated signal transduction pathways, jasmonic acid pattern recognition receptors or salicylic acid 40 .
To aid infection and efficient colonization, pathogenic fungi in plants can develop a variety of cell walldegrading enzymes 41,42 , including pectinase, hemicellulase, cellulase, etc. Hydrolytic enzymes, especially cellulases and pectinases, have long been thought to play a key role in the expression of V. dahliae disease symptoms and pathogenesis 43,44 . The cell wall-degrading enzymes are virulence factors, for instance xyloglucan-specific endoglucanase 45 , fungal endopolygalacturonases 46 , as well operate as pathogen-associated molecular patterns (PAMPs). In particular, enzymes linked in cell wall degradation comprise non-catalytic domains, which are usually related to the fungal carbohydrate hydrolases, considered to be activators of the PAMP-triggered immunity (PTI) response in the oomycetes 47,48 . In V. dahliae, two Glycoside hydrolase 12 (GH12) proteins; VdEG1 and VdEG3 acted as PAMPs that triggered cell death and PTI independent of their enzyme action in Nicotiana benthamiana. Despite the fact that cell wall-degrading enzymes have also been linked to fungus pathogenicity, direct molecular evidence was insufficient. Herein, some of the enriched GO classifications are significantly related to the composition of the cell wall turn off. They include cellular carbohydrate biosynthesis process, plant-type cell wall Biogenesis and polysaccharide biosynthesis. Considering the main component of the cell wall is sugar, this observed pattern of enriched pathways may form part of the cell wall defense mechanism against pathogens at the initial stage of V. dahliae infection.
Phenylpropanoids, which naturally comprise of flavonoids and lignin, typically perform vital functions that are associated with biotic stress response in plant by both availing material units for physical barriers as well as synthesizing various antibiotic compounds 49,50 . Further, lignin has been identified as a contributing factor in the resistance mechanisms of cotton to diseases 51 . Specifically, phenylpropane synthesis is associated with defense mechanisms in cotton 52 , whereas flavonoids are linked to cushion considerable stress-induced changes in the  www.nature.com/scientificreports/ reactive oxygen species (ROS) homeostasis and modulation of the ROS-signaling pathway 53 . ROS are vital in defense signaling mechanisms 54 as well as controlling the programmed cell death by means of formation of the HR 55 . In this study, flavonoid biosynthesis was highly upregulated in the V. dahliae infected cotton varieties. In line with previously conducted studies that have described the significance of lignin pathway in lignification and reinforcement of cell walls as a vital processes in plant responses to fungal infection 51,56 , we identified significant lignin pathway enrichment that was more established in the disease resistant variety more than in the disease   57 , and increased sulfur availability aids in boosting plant disease resistance to pathogens [58][59][60][61] . In accordance to previously conducted studies, SDC aggregation is quickly increased in pathogen-resistant cultivars compared to the susceptible cultivars. Similarly, other numerous studies have shown the importance of sulfur in disease resistance in plants [62][63][64] . In tomatoes, expression levels of genes linked to sulfate uptake that has been observed as well as the integration and the SDCs formation were all up-regulated, thus indicating that sulfur-enhanced defense may be playing a significant part in improved resistance of tomatoes to V. dahliae, as depicted by Rausch and Wachter 57,65 . Further, the increased GSH content together with other up-regulated genes expression associated with sulfur metabolism, like those that encode for glutathione-S-transferase, cysteine synthase and peptide methionine sulfoxide reductase, possibly play a part in enhancing the resistance against V. dahliae in tomatoes 65 . Intriguingly, Glutathione metabolism was upregulated in the KEGG enrichment pathway. Thus, it may be implicated in the resistance mechanisms of V. dahliae infection in cotton varieties. WGCNA split the core DEG's into two modules, each providing a distinct metabolism pathway in relation to V. dahliae infection. From the recognized hub genes, co-expression network of upregulated hub genes comprised of Ascorbate transporter, Serine/threonine kinase, Auxin response factor, and Immunity and drought related genes. In accordance with GO term analysis, these hub genes were involved in the transport and in the immune response mechanisms. Plants utilize a wide variety of chemicals to defend them from pathogens. These chemical compounds may build up in the affected tissue at high levels 66 . Proteins encompassed in long distance transportation are important for the transfer of these protective compounds to other tissues and for the survival of plants under a viral attack 67 . The differentially expressed transporters recognized herein may have a part in transportation of secondary metabolites as well as the translocation of defense associated compounds that stimulate other mechanisms intricate in defense responses of plants when under a viral attack. Serine/threonine protein kinase (STK) performs a significant part in plant stress-signaling transduction pathway through phosphorylation 68 . Serine/threonin protein kinase (STK) is a major defensive signal transduction protein. The primary function of STK in plant-microbe systems is to sense and relay environmental signals from a pathogen 69 . Equally, phytohormone auxin also plays the key role in controlling many different characteristics of plant development, including lateral root initiation, elongation of the shoot, embryogenesis, vascular growth, tropical development and architecture of flowers and tissues 70,71 . Auxin response factors (ARFs), which are essential apparatuses of the auxin signal pathways, control the expression of auxin-responsive genes by binding directly in their promoter regions to the auxin-responsive variable 72 .