Genetic characteristics of Bursaphelenchus xylophilus third-stage dispersal juveniles

The third-stage dispersal juvenile (DJ3) of pinewood nematode (PWN) is highly associated with low-temperature survival and spread of the nematode. Oil-Red-O staining showed that its lipid content was significantly higher compared with other PWN stages. Weighted gene coexpression network analysis identified that genes in the pink module were highly related to DJ3 induced in the laboratory (DJ3-lab). These genes were arranged according to their gene significance (GS) to DJ3-lab. Of the top 30 genes with the highest GS, seven were found to be highly homologous to the cysteine protease family cathepsin 1 (CATH1). The top 30 genes with the highest weight value to each of the seven genes in the pink module were selected, and finally 35 genes were obtained. Between these seven CATH1 homologous genes and their 35 highly related genes, 15 were related to fat metabolism or autophagy. These autophagy-related genes were also found to be highly correlated with other genes in the pink module, suggesting that autophagy might be involved in the mechanism of longevity in DJ3 and the formation of DJ3 by regulating genes related to fat metabolism.


Scientific Reports
| (2021) 11:3908 | https://doi.org/10.1038/s41598-021-82343-9 www.nature.com/scientificreports/ of DJ3 is slightly larger and accumulates many lipid droplets 13,16,26 . In previous studies, the percentage of DJ3 in standing trees in the field increased with time, reaching 60-100% by winter 15,29 , which might be related to their survival in a low-temperature environment, indicating a greater risk of its spread to low temperature areas. Therefore, additional studies are urgently needed to illustrate the mechanism of DJ3 formation and PWN survival under low temperatures. In this study, DJ3 induced in the laboratory (DJ3-lab) was compared with DJ3 collected in the field (DJ3-field) and other PWN stages, and the morphological characteristics of DJ3 were identified. Weighted gene coexpression network analysis (WGCNA) was used to explore the complex relationships between genes and phenotypes, which helped in determining the main functions of genes in the modules related to DJ3 30 .

Results
Dividing nematode stages by length and lipid content. The morphometric values of J2; J3; fourthstage propagative juvenile (J4); DJ3-lab; DJ3-field; female adult (Female); and male adult (Male) are summarized in Table 1. DJ3-lab and DJ3-field were similar to each other (Table 1, Fig. 1A-Q). Stored neutral lipids were   www.nature.com/scientificreports/ assayed by the standard assay for adipocyte fat storage using Oil-Red-O staining ( Fig. 1H-N). Student's t-test was used to determine the significance between DJ3-lab and other stages. The results indicated that except for DJ3-field (p-value > 0.1), the lipid contents of other stages were significantly different from those of DJ3-lab (p-value < 0.0001). Oil-Red-O staining of the DJ3-lab and DJ3-field PWNs showed obviously more fat masses than other stages ( Fig. 1O-Q), this characteristic can be used for the morphological identification of DJ3 (Fig. 1R).

Digital gene expression (DGE) sequencing.
To characterize the gene transcript patterns, 27 libraries (three replicates for each stage: J2, J3, J4, Female, Male, the second-stage propagative juvenile prior to DJ3-lab (J2-2), DJ3-lab, DJ3-field, and DJ4 of PWNs were constructed and sequenced. After removing the low-quality reads, an average of 23.42 Mb clean reads were obtained for each sample (Table S1). The dataset was deposited in the Sequence Read Archive (SRA; accession: SRR9990585; BioProject ID: PRJNA560635). A total of 17,811 assembled unigenes were generated from the 27 libraries.
Weighted gene coexpression analysis (WGCNA). To eliminate noise from genes that were not expressed, we filtered probes with median FPKM levels that did not exceed 1.0 and obtained 15,863 genes. The expression values of these 15,863 genes in 27 transcriptomes of PWNs were used to construct the coexpression module with WGCNA package tools. Cluster analysis was performed on these samples to determine whether there were any obvious outlier (Fig. S1). We choose power 6, which was the lowest power at which the scale-free topology fit index curve flattened out upon reaching a high value (in this case, approximately 0.85, Fig. S2), to www.nature.com/scientificreports/ construct coexpression modules. Fourteen distinct gene coexpression modules were identified ( Fig. 2A). These coexpression modules were constructed and shown in different colors, and the number of genes in each module was shown in Table S2. The gene network was visualized by plotting a heatmap (Fig. S3). Each row and column of the heatmap corresponds to a single gene. The heatmap depicts adjacencies or topological overlaps, with light colors denoting low adjacency (overlap) and dark colors denoting high adjacency (overlap). The module eigengene (ME) is defined as the first principal component of a given module. It can be considered a representative of the gene expression profiles in a module (Fig. S4). Modules with common expression pattern interactions in the coexpression modules that were associated with particular traits (Fig. S1) were identified based on the correlation between the ME and trait (Fig. 2B). The analysis identified two modules (pink and magenta) associated with lipid content. Of these two modules, the pink module was significantly associated with DJ3-lab (correlation value, cor = 0.94, p-value = 2 × 10 -17 , Fig. 2B,C), and the magenta module was significantly associated with DJ3-field (cor = 0.90, p-value = 1 × 10 -13 , Fig. 2B and Fig. S5).
We concentrated on DJ3-lab as the trait of interest. To study the relationships among the identified modules, the MEs were used as representative profiles, and ME correlation was used to quantify module similarity. The trait that determined whether the PWNs were DJ3-lab was added to the MEs to see how this trait fit into the ME network. The dendrogram indicated that the pink module had the highest association with DJ3-lab and that the pink and magenta modules were highly related (Fig. 2D).
Functional and pathway enrichment analyses of genes in interesting modules. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment were performed on the genes in the pink and magenta modules. There was no significant difference in the GO terms and pathways in which the two modules were enriched ( Fig. S6-S9). Genes in these two modules were mainly enriched in GO: 0044699 (single-organism process), GO: 0008152 (metabolic process), GO: 0003824 (catalytic activity), GO: 0005488 (binding), GO: 0005623 (cell), and GO: 0044464 (cell part). They were mainly enriched in global and overview maps (ko01100 and ko01200), lipid metabolism (ko00590, ko00561 and ko00071), transport and catabolism (ko04142, ko04146 and ko04140), carbohydrate metabolism (ko00040, ko00010, ko00053 and ko00620), and xenobiotics biodegradation and metabolism pathways (ko00982, ko00980 and ko00983). The above results indicated that although there were differences in gene expression between DJ3-field and DJ3-lab as DJ3-field was induced by complex environmental conditions, the genes highly related to the two have similar functions and played a role in similar pathways.

Module visualization and candidate genes selection.
We concentrated on the genetic characteristics of DJ3-lab as this stage was induced in the laboratory under certain conditions. An expanded view of the expression of all genes in the pink module was compared with the ME expression of the pink module across all samples (Fig. 3A,B). The ME took on low values in arrays where many module genes were underexpressed (green color in the heatmap). The ME took on high values in arrays where many module genes were overexpressed (red in the heatmap). The results indicated that in addition to the high expression in DJ3-lab, the genes in the pink module also had higher expression in samples J2-2 and DJ3-field than in other samples (Fig. 3A,B). This suggested that the genes in the pink module could also be highly correlated with the formation of DJ3.
To quantify the similarity of genes in the pink module, the associations of individual genes with the DJ3-lab trait were quantified by defining gene significance (GS) as the correlation between the gene and the trait, and a quantitative measure of module membership (MM) was also defined as the correlation of the ME and the gene expression profile for the module. A scatterplot of GS vs. MM in the pink module was plotted, and GS and MM were highly correlated, illustrating that genes highly significantly associated with a trait were often also the most important (central) elements of modules associated with a trait (Fig. 3C).
The intramodular connectivity (IC) value was defined only for the genes inside a given module. IC measures how connected, or co-expressed, a given gene is with respect to the genes of a particular module. IC may be interpreted as a measure of MM. The IC for each gene in the pink module was calculated. Genes with the top 30 highest IC values were considered intramodular hub genes in this study (Table S3, Fig. S10). The amino acid sequences of 20 of these genes could be aligned to highly homologous sequences in the nr library. Six of them had high homology with the cysteine protease family cathepsin 1 (CATH1) from B. mucronatus (Table 2, GenBank: AID50178.1). These 6 hub genes in the pink module are represented in red in Fig. S10. After raising the MM to a power of 6, it was highly correlated with the IC (Fig. 3D). Highly connected intramodular hub genes tend to have high MM values to the respective module. The GS vs. IC was also plotted, and we observed that genes with high IC tended to have high GS (Fig. 3E).
Of the top 30 genes with the highest GS to the DJ3-lab in the pink module, the amino acid sequences of 17 of them could be aligned to homologous sequences in the nr library. Seven of them had very high homology with CATH1 from B. mucronatus (Table 2). These seven genes basically overlapped with the six hub genes mentioned above, except that two of them (cath1-7 and cath1-2) were not hub genes, and one hub gene (cath1-8) was not among them. The top 30 genes with the highest weight value to each of the 7 genes in the pink module were selected, and finally 35 genes were obtained. The network data were exported to Cytoscape (Fig. 3F) by Prefuse Force Directed Layout based on the weight value between two genes. The whole network contained 210 regulatory relationships of 42 genes (detailed in Table S4 and Table S5).
Candidate gene RT-PCR and their relationships with each other. The 15 genes mentioned above that were associated with the lipid metabolism-related pathway or autophagy-related pathway were selected for further study. Their expression levels between different stages of PWNs were further verified by RT-PCR (Fig. 4B,C). Gene expression levels of these genes from J2 were used for data normalization. Student's t-test was used to detect significant differences between the two stages. The results indicated that the expression levels of genes from DJ3-lab were most significantly different from those from DJ3-field and DJ4 (p-value < 0.0001); were significantly different from those from Female and Male (0.0001 < p-value < 0.01); and were different from those from J3, J4 and J2-2 (0.01 < p-value < 0.05). The expression levels of genes from J3, J4, Female, Male and  Table S4 and Table S5). Prefuse Force Directed Layout was applied based on the weight value between two genes. www.nature.com/scientificreports/ J2-2 were not obviously different (p-value > 0.05) from each other, but they were all different from those from DJ3-field and DJ4 (p-value < 0.05), except for genes from Male, which were not obviously different from those from DJ3-field (p-value > 0.5). The expression levels of genes from DJ3-field were different from those from DJ4 (0.01 < p-value < 0.05).
The clustering analysis of the 15 genes and different PWN stages indicated that J2-2 and DJ3-lab had a similar expression pattern, while DJ3-field and DJ4 were quite different from other stages (Fig. 4D,E). Additionally, the expression patterns of genes with similar functions were relatively uniform. Moreover, we noticed that although the expression levels of cath1-2, cath1-3 and cath1-4 were relatively high in the other PWN stages, they had an unmatched high expression level in the DJ3-lab stage (p-value < 0.0001, Fig. 4D,E). However, the expression levels of these 3 genes were relatively low in DJ3-field and DJ4. These results indicated that these 3 genes might have a more important role in laboratory-cultured nematodes.
For the 15 selected genes, the differences between DJ3-lab and the other PWN stages in the expression of genes related to autophagy were relatively larger than those for genes related to lipid metabolism (p-value < 0.0001, Fig. 5A). For each of the 15 genes, its weight value with the other 14 genes was selected, and their network data were exported to Cytoscape by Prefuse Force Directed Layout based on the weight values between two genes. The whole network contained 105 regulatory relationships of 15 genes (Fig. 5B, details can be found in Table S4 and Table S5). The results indicated that the genes related to autophagy might be related to more genes in the pink module, as they had relatively higher IC values and weight values with other genes. Those genes also had large differences in gene expression between DJ3-lab and the other PWN stages, and they had larger GS values to DJ3-lab than those of the genes related to lipid metabolism. These results illustrated that these autophagy-related genes might play an important role in DJ3-lab and they were closely related to lipid metabolism-related genes.

Discussion
Recently, PWNs have been found in northeastern areas of China, where the lowest temperatures in winter are below − 20 °C, indicating that the large areas of pine forest in the north might be at risk of infection. Therefore, it is necessary to study the ability of this nematode to resist to diverse climatic conditions. The DJ3 stage, which is also a long-lived stress-resistant stage, might play an important role in the process of spreading northward, as DJ3s can survive under unfavorable conditions, such as starvation, low temperature and dehydration, inside the dead wood of host trees from autumn to the following spring 4,15 . Additionally, DJ3 is the prerequisite of DJ4, which can be transmitted with Monochamus beetles. Therefore, blocking DJ3 should prevent the spread of the nematode.
Unlike DJ4, which have unique morphological features, DJ3 are not morphologically different from J3 (Table 1, Fig. 1A-Q), except that the body of DJ3 is slightly larger and accumulates many lipid droplets 13,16 . In this study, Oil-Red-O staining was performed to highlight the differences in the lipid content among all PWN stages. DJ3-lab were compared to DJ3-field and the other PWN stages cultured in the lab. The results indicated that DJ3-lab and DJ3-field had obviously more fat masses than other stages. Because sometimes the difference in body length is not obvious between PWNs, the lipid content may be a better choice for distinguishing DJ3 from other stages.
WGCNA was then used to analyze the genetic characteristics of DJ3-lab and explore the complex relationships between gene profiles and phenotypes. The results showed that the genes highly related to DJ3-lab (pink module) and DJ3-field (magenta module) had similar functions and played a role in similar pathways. In addition, the expression of genes highly expressed in DJ3-lab (pink module) was also higher in J2-2 and DJ3-field than in other stages of PWNs (Fig. 3A,B). These results indicated that although there were differences in gene expression between DJ3-field and DJ3-lab since DJ3-field was induced by complex environmental conditions, DJ3 might be characterized by a high expression of genes with the same type of function, and those genes might increase in expression before DJ3 is formed. Therefore, these genes may be highly related to DJ3 and may be target genes for preventing the formation of DJ3.
We then evaluated these genes in the pink module and screened these genes based on IC and GS because IC represents the connection strength within the module, and GS represents the correlation between expression profiles and traits. In the top 30 genes based on IC or GS, there were multiple genes with high homology to the CATH1. We selected 7 genes with high homology to the CATH1 from the top 30 highest GS value genes and www.nature.com/scientificreports/ then screened the top 30 highest weight value genes for each of these 7 genes in the pink module. In total, 42 genes with 210 regulations were obtained. Of these 42 genes, 15 were related to autophagy or lipid metabolism. These included 6 of the 7 genes that were highly homologous to the CATH1, and these 6 genes were all highly related to autophagy. Although the expression patterns of the functionally similar genes of the 15 genes were highly similar, there was a high correlation between the 15 genes. There was an increase in hub genes involved in autophagy-related genes, and these genes had a relatively higher correlation with other genes, suggesting that autophagy might play an important role in DJ3 to promote the formation of DJ3 and maintain its longevity by regulating genes related to lipid metabolism. Under normal conditions, autophagy occurs at the basal level. However, it is accelerated by a variety of stresses, such as starvation 31 , the accumulation of abnormal proteins and organelle damage 32 . Thus far, accumulating evidence shows that the activation of autophagy seems essential for longevity 31,[33][34][35] . Autophagic activity is commonly elevated in many long-lived animals and is essential for their longevity, suggesting that autophagy www.nature.com/scientificreports/ is one of the convergent downstream mechanisms of all these longevity paradigms 32 . It has been reported that lipid turnover by autophagy is essential for the longevity and clearance of lipids (lipophagy) and mitochondria (mitophagy) and are related to aging in C. elegans [36][37][38] . Therefore, autophagy might also be the longevity mechanism of DJ3. However, the process by which autophagy in PWNs is activated and is involved with lipid metabolism to induce the nematode to accumulate lipids and prolong life requires further investigation.

Materials and methods
Nematodes. The  Female; Male; DJ3-lab; DJ3-field and DJ4) were all divided into two parts, which were used for photographing and lipid content determination. Images of dyed PWNs were captured with a microscope Bx51 (Olympus, Japan). The quantification of Oil-Red-O staining was performed by image analysis. ImageJ was used to separate each color image into its RGB channel components. The green channel was used for further analysis 41,42 , and the total dyed areas of the PWNs were calculated. A minimum of 9 PWNs were measured for each stage, and 3 separate biological replicates were performed. Significance was determined by Student's t-test.
The quantification of the lipid content was also assessed by extracting the dye from stained PWNs, and 1 ml of isopropyl alcohol was added to each centrifuge tube with stained nematodes. The extracted dye was removed and then measured with a GeneQuant 1300 ultraviolet spectrophotometer (Biochrom Ltd., UK), and its absorbance was monitored spectrophotometrically at 510 nm 41 . A total of 1000 PWNs were measured for each stage each time, and three separate biological replicates were performed. Significance was determined by Student's t-test. DGE sequencing. Total RNA was extracted separately from the powders of different stages of PWNs using TRIzol (Invitrogen, USA, cat. no. 15596-026) 43 . The RNA Nano 6000 Assay Kit for the Agilent 2100 Bioanalyzer system (Agilent, USA) was used to examine the concentration, RNA integrity number (RIN), 28S/18S and fragment size of the total RNA. The purity of RNA was determined by a NanoDrop spectrophotometer (Thermo Scientific, USA). Genomic DNA was removed using DNase I, Amplification Grade (Invitrogen, USA, cat. no. 18068-015). www.nature.com/scientificreports/ Each RNA sample was sheared and reverse transcribed using random primers to obtain cDNA for library construction. The construction of the libraries and sequencing were all performed on a BGISEQ-500 RNA-seq platform (BGI, Shenzhen, China), and 50-bp single-end (SE) reads were generated. SOAPnuke (v1.5.2, https :// githu b.com/BGI-flexl ab/SOAPn uke) was used to filter the generated raw sequencing reads. Clean reads were mapped to the reference genome of the PWN (BioSample: SAMEA2272519), which is available at NCBI (http:// www.ncbi.nlm.nih.gov/assem bly/31067 8), using HISAT2 (v2.0.4, http://www.ccb.jhu.edu/softw are/hisat ) 44 . Clean reads were then mapped to reference sequences using Bowtie2 (v2.2.5, http://bowti e-bio.sourc eforg e.net/Bowti e/index .shtml ) 45 .. The matched reads were calculated and normalized to fragments per kilobase per million mapped fragments (FPKM) using RSEM (v1.2.12, http://dewey lab.biost at.wisc.edu/RSEM) 46 .
WGCNA. WGCNA was used to explore the complex relationships between genes and phenotypes 30 . An extensive overview of WGCNA, including numerous tutorials, can be found at http://www.genet ics.ucla.edu/ labs/horvath/CoexpressionNetwork/. In this study, we aimed to construct coexpression modules using the expression data of genes from different stages of PWNs. The lipid content of PWNs and the stage to which they belonged were selected as traits. The appropriate power value was determined when the degree of independence was over 0.8. The minimum number of genes was set as 30 for highly reliable results. WGCNA was implemented in the R software package (http://www.r-proje ct.org/), and the heatmap tool package was implemented to analyze the strength of the interactions. Module-trait associations were estimated using the correlation between the ME and the trait. The IC was calculated for each gene by summing the connection strengths with other module genes and dividing this number by the maximum IC. For each expression profile, GS was calculated as the absolute value of the Pearson correlation between the expression profile and each trait. MM was defined as the Pearson correlation of the expression profile and each ME. Network depictions of interesting modules were constructed with Cytoscape software 47 . GO and KEGG enrichment analyses of selected modules. GO 48 and KEGG 49 enrichment analyses were performed on genes in selected modules. The results of the analysis were extracted, and a p-value ≤ 0.05 after the correction was used as the threshold. The top 10 records were extracted if there were more than 10 records.
RT-qPCR. The transcript levels of differentially expressed genes for different stages of PWNs were measured by RT-qPCR using the GoTaq 2-Step RT-qPCR System Kit (Promega, USA, cat. no. A6010) and the Stratagene Mx3000P qPCR system (Agilent, USA). The primers used in this study are listed in Table S6. The RT-qPCR results were normalized as log 2 (JX/J2)-fold changes with a constitutively expressed gene, mec-12, as an internal control, which has no significant difference in expression level at each stage. The normalization of data was performed according to the instructions of the GoTaq 2-Step RT-qPCR System Kit, and the 2 − C T method was used to analyze the data 50 . Three separate biological replicates were performed, and each replicate was assessed three times. Significance was determined by Student's t-test. www.nature.com/scientificreports/