Understanding the mechanisms of dormancy in an invasive alien Sycamore lace bug, Corythucha ciliata through transcript and metabolite profiling

The sycamore lace bug, Corythucha ciliata, is a pest of sycamore trees. In China, it is found in the most northern border where it has been known to become dormant during harsh winters. But the molecular and metabolic basis for dormancy in this insect is still unknown. In this study, we analyzed the transcript and metabolite profiles of this bug to identify key genes and metabolites that are significantly regulated during dormancy in adult females and males. In total, 149 differentially expressed genes (DEGs) were significantly up-regulated and 337 DEGs were significantly down-regulated in dormant adults (both females and males). We found major differences in heat shock protein (HSPs), immunity-responsive genes, NAD-dependent deacetylase sirtuin-1 (SIRT1) and genes involved in the spliceosome pathway that is known to regulate stress. Among the 62 metabolites identified by GC-MS, 12 metabolites including glycerol, trehalose, and alanine were significantly increased during C. ciliata dormancy. By integrating the transcriptome and metabolite datasets, we found that the metabolites in glycolysis/gluconeogenesis and citrate cycle (TCA) were significantly reduced. This study is the first to report both transcript and metabolite profiles of the overwintering responses of C. ciliata to cold stress at the molecular level.

and 86 ± 11%, respectively on November 17 and November 26, 2015 and 50 ± 6% on January 21, 2016. When these inactive adults were maintained at room temperature they recovered their activity level. This finding suggested that C. ciliata adults were dormant and developed tolerance against severe cold stress in winter conditions. A comparative analysis between the transcript and metabolite profiles of the dormant adults in cold environments with the active adults at regular weather was conducted in this study. For this analysis, C. ciliata adults were collected on September 17 (active adults) and November 26 (dormant adults), 2015. As shown in Fig. 1, the maximum and minimum temperatures on September 17 and November 26 were 27/18 °C and 0/−8 °C, respectively. In addition, the samples collected on January 21, 2016 were also used to further verify the transcript by Quantitative Real-Time PCR (qRT-PCR). The maximum and minimum temperatures of that day was −26/−15 °C.

RNA-seq of C. ciliata adults in dormant and non-dormant status.
To reveal the cold tolerance-associated transcriptional responses in C. ciliata, RNA was extracted from four samples including active C. ciliata females and males (in temperatures from 18 °C to 27 °C) and dormant females and males (at winter temperatures from −8 °C to 0 °C) and were subjected to Illumina sequencing. The sequencing data from this study has been deposited in the NCBI Sequence Read Archive database under Accession numbers SRR3170922, SRR3170923, SRR3883369 and SRR3883370. High throughput sequencing generated 65,508,518-79,748,574 pairs of raw reads per sample and the Q20 scores (the average quality value) were more than 95% (Table 1).
After filtering for quality, Trinity was used to assemble the 273,862,866 clean reads. The assembled reads from samples representing the different physiological stages (active and dormant stages) generated a total length of 61,678,837 bp and 92,218 unigenes. The mean length and N50 of these unigenes were 669 bp and 1,221 bp, respectively ( Table 2). Analysis of the size distributions revealed that 13,342 unigenes were over 1,000 bp in length. All 92,218 C. ciliata unigenes were compared with the genes and proteins in seven public databases (Table 3) and the following results were obtained: BlastX yielded 22.4% (20,658 unigenes) that matched the NCBI-Nr database, BlastN resulted in 8.04% (7,421) of the unigenes that specifically matched the nucleotide sequences in the NCBI-Nt database, 18.04% (16,639) were similar to proteins in the Swiss-Prot database using BlastX, and finally    (27,785) were annotated in at least one database (Table 3). After CDS analysis by Blast and ESTscans 7 , 41,786 coding sequences were identified. To analyze the transcript abundance between the dormant and non-dormant samples, the RNA-seq reads from all four samples were mapped onto the assembled unigenes to determine DEGs (q-value < 0.005 and |log2 (foldchange)| >1). DEGs were defined as the transcripts that were significantly enriched or depleted in one sample relative to the other. As shown in Fig. 2, 623 and 1,791 DEGs were up-and down-regulated in a comparison between the dormant and non-dormant males, respectively. Moreover, 368 and 592 DEGs significantly differed between non-dormant and dormant females. More importantly, among the 486 DEGs that were differentially expressed in both adults (female and male), 149 were significantly up-regulated in both dormant adults (female and male) while 337 were significantly down-regulated in both dormant females and males (Fig. 2).
Comparison of the metabolome of C. ciliata adults in dormant status. Comparative metabolomics analysis of dormant and non-dormant C. ciliata was conducted by GC-MS. A total of 62 metabolites were present in the samples, including 23 amino acids, 13 organic acids, 2 free fatty acids, 1 fatty acid methyl ester, 1 intermediate acidic metabolite, 5 sugars, 2 polyols, 3 phosphates, 2 ureides and 10 other metabolites. Interestingly, the metabolites quinic acid, L-cysteine, sucrose, fructose, glucose-1-phosphate, uridine, cytidine, cytidine monophosphate, tryptophan and arginine were not detected in the dormant adults but were identified in the non-dormant adults.
To analyze changes among the metabolites during C. ciliata dormancy, we determined the metabolic profile of C. ciliata before and after the onset of dormancy. Among the 62 metabolites identified in this study, 12 significantly increased, and 15 metabolites significantly decreased in response to dormancy status of both adults (female and male) (Fig. 3). The 12 increased metabolites included 2 free fatty acids (palmitic acid and stearic acid), 5 amino acids (alanine, histidine, oxoproline, serine and threonine), 1 sugar (trehalose), 1 polyol (glycerol),  Table 3. Functional annotations of C. ciliata unigenes. 1 organic acid (glutamic acid) and 2 other metabolites (uracil and dihydrouracil). Among them, trehalose was in a dominant proportion in the dormant samples, followed by alanine.

Number of Unigenes Percentage (%)
Stress tolerance-related DEGs and SIRT1 were significantly up-regulated during dormancy.
Spliceosome related genes were significantly up-regulated during dormancy. In the GO terms enrichment analysis, 24 GO terms were enriched and at least 14 of them were related to RNA splicing (Supplementary Table S2).
In the KEGG enrichment analysis, the up-regulated DEGs were enriched only in the spliceosome pathway during dormancy (Supplementary Table S2). As show in Fig. 4, PUS60, Snu13 and others associated with U2, U4/ U6, Prp19 as well as other common components were up-regulated during dormancy. In addition, HSP73 in the Prp19 complex, pre-mRNA-splicing factor SYF (SYF) in Prp19-related complex, and heterogeneous nuclear ribonucleoprotein (hnRNP) in common components were also up-regulated during dormancy (Fig. 4). This result indicated the importance of spliceosome in the establishment of cold tolerance in dormant C. ciliata female and male adults.
DEGs down-regulated by dormancy included 337 genes that were involved in the TCA cycle, carbon metabolism, valine, leucine and isoleucine degradation, fatty acid metabolism, pyruvate metabolism, and metabolic pathways (Supplementary Table S2). Since these pathways are related to metabolism, this is evidence for the well-known repression of basal metabolism during insect dormancy.  Table S2). In contrast, up-regulated DEGs were not significantly enriched in any metabolism-related pathways during dormancy. This suggested a drastic repression of the secondary metabolism during dormancy than during the non-dormant status. Interestingly, 2 DEGs (c36229_g1 and c62170_g1) related to cytochrome P450 were up-regulated during the dormant status (Supplementary Table S1). Based on phylogenetic analysis (Fig. S7), c36229_g1 was the ortholog of Apolygus lucorum CYP6HM1, and c62170_g1 was the ortholog of Nilaparvata lugens CYP4CE1. This finding indicated that cytochrome P450 may play important role during the dormant status of C. ciliata adults.
We then combined the RNA-seq and metabolomic datasets to construct the biological pathway diagrams for dormant C. ciliata (Fig. 5). Metabolites in glycolysis/gluconeogenesis and TCA cycle were repressed significantly Scientific RepoRts | 7: 2631 | DOI:10.1038/s41598-017-02876-w (Fig. 5). This result indicated that in dormant C. ciliata adults substantial numbers of metabolites are suppressed than in non-dormant adults.
DEGs associated with cold tolerance included two enzymes, aldehyde reductase and aldehyde dehydrogenase, that are key enzymes promoted the glycerol production. These DEGs were also significantly down-regulated during dormancy compared with the non-dormant adults (Supplementary Table S1 and Fig. 2). In addition, 2 genes encoding trehalose-phosphate synthase were detected (c61898_g1 and c46175_g1). But, the expression of these genes did not change significantly between the dormant and non-dormant adults.

Discussion
Dormancy is a common adaptation strategy in invertebrates to avoid extreme temperatures in the environment. Insects enter into dormancy to avoid the stresses of unfavorable seasons 9 . Overwinter of C. ciliata is achieved by dormancy that results in higher survival rate under harsh cold weather conditions in winter. This phenomenon may explain the rapid outbreaks of C. ciliata adults in the active season 10 .
This study reported the transcript and metabolite profiles of C. ciliata dormancy for the first time. We have identified novel genes and metabolites associated with cold tolerance in C. ciliata. We detected 149 significantly up-regulated DEGs in a comparison between non-dormant and dormant adults in both females and males. The variation in gene expression is not extensive, indicating that only a few key genes change their transcriptional activity during dormancy. Some of the identified genes related to overwintering in C. ciliata were similar to those previously reported for other insect species, such as HSPs and immune response genes 11 . Among them, HSP70 was the most up-regulated and in total 6 HSPs (3 HSP70, 1 HSP90, HSP21.4 and 1 HSC) were found to be uniquely expressed in dormant adults. This finding suggested the importance of HSPs during the dormancy of C. ciliata. It is well known that HSPs play a vital role in protein stabilization and/or cold tolerance in insect species as reported previously in Drosophila melanogaster 12 , leafminer species 13 and the mountain pine beetle, Dendroctonus ponderosae 14 . In addition, several immune-response genes were also significantly up-regulated during dormancy, including defensin 4 and leucine repeat-rich protein. In Drosophila, HSP70 was also found to be related to immune response 15 . These DEGs may act to protect C. ciliata adults against pathogens when they undergo their dormant state under the loose barks of infested sycamore trees. Also, this finding indicated the crosstalk between cold tolerance and immune-defense responses.
Moreover, SIRT1 interacts with FOXO genes in the FOXO signaling pathway, which was reported to regulate overwintering diapause in the mosquito, Culex pipiens 16 . However, further studies are needed to confirm the functions for these candidate DEGs in cold and immune tolerance pathways.
Interestingly, up-regulated DEGs were significantly enriched in the spliceosome pathway. Alternative splicing including exon skipping, mutually exclusive exons, alternative donor site, alternative acceptor site and intron retention events may result in one gene coding for multiple proteins. Previous studies indicated that alternative splicing is vital for the adaptative responses to multiple stress conditions [17][18][19][20][21] . Cold-induced genes have also been shown to play important roles in the spliceosome pathways in zebrafish 22 . To our knowledge, this is first report of the genes in the spliceosome pathway correlating to cold tolerance in insects except for Drosophila 21 .
Furthermore, many DEGs that were significantly up-regulated during dormancy did not have functional annotations available in our databases. These new DEGs in C. ciliata adults provide a unique resource to understand insect dormancy. The characteristics of these genes will require functional studies in future.
Some known insect cryoprotectants including trehalose, alaine and glycerol have been reported to be up-regulated during dormancy 23 consistent with our results in C. ciliata adults. Trehalose is vital for the stability of proteins and membranes during overwintering 24 and is dominantly expressed in dormant samples suggesting its importance in C. ciliata overwintering. Glycerol is known to supercool the bodily fluids of insect, thus protecting them from lower temperatures 25 . In addition, the detected reduction in β-alanine possibly results from the conversion of L-alanine from β-alanine 26 . The down-regulation of aldehyde reductase, aldehyde dehydrogenase and trehalose-phosphate synthase could be attributed to their high concentrations and their constitutive expression during dormancy in C. ciliata adults 25,26 .
Our transcript and metabolite datasets revealed the C. ciliata genes and metabolites that respond to cold stress environments in the northern most border of China. To fully understand the response of this invasive insect to different temperature stress, additional comparisons of transcript and metabolite levels should be made in future. In addition, to promote our understanding of the adaptation mechanism of Tingidae insect species to environmental stress, the transcript and metabolite information detected in this study can be compared to other Tingidae family species that experience environmental stress conditions in future. . All RNA-seq libraries were prepared in accordance with the manufacturer's protocol (Illumina Inc., San Diego, CA, USA) and 1.5 μg RNA each was used for library preparation, which was performed as described previously 27 . The libraries were then sequenced on an Illumina Hiseq 2500 platform (Illumina, San Diego, CA) to generate 150 bp paired-end reads.
Bioinformatics analysis. From the raw sequenced reads the adapter sequences, reads containing poly-N and low quality reads were removed using in-house perl scripts. In addition, Q20, Q30, GC-content and sequence duplication levels in the clean data were also calculated. All downstream analyses were performed on the resulting clean data with high quality. Files on the left (read1 files) from all libraries/samples were pooled into one big left.fq file, and files on the right (read2 files) into one big right.fq file. These files were used for the transcriptome assembly using Trinity 28 with min_kmer_cov set to 2 and all other parameters set to default. Unigenes were annotated based on the following seven databases: NCBI-Nr, NCBI-Nt, Pfam 29 , KOG/COG, Swiss-Prot, KO and Gene Ontology (GO) 30 using BLAST with an E value threshold of 1e −5 . For interesting genes associated with dormancy, the protein coding region of these genes were searched by ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/) and confirmed by PCR and sequence analysis. Phylogeny trees of were produced using neighbour-joining method with MEGA6.06 31 and 1,000 replicates of bootstrap analyses. Differential expression analyses. Gene expression levels were estimated by RSEM 32 . Differential expression of genes between the active and dormant periods in the life cycle of C. ciliata was quantified using the DEGseq package 33 . P value was adjusted using q-value 34 and unigenes that were expressed at significant levels (q-value < 0.005) and with more than 2-fold change (|log2FoldChange| >1) in expression level were considered to be differentially expressed. The GOseqR package was used to perform GO enrichment analysis of the DEGs, and the KOBAS software 35   were homogenized in 750 µl of cold (−20 °C) methanol/chloroform (2:1) using a glass homogenizer. Then, 500 µl ice-cold water was added to each sample and centrifuged at 2200 g for 15 min. About 150 µl of the resulting supernatant was transferred to a 1.5 ml tube and were dried using a vacuum concentrator. Then, we used a double derivatization method by first suspending the dried sample in 40 µl metoxyaminhydro chloride (Sigma-Aldrich, St. Louis, MO) and shaking for 2 h at 37 °C, followed by the addition of 70 μL N-methyl-N-(trimethylsilyl) trifluoroacetamide (Sigma-Aldrich) and another shaking for 30 min at 37 °C. Finally, we transferred an aliquot of each sample to a GC/MS glass vial for analysis by GC-MS (4D GC × GC-TOF-MS) (LECO, San Jose, USA).
The GC-MS system was equipped with a DB-5MS capillary column coated with 5% diphenyl cross-linked with 95% dimethylpolysiloxane (30 m × 250 μm inner diameter and 0.25 μm film thickness; J&W Scientific, Folsom, CA, USA). Then, 1 ml of each sample was injected at 230 °C in splitless mode with helium carrier gas flow set to 2 mL min −1 . The initial temperature was maintained at 80 °C for 2 min, followed by a 15 °C per min ramp to 330 °C, and holding at this temperature for 6 min. Transfer line and ion source temperatures were 250 and 250 °C, respectively. Energy was set at −70 eV in an electron impact mode. Mass spectrometry data were acquired in a full-scan mode with an m/z range of 70 to 600 at a rate of 20 spectra per second after a solvent delay of 492 s. Chroma TOF 4.3X software (LECO Corporation) was used for data processing. Compounds were identified by searching against the LECO/Fiehn metabolite Mass spectral library and an in-house reference compound library. Concentrations were also corrected relative to an internal standard, arabinose, to adjust for any sample loss during extraction or injection. Metabolite contents were log2 transformed before analysis. Metabolite quantities were compared across the different treatments using ANOVA and Student's t-test in IBM SPSS Statistics software 20. The metabolic and transcriptomic data were integrated by mapping the Drosophila melanogaster and Acyrthosiphon pisum biochemical pathways based on KEGG.
qRT-PCR. cDNA was synthesized from all four samples using the PrimeScript RT Master Mix (Perfect Real Time) (TaKaRa) and used as template in qRT-PCR. The primers used in qRT-PCR are listed in Supplementary   Table S3. The reactions were performed using the 2 × GoTaq ® qPCR Master Mix (Promega, Madison, WI, USA) and all reactions were conducted using the ABI Prism ® 7500 (Applied Biosystems, Carlsbad, CA, USA) with the following PCR-cycling conditions: 95 °C for 2 min; followed by 40 cycles of 95 °C for 30 s, 60 °C for 1 min; then a final melting cycle at 95 °C for 15 s, 60 °C for 15 s, 95 °C for 15 s. Relative RNA expression was normalized to the level of alpha-tubulin. All samples had three biological replicates with three technical replicates. Relative expression levels of genes were quantified using the comparative 2 −ΔΔCt method 39 . The C. ciliata samples that were collected on September 17 (18 °C to 27 °C) and November 26, 2015 (−8 °C to 0 °C) were used for qRT-PCR. In addition, the samples collected on January 21, 2016 (−15 °C to −26 °C) were used to further verify the transcript by qRT-PCR.