Introduction

Cucumber (Cucumis sativus L.) belongs to the family of Cucurbitaceae1,2. The cultivation area of the cucumber in China was 1.111 million hectares and the annual cucumber output was 47.361 million tons in 2011 (http://www.fao.org). The yield and quality of cucumbers are often influenced by different biological factors. The aphid is one of the most serious pests in cucumber production and often causes severe damage to cucumbers.

The host plant is mainly damaged by aphids as follows. First, the aphid drains the nutrients essential for plant growth and reproduction through mouthparts piercing into the phloem of the plant causing direct damage3. Second, it injects saliva into the plant cells, which produces phytotoxicity during feeding4 and allow viruses to be transferred from the aphid to the plant3,4,5. Third, sooty moulds frequently grow on aphid-infected honeydew and hinder photosynthetic activity3,4. Aphid infection of the cucumber plant often result in yield loss and reduced quality. Understanding of the molecular mechanism of aphid resistance is the most effective way to decrease aphid damage and produce higher-quality cucumbers. However, the molecular mechanisms that result in aphid resistance in cucumbers remain unclear.

Previous studies indicated events such as protein phosphorylation, calcium flux, reactive oxygen species (ROS) generation and phytohormone changes in plants infected by aphid, leading to relevant transcriptional regulation in the early response to phloem-feeding insects6,7,8,9. The downstream defence compounds, including nutrient compounds, glutathione S transferases (GSTs), peroxidases and secondary metabolites, change after the perception of the piercing-sucking of insects8,10,11,12,13,14,15. It is known that downstream signals are activated by the induction of ROS16.

A single R gene most likely controls the defence of the plant against aphid infection. One plant R gene, Mi, a root-knot nematode (Meloidogyne spp.) resistance gene, was isolated from the tomato17. Mi was also found to confer resistance to the potato aphid (Macrosiphum euphorbiae) and whitefly (Bemisia tabaci)18,19,20,21. Gb3, which putatively encodes a NB-ARC-LRR type R protein, was known to confer resistance to the greenbug aphid in the wheat plant22. An additional R gene, Vat, which was found in the melon, encodes a cytoplasmic protein with a nucleotide binding site and leucine-rich repeat (NBS-LRR) structure and defends against virus aphid transmission and Aphis gossypii Glover3,23. It has been reported that the NBS-LRR structure is essential for resistance to insects and plays an important role in the ability of the plant to resist insect infection8,21. Lectin was found to confer resistance to aphids24,25. The Galanthus nivalis agglutinin (GNA) gene encoding a monocot mannose-binding condensate agglutinin has been widely and clearly documented to confer resistance to Myzus persicae26. The Allium sativum leaf agglutinin (ASAL) gene has been found to confer resistance to aphis craccivora25 and Myzus nicotianae27. In addition, the Amaranthus caudatus agglutinin (ACA) gene confers resistance to Aphis gossypii Glover28 and Myzus persicae29.

Deep-sequencing technology has become a powerful tool enabling the concomitant sequencing of millions of signatures of the genome and identification of specific genes in a sample tissue30. This technique provides a qualitative and quantitative description of gene expression31,32. In the present study, we used one aphid-resistant cucumber cultivar, ‘EP6392’, which on average has fewer aphids on individual plants and a lower leaf curling degree and chlorophyll loss ratio than susceptible cultivars, to monitor responses to aphid infection at the RNA level. Digital gene expression (DGE) based on the Illumina Genome Analyzer platform was applied to analyse gene expression profiles in the whole genome with the aim of uncovering changes in gene expression after aphid infection and screening candidate genes that may increase resistance to aphid infection in cucumber.

Results

Data generation and filtering

Illumina sequencing is used 4 base recognition enzyme NlaIII to recognizes and cuts off the CATG sites of double-stranded cDNA and to ligates Illumina adaptor 1 at 5' end, then used MmeI to digests at 17 bp downstream of CATG site and to ligates Illumina adaptor 2 at 3' end to acquire 21 bp tags with different adaptors of both ends and CATG+17 bp. The tags were cleaned and directly sequenced using massively parallel sequencing on the Illumina Genome Analyzer (see Materials and Methods; see Supplementary Fig. S1).

Approximately 5.91 million total raw sequence tags were obtained with approximately 0.29 million total distinct tags per library and approximately 5.75 million total clean sequence tags were obtained with approximately 0.14 million distinct clean tags per library, the number of clean tags was approximately 97.28% of the number of raw tags (Table 1 and Table 2). The distribution of the total and distinct clean tag copy numbers showed a highly similar tendency for the nine libraries (see Supplementary Fig. S2). A reference gene database that included 30,364 cucumber sequences was pre-processed for tag mapping. Among the sequences, the genes with a CATG site accounted for 95.05%. To obtain the reference tags, all of the CATG+17 tags in the gene were used as gene reference tags. Finally, 130,941 total reference tag sequences with 92,326 unambiguous reference tags were obtained. In total 20.05%–34.83% of the distinct clean tags were mapped unambiguously to the UniGene database, 31.20%–62.47% of the distinct clean tags were mapped to the cucumber genome database and 10.55%–13.93% of the distinct clean tags could not be mapped to the UniGene virtual tag database (Table 1 and Table 2). About more DGE libraries Characteristics and tag mapping information please see Supplementary Table S1 and Supplementary Table S2. The analysis of sequencing saturation in the nine libraries was performed to estimate whether or not the sequenced depth was sufficient to cover the whole transcriptome. The genes that were mapped by all clean tags and unambiguous clean tags increased with the total number of tags increased. We found that the number of detected genes was saturated after the sequencing counts reached 2 million tags or higher (see Supplementary Fig. S3). we found that 8.94%–20.58% of the unambiguous distinct clean tags mapped to sense genes and 7.34%–14.88% of the unambiguous distinct clean tags mapped to antisense genes in the nine libraries by Illumina sequencing, which distinguished transcripts that originate from both DNA strands, using the strand-specific nature of the sequencing tags obtained (see Supplementary Table S3).

Table 1 Categorisation and abundance of tags. Clean tags are tags that remained after filtering out low quality tags from the raw data. Distinct tags are different types of tags. Unambiguous tags are clean tags that remained after the removal of tags mapped to the reference sequences of multiple genes
Table 2 Categorisation and abundance of tags 2

Analysis of differentially expressed genes after aphid infestation

To determine changes in gene expression at the transcriptional level in the cucumber leaf infested by aphids, a rigorous algorithm was applied to identify differentially expressed genes by the normalised DGE data by comparing (T2/0) vs. (C2/0), (T4/T2) vs. (C4/C2), (T6/T4) vs. (C6/C4) and (T8/T6) vs. (C8/C6) (Fig. 1). T4/T2 means the differentially expressed genes of plants infected by aphids after 4 d and 2 d (treatments, see Materials and methods section), C4/C2 means the differentially expressed genes caused by growth and development of the plant itself after 4 d and 2 d (controls, see Materials and methods section) and (T4/T2) vs. (C4/C2) means the differentially expressed genes of plants caused by aphids infestation after 4 d and 2 d after having eliminated those produced by the growth and development of the plant itself. False discovery rates (FDR) ≤ 0.001 and the absolute value of || ≥ 1 were used as a threshold to determine whether changes in gene expression were significant. The results showed that 964 genes, including 657 (68.15%) up-regulated genes and 307 (31.85%) down-regulated genes, were differentially expressed in (T2/0) compared with (C2/0) (Fig. 2). By comparing (T4/T2) with (C4/C2), we found that the expression of 1146 genes was altered, including 471 (41.10%) up-regulated genes and 675 (58.90%) down-regulated genes altered. The expression of 1029 genes was altered when (T6/T4) was compared with (C6/C4), including 690 (67.06%) up-regulated and 339 (32.94%) down-regulated genes. Additionally, 1,265 genes were differentially expressed in (T8/T6) when compared with (C8/C6), 494 (39.05%) of which were up-regulated and 771 (60.95%) of which were down-regulated (Fig. 2).

Figure 1
figure 1

The experimental design.

Firstly, we compared DGE profiles of the libraries of aphid infestation (T2 vs. 0, T4 vs. T2, T6 vs. T4, T8 vs. T6) and control (C2 vs. 0, C4 vs. C2, C6 vs. C4, C8 vs.C6) respectively and then compared (T2/0) vs. (C2/0), (T4/T2) vs. (C4/C2), (T6/T4) vs. (C6/C4) and (T8/T6) vs. (C8/C6) to obtain the differentially expressed genes, eliminated it derived from growth and development of the plant itself, caused by aphid infection. T4/T2 means the differentially expressed genes after 4 d and 2 d infected by aphids, C4/C2 means the differentially expressed genes after 4 d and 2 d caused by growth and development of the plant itself and (T4/T2) vs. (C4/C2) means the differentially expressed genes, eliminated differentially expressed genes caused by plant growth and development, caused by aphids infestation after 4 d and 2 d.

Figure 2
figure 2

Number of differentially expressed genes in each comparison.

The numbers of up-regulated and down-regulated genes are presented. ‘B’ was the control group and ‘A’ was the experimental group in ‘A/B’; (A/B) was the control group and (C/D) was the experimental group in (C/D) vs. (A/B).

To determine the genes associated with cucumber aphid resistance, we first used Qvalue ≤ 0.05 as a threshold to screen significant differences of the enrichment pathway in at least one of the comparisons and then selected the candidate pathway related to aphid resistance in cucumber based on the previously published results6,22. Finally, 49 genes that may be associated with cucumber aphid resistance based on the function annotation were chosen. The results showed that several processes such as signal transduction, plant-pathogen interaction, flavonoid metabolism, amino acid metabolism and sugar metabolism may be associated with cucumber aphid resistance (Table 3 and Table 4).

Table 3 Selected genes with altered expression in leaves of control cucumber plants
Table 4 Selected genes with altered expression in leaves of aphid-infested cucumber plants

Novel changes were observed in the expression levels of genes involved in signal transduction. Peroxidase 2, peroxidase 4, lignin-forming anionic peroxidase, L-ascorbate oxidase, respiratory burst oxidase homolog protein C, calcium-dependent protein kinase 7, calcium-binding protein CML19, calmodulin-like protein 1, WRKY protein, WRKY transcription factor 30, WRKY transcription factor 42 and WRKY transcription factor 51 were up-regulated in aphid-infested leaves (Table 3, Table 4). Among these genes, peroxidase 2, peroxidase 4, lignin-forming anionic peroxidase, L-ascorbate oxidase, calcium-dependent protein kinase 7, WRKY protein and WRKY transcription factor 51 were down-regulated at 8 d after aphid infestation,whereas the calmodulin-like protein 1 expression level decreased at 4 d after aphid infestation. Thus, signal transduction was activated by aphid infestation.

The expression levels of genes involved in plant-pathogen interactions including cysteine-rich receptor-like protein kinase 3, pathogenesis-related protein 1, L-type lectin-domain containing receptor kinase IX.1, leucine-rich repeat receptor-like protein kinase At1g68400, lectin-domain containing receptor kinase VI.4, L-type lectin-domain containing receptor kinase S.4, L-type lectin-domain containing receptor kinase VI.1, receptor-like protein kinase At5g20050, LRR receptor-like serine/threonine-protein kinase At1g53440, TIR-NBS-LRR-AAA+ATPase class resistance protein, LRR receptor-like serine/threonine-protein kinase At5g59680, L-ascorbate oxidase, gibberellin 2-beta-dioxygenase 8, proline-rich receptor and acidic endochitinase were higher than that of the control (Table 3, Table 4). Leucine-rich repeat receptor-like protein kinase At1g68400 was up-regulated immediately after aphid infestation but down-regulated 4 d after aphid infestation; LRR receptor-like serine/threonine-protein kinase At1g53440,TIR-NBS-LRR-AAA+ATPase class resistance protein and L-type lectin-domain containing receptor kinase S.4 were down-regulated during 4 d and then up-regulated. L-type lectin-domain containing receptor kinase IX.1, lectin-domain containing receptor kinase VI.4, LRR receptor-like serine/threonine-protein kinase At5g59680 and L-type lectin-domain containing receptor kinase VI.1 were down-regulated immediately after aphid infestation, up-regulated from 4 d to 6 d after aphid infestation and finally down-regulated. R genes and lectin genes have been shown to protect against aphid infection in plants such as wheat, tomato and melon. Therefore,our findings indicate that leucine-rich repeat receptor-like protein kinase At1g68400, LRR receptor-like serine/threonine-protein kinase At1g53440, TIR-NBS-LRR-AAA+ATPase class resistance protein, LRR receptor-like serine/threonine-protein kinase At5g59680, L-type lectin-domain containing receptor kinase IX.1, L-type lectin-domain containing receptor kinase S.4, L-type lectin-domain containing receptor kinase VI.1 and lectin-domain containing receptor kinase VI.4 are closely associated with aphid resistance in cucumber.

Flavonoids are important secondary metabolites that usually play a decisive role against insect infestation. In this study, changes in the expression of several genes responsible for flavonoid metabolism were observed: 1-aminocyclopropane-1-carboxylate oxidase homolog 6 gene, flavonoid 3', 5'-hydroxylase, chalcone-flavonone isomerase, naringenin, 2-oxoglutarate 3-dioxygenase and chalcone synthase 2 were up-regulated after aphid infestation. However, expression of other genes, such as chalcone-flavonone isomerase, naringenin, 2-oxoglutarate 3-dioxygenase and chalcone synthase 2, decreased 2 d after aphid infestation (Table 3, Table 4).

Many genes involved in amino acid metabolism and sugar metabolism were showed change in mRNA level. Genes including the dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase, serine-glyoxylate aminotransferase, polygalacturonase At1g48100, pectinesterase 53 and beta-galactosidase 3 were up-regulated and then down-regulated 4 d after aphid infestation (Table 3, Table 4).

These metabolic pathways associated with signal transduction, plant-pathogen interaction, flavonoid metabolism, amino acid metabolism and sugar metabolism were involved in the response to the stress of aphid infestation. The genes in these pathways, especially L-type lectin-domain containing receptor kinase IX.1, lectin-domain containing receptor kinase VI.4, L-type lectin-domain containing receptor kinase S.4, L-type lectin-domain containing receptor kinase VI.1, leucine-rich repeat receptor-like protein kinase At1g68400, LRR receptor-like serine/threonine-protein kinase At1g53440, TIR-NBS-LRR-AAA+ATPase class resistance protein and LRR receptor-like serine/threonine-protein kinase At5g59680 are likely to be associated with cucumber aphid resistance.

Tag-mapped genes confirmed by qRT-PCR

To confirm the tag-mapped genes in cucumber leaves infected by the aphid, nine genes were selected randomly for qRT-PCR analysis over time. These genes were involved in signal transduction, plant-pathogen interaction, flavonoid metabolism, amino acid metabolism and sugar metabolism. Except for 1-aminocyclopropane-1-carboxylate oxidase homolog 6, the expression of the chosen genes was in agreement with the results of the tag-sequencing analysis patterns (Fig. 3).

Figure 3
figure 3

Quantitative RT-PCR validation of tag-mapped genes from cucumber leaves.

TPM, transcription per million mapped reads.

Discussion

High-throughput tag-sequencing has already been applied to study plant growth and development at the molecular level32,33,34. The tag-mapped technique is known to fully cover the whole plant genome, although many genes have not been annotated34. Using the tag-sequencing technique to analyse gene expression at the whole transcriptional level can increase the understanding of regulatory mechanisms and the identification of differentially expressed genes that render cucumber cultivars resistant to aphids. In this study, gene expression profiling was performed using the tag-sequencing technique after aphid infection in the cucumber leaf. Approximately 5.91 million total raw tags were sequenced per library and approximately 5.75 million total clean tags were obtained per library and more than 85% of the unique tags were matched with the cucumber unigenes or genomic sequence (Table 1, Table 2).

Generation of ROS is a common phenomenon in plant responses to both abiotic and biotic stresses35. ROS such as superoxide (O͘2), hydrogen peroxide (H2O2) and hydroxyl radicals (HO͘) are directly derived from oxidative stress. ROS can induce an array of cellular protection mechanisms including gene expression related to a defensive response36,37. The rapid increase in ROS concentration observed after both biotic and abiotic injuries is called an “oxidative burst”38. Respiratory burst oxidase homolog (RBOH) plays an important role in ROS-mediated signaling33. In the present study, higher expression of RBOH was found in the leaf from the aphid-infected plant than the control plant (Table 3, Table 4). In addition, the expression pattern of some genes involved in the ROS scavenging system significantly changed after aphid infection. For example, genes encoding peroxidase (POD) and phenylalanine ammonialyase (PAL) were up-regulated in the infected plant, suggesting that the acclimation of POD and PAL expression may mediate aphid resistance in cucumber.

Calcium ions (Ca2+) serve as secondary messengers mediating developmental responses, stress signalling and the response to herbivore attack in plants7. After sensing aphid feeding, Ca2+ sensors activate downstream defence signaling cascades by increasing the expression of calmodulin, calmodulin binding proteins and calcium-dependent protein kinases (CDPKs)39. The results of this study support these mechanisms. Calcium-binding protein CML19, calcium-dependent protein kinase 7 and calmodulin-like protein 1 were up-regulated in cucumber leaves infected by aphids.

Secondary metabolites in the induced defence pathways play a decisive role in the resistance to pathogens and herbivore infestation40,41. Flavonoids probably serve as chemical deterrents to defend against pest attacks13,42. Isoflavonoids are used to resist pests and diseases either as protectant phytoanticipins or directly as therapeutic phytoalexins against invading pests43,44. In the treated cucumber plants of the present study, many genes with potential roles in flavonoid metabolites were identified to have altered expression in response to aphid infection (Table 3, Table 4). Naringenin, 2-oxoglutarate 3-dioxygenase, flavonoid 3', 5'-hydroxylase, chalcone-flavonone isomerase, chalcone synthase 2, 1-aminocyclopropane-1-carboxylate oxidase homolog 6, isoflavone 2'-hydroxylase were found to have increased mRNA levels within 2 d of infection that then declined, suggesting that the flavonoid metabolism is rapidly activated in response to stress.

The content of amino acids in plants is closely related to aphid resistance14,45,46. As observed in some lucerne cultivars, the balance of amino acids contributes to aphid resistance15. In this study, genes encoding the dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase and serine-glyoxylate aminotransferase were up-regulated after aphid infestation and their expression then declined. However, threonine synthase, chloroplastic and glutamate decarboxylase 4 were down-regulated at 2 d after treatment and then up-regulated.

Sugar plays an important role against attacks by insects. Genes encoding pectin esterase, cellulose synthase and xyloglucan endotrans-glycosylase/hydrolase were previously found to be up-regulated after aphid infestation of several different plant species such as Apium graveolens, Arabidopsis thaliana and Nicotiana attenuate47,48,49,50,51,52. In this study, genes related to sugar metabolism including 6-phosphogluconate dehydrogenase, decarboxylating, beta-galactosidase 3, polygalacturonase At1g48100, pectinesterase 53 and pectinesterase/pectinesterase inhibitor U1 were up-regulated after aphid infection.

R genes mainly regulate the resistance of plants to pathogens and insects. Many R genes encoding nucleotide-binding leucine-rich repeat (NB-LRR) proteins have been isolated. It has been shown that the R proteins involved in resistance to pest infection have a common NBS-LRR structure motif and these proteins are catalogued into one cluster53,54,55,56. Mi, a plant R gene, has been found to confer resistance to the potato aphid, whitefly and root-knot nematode17,18,19,20,21. Vat, a melon R gene with a structure similar to that of Mi, encodes a cytoplasmic protein and is known to defend against viral transmission by the aphid (Aphis gossypii Glover) transmission and Aphis gossypii Glover3,23. Gb3 encodes a NB-ARC-LRR type R protein that is believed to mediate resistance to green bug aphid infection22. In the present study, the leucine-rich repeat receptor-like protein kinase At1g68400 (Cucsa.283380.1), LRR receptor-like serine/threonine-protein kinase At1g53440 (Cucsa.057870.1), TIR-NBS-LRR-AAA+ATPase class resistance protein (Cucsa.091710.1) and LRR receptor-like serine/threonine-protein kinase At5g59680 (Cucsa.101540.1) genes (including LRR structure) were up-regulated in cucumber leaves after aphid infestation (Table 4). These data suggest that these genes may play important roles in the response of the cucumber to aphid infection and may belong to the family of plant R resistance genes against Aphis gossypii Glover in cucumber.

It has been shown that the PTA (Pinellia ternata agglutinin) gene is a type of lectin gene involved in resistance to Myzus persicae24. The GNA, ASAL and ACA genes that are involved in lectin synthesis also confer resistance to Myzus persicae26,27,29. In addition, the ASAL gene is associated with high resistance to aphis craccivora25 and the ACA gene enhances resistance to Aphis gossypii Glover in plants28. In this study, the lectin-related genes L-type lectin-domain containing receptor kinase IX.1 (Cucsa.218550.1), lectin-domain containing receptor kinase VI.4 (Cucsa.176670.1), L-type lectin-domain containing receptor kinase S.4 (Cucsa.175650.1) and L-type lectin-domain containing receptor kinase VI.1 (Cucsa.176660.1) were up-regulated after aphid infestation in the cucumber leaf (Table 4), suggesting that these genes be very important contributors to aphid resistance.

In conclusion, this study showed that the expression of genes associated with many functional aspects was altered after aphid infestation. The qRT-PCR results agreed well with the tag-sequencing analysis patterns. The plant-pathogen interaction, flavonoid biosynthesis, amino acid metabolism, sugar metabolism and signal transduction were changed, as determined by gene expression profiling. Genes encoding lectins (L-type lectin-domain containing receptor kinase IX.1, lectin-domain containing receptor kinase VI.4, L-type lectin-domain containing receptor kinase S.4 and L-type lectin-domain containing receptor kinase VI.1) and LRR proteins (leucine-rich repeat receptor-like protein kinase At1g68400, LRR receptor-like serine/threonine-protein kinase At1g53440, TIR-NBS-LRR-AAA+ATPase class resistance protein and LRR receptor-like serine/threonine-protein kinase At5g59680) were identified as important target defence genes for further study in aphid resistance in cucumber.

Methods

Plant materials

The aphid-resistant cultivar ‘EP6392’ of C. sativus L. was selected for evaluation of changes in gene expression profiles after aphid infection. 54 seeds were sown in trays filled with potting substrate (nutrients available: 40–60 g/kg total NPK nutrients, ≥350 g/kg total humus content, 6.5–7.5 pH) in chambers at 25°C (18 h)/18°C (6 h) day/night and the relative humidity ranged from 50% to 60% in March 23, 2013.

Aphid culture and infection

One aphid (Aphis gossypii Glover) was collected from experimental cucumber fields of the Department of Horticulture at Yangzhou University in the autumn of 2012 and reared and reproduced on the susceptible cucumber cultivar ‘XiaFengin’ at 25°C (18 h)/18°C (6 h) day/night and the relative humidity ranged from 50% to 60%.Its offspring was used in the infestation of cucumber. After sowing 10 days, a half number of seedling plants (27) infected by aphids above were used as treatment, the other half not infected by aphids as control. The back of the first true leaf of each cucumber seedling plant was infected by five apterous adult aphids. The aphids were allowed to breed and their offspring were reproduced freely on the seedling. The first true leaves of three plants from the treatment and control plants were used for gene expression analysis.

Gene expression library construction

Total RNA isolated from the treated and control leaves of the cucumber at 0, 2, 4, 6 and 8 d after aphid infection was used to construct gene expression libraries. The libraries were named 0, T2, T4, T6, T8, C2, C4, C6 and C8, respectively. The total RNA was checked for quality and quantity using a Biophotometer Plus (Eppendorf, German). Oligo (dT) magnetic bead adsorption was used to purify mRNA and then cDNA was synthesised using Oligo (dT) as primers. The 5' ends of the tags were generated by two types of endonuclease: NlaIII or DpnII. Typically, the bead-bound cDNA was subsequently digested with the restriction enzyme NlaIII, which recognises and cuts CATG sites. The 3' cDNA fragments bound to oligo (dT) beads were washed to remove fragment and the Illumina adaptor 1 was ligated to the sticky 5' end of the digested bead-bound cDNA fragments. The junction of the Illumina adaptor 1 and the CATG site was recognised by MmeI, a type of endonuclease with separated recognition sites and digestion sites that cut at 17 bp downstream of the CATG site, producing tags with adaptor 1. After removal of the 3' fragments by magnetic bead precipitation, the Illumina adaptor 2 was ligated to the 3' ends of the tags, acquiring tags with different adaptors at both ends to form a tag library. After 15 cycles linear PCR amplification, 85 base strips were purified by PAGE gel electrophoresis. During the quality control steps, an Agilent 2100 Bioanalyzer and ABI StepOnePlus Real-Time PCR System were used for quantification and qualification of the sample library. Finally, the library was sequenced using an Illumina HiSeq 2000 device, the sequencing reads are 49 nt long (see Supplementary Fig. S1). Sequencing of the transcripts in the form of special constructs was completed by the Beijing Genomics Institute (BGI).

Data analysis

Raw sequences had 3' adaptor fragments and a few low-quality sequences and several types of impurities. The raw sequences were transformed into clean tags after the following data-processing steps: 3' adaptor sequence removal; empty read removal (reads with only 3' adaptor sequences and no tags); low-quality tag removal (tags with unknown sequences 'N'); removal of tags that were too long or too short, leaving tags of 21 nt; removal of tags with a copy number of 1 (probably caused by sequencing error). The types of clean tags were represented as distinct clean tags. Subsequently, the clean tags and distinct clean tags were classified according their copy number in the library and their percentage among the total clean and distinct tags was defined. Saturation analysis was performed to determine whether the number of detected genes increased along with increases in the sequence amount (total tag number). The virtual library contained all of possible sequences of CATG+17 bases among the reference gene sequences. All clean tags were mapped to the reference sequences and only a 1 bp mismatch was tolerated. Clean tags mapped to the reference sequences from multiple genes were filtered. The remaining clean tags were designed as unambiguous clean tags. The number of unambiguous clean tags for each gene was calculated and then normalised to TPM (number of transcripts per million clean tags).

Identification of differentially expressed genes

Referring to the significance of digital gene expression profiles57, FDR ≤ 0.001 and || ≥ 1 were used as thresholds to evaluate the significance of expression differences of unigenes in the sequence counts across libraries. Genes with similar expression patterns usually were functionally correlated. In this study, cluster analysis of gene expression patterns was performed with Cluster58 software and Java Treeview59 software. The GO has three ontologies: molecular function, cellular component and biological process. The GO enrichment analysis of functional significance applied a hypergeometric test to map all differentially expressed genes to terms in the GO database with regard to significantly enriched GO terms in differentially expressed genes (DEGs) compared to the genome background. Pathway-based analysis helps to further clarify the biological functions of genes. KEGG is the major public pathway-related database60. In the present study, pathway enrichment analysis identified significantly enriched metabolic pathways or signal transduction pathways in DEGs compared with the whole genome background.

qRT-PCR analysis

Quantitative RT-PCR (qRT-PCR) analysis was used to verify the DGE results. The RNA samples from 9 randomly chosen genes that were used for the qRT-PCR assays were the same as those used for the DGE experiments. Gene-specific primers, which are listed in Table 5, were designed using Primer Premier 5.0. qRT-PCR was performed according to the TaKaRa manufacturer specifications (TaKaRa SYBR PrimeScript RT-PCR Kit, Dalian, China). The cucumber actin gene was used as an internal standard and amplified with the following primers: forward: 5'–TCGTGCTGGATTCTGGTG–3' and reverse: 5'–GGAGTGGTGGTGAACAT–3'. The relative expression levels of the genes were determined as 2−ΔΔCT. The reactions were incubated in a 96-well plate. The PCR program consisted of 95°C for 30 s and 40 cycles of 95°C for 5 s and 50–60°C for 30 s. qRT-PCR analysis was performed on an iQ 5 multicolour real-time PCR detection system (Bio-Rad, USA).

Table 5 Detailed information regarding the primers used for qRT-PCR variation