Introduction

A fruit is the product of determinate growth from an angiospermous flower or inflorescence, consisting of the seed and its envelope1. Fruits can be classified into dry fruit and fleshy fruit. For dry fruits, such as maize, rice and nuts, the seed is usually the important product. For fleshy fruits, such as grape, tomato and cucurbits, the fruit flesh developed from diverse tissues in the inflorescence is the edible part. Due to its economic and agricultural importance, intensive studies have been performed to explore the molecular, cellular and physiological events that regulate fruit growth and differentiation2.

Generally, fruit development can be divided into early fruit development and fruit ripening. Early fruit development includes three phases: 1) fruit set, the earliest phase that involves ovary development and fertilization; 2) fruit growth mostly by cell division, which involves seed formation and early embryo development; 3) fruit growth predominantly by cell expansion, during which the fruit reaches its final size and the embryo matures2. Therefore, fruit yield and quality largely depend on cell division and cell expansion during early fruit development.

The cell cycle regulation plays a pivotal role in plant growth and development. Cell division and expansion are tightly controlled by key components in the cell cycle machinery such as the cyclin-dependent protein kinases (CDKs) and the regulatory cyclins3,4. The cell cycle is regulated mainly at two checkpoints, the G1-to-S and G2-to-M transition points. CDKs act as serine threonine kinases in protein complexes, together with cyclins to phosphorylate abundant substrates at the two checkpoints that initiate DNA replication and mitosis, respectively. Five classes of CDKs (A–E) are found in plants5. CDKA is the most numerous class that is constitutively expressed and translated during the cell cycle, while CDKB particularly regulates the S-to-G2 and G2-to-M transitions5,6. Ten groups of cyclins are identified in Arabidopsis7. A-type and B-type cyclins accumulate during the G2 and early M phases and they regulate the S-to-M and G2-to-M transitions, respectively, while D-type cyclins mainly regulate the phase transition from G1-to-S5,6,8. Recent studies showed that the expression of cyclins and CDKs increased during the early tomato fruit development9,10. Similarly, PaCDKB1 and PaCYCD1 were significantly up-regulated in the early phase of avocado fruit development11 and cell cycle genes and CDKs were highly expressed during the cell division phase of fruit development in cucumber12.

Microtubules are essential components for cell division, cell expansion and cell morphogenesis13. Stable attachment of microtubules with kinetochores (a specialized protein structure in the centromere) is required for proper chromosome segregation and thus cell division14. Disruption of the orientation of cortical microtubules resulted in defected cell expansion15. Moreover, kinesins, a class of microtubule-based motor proteins, function in both mitosis and meiosis in plants16,17. Four types of kinesins are identified in Arabidopsis. Three kinesins, CsKF1, CsKF2 and CsKF3, have enriched expression in developing fruit and may play important roles during early fruit development in cucumber18.

Expansins are plant cell wall proteins that regulate cell division and plant growth by mediating cell wall extension19,20. Four families of expansins are identified in plants, α-expansins (EXPAs), β-expansins (EXPBs), expansin-like A (EXLA) and expansin-like B (EXLB), among which EXPA is the largest family and EXLB the smallest19,20. Previous studies showed that expansins were highly enriched in growing and ripening fruits, regulating cell enlargement through primary wall synthesis and restructuring21,22.

Cucumber (Cucumis sativus) is a globally cultivated vegetable crop that is of great economic and nutritional importance23. Cucumber fruit is a type of fleshy fruit that is developed from an enlarged ovary. In contrast to most fruits that are consumed when ripe, cucumber fruit is harvested and consumed immaturely (1–2 weeks after anthesis), either as fresh product or as processed food. Therefore, early fruit development directly determines cucumber yield and quality. Fruit development starts with a short fruit set phase, which is followed immediately by a rapid cell division phase until 5 days after anthesis (DAA) and then by an exponential expansion (5–14 DAA) phase18,24,25. Within the two weeks of development, the fruit weight increases around 200 fold in cucumber12,25. Cucumber fruit length concomitants with cell division and expansion, with the most rapid increase during 4–12 DAA and reaching its final size around 12–16 DAA26. Although several studies have been performed on transcriptome profiling or characterization of cell cycle-related genes at different developmental phases (cell division phase versus cell expansion phase)12,18,24,26,27, to the best of our knowledge, no such study has been conducted on cucumber fruits at the same developmental stage but in cultivars with different final fruit lengths. The fruit length of cucumber varies substantially within four geographic groups: the East Asian group, the Eurasian group, the Xishuangbanna group and the India group. Among them the East Asian group have the longest fruits (over 30 cm), while the Indian group have the shortest (around 4 cm)28. In China, cucumbers can be divided into the northern China type (dark green, dense spines and warts) and the southern China type (light green, sparse spines and warts) and the fruit length varies from as short as 8 cm to as long as 35 cm (Figure S1).

Although fruit length is a key agricultural trait that is of important economic value, the regulatory mechanism of fruit length remains elusive. In order to understand the genes and gene networks that might play a role in controlling the final fruit length in cucumber, we used high throughout RNA-Seq data to compare the transcriptomes of early fruits from two near isogenic lines with different fruit lengths. We obtained 3955 differentially expressed genes between the two lines, in which 2368 genes were significantly up-regulated in the line with long fruit. Moreover, microtubule and cell cycle related genes were dramatically activated in the long fruit and transcription factors were implicated in the fruit length regulation in cucumber. Thus, our data provide valuable information for further dissection of the molecular mechanism of regulating fruit length in cucumber.

Results

Comparison of gene expression of early fruits from two near isogenic lines with different fruit lengths

To explore genes and gene networks that control fruit length in cucumber, we performed RNA-Seq analysis of early fruits from two near-isogenic lines 408 and 409. Line 408 is a spontaneous mutant from line 409 and has been stabilized via seven generations of selfing prior to this study. Both 408 and 409 lines belong to the northern China cucumber cultivar, with long dark green fruit covered with dense spines and warts. There is no visible difference between line 408 and 409 in terms of plant morphology and plant height, except that line 408 has long fruits while line 409 has short fruits. At the commercially mature fruit stage, the average fruit length of line 408 is 28.8 cm, which is significantly longer than that (21.8 cm) in line 409 (t-test, p < 0.05) (Figure 1A–1B). The fruit length difference is apparent even as early as at the anthesis stage (Figure 1C). Given that gene expressions usually display significant differences prior to morphologic changes, we chose the very early fruits (around four days before anthesis) for transcriptome analyses (square bracket in Figure 1D).

Figure 1
figure 1

Morphological characterization of two near-isogenic cucumber lines with different fruit lengths.

(A)–(D) Cucumber line 408 with long fruit (left) and its near-isogenic line 409 with short fruit (right) at commercially mature fruit stage (A)–(B), anthesis stage (C) and 4 days before anthesis (D). The bars in B represent the standard deviation (n = 10). Asterisk indicates that fruit length is significantly different between 408 and 409 (unpaired t test, P < 0.05). Square brackets in C and D show the lengths of early fruits. Samples at the same developmental stage as in D were used for RNA-Seq analyses. Scale bars represent 1 cm.

High-throughput RNA-Seq sequencing generated 14.47 to 16.08 million single-ended reads for each sample and two biological replicates were performed for each line (Table 1). After low quality regions and adapter sequences being removed, 13.4–14.9 million clean reads were mapped to the cucumber genome using TopHat29 (Table 1). We summarized the expression level of each gene with HT-seq30. After lowly expressed genes being removed, we obtained 20000 genes that had at least 1 RPM (reads per million) in at least two samples. We then used the R package edgeR to identify differentially expressed genes (DEGs)31. Using false discovery rate (FDR) < 0.05 and fold change > 2 as the significance cutoffs, we found 3955 DEGs, in which 2368 genes were significantly up-regulated and 1587 genes were significantly down-regulated in the fruits of line 408 as compared to those of line 409 (Supplemental Table S1).

Table 1 Summary of transcriptome sequencing data

Validation of RNA-Seq data by quantitative real time RT-PCR assays

To verify the differentially expressed genes (DEGs) identified by RNA-Seq, we performed quantitative real time RT-PCR (qRT-PCR) assays using independently collected samples that were in the same developmental stage as those used for the RNA-Seq analysis. Among the 20 randomly selected DEGs, 14 genes showed higher expression and 6 genes displayed lower expression in line 408. As shown in Figure 2, all the 20 genes showed the same expression patterns in the qRT-PCR assays as in the RNA-Seq data. The pearson correlation coefficient between qRT-PCR and RNA-Seq data was 0.945 (p = 9.5E-11), indicating that the RNA-Seq data were highly reliable.

Figure 2
figure 2

Verification of differentially expressed genes by qRT-PCR.

Fourteen DEGs with higher expression and six DEGs with lower expression in line 408 were chosen for qRT-PCR validation. The relative expression level of each gene was expressed as the fold change between two lines in the RNA-Seq data (white bar) and qRT-PCR data (gray bar). The cucumber UBIQUITIN gene was used as an internal control to normalize the expression data. The bars represent the standard deviation (n = 3). Asterisks indicate that the gene transcriptions are significantly different between lines 408 and 409 (unpaired t test, P < 0.05).

Genetic variations between lines 408 and 409

To confirm that lines 408 and 409 have nearly identical genetic background and to identify candidate genetic markers for future gene mapping and functional studies, we used the mapped RNA-Seq reads to obtain single nucleotide polymorphisms (SNPs) between these two lines. Using samtools32 and vcftools33, we obtained only 850 SNPs in the genomes of these two lines, confirming that lines 408 and 409 are near isogenic lines with very few genetic variations. We calculated the density of SNPs in non-overlapping 100 kb bins of each chromosome (Figure 3A) and found four significant peaks on chromosome 1 (Chr1: 1.0 Mb), chromosome 5 (Chr5: 1.4 Mb; 2.3–2.5 Mb) and chromosome 6 (Chr6: 18.1–19.0 Mb). The peak on chromosome 6 has the highest density, with more than 50 SNPs in the neighboring 100 kb bins. Chromosomes 4 and 7 have the lowest SNP density, with only one SNP on each chromosome (Figure 3A). We also calculated the density of DEGs in 10 Mb bins in the genome and found that most DEGs were not located in the SNP rich regions (with the exception of the region near the SNP peak on chromosome 6) (Figure 3B), suggesting that the majority of the SNPs may not affect gene expression directly. 145 SNPs were found to be within or near 70 DEGs (Supplemental Table S2) and thus they may serve as molecular markers for map-based cloning and functional characterization of genes that may regulate fruit length in cucumber.

Figure 3
figure 3

Genome-wide distributions of SNPs (A) and DEGs (B) in the cucumber genome.

Number of SNPs and DEGs were calculated over non-overlapping 100 kb and 10 MB bins, respectively.

Microtubule and cell cycle related genes are involved in fruit length regulation in cucumber

To further understand the function of these DEGs, gene ontology (GO) term enrichment analysis (P ≤ 0.05) was performed. For genes that were up-regulated in line 408, the most significantly enriched GO terms were “microtubule-based movement” (p = 1.3E-15) in the biological process (GOBP) group (green in Figure 4), “microtubule binding” (p = 6.1E-18) in the molecular function (GOMF) group (blue in Figure 4) and “kinesin complex” (p  =  1.2E-15) in the cellular component (GOCC) (red in Figure 4), respectively. Because kinesin family proteins are a class of microtubule-based motor proteins that function in mitosis as well as meiosis in both plant and animal systems16,34,35, microtubule-related genes were the highly enriched top three GO categories in the up-regulated DEGs in the 408 line with long fruit. 40 microtubule-related genes that had higher expression in line 408 were listed in Table 2. For example, the microtubule associated protein MAP65 (Csa5G169090) showed a 13 fold (log2FC = 3.72) and the KINESIN 1(Csa1G065980) displayed a 7.6 fold (log2FC = 2.92) higher expression in the line 408 as compared to in line 409 (Table 2). Further, qRT-PCR from independently generated samples verified that five microtubule related genes had significantly higher expression in the fruit of line 408 (top row in Figure 2). In addition, genes involved in nucleosome assembly, DNA replication and cell cycle regulation were highly enriched in the up-regulated DEGs in the line 408 (Figure 4). Further characterization of the DEGs using the MapMan software showed that genes implicated in cell wall (Figure S2A), cell division and cell cycle (Figure S2B) were predominantly up-regulated in the fruit of line 408. In accordance with this finding, 29 cyclin family genes (Table 3) and 15 expansin family genes (Table 4) showed elevated transcription in the 408 line. Moreover, qRT-PCR confirmed that four cyclin genes (CYCB1;2, CYCD3;1-a, CDKB1;2, CDKB2;2) and two expansin genes (EXPA4-a and EXPA5) were significantly induced in the fruit of line 408 as compared to in line 409 (Figure 2), suggesting that the activation of microtubule, cell cycle and cell wall related genes is required for fruit elongation in cucumber.

Table 2 List of selected microtubule-related genes that were differentially expressed in the 408 and 409 cucumber fruit
Table 3 List of selected cyclin family genes that were differentially expressed in the 408 and 409 cucumber fruit
Table 4 List of selected expansin family genes that were differentially expressed in the 408 and 409 cucumber fruit
Figure 4
figure 4

Significantly enriched Gene Ontology (GO) terms (P < 0.05) in the up-regulated genes in the fruit of line 408 vs 409.

GO terms belong to biological processes (GOBP), molecular functions (GOMF) and cellular components (GOCC) were shown in green, blue and red, respectively. GO terms were sorted based on p-values.

Transcription factors are implicated in cucumber fruit length control

For the down-regulated genes in line 408, the top three enriched GO terms were “sequence−specific DNA binding transcription factor activity” (p = 2.9E-05), “microbody” (p = 1.69E-06) and “CVT pathway” (p = 9.6E-06) (Figure 5). A total of 130 genes that function in transcriptional regulation were found to be repressed in the fruit of line 408 (Supplemental Table 3) and these transcription factors were distributed in different gene families. As shown in Figure 6, there are 20, 18, 16 and 16 genes assigned into the Myb, bHLH, NAC and ERF/AP2 families, respectively, that displayed reduced expression in the 408 line with long fruit, suggesting that these transcription factors may function as negative regulators in fruit elongation in cucumber. The predicted cucumber SPATULA (SPT) (Csa2G356640) showed a 2 fold reduction as detected by RNA-seq and a 2.3 fold decrease as revealed by qRT-PCR (Supplemental Table 3, Figure 2). SPT encodes a basic-helix-loop-helix (bHLH) protein that regulates carpel and fruit development by regulating auxin distribution in Arabidopsis, in coordination with additional bHLH members such as INDEHISCENT (IND) and CRABS CLAW (CRC)36,37,38. Similarly, the putative cucumber FRUITFULL (FUL) (Csa1G039910), a MADS-box gene regulating cell differentiation during fruit development in Arabidopsis and fruit ripening in tomato39,40, displayed 2.3 fold lower expression in the fruit of 408 line (Supplemental Table 3). Moreover, members of other families of transcription factors, such as Csa6G312040 (a zinc finger protein 6), Csa7G413890 (a myb domain protein 2) and Csa5G606310 (a NAC-like protein) showed 96.0, 25.6 and 10.5 fold down-regulation in the fruit of 408 line respectively (Supplemental Table 3), implying that they may play important roles for fruit length regulation in cucumber. Functional categorization by the MapMan software indicated that genes involved in transcription regulation, protein modification and protein degradation were highly enriched in the DEGs, including transcription factors with known function in fruit development (Supplementary Figure S3). For example, HANABA TARANU (HAN), which encodes a GATA-3–like transcription factor with a single zinc finger domain, regulates shoot apical meristem, flower and carpel development together with GATA-3 family genes in Arabidopsis41,42. The putative cucumber HAN gene (Csa6G502700) displayed a 3.7 fold as detected by RNA-seq and 2.4 fold as calculated from qRT-PCR up-regulation in the 408 line with long fruit (Figure 2, Supplemental Table 1). Similarly, the putative cucumber SUPERMAN (SUP) (Csa3G141870), a C2H2 type zinc finger protein that controls carpel numbers and ovule development in Arabidopsis43,44, showed 5.6 fold higher expression in the line 408 as compared to the line 409 (Supplemental Table 1). These data suggest that transcription factors may play a key role in fruit length determination in cucumber.

Figure 5
figure 5

Significantly enriched Gene ontology (GO) terms (P < 0.05) in the down-regulated genes in the fruit of line 408 vs 409.

GO terms belong to biological processes (GOBP), molecular functions (GOMF) and cellular components (GOCC) were shown in green, blue and red, respectively. GO terms were sorted based on p-values.

Figure 6
figure 6

Family assignment of the 130 transcription factors that showed lower expression in the fruit of line 408.

Number of genes assigned to each family is shown behind a comma.

Discussion

In this study, we provide a comprehensive analysis of genes involved in fruit length control in cucumber. A total of 3955 differentially expressed genes (DEGs) were found, in which 2368 genes were significantly up-regulated and 1587 genes were significantly down-regulated in the 408 line with long fruit (Figure 1, Supplementary Table S1). qRT-PCR confirmed that our RNA-Seq data were highly reliable (Figure 2). Functional categories of the DEGs by GO term enrichment analysis and MapMan showed that microtubule and cell cycle related genes were strongly induced in the long fruit (Figure 4 and supplemental Figure S2). Microtubules have been shown to be essential for proper cell division and cell expansion14,15. Genes that encode both basic structural units of microtubules (alpha tubulins TUAs and beta tubulins TUBs) were found to be up-regulated in the long fruit (Table 2). Moreover, the microtubule associated motor proteins (kinesin and dynein), as well as many other microtubule associated proteins (MAPs) were activated in the 408 line (Table 2). For example, the expressions of kinesin genes CsKF2 (Csa3G062600), CsKF7 (Csa1G495290) and CsKF1 (Csa7G446860) were 13.9, 5.7 and 2.6 fold, respectively, higher in the long fruit (Table 2), which is consistent with previous report that the expression of CsKF2 positively correlated with cell division while the transcription of CsKF1and CsKF7 positively correlated with rapid cell expansion18, suggesting that the long fruit in the 408 line may result from both increased cell number and accelerated cell expansion. In support of this notion, 29 cyclin family genes were all up-regulated in line 408, in which 12 were D-type cyclins, 8 were B-type cyclins and 4 were CDKs (Table 3). Previous evidence showed that D-type and B-type cyclins regulate the phase transition from G1-to-S and G2-to-M respectively and CDKs regulate cell division through the G1-to-S and G2-to-M checkpoints5,6 and that the expression of cyclins and CDKs positively correlated with rapid cell division during the early fruit development in tomato and cucumber9,10,26. Moreover, 15 expansin family genes were mostly up-regulated in the line 408 with long fruit (Table 4). Previous data also showed that cytoskeleton and cell wall related genes were highly expressed during the peak exponential expansion phase (8 DAA) in cucumber26. Given that the samples we used for RNA-Seq analysis were 4 days before anthesis, a developmental stage with rapid cell division, together with the fact that fruit length but not fruit diameter increased in the line 408 as compared to in the line 409, it is plausible to infer that the long fruit in line 408 may result from both more rapid cell division and accelerated phase progression (from cell division to cell expansion). To test this hypothesis, we compared the cell size at 4 days and 0 day before anthesis between lines 408 and 409 (Figure 7). As expected, there was no noticeable cell size difference between lines 408 and 409 at 4 days before anthesis (Figure 1), suggesting that there were more rapid cell divisions and thus more cells in the line 408 with long fruit (Figure 7A–7C). However, by the day of anthesis, the cell size in line 408 was significantly larger than that in line 409 (Figure 7D–7F), suggesting that line 408 had accelerated phase transition from cell division to cell expansion. In addition, we found that 130 transcription factors were transcriptionally repressed in the fruit of line 408 (Supplemental Table 3) and the majority of them were from MYB, bHLH, NAC and ERF/AP2 gene families (Figure 6). Both AtMYB124 and AtMYB88 encode R2R3 MYB transcription factors, which are expressed during ovule development and have been identified to regulate female reproductive development45. SlFSM1 encodes a SANT/MYB-like domain and is an early fruit-specific gene and the SlFSM1/FSB1/MYBI complex controls cell expansion and fruit development in tomato46. Furthermore, most known regulators of fruit development such as SPT, IND, CRC, FUL, SUP and HAN were transcription factors that were differentially expressed between lines 408 and 409 and many of them can mediate cell division and expansion36,37,38,40,42. Therefore, we speculate that transcription factors may be the key players for fruit length control in cucumber, while microtubule, CDKs-cyclins and expansins that mediate cell division and cell expansion may be the downstream effects. Further functional studies using genetic transformation of the differentially expressed genes identified in this study, especially transcription factors and the 70 DEGs with SNPs between lines 408 and 409, will shed more light on the precise mechanism of fruit length regulation in cucumber. As such, our results provide a valuable resource for further functional characterization and lay a foundation for molecular breeding for desired fruit length in crops.

Figure 7
figure 7

Comparison of cell morphology in lines 408 and 409.

(A)–(C) Microscopic longitudinal sections of the young fruits at 4 days before anthesis (an-4) in line 408 (A) and line 409 (B) and the corresponding quantifications of cell size (C). (D)–(F) Microscopic longitudinal sections of the young fruits at anthesis (an) in line 408 (D) and line 409 (F) and the respective quantifications of cell size (F). (D′)–(E′) are the enlarged view of the boxes in D and E, respectively. The bars in C and F represent the standard deviation (n = 3). Asterisk in F indicates that the cell size in line 408 is significantly larger than that in line 409 at anthesis (unpaired t test, P < 0.05). Bar = 50 um.

Methods

Plant materials

The cucumber hermaphrodite line 408 and its near-isogenic line 409 were grown in the Jinliuhuan Experimental Station of Beijing under standard greenhouse conditions. Pest control and water management were performed according to standard practices. Early fruits of about 2–4 cm in length were collected from 408 or 409 lines at the same time on the same day. Three-to-five fruits from different plants were pooled together as one biological sample for each cucumber line. Samples were immediately frozen in liquid nitrogen and stored at −80°C until further use.

RNA extraction and quality test

Frozen fruit samples were grinded in a mortar with liquid nitrogen and total RNA was isolated using the RNA extraction kit (Huayueyang, China). RNA was checked on 1% agarose gels to avoid possible degradation and contamination and was then examined by a Nano Photometer spectrophotometer (IMPLEN, CA, USA) for RNA purity. Qubit RNA Assay Kit in Qubit 2.0 Flurometer (Life Technologies, CA, USA) was used to measure RNA concentration and RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies, CA, USA) was used to evaluate RNA integrity. Only RNA samples that passed the quality tests were chosen for RNA-Seq analyses.

RNA-Seq library construction and sequencing

RNA-Seq library construction was performed using the NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, Ispawich, USA) following manufacturer's instructions and four index codes were added to attribute sequences to different samples47. Briefly, mRNAs were enriched from 3 ug total RNA using magnetic beads with Oligo (dT) (Life technologies, CA, USA) and then fragmented using divalent cations under elevated temperature in the NEB proprietary fragmentation buffer. Double-stranded cDNAs were synthesized using random hexamers and M-MuLV Reverse Transcriptase (RNase H-), followed by DNA Polymerase I and RNase H. After adenylation of the 3′ ends of cDNA fragments, NEBNext adapter oligonucleotides were ligated to cDNA fragments and the AMPure XP beads system (Beckman Coulter, Beverly, USA) was used to select cDNA fragments of approximately 200 bp in length. cDNA fragments with ligated adapter molecules on both ends were selectively enriched using the NEB Universal PCR Primer and Index primer in a 10-cycle PCR reaction. Products were purified with the AMPure XP beads system and quantified using the Agilent Bioanalyzer 2100 system. The clustering of the index-coded samples was performed on a cBot Cluster Generation System using the TruSeq PE Cluster Kit v3-cBot-HS (Illumia) according to the manufacturer's instructions. RNA-seq libraries were sequenced on an Illumina HiSeq 2000 platform to generate 100 bp single-ended reads.

Bioinformatics analysis of RNA-Seq data

Raw reads were pre-processed to remove low quality regions and adapter sequences. Clean reads were mapped to the Cucumber genome sequence (http://cucumber.genomics.org.cn, v2i) using TopHat23,29. Read counts of each gene were summarized by the HTSeq-count30. The R package edgeR was used to identify the differentially expressed genes31. The expression of each gene was normalized to reads per million (RPM) to compare among different samples. Lowly expressed genes were removed and only genes with an expression level of at least 1 RPM in at least two samples were kept for further analysis. The genes with at least two folds change in expression between lines 408 and 409 and with a False Discovery Rate (FDR) of less than 0.05 were considered to be differentially expressed.

Sequencing data were deposited to the Gene Expression Omnibus (GEO) database at the National Center for Biotechnology Information (NCBI) with access number GSE60346.

SNP calling using RNA-Seq reads

After clean reads were mapped to the cucumber genome using TopHat, we used the SAMtools32 (version 0.1.18) and vcftools as described in Martin et al33 to detect SNPs between lines 408 and 409. We used “samtools mpileup” with the parameter “-D” to record per-sample read depths and filtered VCF files using an AWK script48. SNPs with a quality value greater than 40 and a read coverage of at least 2 were considered confident SNPs. Genome-wide distributions of SNPs and DEGs on were visualized using R scripts47. Functional annotations of SNP sites were performed using snpEff49.

Gene Ontology (GO) term enrichment analysis

Because GO terms are not well annotated for cucumber genes, we used blast2go50 and interProScan51 to assign GO terms to cucumber genes. The GO term enrichment analysis was conducted for up-regulated and down-regulated genes in line 408, respectively, using the R package TopGO52. Adrian Alexi's improved weighted scoring algorithm and Fisher's exact test were used to determine the significance of GO term enrichment.

MapMan analysis of DEGs

In order to obtain detailed functional categorization of differentially expressed genes between line 408 and line 409, we used the online web tool Mercator53 to obtain the cucumber protein annotation mapping file for MapMan with default parameters and then used the Java software MapMan to assign functional categorizations to DEGs54.

Quantitative real-time RT-PCR

qRT-PCR analyses were performed with independently generated samples from 408 and 409 lines at the same developmental stage. Primers for qRT-PCR were designed using the Primer 5 software and synthesized by Sangon Biotech. cDNAs were reverse transcribed from 3 μg of total RNA using the PrimeScript RT reagent Kit (Takara, Da Lian, China) and qRT-PCR analyses were performed on an ABI PRISM 7500 Real-Time PCR System (Applied Biosystems, USA). The cucumber UBI gene was used as an internal control to normalize the expression data55. Each qRT-PCR experiment was repeated three times. The relative expression of genes was calculated using the 2−ΔΔCt method56 and standard deviation was calculated between three biological replicates. The gene-specific primers are listed in Supplemental Table S4.

Light microscopy of cell morphology

Young fruits at 4 days and 0 day before anthesis were collected from line 408 and line 409, respectively. Fruit samples were fixed in the FAA (50% ethanol, 5% glacial acetic acid and 3.7% formaldehyde) buffer overnight and then dehydrated through an ethanol series and embedded in paraplast as described41.