Distinct transcriptome profiles reveal gene expression patterns during fruit development and maturation in five main cultivated species of pear (Pyrus L.)

The transcriptomes of five pear cultivars, ‘Hosui’ (P. pyrifolia), ‘Yali’ (P. bretschneideri), ‘Kuerlexiangli’ (P. sinkiangensis), ‘Nanguoli’ (P. ussuriensis), and ‘Starkrimson’ (P. communis) were sequenced at seven key fruit developmental stages, from fruit setting to maturation and fruit senescence after harvesting. In total, 33,136 genes that could be mapped by reads, were analyzed. Most gene expression cluster models showed a steadily decreasing trend. Gene expression patterns had obvious differences according to maturity type, that is, post-ripening cultivars were still vigorous at maturity, and showed a higher proportion of up-regulated genes; non post-ripening cultivars had a gradually decreasing tendency during fruit maturation. Meanwhile, differentially expressed genes related to fruit quality and development, such as stone cells, sugar, acid and hormones, were identified. Co-expression analysis revealed that several ethylene synthesis genes and polyphenoloxidase-related genes interacted with each other directly, and an indirect relationship was reflected between ethylene synthesis genes and ethylene response genes. In addition, the highly diverse SNPs represented the great differences between oriental and occidental pears. Understanding how RNA-seq based gene-expression patterns and differential gene expression contribute to fruit quality allows us to build models for gene-expression for fruit development of Pyrus species.

Scientific RepoRts | 6:28130 | DOI: 10.1038/srep28130 volatile compounds 6 and concentrated fruit flavor. On the contrary, the other oriental pears have crisp flesh, low aroma and flavor, and some of them have high sugar and low acid content 5 .
Until now, some physiological and molecular mechanism studies have been carried out to investigate different fruit development characteristics and fruit quality. For the fruit maturation process, fruit texture is affected by the ACO (1-aminocyclopropane-1-carboxylate oxidase) gene, and then XTH (Xyloglucan endotransglucosylase/ hydrolase)-related genes can lead to cell wall disassembly and loosening 7 . 'Bartlett' occidental pears are prevented from ripening on the tree, but ethylene was able to stimulate fruit softening on the tree 8 and endo-PG genes have an impact on many different maturation characteristics 8,9 . Microarrays were developed for pear to illustrate the absence of cupin family protein-related genes and two un-annotated genes in Japanese pear (Pyrus pyrifolia), which may induce post-ripening in pears 10 . Sugar is one of the most important factors for fruit quality, and provides the main energy for fruit metabolism. In Pyrus species, sucrose and fructose are the major soluble sugars, and overall sugar content varies greatly between different cultivars 5 . It was found that sucrose synthase and sucrose-phosphate synthase were highly correlated with sucrose levels in Japanese pear 11 . Soluble acid invertase accumulates hexoses and regulates sucrose-to-hexose ratio, and two invertase genes, PsS-AIV1 and PsS-AIV2, have been cloned in pears, but are expressed differently during fruit development 12 . Lignified stone cells are a special characteristic of pears, which severely affect fruit quality. The stone cells have a mosaic pattern in the flesh of pear fruits, with a higher concentration of larger stone cells near the core, and smaller ones around the pericarp 13 . The ratio of G-type and S-type lignin and polymerization degree have been evaluated among various pear species 14 . Highly expressed gene families of HCT, C3′ H, and CCOMT contribute to high accumulation of both G-lignin and S-lignin in 'Dansuansuli' fruit. However, there are few reports on all five cultivated pear species toward understanding the different molecular regulation mechanisms of fruit development and quality.
Next-generation RNA sequencing can sensitively and rapidly reflect gene expression with an enormous amount of data, and is an efficient method to explore fruit development characteristics. Besides pear, many Rosaceae family fruits are popular, such as apple, peach, berry, strawberry, plum, loquat, and apricot. The whole genomes of apple 15 , strawberry 16 , plum 17 , pear 18 and peach 19 have been sequenced, making possible differentially expressed gene analysis based on the reference genome to reveal different genotypes. In a previous study, transcriptome sequencing of apple helped to find the Gypsy-like retrotransposon in the bud mutant, affecting internode length 20 . Peach 21 and berry 22 have also been analyzed by high-throughout sequencing technology to reveal internal mechanisms of fruit development and quality. The whole genome sequence of pear has also been published with high quality assembly. Therefore, a good platform is available to study the genetic characteristics of fruit development and quality conformation in pear. The transcriptome of Japanese pears has been analyzed to reveal the flower bud transitioning through endodormancy 23 , and occidental pear 24 was analyzed to study fruit size. However, many fruit developmental characters and transcriptome-based differences in all five pear species are still unclear. RNA-seq technology can be an effective method to reveal gene expression patterns and difference during fruit development and maturation for distinguished cultivated species of pear.
In this research, we performed large-scale RNA sequencing from seven fruit developmental stages in the five cultivated species of pear to effectively reveal the main differences between fruit-related metabolism, especially fruit post-or non post-ripening. We used the 'Dangshansuli' (Pyrus bretschneideri Rehd.) genome as a reference to analyze various fruit developmental patterns and differentially expressed genes, which mainly regulate the pear growth and fruit quality. Results contribute to a better understanding of the dynamic metabolism and molecular mechanism of different pear species, which provides a significant basis to improve fruit quality of pear, and even other species in Rosaceae.

Results and Discussion
RNA sequencing of different developmental stages of pear fruit in five cultivars. In this study, the transcriptomes of five representative cultivars: 'Hosui' (P. pyrifolia), 'Yali' (P. bretschneideri), 'Kuerlexiangli' (P. sinkiangensis), 'Nanguoli' (P. ussuriensis) and 'Starkrimson' (P. communis), which are the main cultivated species worldwide, were selected to reveal gene expression patterns and explore relationships between different fruit traits. A total of seven periods (date details in materials), including six key fruit developmental stages (periods 1-6) and one fruit senescence stage after harvesting (period 7) were analyzed. Fruit developmental stages were: fruit setting (period 1), physiological fruit dropping (period 2), fruit rapid enlargement (period 3), a month after fruit enlargement (period 4), pre-mature (period 5) and mature (period 6). Each was sequenced by the Illumina2000 platform. There were more than 10 M raw reads for each sample analysis ( Table 1). The original data are available on NCBI: project number PRJNA309745, SRA trace number SRP070620. The mean quality score was above 30, and after adapter removing and ribosomal RNA filtering, a total of 33,136 mapping genes were aligned to the reference pear (Pyrus bretschneideri Rehd.) genome 18 , and analyzed based on the 49 bp RNA-seq single end data. The mapping gene numbers were 28,621 (86.4%,'Hosui'), 29,041 (87.6%, 'Kuerlexiangli'), 28,925 (87.3%, 'Nanguoli'), 29,294 (88.4%, 'Starkrimson') and 28,905 (87.2%, 'Yali'). Transcript sets of five cultivars had no bias in mapping gene number, which reflected good sequencing distribution. Little difference was found in the percentage of aligned gene numbers accounting for clean reads during fruit developmental stages, however, it was relatively low for all cultivars at maturation or senescence (Table 1). One major reason might be that many genes have no expression or greatly decreased expression during fruit maturation. Another natural reason could be low sequencing depth, and the limited sequencing data, 10 M reads with 49 bp length for each sample. In tomato, as the cell wall decomposed during maturation, the expression of many genes involved in cell wall carbohydrates also decreased 25 . The same phenomenon was also present in watermelon, with CmXTH expression down during mature-fruit abscission 26  2.0 27 used clustering correlation matrixes to generate a tree. As Fig. 1 shows, there were two main groups in the tree, group I and group II. Most fruit setting stages (period 1) and fruit physiological dropping stages (period 2) were gathered together in group I by the Spearman correlation method. The occidental pear 'Starkrimson' made up a separate subgroup B, and other oriental pears made subgroup A. Group II also divided into two subgroups, C and D, with subgroup C clustering the stages of full development to mature stage of all oriental cultivars, while subgroup D included post-harvest stages of all five cultivars and three developmental stages of occidental pear 'Starkrimson' . It was interesting to find that the most closely related stages of the same cultivar usually first clustered together, and then clustered with stages of other cultivars, indicating different gene expression patterns between the five cultivars, which represent the five cultivated species to some extent. In addition, occidental pear 'Starkrimson' showed a relatively distant relationship with other oriental cultivars, in both groups I and II.
Models for genes with the same expression pattern were defined by STEM analysis 28 , and a total of 20 gene cluster modules were generated for each sample. We selected the top eleven with the same expression model, shown in Fig. 2a. In general, the gene expression models cluster 3-steady decrease-and cluster 16-steady increase-represented the most common gene expression models across all seven stages and different cultivars. Cluster 3 included the largest number of genes (more than 3,000 genes for each cultivar), indicating that all five cultivars had most genes expressed with the model of steady decrease. A total of 270 genes were identified in all five cultivars in cluster 3, demonstrated in the Venn diagram (Fig. 2c). Gene expression modules of five pear cultivars were significantly enriched (P-value < 1e-5, except for 'Nanguo' in cluster 16) in two main clusters, and presented different genetic functions by Gene Ontology (GO) (Fig. 3). Continuous increase cluster 16 had a relatively higher percentage of genes responsible for cellular component organization or biogenesis, single-organism process, cell part and transporter activity, compared with the steady decrease cluster 3 (Fig. 3). We used the 270 genes in cluster 3 for KEGG pathway enrichment, and the genes were most enriched in the photosynthesis   pathway (map00195 with p-value of 4.15e-12 and map00196 with p-value of 5.68e-6). In previous gene expression pattern research, sweet orange showed the largest cluster (3,075 genes, accounting for 16.3%) expressed as continuous increase during four fruit development and ripening stages, and the genes in that cluster had functions such as cysteine proteinase sucrose phosphate synthase, and a sucrose transporter 29 . However, the steadily changing gene-expression patterns were not the main models 22 in studies of the berry transcriptome, based on analysis of three stages of fruit development. Specific gene expression patterns showed that occidental pears had a gradually decreasing tendency of gene expression in stage 6 of 'Starkrimson' -a typically European pear species, expressed genes mostly grouped in cluster 3, cluster 16, cluster 0, cluster 7, and cluster 5 (Fig. 2b). All the gene expression tendencies in these clusters were down-regulated at the mature stage, except for the common model of cluster 16. Cluster 2 represented a specific gene expression pattern of 'Kuerlexiangli' , because the gene numbers in that pattern were significantly high (under the significance level of 0.05, Fig. 2b). In addition, 'Nanguoli' showed its own special fruit developmental model with cluster 17, and 'Hosui' had a special gene expression cluster 13. How these specific gene expression patterns regulate the particular fruit development of each cultivar still needs further exploration and study.

Specific gene expression characteristics for post-ripening and non post-ripening cultivars in pear.
We sequenced samples from the stage of fruit maturity to senescence to detect maturation characteristics of pear. Of the five cultivated species, 'Starkrimson' and 'Nanguoli' are post-ripening pears, while other three cultivars, 'Hosui' , 'Yali' and 'Kuerlexiangli' are eatable directly after harvest. Analysis revealed that post-ripening cultivars were still vigorous after harvest, furthermore we found that they had a higher proportion of up-regulated genes. 'Nanguoli' had 43.95% (Fig. 4) up-regulated genes from maturity to fruit senescence. The same tendency was followed by another post-ripening occidental pear, 'Starkrimson' , with 44.63% (Fig. 4). However, up-expressed genes in 'Hosui' , 'Yali' , and 'Kuerlexiangli' only accounted for 21.39%, 24.33%, and 27.04% (Fig. 4), respectively. The ratio  of up-expressed genes between post-ripening and non post-ripening pear cultivars had significant differences by t-test (p-value = 0.005032).
In order to distinguish the two extreme gene expression patterns in post-ripening and non post-ripening pears, we explored gene functions within the two patterns. The number of genes only up-expressed in post-ripening cultivars was 133, while the gene number of only down-expressed in non post-ripening cultivars was 47, as Venn diagrams in Fig. 5 show for the up-regulated and down-regulated genes. Previous studies proved that ethylene plays an important role in catalyzing ripening in European pears 30 . Ethylene catalyzed the maturity process through different methods such as with the involvment of one cupin family protein gene and two unannotated genes in occidental pears 10 . In this study, functional analysis showed that the only up-regulated genes in post-ripening cultivars were responsible for nucleosome assembly and protein-DNA complex assembly based on the gene ontology (GO) analysis. However, the only down-expressed genes in non post-ripening cultivars were in the categories metabolism and transportation.

Gene expression differences reveal zeatin and ethylene regulating development and maturation of pear fruit.
Zeatin is a plant hormone, first isolated from maize. It controls many aspects of plant development, such as cell division 31 and regulation of environmental stress responses 32 . Based on the heat map of gene expression at later stages of fruit development, we found cytokinin oxidize (CKX) to be differentially expressed (with log2 fold change from 4 to 8) in all five species (Fig. S1). CKX is the key gene for zeatin synthesis, and is highly conserved in function and structure. Not only did the gene expression pattern of CKX seem to be the same among different pear cultivars (Fig. S1), but also the structure of CKX was conserved between families 33 .
Ethylene regulates fruit maturity and also responses to stress in plants 34 . We selected rate-limiting genes in ethylene biosynthesis and key genes for ethylene receptors, though maturity characteristics regulated by ethylene varied between different pear cultivars. Genes corresponding to ethylene biosynthesis had two expressed forms, one pattern was steady increase, and the other pattern steady decreased during fruit development. S-adomet synthetase (SAMS), 1-aminocyclo-propane-1-carboxylic acid synthase (ACS) and 1-aminocyclo-propane -1-carboxylic acid oxidase (ACO) were rate-limiting for ethylene synthesis. Expression of Pbr018549.1, Pbr037756.1, Pbr0001686.1, and Pbr008602.1 (Fig. S2) was continually high for SAMS, but low for Pbr020754.1, Pbr026061.1, Pbr006707.1 and Pbr018549.1 in five cultivars. The expression of two ACS related genes (Pbr030234.1 and Pbr029891.1) was especially high at the stage of fruit maturation. The oriental post-ripening cultivar 'Nanguoli' also showed a high expression of ACS at the mature stage (period 6) and fruit senescence stage (period 7), and the same result was verified in previous studies 35 . In Pyrus pyrifolia, ethylene receptors and ACS have also been identified as regulating pear maturation in the late developmental stages 36 . We found that most ethylene receptors had steady expression. Ethylene response (ETR), constitutive triple response (CTR) and ethylene insensitive (EIL) are the main ethylene receptors; a heat map of these gene expression was shown in Fig. S2. In addition, RAN (response to antagonist) is a copper transporter necessary for ethylene signaling 37 . In our study, RAN (Pbr002476.1) was shown to express especially highly at the early stage of all five pear cultivars (Fig. S2).
Gene expression reveals genetic differences related to fruit quality in pear. Stone cell. The stone cell is a special characteristic of pear fruit that can considerably influence fruit quality. Lignin has been found to be the main component of stone cells 38 . So, in our study, lignin synthesis-related genes were selected for gene expression analysis (Fig. S3). The results showed that several genes such as 4CL (4-coumarate CoA ligase), C3H (p-coumarate 3-hydroxylase), CA5H (coniferylaldehyde 5-hydroxylas), and CAD (cinnamyl alcohol dehydrogenase) had relatively high expression at the early fruit development for all five cultivars (Fig. S3), which was consistent with previous studies showing that stone cells could be detected from sixty days after full blooming 39 . Genes regulating hydroxycinnamoyl transferases (HCT), which reduce the H-lignin content 40 , were identified as expressed at the early stage of fruit development (Fig. S3). Evaluation of stone cell content in 304 pear cultivars from five species revealed that the mean stone cell content in P. ussuriensis was the highest 4 . In this study, it was found that Caffeoyl-CoA o-methyltransferase (CCOMT) related gene (Pbr034039.1) expressed at an especially low level during 'Nanguoli' development, with higher expression in the other four cultivars. Based on the lignin pathway analysis from previous study, CCOMT uses the lignin precursor to synthesize G-lignin and S-lignin 41 . Therefore, we speculated that 'Nanguoli' might have relatively high H-type lignin.
Sugar and acid biosynthesis. Sugar and acid are among the most important fruit quality indicators, and the ratio of sugar/acid is a large component of fruit flavor. Sugars are not only energy for metabolism, but also signaling materials for plants. Fructose accounts for the greatest sugar content at maturity for the five pear cultivars, from components of sucrose, glucose, fructose, and sorbitol (Fig. S4). Among these cultivars, 'Starkrimson' had high fructose content, exceeding others at the mature stage (Fig. S4). Meanwhile, we found that expression of one hexokinase (HK) related gene in 'Starkrimson' was not as high as in the other four species (Fig. S5, Pbr005841.1). Since hexokinase can transfer fructose into fructose-6-phosphate, we speculated that low HK expression in 'Starkrimson' induced higher fructose content. Similarly, HK was also able to induce fructose repression in yeast 42 .
Organic acid is an important factor that can affect fruit flavor. Malate and citrate are the main acids in pear fruit. From malate, citrate, quinate and shikimic acid content at the mature stage, citrate in 'Starkrimson' was the highest among all the acids of five species 43 . It was found that citrate synthase related genes (CS, Pbr004985.1) always had superior expression in 'Starkrimson' during fruit development, and could be an important candidate gene for controlling acid content. In a previous report, two Chinese pear accessions were analyzed to find that CS also responds for citrate accumulation 44 .

Gene networks reflect ethylene synthesis and ethylene receptor interaction.
In order to understand which gene networks are active in pear fruit development, we explored the ethylene synthesis and ethylene receptor interaction with the R package. Based on the results presented in Fig. 6, we found that several ethylene synthesis genes and polyphenoloxidase-related genes interacted with each other directly, and indirect relation was reflected between some ethylene synthesis genes and ethylene response genes. Gene interaction analysis illustrated that SAMS (S-adomet synthetase) regulated ethylene synthesis together with LAC (laccase), which had a tight relationship during pear fruit development. In addition, genes controlling cell division, cellular components, gene silencing, and regulation functioned in a network with indirect complex interaction (Fig. 6). It was reported that S-adomet synthetase (SAMS) can synthesize S-adenosyl-methionine (S-AdoMet) from methionine with ATP 45 . The laccase-related gene, as an ethylene receptor, regulates the downstream ethylene signal transduction pathway 46 . Researchers found that a cDNA induced by ethylene was homologous to a LAC gene in Rosaceae species of miniature roses 47 , which was also the verification of the close connection of ethylene and LAC.  18 . SNP calling based on the transcriptome is meaningful, because some SNPs can affect protein function. Oriental pears 'Hosui' , 'Yali' , 'Kuerlexiangli' , and 'Nanguoli' had 83543, 62522, 74814 and 75463 filtered SNPs, respectively. We found the most polymorphic cultivar was 'Starkrimson' , with 154,404 SNPs after filtering (Fig. 7). The nearly doubly high diversity rate represented the many differences between oriental pear 'Dangshansuli' and occidental pear of 'Starkrimson' on the transcriptome level.
Effectiveness of non-synonymous SNPs. In previous reports, SNPs induced mutants for gene function, for example, peach genotype of flesh melting was affected by SNPs at the melting flesh (M) locus 48 , and SNP241 in watermelon resulted in a resistant gene mutation on exon 49 . Arrays of apple 50 and peach 51 have also been developed based on SNPs, and could be used for genetic mapping, genetic relatedness and marker assisted selection in breeding programs 52 . In our research, SNPs were classified by the location and effectiveness in genes. Effectiveness included synonymous mutations, non-synonymous mutations, start codon mutations and stop codon mutations, among others, the proportion varied in species (in Fig. 7). Synonymous mutations were overwhelmingly prevalent over other types in all five cultivars. Non-synonymous SNPs in the initiation codon (type a, eg: HS accounted for 0.015% in 79,724 SNPs), SNPs in the termination codon (type b, eg: HS: 0.039%), and SNPs changing protein function by early termination (type d, eg: HS: 0.05%) accounted for only a small portion in the whole SNPs (in Table 2 and Fig. 7), however, these SNPs sometimes can completely change gene function and effect the biology trait significantly.
Based on the gene ontology of the predicted SNP-induced varied genes in Table 2, we found that genes responsible for transport in 'Nanguo' had the largest number of mutations in the start codon regions, and GO slim term of response to stress was prone to mutate in the stop codons comparing with other GO terms. We demonstrated all the SNPs, which induced mutations in start codon, stop codon and premature stop codon in Table S1. Conserved function domains (https://www.ncbi.nlm.nih.gov/cdd) were analyzed based on these genes (Table S2). Many SNPs are related to fruit development. We found a SNP mutation in the start codon of Pbr038562.1, which had the functional domain of "ethylene responsive element binding protein (EREBP)" in cultivar of 'Starkrimson' . In Arabidopsis, EREBPs can interact with the 11 bp GCC box in the promoter of target genes and activate downstream ethylene responses 53 . Furthermore, in the stop codon region, the SNP from 'Kuerlexiangli' , 'Nanguo' and 'Starkrimson' might influence Pbr026622.1 with domains from the DUF647 superfamily that has functions in auxin transport. A study found that a member of the conserved DUF647 protein family plays an essential role in the regulation of polar auxin transport by maintaining the proper level of auxin transporters on the plasma membrane 54 . The predicted non-synonymous SNPs provided valuable information for candidate genes associated with fruit development, even though more gene function experiment should be performed to verify the SNP effectiveness derived gene mutations.  Fig. 8. Correlations of the RPKM of four genes and the relative qPCR expression were 0.983 (Pbr025576.1), 0.780 (Pbr002476.1), 0.773 (Pbr009725.1), and 0.710 (Pbr002025.1), throughout fruit development, respectively. The ethylene receptor showed fluctuating expression. We found that the ABA synthesis related-gene had high expression at the early stage of fruit development, when xanthoxin is being converted to ABA 55 . The zeatin-related gene was found to have a peak in the middle stage of fruit metabolism in 'Starkrimson' , but was highly expressed in the early stage of fruit development in blueberry 56 .
After files were aligned for all 35 samples, bam format files were merged together from seven stages of the same cultivar by samtools 61 . SNP calling was performed with the module 'mpileup' in samtools. Effectiveness of the SNPs was analyzed with an in-house perl script (https://github.com/Sunhh/NGS_data_processing).

Sugar content measurements.
Pear fruit sugars including sucrose, glucose, fructose and sorbitol were measured using High Performance Liquid Chromatography (HPLC). Sugars were extracted from pear flesh by grounding, and then were dissolved and filtered through a SEP-C18 cartridge (Waters, WAT021515) and Sep-Pak filter. Waters 1525 system (Waters, Shanghai, China) was used to measure sugars. The column was 6.5 mm × 300 mm, Inner Diameter, 10 um particle size (Waters), with a guard column cartridge of Sugar-pak 1 Guard-Pak Holder and Insert (Waters). Column temperature was 85 °C, and the reference cell temperature was 35 °C.
RT-PCR validation. Several expressed genes were selected for verification by quantitative RT-PCR (qRT-PCR) on 'Starkrimson' . Coding sequences of the genes were acquired from the pear genome project (http:// peargenome.njau.edu.cn). Primers for the candidate genes were designed (Table S1) with the online software Primer 3 (http://simgene.com/Primer3). The LightCycler 480 SYBR GREEN Master (Roche, USA) system was used, with GAPDH as the house-keeping gene. Reactions of target genes and reference genes were performed in triplicate in a total volume of 20 μ l. PCR reactions were carried out via an initial incubation at 95 °C for 10 min and at 95 °C for 15 s, and then cycled at 60 °C for 15 s, and 72 °C for 20 s for 40 cycles, and final extension at 72 °C for 30 min.