Determination of reliable reference genes for gene expression studies in Chinese chive (Allium tuberosum) based on the transcriptome profiling

Chinese chive (Allium tuberosum) is widely cultivated around the world for its unique flavor, nutrient, and medicinal values, yet its molecular mechanism on flavor formation and other metabolic pathways remains intangible. The elucidation of these complex processes begins with investigating the expression of the genes of interest, however the appropriate reference genes (RGs) for normalizing the gene expression are still unavailable in A. tuberosum. To fill this lacuna, transcriptome-wide screening was undertaken to identify the most stable genes according to the analysis of their FPKM values. The expression stability of the RGs was further evaluated using geNorm, NormFinder, BestKeeper, and RefFinder algorithms. The comprehensive analysis showed that GLY1 and SKP1, instead of two traditionally used RGs (eIF1α and ACT2), were the most stable genes across diverse A. tuberosum tissues, indicating the necessity to carefully validate the stability of RGs prior to their use for normalizations. As indicated by geNorm, the normalizations with at least two RGs could give more accurate results. qRT-PCR experiments were conducted with randomly selected genes, demonstrating that normalization with a combination of GLY1 and SKP1 resulted in reliable normalization results. Our finding represents the first attempt toward establishing a standardized qRT-PCR analysis in this economically important vegetable.


Results
In silico screening of candidate reference genes from transcriptome data. In a previous study, 18 RNA-Seq libraries representing leaf, root, rhizome, mature flower, inflorescence stalk with its associated flower buds, and seed tissues of A. tuberosum were sequenced with Illumina Hiseq4000 platform. The raw data of all RNA-Seq samples obtained in this study were deposited in the NCBI Sequence Read Archive under the project with identification number PRJNA67394520 7 . After removal of low-quality reads, ambiguous reads, and adaptor sequences, de novo assembled transcriptome was annotated using Nr (NCBI non-redundant protein) database, which enables the identification of 223,529 tentative transcripts in A. tuberosum 7 .
The entire workflow of the screening process is illustrated in Fig. 1. As the housekeeping genes, commonly used as RGs, are expressed at relatively high levels, the first step was attempted to identify the genes with high expressions in all A. tuberosum tissues. In the annotated dataset, we noted that the FPKM values of housekeeping genes such as ACTIN, eIFs, and UBQ, were high-expressing genes (average FPKM ≥ 100) in tissue samples (Supplementary Table 1). Thus, the cut-off value of median FPKM was set to 100. Accordingly, low-expressing tentative transcripts (median FPKM < 100) were eliminated, and the remaining 197 genes showed comparable expression levels as housekeeping gens (Supplementary Table 2).
The coefficient of variations (CV), the normalized median absolute deviations (NMADs), and the p-values measured by the Shapiro-Wilks were determined on the basis of the FPKM values of 197 genes. Then these genes were further ranked according to the Euclidean distance (d) calculated with the three parameters aforementioned (Supplementary Table 2), and primers were designed based on the sequences of the top 12 genes with the least values of Euclidean distance. DN47561_c0_g1 and DN5072_c0_g1 could not be amplified by PCR, while others were confirmed by the sequencing of their PCR products. Finally, the 10 stable genes with abundant expressions were further evaluated by using comprehensive bioinformatics analyses (Fig. 2). Additionally, based on the homologous blast search against A. tuberosum transcriptome, two traditional RGs (ACT2 and eIF1α) which were frequently used as internal controls in Arabidopsis 21,22 , were also included in the evaluation analysis.
Evaluation of primers' specificity and amplification efficiency. Before experiments, gene-specific primers designed based on the assembled transcriptomes were firstly evaluated using the regular PCR method, and the resulting amplicons which contained unique bands with predicted sizes were confirmed by sequencing ( Supplementary Fig. 1). Standard curves of ten candidate RGs and two controls were generated from a tenfold series dilution of at least five cDNA template concentrations and non-template control. As the R 2 value of the standard curve represents how well the experimental data fit the regression line, the R 2 value > 0.980 is recom-  Supplementary Fig. 2). Ideally, if the amount of PCR product perfectly doubled during each cycle, the theoretical amplification efficiency (E) should equal 95.4%, which indicates all templates were amplified during PCR reaction. However, in practice, an amplification efficiency of 90-105% is preferred, because qPCR was prone to be affected by suboptimal reaction conditions. As shown in Table 1 Table 2). Despite the algorithm difference, the two aforementioned programs predicted that GLY1 and SKP1 were the most stable genes while DN2022_c2_g1 was the least stable gene.

BestKeeper analysis.
BestKeeper assesses the stability of gene expression by comparing the standard deviation (SD) and the coefficient of variation (CV) of Ct values. Thereby, the most stable RG should have the lowest SD and CV according to the BestKeeper algorithm. The ranking result suggested that the top three stable genes were ACT2, GLY1, and SKP1 ( Table 2), indicating that they were the best RGs for normalization.

RefFinder analysis. Depending on the input of raw Ct values, RefFinder is a web-based computational pro-
gram that integrates geNorm, Normfinder, BestKeeper as well as the comparative Delta-Ct method to evaluate the RGs. Based on the rankings from four programs, it generates the geomean of the ranking values of the tested RGs. The rankings by RefFinder suggested that GLY1, SKP1, and SERBP1 were the most stably expressed genes (Table 2), which was largely in agreement with the other three algorithms used.
Analysis of the optimal number of reference genes for normalization. In addition to the calculation of gene stability, geNorm also calculates the pairwise variation between two sequential RGs used for normalization via geNorm V analysis. As previously proposed, the cut-off value of V n/(n+1) is above 0.15, the inclusion of an additional RG is required. As illustrated in Fig. 4b, the value of V 2/3 was 0.132, indicating that two references were sufficient for accurate qRT-PCR normalization. Considering the expression stabilities of candidate RGs, we thus concluded that the most stable combination was GLY1 and SKP1, which could be used as reliable normalization factors for investigating the gene expression across different tissues in A. tuberosum.

Validation of the reference genes.
To validate the applicability of the selected RGs, we conducted further qRT-PCR to analyze the expression profiles of several randomly selected genes from the tissue-specific transcriptome in A. tuberosum. Their expressions of these genes were normalized against either the most stable or unstable RGs. In general, qRT-PCR results are largely consistent with the expression patterns revealed by transcriptome analysis, when GLY1 and SKP1, alone or in combination, were used as internal control genes ( Fig. 5ad). For instance, DN4756_c1_g1 shared 94% sequence similarity with the Arabidopsis LHCB1 (Light-harvesting Chlorophyll A/B-protein 1) protein. As revealed by transcriptome data, in A. tuberosum, DN4756_c1_g1 was expressed in photosynthetic tissues such as the leaf, flower (sepal included), and inflorescence (with stalk). In the qRT-PCR experiments normalized against the stable RGs, we confirmed that DN4756_c1_g1 exhibited the highest expression in leaf tissues, with weak expression in flower, inflorescence, and seed tissues. Likewise, seedabundant gene DN7391_c1_g1 encoded a putative vicilin-like seed storage protein homologous to At3g22640. Accordingly, we observed that DN7391_c1_g1 was highly expressed in seeds when using GLY1 and SKP1 as internal controls. Conversely, when the least stable gene, Cyclase_X1, was used as an internal control, DN4756_ c1_g1 and DN7391_c1_g1 displayed expression patterns that differ from that of transcriptome analysis (Fig. 5e). A similar phenomenon was observed for the other three tested genes in A. tuberosum. Hence, the results demonstrated that, to a large extent, the RG determined the accuracy of qRT-PCR normalization. Furthermore, the results also confirmed that GLY1 and SKP1 could be used as appropriate RGs in analyzing gene expression across various A. tuberosum tissues.

Discussion
As one of Allium crops with great economic importance, Chinese chive is well-known for its unique flavor which combines garlicky and sweet odors, and for the ability to enhance the flavor of other food 2,27,28 . Several genes encoding the key enzymes in flavor production pathways have been isolated in A. tuberosum, but the molecular  29 . Expression profiles of several genes encoding alliinase, cysteine synthase, and serine acetyltransferase were characterized by using the Northern blot method whereby ribosomal RNA was used as loading controls 8,9 . However, studies qRT-PCR to explore expression patterns of genes in Expression level of selected genes in different tissues was presented according to the A. tuberosum transcriptome data (d). All these bars was represented by mean ± SD from three biological replicates. www.nature.com/scientificreports/ Chinese chive are still rare, and to date, no prior report on the reliable RGs has been published in A. tuberosum yet.
Since the relative quantification method has been widely used in the qRT-PCR analysis, reliable results are largely dependent on the selection of the appropriate RGs, expression of which is supposed to be stable in all samples, irrespective of all experimental conditions. Therefore, it is imperative to identify the stable genes that can be used as internal controls for qRT-PCR experiments. In earlier studies, the frequently used RGs in Alliums were usually derived from the identification of housekeeping genes that were associated with the basic cellular metabolism. For instance, ACTIN was used as RGs for normalizing the expression of waxy cuticle-related genes in welsh onion 30 . Similarly, an actin gene, AsACT , was served as internal control genes in the qRT-PCR analysis of GA-treated garlic plants 31 . However, no evidence about their constitutive expression patterns under the experimental conditions was available. In fact, the expression of the ACT gene could be influenced by pathogen attacks in Arabidopsis 32 . As reliable quantification of gene expression mainly depends on an accurate normalization, the selection of RGs is one of the crucial factors in investigating gene expressions under given experimental conditions.
High-throughput RNA-Seq technologies facilitate gene expression analysis especially for those species whose genomes are still unavailable. Apart from the sequence information of all the transcribed genes, the assembled transcriptome provides the landscape of gene expressions at the genome-scale, which permits the identification of the genes with stable expressions across different experimental conditions or samples. Depending on the calculation methods employed in the transcriptome analysis, gene expressions can be presented as FPKM (Fragments per kilobase per million mapped reads) value or others. Therefore, according to the FPKM values, researchers can examine the expression stability using the mean expression value, standard deviation, coefficient of variation (CV), and normalized median absolute deviation (NMAD). The CV reflects the extent of variability in relation to the mean of the expression levels, and it was successfully used to identify the best RGs from the transcriptome dataset [33][34][35] . However, the CV is simply calculated based on the means, which is less susceptible to the deviations caused by the outliers. Thus, the NMAD can be included to evaluate the spread of the distribution according to the medians of the expression levels, which provides a robust alternative criterion.
It was not surprising that different screening criteria might lead to contrasting results. As observed in our analysis, the rankings according to the CV values suggested that DN47561_c0_g1, DN12657_c0_g1, DN6928_c0_ g1, DN253_c0_g1/SKP1, and DN18197_c0_g1 were the top 5 stable genes; however, DN5298_c0_g2, DN47561_ c0_g1, DN6928_c0_g1, DN253_c0_g1/SKP1, and DN2292_c0_g2 were ranked as the top 5 stable genes when these RG candidates were scored by NMAD values. Thus, DN47561_c0_g1, DN6928_c0_g1, DN253_c0_g1/ SKP1 seemed to be the promising RG candidates because they could meet the two screening criteria. However, other candidates only met one of the criteria mentioned above, which made it difficult to pick up the ideal RG for further analysis. To solve the problem, CV and NMAD, together with the normality test of gene expressions (p), should be taken into account. Assuming that CV, NMAD, and p had equal contributions to the evaluation of RG's expression stability, the Euclidean distance (d) between the three parameters could be used as an effective measure in the RG selections from the huge RNA-Seq dataset.
A Venn diagram was generated with the top 20 stable RG candidates ranked by their CV, NMAD, or d values (Supplementary Fig. 3). As revealed by the Venn analysis, 12 RG candidates were shared by three rankings lists, suggesting that they might be the ideal RG candidates which met all three criteria. 19 genes were shared between d and CV ranking lists, while only one gene was unique for the CV ranking list. Likewise, 13 genes appeared on both NMAD and d ranking lists, whereas 7 genes formed a unique group that was specific to the NMAD ranking list. In our analysis with top 20 RG candidates, the outcomes with the CV and d criteria were similar to a large extent, though the ranking orders of some RGs were different; on the contrary, the NMAD might serve as a more restrictive criterion, and nearly half of genes appeared in the overlapping between both d and NMAD ranking list, implying that NMAD criterion alone is not sufficient to evaluate the stability of RG expressions. As the Euclidean distance can balance the effects from three parameters, we preferred to use d values instead of CV or NMAD to evaluate the first-round screenings, and in the subsequent bioinformatics analyses, the selected 10 RG candidates, which were further confirmed by PCR experiments, exhibited better expression stability compared to that of two traditional RGs.
By combining some bioinformatics tools, NormFinder, geNorm, BestKeeper, and RefFinder, a series of RG candidates were identified in non-model species, such as cotton, onion, soybean, and garlic [36][37][38][39] . In addition to traditionally used RGs-eIF1α and ACT2, the top 10 potential RGs identified from transcriptome analysis were quantitatively evaluated with the programs aforementioned. Surprisingly, the majority of RG candidates were associated with either ubiquitin or ribosomal pathways except that SEM1 might encode one of the components in the 26s proteasome 40 . Ubiquitin genes were often used to normalize the quantification of gene expression due to their abundance and universal expressions in diverse samples. Among the top 200 RGs, we found at least 13 genes that were associated with ubiquitin pathways, but the ranking list suggested that they did not appear to be the best RGs as some expression variations of these genes were detected (Supplementary Table 1). It has been generally accepted that, despite housekeeping roles in the intracellular protein degradation, the ubiquitin-proteasome system was involved in selective proteolysis, and therefore the process was tightly controlled and dynamically regulated 41 . Similarly, it becomes controversial whether the 18s or 28s ribosomal genes are perfect RGs. On one hand, like ubiquitin genes, transcription of some ribosomal genes was also affected by biological factors 26 ; On the other hand, the ribosomal RNAs are not polyadenylated, which makes it difficult to investigate their expression when dealing with cDNA from total RNA with oligo-dT primers. It was reported that the use of ribosomal genes in expression studies could give erroneous normalizations compared with the use of other RGs 42,43 . Consistent with previous findings, in this study, the top 10 RG candidates could thus be used to normalize expression levels of genes of interest despite that most of the candidates might be not related to ubiquitin or ribosomal pathways. www.nature.com/scientificreports/ The comprehensive rankings suggested that, of the 12 genes, GLY1/DN374_c0_g1 was the most stable gene, followed by SKP1/DN253_c0_g1; on the contrary, Cyclase_X1 was thought to be the least stable one. Interestingly, we found that the housekeeping genes, eIF1α and ACT2, were not constantly expressed despite that their Arabidopsis homologs were widely used as internal controls in molecular biological experiments 43,44 . As revealed herein, the most stable genes identified in our analysis also included genes involved in signaling pathways. For example, DN253_c0_g1 encoded a putative S-phase kinase SKP1, which might mediate the ubiquitination of proteins involved in cell cycle progression 45 . DN8079_c1_g3 was ranked as the fourth stable gene by Ref-Finder, though it might encode a putative a high mobility group protein (HMG) which could regulate chromatin remodeling and ultimately gene transcription 46 . Therefore, the results reminded us that the empirically selected 'housekeeping' genes could be unsuitable as the internal controls, and their stabilities should be carefully evaluated before experiments.
Compared to the other algorithms, only geNorm provides a method to define the optimal number of RGs for accurate normalization of gene expression via the pairwise variation V analysis. The cut-off Vn/n + 1 value of 0.15 is recommended, indicating that the preferred primer number is n; otherwise, an additional RG should be included. In our experiments, the V2/3 was below the threshold value of 0.15, suggesting that two RGs were sufficient for data normalization. When using two RGs alone and in combination, the qRT-PCR analysis of randomly selected genes showed similar expression patterns in our experiments. However, it is still recommended to always use at least two RGs to avoid substantial errors, because we could not deny the possibility that these stable genes might participate in other biochemical pathways other than plant development.
We analyzed the expression data of five randomly selected genes in A. tuberosum normalized against either the most or the least stable genes according to the comprehensive rankings given by the RefFinder algorithms, demonstrating that transcriptional patterns analyzed with the stable RG were almost the same as observed in transcriptome analysis. On the contrary, when the raw data normalized with the least stable gene, the qRT-PCR analysis displayed distinct expression patterns of target genes, which could lead to biased interpretations of gene functions. Thus, our results suggested that the importance of appropriate RGs, which will enable more accurate and reliable normalizations in gene expression analysis.
In conclusion, we finally demonstrated the fidelity of GLY1 and SKP1 as the optimal candidate RGs in qRT-PCR experiments with diverse A. tuberosum tissues. This is the first report on the selection of RGs in A. tuberosum, which will benefit the study of gene expressions and other related subjects in Chinese chive. Nevertheless, we should be cautious when including the two RGs into qRT-PCR experiments in A. tuberosum, because their expression stability was only verified at the developmental aspect and could be affected by other experimental conditions. The seeds of Chinese chive were germinated on a 32-cell tray with granulated rockwool, and seedlings were irrigated with ¼ strength of Hoagland's medium once a day. After 2-month growth, seedlings were transferred to a hydroponic cultivation system in a controlled greenhouse located in the BVRC. The plants were supplied with the nutrient solution, and the composition formula as previously described 7 . Tissue samples of 3-year-old seedlings were harvested from the mature leaves, roots, rhizome, mature flowers, inflorescences, and seeds. All RNA-Seq and transcriptome de novo assembly. The 18 RNA-Seq libraries representing the six tissues were prepared were sequenced at the Illumina NovaSeq 6000 platform to an average depth of 50 million reads per sample. Sequence reads were filtered using SeqPrep (https:// github. com/ jstjo hn/ SeqPr ep) and Sickle (https:// github. com/ najos hi/ sickle) to remove the low-quality and adaptor sequences. Clean reads were assembled via the Trinity de novo assembly program (https:// github. com/ trini tyrna seq/ trini tyrna seq) 47 and TransRate (http:// hibbe rdlab. com/ tran-srate/) 48 . For annotation purposes, sequences were handled with CD-HIT (http:// weizh ongli-lab. org/ cd-hit/) program to reduce the transcript redundancy 49 , and finally, the assembly quality was evaluated using BUSCO (Benchmarking Universal Single-Copy Orthologs, http:// busco. ezlab. org) program with default configurations 50 . All raw data were deposited NCBI Sequence Read Archive under the project with identification number PRJNA67394520, and the assemble transcriptome is accessible at a public server (https:// doi. org/ 10. 6084/ m9. figsh are. 14820 201).

Materials and methods
Screening of candidate reference genes from transcriptome data. Candidate RGs were screened according to the previously described method with modifications 37 . Briefly, based on the transcriptome data from the six tissues, four parameters were adopted to evaluate the stability of gene expressions: (1) The coefficient of variation (CV), which detects the extent of variability in relation to the mean of the expression levels; (2) The normalized median absolute deviation (NMAD). In statistics, median absolute deviation (MAD) is commonly used for assessing the spread of the distribution according to the medians and is less susceptible to the deviations by outliers. The MAD was further normalized against the median, which could reflect the variability of gene expressions based on the medians. Compared to the CV that is often affected by outliers, and the NMAD pro- (3) Normality test. The p-value measured by the Shapiro-Wilks hypothesis indicates whether the expression data does fit the normal distribution, and in the test, p-value < 0.05 means a significant departure from normality. Most importantly, FPKM values were also taken into account. Because genes with low expression abundance are unsuitable as RGs, we used FPKM ≥ 100 as the cut-off value for candidate selection. The ideal RGs should have lower or similar statistical variations across samples, which was indicated by low CV and NMAD values. Moreover, their expression data also should pass the normality test, which allows that with 95% confidence they fit the normal distribution. To meet the criteria proposed herein, RGs were finally ranked according to the calculation of Euclidean distance (d), and the formula was as follows: After comparisons, the top 10 genes with the least values of Euclidean distance were selected for further analysis.
RNA extraction, the cDNA synthesis. Total RNA isolation was performed as described elsewhere 51 .
Briefly, total RNA was extracted with Transzol (Transgen Biotech, China) according to the manufacturer's instructions. Briefly, tissues of A. tuberosum were homogenized in liquid nitrogen and the frozen powder was lysed using Transzol extraction buffer. 1/5 volume of chloroform was added before 5-min centrifugation at 12,000g. The aqueous containing nucleic acid was precipitated with an equal volume of isopropanol by centrifuging at 12,000g for 10 min. The resultant pellets were washed in 75% ethanol and resuspended in RNase-free water. After DNase I digestion, the clean total RNAs were stored at − 80 °C freezer. RNA quantity and quality were examined using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA) to ensure structural integrity for further experiments.
First-strand cDNA synthesis was conducted by using the EasyScript First-Strand cDNA Synthesis SuperMix kit (Transgen Biotech, China) according to the user's manual. In summary, 1 µg of total RNA was mixed with 1 µl oligo (dT) 18 primers (10 mM), and RNase-free water. The mixture was incubated for 5 min at 65 °C before being chilled on ice for 3 min. Subsequently, 4 µl of the first-strand buffer, 1 µl of gDNA remover, and 1 µl of reverse transcriptase were added to finalize the reaction. After 1-h incubation at 42 ℃, the reaction was inactivated for 10 s at 70 °C. The cDNA diluted by fivefold with PCR-grade water was ready for use.
Quantitative real-time PCR. The amounts of individual genes were estimated with gene-specific primers by quantitative real-time PCR analysis with a real-time PCR instrument Roche LightCycler 480 and SYBR Green mixture (Toyobo, Japan) as previously described 52 . Briefly, primer pairs showing a single amplified product with the predicted sizes were chosen for further qRT-PCR experiments. Each PCR reaction mix consisted of 10 µl of SYBR Green Supermix, 0.5 µl of forward and reverse primers (10 mM), and 2.5 µl fivefold diluted template cDNA. Finally, the resulting reaction volume was made up to 20 µl by adding 7 µl PCR-grade water. The PCR cycling was performed as follows: 5 min at 94 ℃ followed by 40 cycles of 15 s at 94 ℃, 10 s at 58 ℃, 30 s at 72 ℃, and finally, 1 round of 60 s at 60 ℃, and the melting curve cycling consisted of 15 s at 95 ℃, 1 min at 60 ℃, 30 s at 95 ℃, and 15 s at 60 ℃. For qPCR efficiency, Based on the standard curve generated with serial dilutions of the pooled cDNA template, the amplification efficiency (E) of each primer pair was calculated using the equation: E% = (10 -1/slope − 1) × 100. For the validation experiment, the relative expression level was calculated by the 2 −∆ct method with selected RGs, where ΔCt is the difference in the threshold cycles between the target and the RG. All qRT-PCR reactions were performed in technical triplicate. The specific primers used in this experiment are shown in Supplementary Table 3.
Expression stability analysis of potential reference genes. The expression stability of the candidate RGs was evaluated by using geNorm (https:// genorm. cmgg. be/), NormFinder (https:// moma. dk/ normfi ndersoftw are/), BestKeeper (https:// www. gene-quant ifica tion. de/ bestk eeper. html), and RefFinder (https:// heart cure. com. au/) statistical algorithms [23][24][25][26]53 . The qBase+ software with geNorm analysis was employed 26 , while Nor-mFinder and BestKeeper analyses were performed with free add-ins for Microsoft excel 23,24 . RefFinder integrated the above three algorithms and the Delta Ct method is a free, web-based service for comprehensive ranking of RGs 25 . Based on the sequential pair-wise comparisons, geNorm calculates the stability value M and the variation value V, by which the stability of each gene and the number of optimal RGs is determined. NormFinder ranks individual candidates according to their stability value. BestKeeper determines the optimal references by employing pair-wise correlation analysis of all pairs of candidate genes. Using the LightCycler 480 Relative Quantification Software module (Roche, USA), quantification cycle (Cq) values were obtained and converted into relative quantities by using standard curves, then applied for evaluation. The relative expression levels were imported into geNorm and NormFinder, whereas BestKeeper and RefFinder analyses were conducted with the raw non-transformed Cq values.