Comprehensive evaluation of candidate reference genes for quantitative real-time PCR-based analysis in Caucasian clover

The forage species Caucasian clover (Trifolium ambiguum M. Bieb.), a groundcover plant, is resistant to both cold and drought. However, reference genes for qRT-PCR-based analysis of Caucasian clover are lacking. In this study, 12 reference genes were selected on the basis of transcriptomic data. These genes were used to determine the most stably expressed genes in various organs of Caucasian clover under cold, salt and drought stress for qRT-PCR-based analysis. Reference gene stability was analyzed by geNorm, NormFinder, BestKeeper, the ∆Ct method and RefFinder. Under salt stress, RCD1 and PPIL3 were the most stable reference genes in the leaves, and NLI1 and RCD1 were the most stable references genes in the roots. Under low-temperature stress, APA and EFTu-GTP were the most stable reference genes in the leaves, and the RCD1 and NLI2 genes were highly stable in the roots. Under 10% PEG-6000 stress, NLI1 and NLI2 were highly stable in the leaves, and RCD1 and PPIL3 were the most stable in the roots. Overall, RCD1 and NLI2 were the most stable reference genes in organs under normal conditions and across all samples. The most and least stable reference genes were validated by assessing their appropriateness for normalization via WRKY genes.

were used in this study. Original sources of the plant materials were obtained from the Inner Mongolia Grass Variety Engineering Technology Research Center of Inner Mongolia Agricultural University. Staff at the center formally identified the samples, provided details of specimen deposition and provided seeds (International Plant Name Index (IPNI) Life Sciences Identifier (LSID) urn:ipni.org:names:522843-1). After they were sterilized with 75% ethanol for 30 s, followed by NaClO 4 for 10 min, the seeds were sown in vermiculite in plastic pots (10 cm diameter, 9 cm depth) in a greenhouse at Northeast Agricultural University on 9 September 2018. The greenhouse had a day/night average temperature of 24/18 °C, a photoperiod of 16/8 h (light/dark) and a relative humidity of 70-80%, and the vermiculite was kept moist with 1/2-strength Hoagland nutrient solution. The plants were treated when they were 5 weeks old. Samples from different organs (roots, stems and leaves) were collected from the plants for 5 weeks. The experimental design included three biological replicates.
Five treatments were applied in the present study: (1) a low-temperature treatment involving subjecting the plants to 4 °C for different durations (0, 2, 6, 12, and 24 h) under irrigation with Hoagland solution, after which leaf and root tissues were collected; (2) a salt treatment involving the application of 200 mL of NaCl at different concentrations (0, 12.5, 25, 50 and 100 mmol/L) to simulate high-salt conditions for 4 h, after which leaf and root tissues were collected; (3) a drought stress treatment involving subjecting plants to 10% polyethylene glycol (PEG)-6000 solution (w/v; Sangon, China) for different durations (0, 2, 6, 12, and 24 h), after which leaf and root tissues were collected; (4) a control in which stems, leaves and roots were collected under normal conditions; and (5) a treatment in which all samples from the four treatments were considered together for all samples. All the samples were collected in triplicate, frozen in liquid nitrogen, and stored at − 80 °C.
Total RNA isolation and first-strand cDNA synthesis. Total RNA was extracted from each sample by TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions. Genomic DNA (gDNA) was removed by RNase-free DNase I digestion during the isolation procedure. The quantity and purity of RNA were determined via 1% agarose gel electrophoresis. RNA whose A260/280 ratio ranged from 1.8 to 2.2 and whose A260/230 ratio was greater than 2.0 was used for further synthesis. All the RNA samples were stored at − 80 °C. For qPCR, first-strand cDNA was synthesized from 1 μg of total RNA via HiScript II Q RT SuperMix for qPCR (+ gDNA wiper) according to the manufacturer's protocol 7 . Before the next step, the cDNA samples were stored at − 20 °C.
Selection of candidate reference genes, mining of target transcripts, designing of primers and verification of selected gene amplicons. The RNA transcriptome sequence of Caucasian clover is available from a public database (accession number SRP159097 in the NCBI database). Twelve candidate reference genes among 27,004 genes (with fragments per kilobase of transcript per million mapped reads (FPKM) values ≥ 5 for at least three replicates per treatment and differences from each other not surpassing 5) 22 identified via RNA sequencing analysis were selected from the RNA transcriptome sequence of Caucasian clover. In addition, data mining for target transcripts within the transcriptome was also performed.
The primer pairs of candidate reference genes and target transcripts were designed via the NCBI website (https ://www.ncbi.nlm.nih.gov/) and Oligo 6 software, with the following parameters: a primer length of 20-22 bp (optimal length of 100 bp), an amplicon length of 80-150 bp, an annealing temperature (Tm) within the range of 57-63 °C (optimal temperature of 60 °C), and a temperature difference of each primer of less than 1 °C. All the primers used were synthesized by a commercial supplier (RiboBio, Harbin, China). All the information about the primer design in this study is listed in Table 1. Primer specificity was determined via 1.0% agarose gel electrophoresis ( Fig. 1 and Fig. S1).
qRT-PCR and gene-specific PCR amplification efficiency. qRT-PCR was performed in 96-well plates on a Quantagene q225 Real-Time PCR System (Novogene). The total volume of the reaction solution was 10 μL, which consisted of 0.5 μL of cDNA, 0.2 μL of forward primer, 0.2 μL of reverse primer, 4.1 μL of ddH 2 O and 5 μL of 2× ChamQ Universal SYBR qPCR Master Mix. The PCR procedure was as follows: the thermal profile of the reaction included initial denaturation at 95 °C for 2 min, followed by 40  Ranking the stability of candidate reference genes. geNorm 23 , NormFinder 24 , BestKeeper 25 , the ΔCt method and RefFinder 26 (https ://www.heart cure.com.au/reffi nder/?type=refer ence) were used to analyze the stability of the candidate reference genes under different conditions. All procedures were performed in accordance with the program instructions.
The geNorm algorithm, described by Vandesompele et al. 27 , calculates the gene expression stability, M, for a reference gene as the average pairwise variation, V, for that gene with all other tested reference genes. Stepwise exclusion of the gene with the greatest M value allows ranking of the tested genes according to their expression stability. Stable reference genes with M values lower than 1.5 can be used with geNorm to determine the optimal  28,29 .
NormFinder software was used to identify the stable reference genes, and the principle underlying the calculations was similar to that used by geNorm. First, we obtained the expression stability, M, and then selected the most stable reference gene according to M. The standard was the most stable reference gene with the smallest M value. However, this program selects only the most suitable reference gene.
The BestKeeper algorithm compares the expression levels of only ten reference genes and ten target genes in 100 samples. The correlation coefficient (r), standard deviation (SD) and coefficient of variation (CV) between each gene were obtained. The magnitudes of the respective values were then compared, after which the reference gene with the best stability was ultimately determined.
With the ΔCt method, ranks were determined according to pairwise comparisons of gene sets. The reference gene with the lowest SD had the most stable expression. RefFinder was then used to combine all four statistical methods (geNorm, NormFinder, BestKeeper and the ΔCt method) to calculate the comprehensive ranks.
Validation of reference gene stability. In the present study, the WRKY gene was verified to be involved in various processes, including responses to biotic and abiotic stresses. The relative expression data were calculated according to the 2 −ΔΔCt method 30 and presented as relative expression levels. The sequence of WRKY was obtained from the RNA sequence of Caucasian clover deposited in the NCBI database. In this study, we used WRKY as a target gene to validate the stability of the reference genes. The relative expression levels of WRKY in the roots of plants under low-temperature treatment and in the roots of plants under salt stress were determined and normalized by the use of the most and least stable reference genes.

Results
Primer verification and expression levels of candidate reference genes. We determined the cycle threshold (Ct) values for 12 candidate reference genes (three biological replicates and three technical replicates). The Ct values of all the candidate reference genes are shown in Supplementary Table S1  Expression stability of candidate reference genes. To further evaluate the stability of the candidate reference genes, we used five methods (geNorm, NormFinder, BestKeeper, the ∆Ct method and RefFinder) to determine the individual expression stability of the genes. geNorm analysis. In the geNorm analysis, the M values were determined for root and leaf samples of plants subjected to three different treatments (Fig. 3). The rank of the candidate reference genes differed on the basis of the different conditions. In the roots of plants under low temperature (Fig. 3A), PP2A and PPIL3 had the lowest M values, which indicated that they were most stable under the low-temperature treatment. NLI1 had the greatest M value (1.51), and its expression was the most unstable. In the leaves of plants under low temperature (Fig. 3B), all of the M values were below the threshold of 1.5, and the expression of PTPMT1 and MAP was the most stable. Overall, M is suggested to be the criterion for appropriate reference gene selection (Fig. 3). In the  We also calculated the optimal number of reference genes according to the pairwise variation results (V n/n+1 values). The results showed that, with the exception of those of the roots of plants under salt stress (V 4/5 = 0.13), all of the V 2/3 pairwise variations calculated by geNorm were less than 0.15, and V 4/5 was determined to be 0.13 for the roots of plants subjected to salt stress. These results suggested that the two most stable reference genes were adequate for qRT-PCR-based normalization across different experimental conditions and that an additional reference gene was not needed (Fig. 4). (Table 2), relatively low stability values indicate relatively stable reference genes. The results of the NormFinder analysis are listed in Table 2. The most stable reference genes differed among organs and stresses. The most stable reference genes included RCD1 under salt stress, PPIL3 under 10% PEG-6000 stress and PTPMT1 under low-temperature stress across all the leaf samples. The most stable reference gene under salt and low-temperature stresses was PP2A, and that under 10% PEG-6000 stress was PTPMT1 across all the root samples. The most stable reference genes across all the organs and samples were EFTu-GTP and RBD, respectively. The least stable reference genes across all the samples and conditions were MAP and TMP, respectively. ∆Ct analysis. Relatively small ΔCt values of candidate reference genes indicate relatively high gene expression. As shown in Table 3, the ΔCt values differed among the stresses. In the organs, the smallest ΔCt values of the reference genes were 0.56, 0.56 and 0.58 for EFTu-GTP, RBD and NLI2, respectively, and the most unstable reference gene was TMP. However, the most stable reference genes were EFTu-GTP and NLI1 across all the samples, and PPIL3 was the most unstable reference gene overall. Among all the leaf samples, the most stable reference genes were PTPMT1 under low-temperature stress and PPIL3 under both salt stress and 10% PEG-6000 stress. Among all the root samples, the most stable reference genes were EFTu-GTP under low-temperature stress, PP2A under salt stress and MAP under 10% PEG-6000 stress. (Table 4), the ranking of candidate reference genes is evaluated according to CV and SD values, which are determined by Ct values. Relatively high stability of reference genes is represented by relatively low SD and CV values, and candidate reference genes are unsuitable for normalization and unstably expressed when the SD is > 1.

BestKeeper analysis. In BestKeeper analysis
As shown in Table 4, among all the leaf samples, the most stable reference genes were RCD1 under salt stress, PTPMT1 under 10% PEG-6000 stress and EFTu-GTP under low-temperature stress. In the leaf samples of plants treated with 10% PEG-6000, the SD values of NLI1 and BZIP were greater than 1, which indicated that the reference genes could not be used for normalization. Among all the root samples of the plants, the most stable reference genes were NLI2 under salt stress, PPIL3 under 10% PEG-6000 stress and PP2A under low-temperature stress. Among the different organs, the most stable reference gene was EFTu-GTP, and PTPMT1 was the most stable reference gene across all the samples. Candidate reference genes with an SD > 1 were identified in the roots of plants treated with salt, 10% PEG-6000 and low temperature; these genes could not be used for normalization.
RefFinder analysis. The online tool RefFinder was used to determine the comprehensive ranking of the candidate genes analyzed by the other four methods (geNorm, NormFinder, BestKeeper and the ΔCt method). The results of RefFinder also showed that the comprehensive rankings of the reference genes differed under the various stresses and among the different organs (Supplementary Table S2 and Table 5). The most stable reference     www.nature.com/scientificreports/ the analysis of geNorm, NormFinder, BestKeeper and the ∆Ct method and obtained a comprehensive ranking ( Supplementary Fig. S3).  35 .

Reference gene validation.
On the basis of the comprehensive analysis of the results from geNorm, NormFinder, BestKeeper, and the ΔCt method by RefFinder, the two most stable reference genes and the least stable reference genes under different treatment conditions were selected. In root samples from plants subjected to salt and low-temperature treatments, the expression of WRKY at 0 h was assumed to be 1 and was used to compare the relative expression of genes from the samples at other time points.
As shown in Fig. 5, in the roots of plants under low-temperature stress, the two most stable reference genes, RCD1 and NLI2 (including RCD1 + NLI2), and the least stable reference gene, EFTu-GTP, according to the results of the comprehensive evaluation were used to calibrate the relative expression of the WRKY, RCD1, and NLI2 target genes, respectively. In addition to the expression pattern of the RCD1 + NLI2 combination, which was used as a reference gene, the expression pattern of WRKY was consistent. When the least stable reference gene (EFTu-GTP) was used to correct the target gene, the expression pattern greatly differed from the above expression pattern. In the roots of plants under salt stress, the relative expression of the WRKY target gene was validated using the two best reference genes, NLI1 and RCD1; the NLI1 + RCD1 combination; and the least  www.nature.com/scientificreports/ stable reference gene, NLI2. When NLI1, RCD1, and the NLI1 + RCD1 combination were used as references, the expression pattern of WRKY was consistent. When the least stable reference gene (NLI2) was used to validate the target gene, the expression pattern was very different from the above expression pattern.
In the roots of plants under low-temperature treatment, the least stable reference gene was EFTu-GTP, and the most stable reference genes were RCD1 and NLI2. In the roots of plants under salt stress, the least stable reference gene was NLI2, and the most stable reference genes were NLI1 and RCD1. We also used the combination of the most stable reference genes for validation.

Discussion
As an advanced, accurate and commonly used research tool, qRT-PCR is pivotal for quantifying the relative expression levels of target genes 36 . qRT-PCR has been used to quantify relative levels of gene expression on the basis of normalization to the expression of stable reference genes 37 . Unstable reference genes can substantially affect the results of such analyses and can even lead to erroneous conclusions 38 . Primer amplification efficiency represents the amplicon doubling rate during the PCR process, which also affects the accuracy of qRT-PCR 16 . A good primer has an E value ranging from 90 to 105% 23 . In this study, the E values of all the primers used were in this range, ensuring the accuracy of the qRT-PCR data in this study.
After consulting previous studies of other species and searching our transcriptomic database, we selected 12 candidate reference genes for evaluation. No reference genes suitable for different tissues or organs of Caucasian clover related to growth and development and stress conditions were identified. In general, the most stable genes in all the samples were RCD1, PPIL3, NLI1 and NLI2, whereas bZIP and EFTu-GTP were the most unstable reference genes. bZIP and EFTu-GTP encode a transcription factor and an elongation factor Tu GTP-binding domain-containing protein, respectively. It is very easy to draw erroneous conclusions due to random effects caused by heterogeneity of the samples. Therefore, it is not recommended to use bZIP or EFTu-GTP as a reference gene in such experiments. RCD1 is needed to maintain cells in a division-competent state and to regulate division plane placement 39 . Reports on PPIL3, NLI1 and NLI2 in plants are scarce. PP2A was evaluated in four strains of Auricularia cornea 40 at different developmental stages, and its comprehensive ranking was 3. The rankings generated from the four algorithms were incomplete, and the results were confirmed in Rhododendron 20 , Taihangia rupestris 41 and Baphicacanthus cusia 42 . In the leaves and fruits of Lagenaria siceraria 43 , the reference gene CYP was the most unstable according to geNorm.
Nearly all of the relevant studies have revealed that the use of more than one reference gene for normalization yields more accurate results 44,45 . In most cases, the gene stability results calculated by geNorm and NormFinder were similar (Fig. 3 and Table 2). The best number of reference genes used for normalization was calculated by geNorm, and the results showed that, with the exception of the value from root samples of plants under salt stress, all the V 2/3 values were below the threshold (0.15). A V 2/3 value of < 0.15 indicated that the optimal number of reference genes for normalization was two, and a V 4/5 value of < 0.15 in the roots of plants under salt stress indicated that the best number of reference genes for normalization was four, which was not ideal. However, the M values calculated by geNorm can also be used to evaluate the stability of reference genes; the results showed that only three of the studied reference genes were unstable, while the M values of the other reference genes were below 1.50. Thus, V scores cannot be used as the only index. On the basis of the results of this study, we suggested that using the most stable reference gene for normalization may be a better choice for the roots of plants under salt stress conditions. However, the gold standard of qRT-PCR is the use of at least four reference genes to determine the deviation of a single reference gene 38 . Using four algorithms (geNorm, NormFinder, BestKeeper and the ΔCt method), we found that the rankings of all the reference genes differed among the results of the different algorithms. This variation was expected, however, because the four approaches involve the use of different calculation algorithms, each of which has been verified in many previous studies [39][40][41] . www.nature.com/scientificreports/ In this study, for further verification of the best reference genes, we compared the CVs of the FPKMs of 12 candidate reference genes between the control and different organs in our RNA sequencing (RNA-seq) data (Fig. S2). At the same time, the CVs of the 12 candidates were also compared with our results obtained by geNorm, NormFinder and BestKeeper (Fig. S2), and the overall results were generally consistent. However, this did not undermine the final results of the experiments, because the sequencing data were used only for verification in subsequent validation experiments instead of being directly used to draw conclusions. For this reason, the data could be considered only preliminary, which indicated the trends the FPKM CV values of 12 candidate reference genes. We validated the selected reference genes according to the relative expression of WRKY in the roots of plants under low-temperature stress and that of WRKY in the roots of plants under salt stress. The results shown in Fig. 3 confirm that the candidate reference genes were applicable and stably expressed in Caucasian clover. The results also indicated that the most stable reference genes differed among organs and among treatments, even within the same organ.

Conclusion
Twelve candidate reference genes were selected for qRT-PCR standardization evaluation, and five different statistical methods were used. The results showed that RCD1, PPIL3, NLI1 and NLI2 were the most stable reference genes, while bZIP and EFTu-GTP were the most unstable reference genes. The stable reference genes identified in this report will enhance the accuracy of qRT-qPCR-based analysis of target gene expression and can be used to study related functional genes in Caucasian clover.