Selection and Validation of Reference Genes for Gene Expression Studies by RT-PCR in Dalbergia odorifera

Perennial tree Dalbergia odorifera T. Chen could form the precious heartwood used to produce chinese traditional medicine, rosewood furniture and fragrances. However the formation of heartwood is time-consuming and low efficient, leading to the severe destruction of its wild resources. Thus, it is urgent to study the molecular mechanism of heartwood formation in D. odorifera. But till now, there is no report about the reference gene selection in this species. In this study, the expression stability of nine candidate reference genes were evaluated across different tissues and stems treated by wound and chemical stimulators. Four algorithms were applied to obtain the robust genes. The results support HIS2, GAPDH, and CYP to be the most stable reference genes in samples under different wound treatments while DNAj was the least stable. In different tissues, HIS2, UBQ, and RPL were the most stable reference genes while DNAj was the least stable. The selected reference genes were validated through the normalization of the qRT-PCR data of six heartwood related genes in terpene biosynthesis pathway and ethylene signal pathway. The results showed that their expression levels were accurate when they were normalized by the most stable reference gene HIS2, or by the combination of the two or three most stable reference genes. These results demonstrated that these selected reference genes are reliable.

the trunk begins to form heartwood after growing for 8-10 years. And it takes at least 30 years before it can be used as timber. In the natural environment, heartwood is formed mainly in the center of the trunk and root. In addition, when affected by certain external factors, such as mechanical damage, insect attack, microbial invasion, phytohormones, or drought, the trees would also produce heartwood around the wound or rotting parts of the trunk or root. Recent studies have shown that D. odorifera can be induced to produce heartwood by ethylene, 6-benzylaminopurine (6-BA), drought, microorganism and mechanical damage [15][16][17][18] . Thus, to improve heartwood production, it is urgent to study the exact mechanisms underlying its formation.
Gene expression analysis is a universal and fundamental method to explore the molecular mechanisms of various biological processes. Even though several large-scale gene expression tools have been developed, the quantitative real-time PCR (qRT-PCR) assay with its advantages on the high sensitivity, accuracy, specificity, and good reproducibility is normally used as the small-scale tool in gene expression analysis 19,20 . To avoid the variations caused in each step of the entire experimental procedure 21 , the internal reference genes are crucial in normalization of the qRT-PCR data. The ideal reference gene should express at stable level in different cell types and under different experimental conditions 22 . However, the expression levels of some widely used reference genes have also been proved to be varied across different samples or under different treatments [23][24][25][26] and no universal reference genes have been identified until now. Meanwhile, more and more studies have focused on the evaluation of the expression stability of potential reference genes in special species or under specific conditions. Several algorithms have been developed to validate the performances of candidate reference genes. The geNorm software conducts pair-wise comparison and calculates the M value of all candidate genes 27 . Low M value represents a slight variant. Furthermore, the pair-wise variation (V n /V n+1 ) between the sequential normalization factors (NF) (NF n and NF n+1 ) were calculated to determine the optimal number of reference genes required for accurate normalization. NormFinder is a model based program designed to calculate stability value according to the intra-group and inter-group variation 28 . The candidates are ranked according to their expression stability. The lower the stability value, the more stable the reference genes. BestKeeper determines the "optimal" reference genes using pair-wise correlation analysis of all pairs of candidate genes 29 . The deltaCt method compares relative expression of pairs of reference genes within each sample 30 .
However, no report is available concerning reference gene selection in D. odorifera, especially for its heartwood formation by different wound treatments. In this study, the expression stability of nine candidate reference genes were assessed in D. odorifera samples exposed to different wound treatments or in different tissues using geNorm, Normfinder, Bestkeeper, and deltaCt algorithms. The combined results were calculated using refFinder. The relative expression of six genes including genes in the terpene biosynthesis pathway and ethylene signal pathway were used to validate the reliability of the selected reference genes and to compare the effect of different wound treatments.

Results
Candidate reference gene selection and primer design. Candidate reference genes were selected according to the previous reports in other plants or the expression pattern in D. odorifera transcriptome data [31][32][33][34] . Ortholog sequences of these genes in D. odorifera were obtained in the 454 high-throughput sequencing library and identified through BLASTX against GenBank. According to the identity of the annotation of each gene with its ortholog, we developed a list of nine candidate genes including those encoding actin (ACT), polyubiquitin (UBQ), ribosomal protein (RPL), glyceraldehydes-3-phoshpate (GAPDH), chaperone protein dnaJ (DNAj), histone H2B (HIS2), elongation factor 1-alpha (EF1), peptidyl-prolyl cis-trans isomerase (CYP) and 18S ribosomal RNA (18S) ( Table 1). Agarose gel electrophoresis was performed and the amplification specificity of each primer pair was confirmed by a single band with the desired size of each amplicon. The melting curve analysis of qRT-PCR results also proved the primer specificity with a single peak for each product. PCR products were further confirmed by sequencing and BLAST against GenBank and the EST library.

Expression variation of candidate reference genes.
To obtain the most reliable results, the quality of isolated total RNAs was assessed by strict quality control method. OD 260/280 value examined by Nano drop ND2000 ranged between 2.0 and 2.2. Electrophoresis showed sharp and intense 18S and 28S ribosomal RNA bands with a practical absence of smears. The primary cDNA template amount used in all real-time PCR reactions was determined by the standard curves of all candidate reference genes to make sure the threshold cycle (Ct) for each one kept in an appropriated range . Finally, the average Ct value of all genes ranged from 15 to 34 ( Figure. 1). Amongst all the genes, GAPDH, UBQ, ACT, and 18S were the most abundant with the average Ct value close to 20. For all the tested genes, their expression levels in two subgroups were similar, however, their expression variation in different tissues were higher than that in different wound treatments, and the overall variation was much higher than that of each subgroup. In different tissue subgroup, HIS2 and CYP showed a comparatively narrower range of expression variation, while in different wound treatment subgroup, more genes showed little expression variation, indicating a possibility that different reliable reference genes might be selected for two subgroups.
Expression stability of the candidate reference genes in different tissues. In order to minimize the bias generated by the assumptions underlying each evaluation method, the expression data were analyzed by four different statistical approaches to rank the stability of candidate housekeeping genes.
The results of geNorm analysis showed that, for different tissues, all candidate genes reached high expression stability with relatively low M values far below the default limit of 1.5 (Table 2). Among them, HIS2 and RPL were the most stable genes with least M values of 0.541 and followed by CYP and UBQ, with the M value of 0.589 and 0.841 respectively.
The NormFinder algorithm demonstrated a different reference gene of choice. According to the stability value calculated by this software, UBQ with the least stability value of 0.388 ranked top as the best reference gene, followed by GAPDH (Table 2).
Because the BestKeeper program is designed to assess the reliability of the reference genes and determine a reliable normalization factor, but not to rank their reliability, we listed the candidate genes according to their standard deviation (SD) value. The BestKeeper program calculated HIS2, 18S, and CYP as the most suitable genes  www.nature.com/scientificreports www.nature.com/scientificreports/ in all tissue samples. The results of deltCt method showed that in different tissues, UBQ, HIS2, and GAPDH were the most stable genes with least SD value, while DNAj and 18S were the least stable genes.
With the purpose to identify the robust reference genes of all tested samples in D. odorifera, we applied RefFinder algorithm (http://150.216.56.64/referencegene.php) to aggregate all lists generated by each of the four approaches. RefFinder bases on the ranking from each program. It assigns an appropriate weight to an individual gene and calculated the geometric mean of their weights for the overall final ranking. The aggregation list obtained showed that HIS2, UBQ and RPL were the top three reliable reference genes. In contrast, DNAj, ACT, and 18S were the lowest ranked and should be avoided as the normalization gene (Table 2).
Expression stability of the candidate reference genes after different treatments. The geNorm program calculation showed that for different treatments, all genes are qualified too, and the most stable genes are GAPDH and CYP with the least M values of 0.354, and followed by HIS2 and RPL, with M values of 0.485 and 0.559 respectively ( Table 3).
The result of NormFinder analysis showed that HIS2 and GAPDH had the least variation value of 0.318 and 0.381 respectively in the subset of different wound treatments.
The BestKeeper program calculated UBQ, 18S and HIS2 as the most suitable genes in samples with different treatments, whereas DNAj and 18S were the least reliable ( Table 3). The results of deltCt method showed that HIS2, GAPDH and CYP were the most stable genes, while DNAj and ACT were the least stable genes.
The final list obtained by refFinder showed that HIS2, GAPDH and CYP were the top three reliable reference genes. In contrast, DNAj, ACT, and EF1 were the lowest ranked and should be avoided as the normalization gene (Table 3).
geNorm was also adopted to determine the optimal number of reference genes required for accurate normalization. For all tissue samples, the V 5/6 was 0.147 ( Figure. 2), indicating that including the five most stable reference genes might be best for the accurate analysis. For different wound treatments, the V 3/4 was 0.142 which is lower than the threshold of 0.15, demonstrating that including the three most stable reference genes, HIS2, GAPDH and CYP, might be sufficient for the accurate analysis. However, taking all samples together, none of the V n/n+1 value could reach the standard <0.15, demonstrating that it's hard to identify reference genes for both sample subgroups.
Validity of selected reference genes. To assess the validity of the selected reference genes, expression of potential heartwood related genes in terpene biosynthesis pathways and ethylene response pathways  www.nature.com/scientificreports www.nature.com/scientificreports/ were normalized using the most stable gene or least stable gene. These genes have been identified as wound up-regulated genes by the transcriptome data (SP Table 1).
The up-regulation of all these tested genes by different wound treatments could be confirmed, irrespective of the reference genes used (Figures 3-5). However, the detailed expression level was obviously different. For example, when 18S was used to normalize the expression data, almost all genes exhibited the highest expression level in the CK sample with ddH 2 O treated on the wound ( Figure. 3). This is inconsistent with our heartwood production experiment. Another unusual phenomenon was observed when the least stable gene DNAj was used as reference gene. The expression levels of many genes in the samples with H 2 O 2 treated on the wound were less than those in the CK samples ( Figure. 4). Furthermore, when the top ranked two or three reference genes were adopted, the expression patterns of these tested genes are highly similar (Figure. 5, SP Figs 1 and 2). These results demonstrated that the reference gene selected could influence the accuracy of gene expression level in the qRT-PCR data analysis and confirmed that the top ranked HIS2 was more reliable than the least ranked genes as the reference gene. The result reinforces the significance of validating reference genes prior to experimental applications.

Discussion
The precious heartwood of D. odorifera is formed extremely slowly and mainly in the centre of trunks and roots naturally. However, the demand in the market is growing. For the purpose of obtaining the heartwood, natural resource of D. odorifera is being destroyed. To solve this problem, researchers have paid great attention to the efficient production of heartwood. In recent years, researchers have noticed that different wound and biotic stresses could induce the heartwood formation in tissues outside of the wood centre. To reveal the secret of heartwood formation, it is urgent to conduct researches on the molecular mechanisms of the induced heartwood formation and identify important genes in this process. Thus, identification of suitable reference genes in D. odorifera, especially in samples under different treatments, is necessary and critical for accurate gene expression assessment.
Several algorithms have been developed for reference gene selection. Because each algorithm has its own bias, we applied the four algorithms to evaluate the candidate reference genes, and selected the most reliable ones in D. odorifera, especially for studying the mechanism of heartwood formation or the biosynthesis of major components. The analysis by the four programs generated slightly different ranking. Since these approaches are intrinsically biased, this result is not beyond our expectation. In order to counteract this bias influence, we applied the refFinder to aggregate the four gene lists. Our result demonstrated that for different wound treatments, HIS2, GAPDH, and CYP are the most reliable reference genes and for different tissues, HIS2, UBQ, and RPL are the three most reliable reference genes. Histone encoding genes are commonly used reference genes in many species and tissues. Even though it has been reported to be unstable in several studies 35,36 , it has also been proved to be www.nature.com/scientificreports www.nature.com/scientificreports/ the choice in some other reports. For instance, study in a dominant species of desert ecosystems Reaumuria soongorica showed that histone H2A was one of the most stable reference genes under 4 kinds of abiotic stresses (drought, salt, dark, and heat) 37 . Another histone gene HIS3 and eIF4A were identified as the most suitable for analyses of abiotic stress in tissues of annual ryegrass 38 . In green alga cultured in various environments, histone H4 was one of the most suitable 39 . These results implied an advantage of histone genes as reference gene in plants response to environmental factors and stresses. Besides, our result here also demonstrated that histone genes might be used in different tissues. The other stable gene in different treatment subgroup, GAPDH, is one of the most commonly used reference genes in many species and tissues, although it has been reported to be unstable in several studies 26,40,41 . Our previous study in Aquilaria sinensis has showed that it was one of the most stable genes under different treatments 23 . It was one of the top-ranked reference genes in different tissues or for cold or sucrose stresses in Chinese tallow 42 and was identified as the most stably expressed gene in flax 43 . In addition, it was selected as one of the stable reference genes for "hormone stimuli" group in carrot leaves 44 , and was a suitable reference gene under salinity/drought-treatment in sugarcane 45 . These results indicated the potential of GAPDH for being employed as the reference gene for a wide range of purposes.
Another stable reference gene UBQ was found to be one of the most stable genes for the different cultivars and different tissues examined, and for fruit developmental stages in the food seasoning Chinese prickly ash 46 . In a wide variety of stages of longan cultured under different temperatures, it was identified as one of the internal controls for gene expression analysis 35 . And it was one of the top-ranked reference genes as well as GAPDH in all samples in Chinese tallow 42 .
We demonstrated the fidelity of the selected reference genes HIS2, by normalizing the relative expression of several heartwood related genes in D. odorifera. These genes were identified from our 454 transcriptome sequencing data by BLASTX, and all of them were proved to be up-regulated by wound treatment according to the transcriptome study. 3-hydroxy-3-methylglutaryl-CoA reductase gene (HMGR), (E)-4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HDR), terpene synthase genes (TPS) were genes encoding enzymes of plant terpene biosynthesis pathway 47,48 . In the validation of HIS2 as reference gene, it was confirmed that they could be induced by wound, especially for HMGR, TPS1, and TPS2. The induction of HDR was not as obvious as them, this is because that HDR expressed at a moderate level in untreated stems, whereas the other three almost had no expression at all. Since HMGR and HDR are genes from MVA and MEP pathways respectively. This demonstrated www.nature.com/scientificreports www.nature.com/scientificreports/ that the terpenes in heartwood might be synthesized by the MVA pathway. This result is consistent with some other researches which also showed that wound could induce the terpene biosynthesis through MVA pathway 49 . Moreover, we noticed that the expression of two TPS genes had different expression pattern. After 6-BA treatment, TPS1 could be obviously induced, whereas TPS2 was totally inhibited. Although the deep reason under this need to be further studied, this demonstrated that different treatments might influence the components of the heartwood because different TPSs always produce different terpenes.
ERS1 and ETR2 are two ethylene receptors in Arabidopsis, and play very important role in ethylene signal transduction. Our results were consistent with researches in Arabidopsis showed that they could be induced by ethylene 50,51 . The wound and ethylene induction of their expression proved by this study demonstrated that these two genes might function as ethylene receptors in D. odorifera too and take part in the heartwood formation. Furthermore, when 18S was used as reference gene, the plant hormone treatments were all shown to have negative effect on the expression of these genes, compared with CK which only treated by wound and ddH 2 O. This is not consistent with our and other heartwood induction experiments. As many other studies, our study proved that if the unsuitable reference gene was used, the data analysis would result in biased result.

Methods
Plant materials and treatments. All samples were collected from seven-year-old D. odorifera T. Chen trees in the planting base of Hainan Branch, the Institute of Medicinal Plant Development of China. For different tissues, roots, stems, leaves, flowers, and fruits were cut off by scalpels or gardening scissors. For different wound treatments, stems were incised by single side razor blades to produce 4 × 2 × 0.5 cm wound. The wounds were then treated with 1% H 2 O 2 , ethephon, 6-BA, or abscisic acid (ABA) for 24 hours. The mock-treated stems with only ddH 2 O supplemented on the wound for the same time interval served as the control. All samples were collected in biological triplicate. All samples were ground in liquid nitrogen for RNA isolation or stored at −80 °C until use.
Total RNA isolation and cDNA synthesis. Total RNA was extracted from all samples using the RN53 EASYspin Plus Total RNA Extraction Kit (AIDLAB) according to the manufacturer's instruction and then was treated with RNase-free DNase I (TaKaRa, Japan). The RNA integrity was checked by 1% agarose www.nature.com/scientificreports www.nature.com/scientificreports/ gel electrophoresis. The quantity and quality of each RNA sample were examined by a NanoDrop ND-2000 Spectrophotometer (Thermo Fisher, USA) to confirm the OD260/OD280 value between 2.0 and 2.2. 1 μg RNA for each sample was used in the 25 μl reverse transcription reaction system with RevertAid ™ first strand cDNA Synthesis kit K1622 (Thermo Scientific) according to the manufacturer's instruction. PCR primer design. A total of nine commonly used housekeeping genes were selected and ortholog sequences from the EST library of D. odorifera were used for primer design. Detailed information of these genes is described in Table 1. Primers were designed with the EST sequences by Oligo 6 with PCR amplicon length of 100-300 base pairs (Table 1).
Quantitative real-time PCR and amplification efficiency test. The qRT-PCR reaction was performed using ABI SYBR Selected Master Mix kit (4472908) and run on 96-wells plates with the Applied Biosystem StepOnePlusTM Real-Time System. 2 μl template cDNA was added to the 12.5 μl PCR reaction mixture containing 0.2 μM of each primer. ddH 2 O was used to compensate the reaction system to a final volume of 25 μl. PCR was initiated with 1 minute incubation at 95 °C and followed by 40 cycles of 95 °C for 15 seconds and 60 °C for one minute. All qRT-PCR reactions were carried out in biological and technical triplicates. Each PCR was repeated three times. The specificity of the qRT-PCR reactions was determined by melt curve analysis of the amplified products with a heating procedure from 72 °C to 95 °C in 0.5 °C steps. Negative controls with water instead of cDNA were included for each gene target under assay.
Primer efficiencies and standard deviations were calculated based on a standard curve generated using a tenfold dilution of template cDNA over at least four dilution points that were measured in triplicate. A variation of 10% was allowed. Amplicon specificity was confirmed by sequencing.
The amplification products were sequenced and BLAST against GenBank and our EST library of D. odorifera to confirm whether they were the target gene sequences. Data analysis. For geNorm and NormFinder analysis, expression level was calculated relative to the sample with the highest expression. For BestKeeper analysis, the Ct values were input directly to the Excel table.