Selection of suitable reference genes for normalization of quantitative RT-PCR in peripheral blood samples of bottlenose dolphins (Tursiops truncatus)

Quantitative RT-PCR is often used as a research tool directed at gene transcription. Selection of optimal housekeeping genes (HKGs) as reference genes is critical to establishing sensitive and reproducible qRT-PCR-based assays. The current study was designed to identify the appropriate reference genes in blood leukocytes of bottlenose dolphins (Tursiops truncatus) for gene transcription research. Seventy-five blood samples collected from 7 bottlenose dolphins were used to analyze 15 candidate HKGs (ACTB, B2M, GAPDH, HPRT1, LDHB, PGK1, RPL4, RPL8, RPL18, RPS9, RPS18, TFRC, YWHAZ, LDHA, SDHA). HKG stability in qRT-PCR was determined using geNorm, NormFinder, BestKeeper and comparative delta Ct algorithms. Utilization of RefFinder, which combined all 4 algorithms, suggested that PGK1, HPRT1 and RPL4 were the most stable HKGs in bottlenose dolphin blood. Gene transcription perturbations in blood can serve as an indication of health status in cetaceans as it occurs prior to alterations in hematology and chemistry. This study identified HKGs that could be used in gene transcript studies, which may contribute to further mRNA relative quantification research in the peripheral blood leukocytes in captive cetaceans.


Function
Gene Name  Table 2); therefore all 15 HKGs were included in the analysis of reference gene suitability. Figure 1 illustrated the variable transcript levels in the 15 HKGs with the lowest mean Cq values (21.06) in ACTB, and the highest (32.14) in SDHA. Transcript levels were used to establish two arbitrary categories: those that were highly transcribed (mean Cq values < 25 cycles) including ACTB, B2M, GAPDH, RPL4, RPL8, RPL18, RPS18, and RPS9, and those with lower transcript levels (mean Cq values > 25 cycles) including HPRT1, LDHA, LDHB, PGK1, SDHA, TFRC and YWHAZ. All HKGs showed a small difference (< 5 cycles) between the maximum and minimum Cq values.
Gene transcript stability in 75-sample group. The geNorm algorithm demonstrated good stability of all tested HKGs (Table 3). M values, average pair-wise variation of a given gene with all other control genes, were lower (0.178-0.682) than the program's default limit of 1.5. RPL18 and RPS18 were the most stable genes followed by RPL8, RPL4, and RPS9. The optimal number of reference genes for normalization was established using pairwise variation values (V) between two sequential normalization factors while employing an increasing number of genes. The optimal number was determined to be less than 3 based upon V 2/3 value of 0.084 (< 0.15 is the default cut-off) ( Fig. 2A), indicating that the addition of one more HKG would not significantly improve reliability. NormFinder analysis identified HPRT1  as having the best stability (0.357) followed by RPL4 (0.390) and PGK1 (0.408) ( Table 3). BestKeeper analysis determined the SD Cq value of all HKGs (0.540-0.783) were < 1 indicating these genes were basically stably expressed. The three most stable genes, according to their SD Cq value ,were RPL4, TFRC and RPS9 ( Table 3). The best choice in comparative Δ Ct method was PGK1, HPRT1 and RPL4. RefFinder identified RPL4, HPRT1 and PGK1 as being superior (Table 3, Fig. 3A).
Gene transcript stability in 35-and 55-sample groups. In both 35-and 55-sample groups the M values in geNorm of all studied HKGs were lower than the program's default limit (M = 1.5), and the SD Cq value in BestKeeper were < 1. V 2/3 values in both groups were below to 0.15 (Fig. 2B,C). In contrast to the result in 75-sample group, geNorm algorithm identified PGK1, ACTB and LDHA to be the most stable HKGs in both groups. BestKeeper showed similar results in high-ranking HKGs (RPL4, TFRC, RPS9, B2M and HPRT1) in three groups. NormFinder and comparative Δ Ct method identified HPRT1, PGK1 and RPL4 to be the best-three HKGs in all three groups. The RefFinder comprehensive rankings placed PGK1, HPRT1 and RPL4 as being highly ranked HKGs in all three groups (Tables 3-5, Fig. 3).

Discussion
RNA transcript stability plays a crucial role in qRT-PCR-based studies due to pre-analytical variations, especially degradation by endogenous RNases and unintentional transcription of individual genes after blood drawing. Without proper preservation, copy number of individual mRNA transcripts in blood can change more than 1000-fold during storage and transport 20 . Two common methods for stabilizing blood  leukocyte RNA are to use PAXgene Blood RNA vacutainer tube and RNAlater. Previous study compared RNA transcript stability in blood leukocytes collected directly into PAXgene tubes or transferred immediately in RNAlater, and showed both methods were clearly appropriate for RNA stabilization in blood based on good quantity, integrity and purity of isolated RNA 21 . PAXgene tube has been successfully used in blood collection of sea otters (Enhydra lutris) to measure differential transcript levels of select immune function genes 20 . PAXgene tubes have the advantage of minimal sample manipulation and immediate exposure of the cells to RNA stabilizing agents, but require collection of 2.5 mL of blood. The present study used RNAlater. Smaller blood volume (0.5 mL) needed for RNAlater would facilitate sample collection and transport. Besides, a small difference between the maximum and minimum Cq values of each HKG indicates good quality of RNA stabilization and the consequent procedures in the current study. Amplification efficiency is an important factor in gene transcript studies using qPCR. When efficiency is not close to 100% (doubling of PCR products per cycle), the calculation the gene quantification requires corrected 22 . Δ Δ Ct and Δ Δ Ct corrected may yield similar results only in the case where high quality data are available and the traditional Δ Δ Ct method would not overestimate the error 23 . The efficiency values of 15 candidate genes in the present study were within the optimal range of 95-105%. Regardless, HKG rankings were susceptible to modest variation if raw Cq values were used for stability analysis rather than corrected Cq values were used (data not shown). This suggests that proper efficiency adjustment can improve qPCR data analysis with greater accuracy.
The HKG stability orders proposed by the four different algorithms used in the current study were not identical, which has been described before 24 . BestKeeper uses raw Cq data as compared to relative transcript levels used in geNorm and NormFinder that may lead to the different outputs 24 . Comparative Δ Ct and geNorm, which use a pairwise comparison approach, are prone to select co-regulated genes and this can also influence the ranking results 25 . While NormFinder uses a model-based approach that considers systematic differences and is less likely to be impacted by co-regulated HKGs, it is sensitive to sampling errors and outliers 26 . Since different algorithms can show various HKG rankings, it has been suggested that more than one type of algorithm should be used for reference gene selection 27 . RefFinder was used in the current study to combine all four algorithms to comprehensively evaluate and rank HKGs. This approach assigns an appropriate score to each individual HKG and then calculates their geometric means to produce a final ranking. The three most stable HKGs (PGK1, RPL4, HPRT1) identified using RefFinder were also in high-ranking orders in NormFinder and comparative Δ Ct. In contrast, the top 5 reference genes identified by geNorm were all coding for ribosomal proteins that are likely to be co-regulated. It has been demonstrated that the sensitivity to co-regulation is a major weakness of the pairwise comparison approach while the co-regulation of candidate HKGs does not significantly affect the model-based approach (NormFinder) 26 . Sole utilization of ribosomal protein genes as reference genes has the potential to decrease the sensitivity of identifying changes in transcript levels of GOI in an experiment 6 . Therefore, utilization of HKGs whose encoded proteins belong to different functional classes would reduce the co-regulation effect 26 . The three most stable HKGs in the present study are responsible for different functions. PGK1, encoding for a key enzyme in glycolysis and gluconeogenesis, has previously been identified as a stable reference gene for use with human whole blood RNA and RNA derived from PBMC 28 . RPL4 encodes a protein that is a component of the 60S ribosome subunit. It has been identified as a suitable reference gene on the PBMCs with unknown pathogenic condition in pigs 29 . RPL4 and PGK1 have previously been recommended as reference gene for exfoliated cervical cells 30 . HPRT1, plays a central role in the generation of purine nucleotides through the purine salvage pathway, belonged to one of the most stable reference genes for qRT-PCR studies in human neutrophils 31 and exercise induced stress in horse PBMCs 32 .
Increasing the number of stably transcribed HKGs included in calculation will increase the efficacy of the normalization factor 3 . Previous studies have suggested there is no single reference gene that can be used for different experiments but rather a group of putative reference genes should be considered for certain specific experimental setups 27 . While inclusion of more HKGs further decreased the V values in the present study, the V2/3 value showed two genes were sufficient for data normalization. Previous study has suggested the transcript levels of a reference gene should not to be very low (Cq > 30) or very high (Cq < 15) 33 . However, appropriate reference genes were suggested to have the same transcript levels as the target gene in an experimental application in order to enhance the uniformity of the analysis 5 . According to mean Cq values, PGK1 and HPRT1 were classified in the low transcript-level group (mean Cq > 25) and RPL4 in the high transcript-level group (mean Cq < 25). Based upon these concepts, the low-level transcripts encoding PGK1 and HPRT1 would be logical reference genes for studying immune-inducible genes with typical low transcript level, and the combination of RPL4 and PGK1 would be more appropriate for higher transcript-level studies. Investigators must recognize that the proposed reference genes in this study would be suitable only when RNA is extracted from RNAlater-preserved whole blood samples of bottlenose dolphins. It has been shown that some HKGs found to be invariant in proliferating PBMC cultures were unsuitable when studied in whole blood 28,34 . For blood samples from other cetacean species, more studies are needed to identify if PGK1, HPRT1 and RPL4 are superior reference genes as well.
Immunologic studies of cetaceans with cytokine gene transcripts have been conducted on blood samples using several different HKGs as reference genes. GAPDH and YWHAZ were used in harbor porpoise studies [10][11][12] , GAPDH in bottlenose dolphins 13 , and RPS9 in bottlenose dolphins, beluga whales, and Pacific white-sided dolphins 8,14 . RPS9 could potentially be a better reference gene than GAPDH and YWHAZ in studies using bottlenose blood samples since it was ranked much higher than the other two genes in the current study.
The reliability of reference gene selection could be affected by sample size. The general recommendation for selecting reference genes using NormFinder is a minimum of 8 samples and 5-10 genes 26 , and GeNorm proposed the use of 8 reference genes and 10 samples 3 . Previous studies directed at reference gene selection in cetaceans have included 30 skin biopsy samples in striped dolphins 5 , and 20 skin biopsy samples from 7 blue whales (Balaenoptera musculus), 7 fin whales (Balaenoptera physalus) and 6 sperm whales (Physeter macrocephalus) 17 . Bovine 35 and sheep 36 studies have employed 22 and 16 neutrophil samples, respectively, for use as reference gene selection. The current study employed 75 blood samples from 7 bottlenose dolphins including clinically healthy controls and individuals with a variety of different body conditions, which has been suggested for facilitating optimal reference gene selection for a wide-range of whole blood transcript studies 37 . In addition, the analyses were conducted using randomly selected subsets of 35 and 55 samples for comparative purposes. Analyses of the 35-and 55-sample subsets using RefFinder also identified HPRT1, PGK1 and RPL4 as being the high-ranking genes, only differing in the ranking order. It indicated that a 35-bottlenose dolphin blood sample set with various body conditions could establish reliable HKGs as reference genes. This is the first comparison of sample size effect on reference gene selection to our knowledge. It should be noted that the reference gene identified here is for use in clinical bottlenose dolphin testing. For non-clinical dolphin research, the potential reference gene should be verified first for each experimental condition. Employing a similar approach in other cetacean species in the future would be both time and cost saving.  Table 2). The majority of sequences were obtained from bottlenose dolphin and striped dolphin, while a few were based upon beluga whale, killer whale and fin whale (Balaenoptera physalus). Primers and corresponding UPL probes were designed using Roche UPL design software Scientific RepoRts | 5:15425 | DOi: 10.1038/srep15425 (ProbeFinder, v.2.49) ( Table 2). Primer specificity of the 15 candidate genes was confirmed by PCR using Fast-Run Hotstart PCR kit (Protech) and electrophoresis.

Methods
Quantitative PCR. Quantitative PCR was conducted in 48-well reaction plates using the Eco Real-Time PCR System (Illumina, San Diego, CA, USA). Reactions were prepared in a total volume of 10 μ L containing 3 μ L of 12-fold-diluted cDNA, 0.4 μ L of each 10 μ M primer, 0.2 μ L of UPL probe (Roche, Pleasanton, CA, USA), 5 μ L FastStart Essential DNA Probes Master (Roche) and 1 μ L of RNase/ DNase-free sterile water (Protech). The thermocycle conditions were set as follows: polymerase activation at 95 °C for 10 min, followed by 45 cycles of denaturation at 95 °C for 10 s and combined primer annealing/elongation at 60 °C for 30 s. All reactions including no template controls (NTC) and plate controls were conducted in triplicate. Plate controls contain the same reaction components on every plate. Cq data was analyzed with EcoStudy software (Illumina). A consistent Cq value across plates was obtained allowing the data consolidation from multiple plates into a single study data set. Baseline values were automatically determined for all plates using Eco Software V4.0. Thresholds for each HKG were determined manually ( Table 2). Triplicate Cq values with standard deviation (SD) < 0.5 were averaged as raw Cq values. PCR amplification efficiency (E) and R 2 for each probe and primer pair were calculated from the slope of a standard curve using the following equation: E = (10 (−1/slope) − 1) × 100%. The average of at least three E values for each HKG was used as a gene-specific E for following relative quantity transformation. This study was conducted according to MIQE (Minimum information for publication of quantitative real-time PCR experiments) guidelines 38 .

Data analysis.
Corrected Cq values (Cq corr) were transformed from raw Cq values using Δ Cq formula, Cq corr = Cq min − log 2 E −ΔCq , modified from Fu et al. 39 , where Δ Cq is the Cq value of a certain sample minus the Cq value of the sample with the highest transcript level (lowest Cq, Cq min ) of each HKG. Stability of all HKGs were evaluated and ranked using algorithms geNorm 3 , NormFinder 26 , comparative Δ Ct method 40 and Bestkeeper 41 using the web-based analysis tool RefFinder (http://www. leonxie.com/referencegene.php) 42 . Algorithm geNorm calculates the expression stability value for each gene and them performs a pair-wise comparison of this gene with the others. NormFinder ranks the set of candidate reference genes according to the least of their estimated variations. Comparative Δ Ct method compares relative transcription of pairs of genes and the stability of candidate reference genes is ranked according to repeatability among all samples. BestKeeper determines the standard deviation and the genes are rated based upon variability. RefFinder calculated the geometric mean based upon rankings obtained from each algorithm and provides the final comprehensive ranking. Thirty-five and 55 samples were randomly selected from the original 75 samples, and the HKG ranking results were compared among 35-, 55-and 75-sample groups.