Validation of reference genes for use in untreated bovine fibroblasts

Proper normalization of RT-qPCR data is pivotal to the interpretation of results and accuracy of scientific conclusions. Though different approaches may be taken, normalization against multiple reference genes is now standard practice. Genes traditionally used and deemed constitutively expressed have demonstrated variability in expression under different experimental conditions, necessitating the proper validation of reference genes prior to utilization. Considering the wide use of fibroblasts in research and scientific applications, it is imperative that suitable reference genes for fibroblasts of different animal origins and conditions be elucidated. Previous studies on bovine fibroblasts have tested limited genes and/or samples. Herein, we present an extensive study investigating the expression stability of 16 candidate reference genes across 7 untreated bovine fibroblast cell lines subjected to controlled conditions. Data were analysed using various statistical tools and algorithms, including geNorm, NormFinder, BestKeeper, and RefFinder. A combined use of GUSB and RPL13A was determined to be the best approach for data normalization in untreated bovine fibroblasts.

and Waldenström, ACTB and GAPDH continue to be the most widely used RGs, with 62% of studies using one of these genes as the single normalizing RG, of which only 8% did the due diligence of evaluating stability prior to use 21 . This is despite evidence showing that the expression stability of both ACTB [22][23][24][25][26] and GAPDH 24,[27][28][29] are not impervious to experimental conditions. A further examination of RG stability validation studies showed a significant (P < 0.001) and inverse relationship between the number of RGs screened and the probability that ACTB, GAPDH, or 18S rRNA were selected for normalization; where the likelihood of these three genes being selected is significantly decreased when more RGs are tested 21 . The importance of proper RG selection has been demonstrated by numerous studies reporting significant differences in results owing to normalization against RGs of varying stability 23,30,31 . It is unlikely that any universal RGs exist, and it is therefore important for the suitability of RGs to be experimentally validated for a given set of samples and conditions prior to use.
Optimal RGs have been reported for RT-qPCR use in various bovine cells 25,26,29,[32][33][34] , including fibroblasts 31,35 . Interestingly, Zhou et al. reported larger variation when only one RG (ACTB) was used to calculate target gene expression compared to two RGs 31 . However, limitations in these studies, including small sample size (n = 1) 35 , number of RGs tested (5) 31 , and the use of at most two RG determination algorithms 31 , may have impacted the identification of optimal RGs for bovine fibroblast cells. The current study addresses a gap in knowledge concerning suitable RGs for use in RT-qPCR studies investigating untreated bovine fibroblast cell lines. We evaluated the expression stability of 16 candidate RGs across 7 untreated bovine fibroblast cell lines grown under controlled conditions and standardization by morphology and growth kinetics. Candidate genes were selected based on an extensive review of the literature (Supplementary Table S1) and included RGs previously described in fibroblast and/or bovine studies, as well as the classical RGs (Table 1). Special consideration was given to select genes from various pathways and functional classes to avoid co-regulation. Data were analysed using the most common RG determination methods: geNorm 36 , NormFinder 37 , BestKeeper 38 , and RefFinder 39 , which integrates the algorithms of the first three methods, as well as delta Ct 40 . The algorithms for each differ and, as such, their combined application should improve confidence in the selection of ideal candidate RGs when there is congruence. To our knowledge, this is the most extensive validation of RGs for use in untreated bovine fibroblasts undertaken to date.

Range of fluctuation of RGs determined by box-and-whiskers plot.
An extensive evaluation of the literature describing the use of RGs, including those specific to fibroblast and/or bovine cells, culminated in the selection of 16 candidate RGs for inclusion in this study. Following RT-qPCR, gene expression variability (Ct) was assessed for each of the candidate genes ( Fig. 1). As a measure of stability, the range of fluctuation of each gene was determined by finding the difference between 25 and 75th percentiles, the interquartile range. YWHAZ, PPIA and HSP90AB1 displayed the greatest variability, with ranges of fluctuation of 0.5798, 0.4909, and 0.4500 respectively, while the most stable genes were RPL13A, GUSB and TBP with ranges of 0.2368, 0.2442, and 0.2691, respectively. geNorm analysis of RG stability. The stability of the candidate RGs was analysed using four statistical methods. The algorithm, geNorm, is a comprehensive tool, performing an initial measure of expression stability (M) value assessment for all genes, followed by a stepwise exclusion analysis, wherein the least stable gene is eliminated, and the remaining genes are reassessed for M values. Although M values for all genes fell within range for inclusion (M < 0.5), B2M, HPRT1, and RAD50 had the lowest stability while GUSB, RPL13A, and www.nature.com/scientificreports/ SDHA had the highest stability (Fig. 2a). Pairwise variation (V) values were also calculated to determine the suitable number of RGs required for untreated bovine fibroblast gene expression studies using RT-qPCR (Fig. 2b). Low pairwise variation (V) value for V2/3 = 0.0382 indicated that the two most stable RGs (RPL13A and GUSB) were suitable for data normalization and that adding a third reference gene would not provide additional value 41 . Though geNorm shows bias in selecting co-regulated genes, RPL13A and GUSB are not co-regulated (see Supplementary Tables S2 and S3), further corroborating their suitability as reference genes.
NormFinder analysis of RG stability. Gene stability rankings were compared between the Microsoft Excel plug-in and R version of NormFinder (see Supplementary Table S4). While the former generated stability values, the latter computed GroupSD values for individual genes and combinations of two genes to only two decimal places, precluding the determination of gene rankings with certainty. However, rankings of both versions generally coincided: based on the output, B2M, HPRT1, and RAD50 showed the lowest stability, while HMBS, GUSB, and TBP had the highest stability. The most stable combination of genes was GAPDH and YWHAZ (0.03),  www.nature.com/scientificreports/ followed closely by ACTB and SF3A1, GAPDH and SF3A1, GUSB and PPIA, and HMBS and RPL13A (0.04) ( Table 2).

BestKeeper analysis of RG stability.
BestKeeper generated a set of descriptive statistics (Table 3), where genes falling short of certain cut-offs (standard deviation [± Ct] > 1 and/or standard deviation [± x-fold] > 2) were omitted from the subsequent BestKeeper index calculation that determines coefficients or correlation [r] and p values ( Table 4). All genes satisfied the initial criterion and were kept in the subsequent analysis. Based on both standard deviation values (± Ct/ ± x-fold), YWHAZ, PPIA and SF3A1 displayed the lowest stability, while TBP, SDHA and RPL13A had the highest stability. Using the coefficient of correlation [r] value, GAPDH, SF3A1 and RPL13A displayed the lowest stability while ACTB, YWHAZ and PPIA showed the highest stability.

RefFinder analysis of RG stability.
RefFinder assesses gene stability based on the algorithms of geNorm, NormFinder, BestKeeper, and delta Ct, and generates a comprehensive ranking based on the geometric mean of ranking values derived from the four other algorithms. As RefFinder does not specify input requirements, results derived from the use of median and mean Ct values were compared. Remarkably, for all four imbedded algorithms and the comprehensive ranking, the top four most stable genes were the same between the median Ct and mean Ct datasets, albeit in slightly different orders for geNorm, NormFinder and the comprehensive rankings.   Overview of rankings from algorithms. The results generated by the box-and-whiskers analysis, geNorm, NormFinder, RefFinder, and BestKeeper have been summarized in Table 5.

Discussion
The selection of suitable RGs for RT-qPCR data normalization is pivotal to the integrity and reliability of results 18 . The present study aimed to identify stably expressed RGs for use in RT-qPCR normalization of untreated bovine fibroblasts by assessing a panel of 16 candidate RGs using numerous algorithms and statistical tools. Several considerations were made in selecting the most optimal RG in this instance. Overall, rankings by different algorithms were generally in agreement, except for BestKeeper, which consistently generated different outcomes. The differences in BestKeeper outcomes, as compared to other RG determination algorithms, may be explained, at least partly, by the use of raw Ct values as input (as compared to the RQ values required by geNorm and NormFinder). Rydbirk and colleagues offer an alternate explanation, referring to the BestKeeper Index, a tool unique to the eponymous software, as the source of this discrepancy in outcome 42 . The BestKeeper Index is calculated from the geometric mean of all RGs satisfying the initial exclusion criterion (standard deviation [± Ct] > 1) and is subsequently used to generate Pearson correlation (r) values (coefficient of correlation) for each RG in comparison to this index, thereby determining expression stability. Setting aside BestKeeper results, some genes displayed consistently poor rankings (B2M, HPRT1, HSP90AB1, RAD50, RSP18, SF3A1, and UBC), while others consistently showed higher expression stability and the tightest range of Ct values (GUSB, HMBS, TBP, and RPL13A). A summary of rankings can be found in Table 5. Regarding similarities between the same algorithms used in different software, both versions of NormFinder were largely in agreement. RefFinder outputs and those of their original counterparts were also highly similar (other than BestKeeper, since RefFinder ranks RGs according to SG, and not the BestKeeper Index). Results from the mean and median were comparable, with small differences observed mostly between genes that do not show much difference in stability ranking. www.nature.com/scientificreports/ Considering each algorithm uses a different approach and evaluates different parameters when determining RG expression stability, it is expected that variations in the results will be observed, though many studies have reported general consensus between algorithms 43 . As such, difficulties arise when trying to reconcile different rankings and several considerations must be made in selecting optimal RG(s), including the (i) overlap of rankings by various algorithms, (ii) strengths and weaknesses of each algorithm, as applicable to the experimental design of each study, and (iii) use of different complementary tools for merging results into a comprehensive ranking. GeNorm's pairwise correlation is known to be a robust algorithm for studies with small sample sizes but shows bias towards selecting RGs that are co-regulated. NormFinder's model-based approach confers the advantage of differentiating intragroup variation from intergroup variation, and as such, is a suitable tool for the assessment of RGs in experiments with different sample groups, but requires larger sample sizes (> 8) in comparison to geNorm 37,43,44 . Different tools have been reported by investigators to merge results of different algorithms into a comprehensive ranking, the most well-known of which is RefFinder 39 . Concerns have been raised about the validity of this software mainly based on the fact that RefFinder does not integrate primer efficiencies in calculations, and that its BestKeeper output is solely based on the initial standard deviation [± Ct] values and not the BestKeeper Index. Another limitation of RefFinder is that results of the four methods are not weighed given the unavailability of their cut-offs and applicable weights 45 . Lastly, cycle threshold coefficient of variation (Ct CV%) 46 can also be used to determine comprehensive ranking whereby a lower Ct CV% is indicative of higher stability and is one of the parameters found in the initial descriptive statistics generated by BestKeeper. Though this presents a simple approach in assessing RG stability, Ct CV% should be considered carefully, as it is not always inversely correlated to the degree of stability of a given gene 47 . Selection of the most suitable algorithm for a given experimental design must take into account the different strengths and weaknesses as described above. In our case, given our small sample size, geNorm was the most suitable algorithm, but the results from all algorithms were taken into consideration before selecting the optimal RGs.
The panel of genes evaluated showed good stability (e.g. all genes satisfied geNorm's M value criterion of 0.5, NormFinder's standard deviation [± Ct] of 1, and geNorm's V value criterion of 0.15), compared to the range of stabilities seen for candidate RGs assessed in other studies 26,32 . This is to be expected due to the set of standardized untreated fibroblasts used, with average M values between 0.2-0.5 typically being observed when assessing candidate RGs across a homogeneous sample set 41 . GUSB showed the most consistently stable expression amongst all RGs, ranking first or second in 12 of the 16 rankings. The use of a single RG, however, is discouraged and can create variability in the expression of the target genes independent of experimental treatments 31 , including more than threefold deviation from true values in 25% of cases, and at least sixfold in 10% of cases 36 . The appropriate number of RGs required must be experimentally validated. GeNorm determines the number and combinations of genes showing stability while NormFinder evaluates all possible pairs of genes. In our study, according to geNorm, the best approach was a combined use of GUSB and RPL13A, which was corroborated by both versions of NormFinder, where this combination, though tied for 9th in rankings, showed greater stability than the number 1 ranked individual genes HMBS and GUSB (Group SD of 0.06 vs 0.08, respectively). Further investigation of GUSB and RPL13A revealed no-coregulation between these two genes, an important consideration when selecting RGs 37,43,44 . Individually, GUSB and RPL13A show high and medium/medium-high expression stabilities, respectively. It is their combined expression (geometric mean of expression) that shows high stability Table 5. Compilation of rankings of candidate RGs by all statistical tools and algorithms. *The six genes most frequently ranked with the lowest stability were omitted from analysis using BestKeeper, which is limited to evaluating only 10 genes. www.nature.com/scientificreports/ and was therefore determined as the optimal approach for data normalization in untreated bovine fibroblasts. These two genes also display the smallest range of Ct values (Fig. 1). Previous RG validation in bovine fibroblasts concluded the combined use of ACTB and YWHAZ as the most suitable data normalization approach (using geNorm and NormFinder), and showed that data normalization against one RG (ACTB; which showed high expression stability within the genes assessed) can yield results deviating from true values 31 . However, this study only assessed five candidate genes using RNA from treated (serum-starved cells) and co-regulation between ACTB and YWHAZ did not appear to be considered 31 . The information provided here addresses a gap in knowledge concerning suitable reference genes for use in RT-qPCR studies of untreated bovine fibroblast cell lines, a commonly studied species. This type of validation, though essential, is costly and time consuming. To this end, the current study was designed as the most extensive RG validation in untreated fibroblast cell lines to date. Our results provide targeted information for future RT-qPCR studies, alleviating the researcher of the burden of repeating the validation. For example, the results presented herein were used to normalize gene expression when comparing inherent expression between various bovine fibroblast cells lines 48 . The limitation of this paper, however, is that any deviations applied in future studies (i.e. different experimental treatments, cell culture conditions), will warrant novel validation of suitable reference genes. However, investigators can use the reference genes found suitable herein as a starting point, or broaden their evaluation to include any of the 16 candidate genes carefully selected from the literature and further tested for co-regulation. Finally, we also highlight the varying strengths and weaknesses of the most published algorithms, and recommend using a multitude of approaches when evaluating gene stability to ensure accurate RT-qPCR normalization under your given experimental conditions.
We conclude that the combined use of GUSB and RPL13A presents the best approach in normalizing RT-qPCR data in untreated bovine fibroblasts. Though these genes have been less frequently used for normalization, they have previously been shown as suitable RGs in human adipose tissue-and Wharton's Jelly-derived mesenchymal stromal cells (RPL13A) 24 , human nasal epithelium cells (GUSB) 49 , human papillary thyroid carcinoma (GUSB) 50 , human dermal fibroblasts and mammary epithelial cells (GUSB) 51 , rat bone mesenchymal stem cellderived neuronal cells (RPL13A) 52 , and treated human ovarian cancer cells (RPL13A) 53 .

Materials and methods
Tissue processing and fibroblast cell culture. Ears from healthy, age-matched (< 30 mos.) Angus bulls were removed post-mortem at the local abattoir, covered with sterile saline and transported to the lab for processing within 4 h of collection. Tissue samples (n = 7) were processed, and fibroblast cells cultured as previously described 54 with minor modifications. Previous experiments have shown comparable outcome between sample sizes of n = 11 and n = 6 43 . Given the homogeneity of the samples (i.e. untreated bovine fibroblasts), it was deemed that n = 7 was adequate for the determination of suitable reference genes. All materials and reagents were obtained from Thermo Fisher Scientific (Mississauga, ON, Canada) unless otherwise noted. Briefly, tissue samples spanning the entire thickness of the ears were collected using an 8 mm biopsy punch, the skin was discarded, and the remaining cartilage was manually dissected into 1 mm 2 pieces before being collagenase-digested (Sigma-Aldrich, Oakville, ON, Canada) for 5 h at 38.5 °C and 5% CO 2 . Cells and tissue fragments were pelleted by centrifugation at 200 rcf for 5 min, resuspended in Dulbecco's Modified Eagle's Medium (DMEM; Sigma-Aldrich, Oakville, ON, Canada) supplemented with 20% fetal bovine serum (FBS; Sigma-Aldrich, Oakville, ON, Canada) and 1% antibiotic-antimycotic (ABAM), and cultured at 38.5 °C and 5% CO 2 in 25 cm 2 tissue culture flasks. Media were partially and fully replaced on days 4 and 6, respectively, until confluency, at which point cells were passaged at a dilution of 1:4 into DMEM containing 10% FBS and 1% penicillin-streptomycin (P/S). After 2 passages (P3), confluent cell lines were trypsin-harvested (0.25% trypsin-EDTA), diluted 1:4 in freezing medium containing DMEM supplemented with 20% FBS, 10% dimethyl sulfoxide (DMSO; Sigma-Aldrich, Oakville, ON, Canada), and 1% P/S, and cryopreserved. For consistency, only one ear was processed per day and all steps were conducted by the same operator. Cell lines that did not reach confluency within 9-days from the initiation of culture or contained > 10% cells with abnormal morphology (assessed visually) were discarded. All cell lines were authenticated for cell type homogeneity, chromosome normality (2n = 60) and sex, as described by Toorani et al. 48 . Inclusion and exclusion criteria were pre-established.
RNA extraction and cDNA synthesis. Fibroblasts were thawed, cultured to confluency (P4), pelleted, and frozen at − 80 °C for RNA extraction using a Total RNA Purification Kit (Norgen Biotek Corp., Thorold, ON, Canada) following manufacturer's instructions, including an on-column gDNA removal step (RNase-Free DNase I kit, Norgen Biotek Corp., Thorold, ON, Canada). Isolated RNA was quantified using the Qubit RNA HS Assay Kit. RNA integrity was assessed visually (28S and 18S rRNA) following separation on 1% E-GEL, EX-Agarose gels using the Invitrogen E-Gel Power Snap Electrophoresis System.
Reverse transcription (RT) was performed using 1000 ng of RNA, and in accordance with, Norgen Biotek Corp. 's TruScript First Strand cDNA Synthesis Kit (Thorold, ON, Canada). Samples were stored at − 20 °C before proceeding to reverse transcription quantitative real-time PCR (RT-qPCR).
Selection of candidate genes. A panel of 16 candidate RGs was selected based on an extensive review of the literature (see Supplementary Table S1). Classical RGs were chosen, as well as others used or validated for use in bovine or fibroblast cells. Special consideration was taken to select genes from different functional classes and pathways to avoid co-regulation, which may skew the output of some of the algorithms used in RG analysis 37 55 based on data available for humans, mice, and rats from Affymetrix, Illumina, and Agilent (see Supplementary Tables S2 and S3). To correct for interplate variation one sample was repeated on all plates. All Qiagen RT 2 primers have been previously validated and optimized to work at uniform and near perfect efficiency to allow for simultaneous analysis of multiple genes in arrays. Primers are designed using proprietary algorithms and are subjected to rigorous testing for high performance: the average amplification efficiency of a representative set of assays for 4000 genes used in RT 2 Arrays was shown to be 99%, with a 95% confidence interval about the mean between 90 and 110% 56 . Therefore, a base of 2 was used for relative quantity calculations 38,43,57 . Data analysis. Ct values and baseline corrections to account for inter-plate gene expression variation were calculated using Thermo Fisher Connect (Thermo Fisher Scientific, Applied Biosystems, Foster City, CA, USA).

Reverse Transcription Quantitative Real-time PCR (RT-qPCR
Relative quantification values (RQ) were calculated using RQ = E (min Ct − sample Ct) where E, the amplification efficiency, is 2 (assume 100% efficiency for commercially designed primers) 38,43,57 , min Ct is the sample with the lowest Ct value (highest expression) and sample Ct is the Ct of each sample. This will result in a set of relative quantities between 1 and 0, where the sample with the lowest Ct is 1. The calculation (min Ct-sample Ct) is also referred to as ΔCt.
The calculated stability of candidate RGs and their relative ranking was compared across four statistical methods, geNorm, NormFinder, RefFinder, and BestKeeper. Calculated RQ of mean Ct values were input in geNorm (Version 3.5) to determine expression stability (M) for each gene, as well as the optimal number of RGs required by comparing pairwise variation (V) 36 . Accepted cut-off values are M < 0.5 and V < 0.15; when the Vn/ Vn + 1 ratio is less than 0.15, it indicates that the optimum number is n 36 . Two versions of NormFinder, Microsoft Excel add-in (Version 0.953) and NormFinder for R (Version 5, 2015-01-05), were evaluated using RQ values of median Ct as input for both versions to calculate stability for individual genes and gene combinations (https:// moma. dk/ normfi nder-softw are) 37 . RefFinder (https:// www. heart cure. com. au/ reffi nder) 39 evaluates expression stability based on the algorithms of geNorm, NormFinder, BestKeeper, and delta Ct, generating a comprehensive ranking based on the geometric mean of ranking values. Input data format (median or mean Ct values) was not specified and so results from both methods were evaluated. Finally, BestKeeper (Version 1) requires input of raw median Ct values to determine the expression stability by using pairwise correlation analysis of all pairs of candidate genes 38 . Standard deviations, coefficients of correlation [r] and p-values determined for each gene are used as measures of RG stability, with coefficients of correlation being the superior metric. BestKeeper allows for the assessment of up to 10 candidate RGs, therefore, the 6 genes most often ranking poorly according to the box-and-whiskers assessment and algorithms geNorm, NormFinder and RefFinder (UBC, RAD50, HPRT1, B2M, HSP90AB1, and RPS18) were omitted from this analysis.