Endogenous reference RNAs for microRNA quantitation in formalin-fixed, paraffin-embedded lymph node tissue

Lymph node metastasis is one of the most important factors for tumor dissemination. Quantifying microRNA (miRNA) expression using real-time PCR in formalin-fixed, paraffin-embedded (FFPE) lymph node can provide valuable information regarding the biological research for cancer metastasis. However, a universal endogenous reference gene has not been identified in FFPE lymph node. This study aimed to identify suitable endogenous reference genes for miRNA expression analysis in FFPE lymph node. FFPE lymph nodes were obtained from 41 metastatic cancer and from 16 non-cancerous tissues. We selected 10 miRNAs as endogenous reference gene candidates using the global mean method. The stability of candidate genes was assessed by the following four statistical tools: BestKeeper, geNorm, NormFinder, and the comparative ΔCt method. miR-103a was the most stable gene among candidate genes. However, the use of a single miR-103a was not recommended because its stability value exceeded the reference value. Thus, we combined stable genes and investigated the stability and the effect of gene normalization. The combination of miR-24, miR-103a, and let-7a was identified as one of the most stable sets of endogenous reference genes for normalization in FFPE lymph node. This study may provide a basis for miRNA expression analysis in FFPE lymph node tissue.

this reason, appropriate reference miRNAs should be selected for specific experiments, target cells, or tissues of interest. A stable endogenous reference gene should have high and constant level of expression in all FFPE tissue samples.
In this study, we investigated candidate miRNA reference genes in lymph nodes. Lymph node biopsy is generally performed to diagnose lymph node swelling, particularly when malignancy is suspected, and this procedure is also employed to determine the tissue type of metastatic carcinoma. To date, a miRNA expression profile can be used to identify the tissue of origin of carcinoma of unknown primary thanks to its tissue specificity 8,9 .
The present study aimed to find specific miRNAs that could serve as reliable and reproducible endogenous reference for FFPE lymph node tissue and to promote the further application of miRNA analysis. First, 10 miRNAs were selected as endogenous reference gene candidates from 71 small RNAs (Table 1). These consisted of 8 genes (miR-16, miR-24, miR-103a, miR-191, let-7a, U6 snRNA, SNORD44, and SNORD48), which are commonly used as reference in cancer studies 3,7,10-12 ; 48 cancer-associated genes which have been used to identify cancer tissue origin 8 ; and 15 genes of interest, which may be involved in cancer development. Then, the expression stability of the candidate genes was assessed using four statistical tools: BestKeeper 13 , geNorm 14 , NormFinder 15 , and the comparative ΔCt method 16 . Finally, stable gene sets were evaluated by comparing them with other normalization factors such as small nucleolar RNAs.

Results
Candidate genes and stability analysis. Fifty-one out of 71 genes, which were expressed in more than 90% of the samples, were employed to calculate the internal reference value for the comparative ΔCt method with the global mean normalization strategy (ΔCt Glo ) ( Table 1). The top 10 endogenous reference gene candidates selected by the low variability of the ΔCt Glo values were miR-16, miR-21, miR-24, miR-34a, miR-92a, miR-103a, miR-148b, miR-152, miR-191, and let-7a. The distribution of raw Ct values of the 10 candidate genes in control and cancer groups is shown in Fig. 1. After subdivision on the basis of the location of primary tumor, a specific trend was not observed on raw Ct values.
The expression levels of these candidate genes were evaluated using BestKeeper analysis in all samples ( Table 2). Among all candidate genes, miR-103a exhibited the best correlation between the BestKeeper Index and the candidate gene (r = 0.963, p ≤ 0.001). This BestKeeper Index was calculated using only seven genes excluding miR-16, miR-21, and miR-148b owing to their unacceptable variability (SD > 1.05). BestKeeper analysis determined that miR-103a was the most stable gene by combining a low standard deviation and a high correlation.
Using geNorm analysis, all 10 candidate genes had an M value below the recommended threshold of 1.5 (Fig. 2a). This analysis indicated that miR-24 and miR-103a were the most stable genes among the 10 candidate genes (M = 0.699). However, a highly stable gene should have an M value below reference value of 0.5. geNorm  also identified the optimal number of reference genes in terms of the pairwise variation (V) between normalization factors. As shown in Fig. 2a, the use of the five most stable genes was recommend for optimal performance (V value < 0.15) in our study. NormFinder was used to calculate stability values derived from the intra-and inter-group variability. This analysis revealed that among the 10 candidate genes, miR-103a was the most stable gene (Stability value = 0.114) in sample groupings from metastatic cancer and non-cancerous tissues (Fig. 2b). The best combination recommended by NormFinder was a set of miR-148b and miR-152 (Stability value = 0.087).
The comparative ΔCt method was used to identify the variability of all possible gene combinations within each sample. The stability of a gene was assessed by the mean of the standard deviations of the ΔCt values (ΔCt Pair ) over the pairs including the gene. As shown in Fig. 2c, miR-103a was identified as the most stable gene (Mean SD = 0.931) in all the samples, followed by miR-24 (Mean SD = 0.966) and miR-148b (Mean SD = 1.000).
The ranking assessment of expression stability by the four analyses is reported in Table 3. Combining the results of the four analyses, the ranking assessment of expression stability using the geometric mean suggested that miR-103a was the most stable gene, followed by miR-24 and let-7a. Stability analysis of the combined genes. NormFinder selected miR-148b and miR-152 as the best combination. In addition, based on the stability rankings shown in Table 3, we chose the combination of two (miR-24/miR-103a) or three (miR-24/miR-103a/let-7a) genes. The stability analysis of these gene sets was   performed with the following normalization factors: global mean, miR-16, miR-191, miR-16/miR-345, miR-16/ let-7a, and U6/SNORD44/SNORD48. Because U6, SNORD44, and SNORD48 exhibited very low stability compared with the five normalization factors, they were combined to avoid bias in the results. Briefly, U6, SNORD44, and SNORD48 were consistently ranked (8 th, 7 th, and 6 th, respectively) as the least stable genes by the four   stability analyses, e.g., BestKeeper indicated a high standard deviation for U6 (1.837) and SNORD44 (1.315) and a low correlation for SNORD48 (r = 0.538).
The overall results of the stability analysis obtained using the statistical tools showed that miR-24/miR-103a/ let-7a ranked as the most stable gene combination, whereas U6/SNORD44/SNORD48 was the least stable combination ( Table 5).
Effects of normalization on relative quantification. We investigated the impact of normalization on the expression of miR-29c in a metastatic lymph node with colon cancer in comparison with that in a lymph node without cancer. Figure 4 shows the expression of miR-29c normalized to each of the five normalization factors: miR-24, miR-103a, miR-24/miR-103a, miR-24/miR-103a/let-7a and U6/SNORD44/SNORD48.
The expression of miR-29c normalized to miR-24/miR-103a/let-7a significantly decreased in a metastatic lymph node with colon cancer in comparison with that in a lymph node without cancer (p < 0.05). However, this difference was not detected when the expression of miR-29c was normalized to the other four factors.

Discussion
In this study, we presented the use of the mean Ct value of miR-24, miR-103a, and let-7a as a suitable normalization factor for miRNA expression in FFPE lymph node tissue using qRT-PCR.
qRT-PCR is the gold standard method for miRNA quantitation because it is the most sensitive and reproducible method. However, the accuracy of results depends on the selected normalization factor. To obtain accurate data using qRT-PCR, many studies have identified stable reference genes in tissues other than lymph node, such as SNORD48 for atrial or organ tissue samples 17,18 , miR-16/let-7a for breast tissue samples 10 , and miR-16/miR-345 for colorectal tissue samples 19 . Moreover, the use of a single reference gene, unless fully validated, is insufficient to obtain reliable miRNA expression 20,21 .
In fact, we had planned to identify the origin of the primary tumor using miRNA expression patterns in metastatic lymph node cells using qRT-PCR at first. However, we noticed that an endogenous reference gene for miRNA expression analysis had not been identified in FFPE lymph node with metastatic cancer. Prior to miRNA expression analysis, a suitable endogenous reference gene should be evaluated to avoid misinterpretation of data and identify true changes in miRNA expression levels 7 . Therefore, we validated that the combination of three miRNAs could be used as a suitable endogenous reference in a lymph node with metastatic cancer for qRT-PCR. This miRNA combination is not surprising because miR-24, miR-103a, and let-7a have been reported to exhibit a high expression stability in tumor tissue 3,10-12 . This suitable set of endogenous reference genes was determined based on the combination of four statistical approaches: BestKeeper, geNorm, NormFinder, and the comparative ΔCt method. These statistical analyses have been developed to identify optimal endogenous reference genes in a given set of samples. For the selection of candidate reference genes, we used the global mean normalization across 51 genes (the comparative ΔCt Glo method). The global mean normalization can be used when a large number of miRNAs (typically more than 50) are analyzed in a sample. In addition, this normalization assumes that the mean   13 , whereas geNorm uses a pairwise comparison approach and assumes that the expression ratio of two ideal endogenous reference genes is identical in all samples, regardless of experimental conditions 14 . geNorm also calculates the optimal number of reference genes. It recommended the use of five reference genes for optimal normalization performance. However, we restricted our analysis to the combination of up to three genes, because we considered that the use of a larger number of reference genes may be impractical in further studies, such as the identification of tissue origin in cancers of unknown primary. NormFinder can account for heterogeneity among sample groups as it estimates intra-and inter-group variability 15 . In addition, NormFinder also allows the calculation of the stability values for a set of genes. It is important to note an imbalance of the number of samples between the control and the cancer groups. This imbalance may have an effect on NormFinder analysis, where   balanced populations are generally requested. The comparative ΔCt Pair method compares relative expression of 'pairs of genes' within each sample for all candidate gene combinations 16 .
The stability of selected gene sets were evaluated using the other normalization factors (i.e., global mean, miR-16, miR-191, miR-16/miR-345, miR-16/let-7a, U6/SNORD44/SNORD48). These factors are commonly used as reference for miRNA quantitation in cancer tissue 7,10,12,19 . Based on the initial analysis, we combined U6, SNORD44, and SNORD48. These three genes are well-known reference genes in miRNA analysis, however, their use has been reported to involve high variability in some experimental conditions 12 .
Consequently, these multiple analyses revealed that the combination of miR-24/miR-103a/let-7a was the most stable endogenous normalization factor in our experimental conditions. miRNA expression stability must be carefully assessed in each specific experimental setting, because it is also possible that these miRNAs might be also associated with lymph node metastasis. In fact, the aberrant expression of miR-24 (higher expression) and let-7a (lower expression) in breast cancer is shown to play a key role in tumor invasion and metastasis 22,23 . Therefore, a combination of three miRNAs has an important meaning to avoid erroneous results.
We also quantified the expression levels of miR-29c in a metastatic lymph node with colon cancer to evaluate the normalization efficiency by four different endogenous factors. A significant downregulation of miR-29c expression was observed in the metastatic lymph node when data were normalized to the combination of miR-24/miR-103a/let-7a. The downregulation of miR-29c expression has been reported in various cancers including colon cancer 24,25 . Although the function of miR-29c expression in colon cancer has not been clarified, miR-29c exhibits a tumor suppressor function in several types of cancer by targeting TNFAIP3 and cyclin E, thereby inhibiting tumorigenesis and metastasis [25][26][27] . Thus, the use of the other normalization factors, which did not reveal the significant downregulation of miR-29c, should be discouraged for FFPE lymph node studies because it may lead to incorrect results.
To our knowledge, this is the first study to determine reference genes in FFPE lymph node tissues from patients with different kinds of cancer. Our result suggests the importance of assessing the gene stability for each experimental condition.
We acknowledge some limitations of the present study as follows: the relatively small sample size; the qRT-PCR assay based on SYBR Green, which is not so specific as the TaqMan method; the absence of replicates in the determination of Ct values; and the fact that the analysis of combined genes involved repeated elements (e.g., miR-24/miR-103a and miR-24/miR-103a/let-7a) with potential correlations among sets, which may have affected the analyses. Although this study assessed the stability of 71 genes including the most commonly used reference genes, more suitable reference gene combinations may be identified in the future. Our results should be applied to FFPE lymph node tissue in humans. Therefore, it does not eliminate the use of common reference genes in other experimental conditions.

Conclusion
We identified a suitable endogenous reference that can be used to study miRNA expression in FFPE lymph node tissue from patients with metastatic cancer. This result will provide valuable information for future miRNA expression studies in FFPE lymph node tissue samples with metastatic cancer.

Methods
Tissue samples and ethical statement. FFPE lymph node tissue samples were obtained from 41 metastatic cancer and 16 non-cancerous tissues at Ibaraki Prefectural Central Hospital, Japan. The characteristics of the samples are presented in Table 6.
This study was approved by the institutional review board of Ibaraki Prefectural Central Hospital, Japan. Informed consent was obtained for the use of archived clinical specimens, and all experimental methods were performed in accordance with the relevant guidelines and regulations.
Total RNA isolation from FFPE lymph node tissue. Total  Quantitative real-time PCR. Total RNA (10 ng) isolated from FFPE tissue samples were reverse transcribed using miRCURY LNA Universal RT cDNA synthesis kit II with RNA spike-in (Exiqon) in a 10 µL reaction volume for 60 min at 42 °C and 5 min at 95 °C, according to the manufacturer's instructions. The reverse transcription product (cDNA) was diluted 100-fold and quantified by qRT-PCR using an ExiLENT SYBR Green master mix (Exiqon). qRT-PCR was performed in custom-made 96-well Pick & Mix microRNA PCR panel plates (Exiqon) and a 7500 Fast Dx Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). This PCR panel plates consisted of the 71 candidate genes (a single well for each gene) and the following quality controls: UniSp3 (triplicate), UniSp6 (single), and cel-miR-39 (single). The PCR protocol was applied as follows: incubation for 10 min at 95 °C, followed by 40 cycles of 10 s at 95 °C and 1 min at 60 °C, with a final melting curve analysis. The Ct values for qRT-PCR were determined using the SDS v1.4 software (Applied Biosystems) and the single-threshold method. Ct values that displayed unusual amplification curves (e.g., low amplification efficiency) were excluded from further analysis.
Selection of candidate genes and stability analysis. Ten candidate genes were selected based on the comparative ΔCt method with the global mean as the endogenous control. The global mean value was calculated as the mean Ct value of the genes detected in a sample. Only highly expressed genes (expressed in more than 90% of the samples) were selected for the global mean method. For these genes, the normal distribution of Ct values was tested by Shapiro-Wilk test. Global mean normalization has been demonstrated to be one of the highly accurate approaches for miRNA expression analysis 20 .  The ΔCt Glo value was calculated by the following equation: ΔCt Glo = Target gene Ct − Global mean Ct. In this analysis, gene stability was assessed by the standard deviation of the ΔCt Glo values. The standard deviation was given for the variability in ΔCt Glo values in all samples.
BestKeeper 13 analysis determines the most stably expressed gene based on the Pearson correlation coefficient (r) of the BestKeeper Index, which is the geometric mean of Ct values of candidate genes. Based on the raw Ct values of each sample, standard deviation was calculated, and high standard deviations (>1) were considered inadequate. geNorm 14 analysis evaluates the stability of candidate reference genes based on the average pairwise variation of a gene compared with that of all other genes. Next, it identifies the optimal number of reference genes required by analyzing the pairwise variation (Vn/n + 1) among candidate genes. The gene stability value (M value) and the pairwise variation value (V value) were calculated with the geNorm software using 2 −ΔCtMin values. The ΔCt Min value was calculated by the following equation: ΔCt Min = Target gene Ct − Min Ct, where Min Ct is the lowest Ct value of a candidate gene. The lowest M value indicates the most stable expression, and values under 0.5 indicate an acceptably stable expression. The number of reference genes was considered as optimal when the V value was below 0.15.
NormFinder 15 analysis is based on an ANOVA model that considers intra-and inter-group variability to evaluate the expression stability of a candidate gene. In this study, expression variations between the metastatic cancer and non-cancerous tissue were focused on. Exponentially transformed data (2 −Ct value) were used as input data in the NormFinder software. The lowest stability value indicates the most stably expressed gene.
In the comparative ΔCt method 16 , we calculated the ΔCt Pair values for each pair of reference gene candidates, and assessed the stability of each gene using the mean of the standard deviations obtained from all the pairwise comparisons including the gene.
The overall stability ranking of candidate genes was determined using the geometric mean of the rankings generated from all four analyses. Assessment of stability for combined genes. In addition to the best reference gene, NormFinder identified also the best combination of two genes. In addition, based on the overall stability ranking, combinations of two or three genes were determined as endogenous reference candidates. To evaluate the stability of combined genes, stability analyses were performed in combination with other normalization factors (i.e., global mean, miR-16, miR-191, miR-16/miR-345, miR-16/let-7a, U6 snRNA, SNORD44, and SNORD48).

Effect of normalization.
To evaluate the effectiveness of the endogenous reference chosen in this study, the expression levels of miR-29c was measured using various reference genes. Relative expression levels were reported as 2 −ΔCt . The Mann-Whitney U test was used to determine statistically significant differences in expression levels between the metastatic cancer and non-cancerous tissue. Statistical analysis was performed with GraphPad Prism 6.03 (GraphPad Software, San Diego, CA, USA). P-values of <0.05 were considered statistically significant.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.