Selection of Endogenous Control Reference Genes for Studies on Type 1 or Type 2 Endometrial Cancer

A panel of 32 candidate reference genes was used to identify the most stable genes for gene normalisation in quantitative RT-PCR studies using endometrial biopsies obtained from women with endometrial cancer (type 1 or type 2) and without cancer (controls). RNA from the biopsies was isolated, examined for purity and quality, and then reverse transcribed into cDNA before being subjected to real-time qRT-PCR analysis in triplicate within the TaqMan gene Expression Assay kit. The most ‘stable’ endogenous control genes were then identified using the geNorm qbase + 2 and NormFinder software packages. PSMC4, PUM1 and IPO8 were identified as the best reference genes combination for type 1 endometrial cancer (grades 1, 2 and 3), whereas for type 2 endometrial cancer (serous and carcinosarcoma), UBC, MRPL19, PGK1 and PPIA were the best reference genes combination. We conclude that the use of these normaliser combinations should provide accurate interpretation of gene expression at the transcript level in endometrial cancer studies especially for types 1 and 2 cancers.

In earlier qRT-PCR studies of endometrial cancer, normalisation was done using reference genes, such as β-actin (ACTB) 21,22 , glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 23,24 , or 18S RNA ribosomal unit 1 (18S rRNA) 25 , chosen without rigorous testing. Previously, we demonstrated that all of the aforementioned genes are poor reference genes for the study of gene expression changes that occur in the two main types of endometrial cancer 5 . In our previous studies, we showed that a combination of 3 genes (IPO8, MRPL19 and PPIA) provided the best combination of normalisation factors in qRT-PCR studies of endometrial cancer and that geNorm qbase+2 26 is more robust than the other software packages in its statistical corrections for all the possible sources of experimental error listed above. Although this conclusion remains applicable and supportable when comparing mRNA levels in type 1 endometrial cancer with type 2 endometrial cancer in the same gene expression study, recent studies have identified additional targeted biomarkers for individualised endometrial cancer. In those studies, the authors advocate the linking of transcript studies directly to personalised treatments, without indicating what the correcting reference genes for such studies should be 13 . This is especially important because endometrial cancer is becoming more prevalent in the reproductive age woman 10 , a point that has been missed in those recommendations 1,12 .
The aim of this study was therefore to identify the best reference gene combinations for the normalisation of studies, when using qRT-PCR, to quantify the number of gene transcripts in endometrial tissues from normal women with either endometrial cancer type 1 or type 2. By doing so, we hope to provide investigators with the tools to effectively investigate their chosen RNA biomarkers in future studies of women with type 1 or type 2 endometrial cancer so as to provide meaningful clinical data, especially when only one of these patient groups is under consideration. If both groups are under consideration, then our previously published data remains the correct choice 4 .

Results
GeNorm analyses for type 1 EC and for type 2 EC. Using geNorm, the least stable to most stable reference genes evaluated for type 1 EC was found to be: 1); (gene names, identities, accession numbers and amplicon sizes of the PCR products can be found in Supplemental Table 1). Analysis of this order indicated that PUM1 and IPO8 were the two most stable genes (defined as M ≤ 1.0) in type 1 EC samples, with the two least stable genes being RPL30 and MT-ATP6. The commonly used reference genes β-actin (ACTB), GAPDH and 18S were outside the least stable (M ≥ 1.0) category with M-values of 1.355, 0.950 and 1.300, respectively.
Gene stability analysis using geNorm (Fig. 2) indicated that the optimal number of reference gene targets was 5 in the EC1 analyses (geNorm V < 0.15 when comparing a normalisation factor based on the 5 or 6 most stable targets). Thus, geNorm qbase + 2 predicts that the optimal normalisation factor would be the geometric mean of the reference targets PSMC4, PUM1 and IPO8 (based on 3 most stable genes), ELF1, PSMC4, PUM1 AND IPO8 (based on 4 most stable genes), or EIF2B1, ELF1, PSMC4, PUM1 AND IPO8 (based on 5 most stable genes). Similar analyses of samples from patients with type 2 EC demonstrated the least stable to most stable genes to be: (Fig. 1). In studies using normal and type 2 cancer endometria, PGK1 and PPIA would be the two most stable genes with 18S and RPL30 being the two least stable. The commonly used reference genes (β-actin (ACTB), GAPDH and 18S) were also outside the least stable category with M-values of 1.450, 1.205 and 1.600, respectively.
The optimal number of reference targets for type 2 EC, was also 5 genes (Fig. 2). Thus, geNorm PLUS with qbase+2 predicts that optimal normalisation factor would be achieved by using the geometric mean of the reference targets MRPL19, PGK1 and PPIA (based on 3 most stable genes), UBC, MRPL19, PGK1 and PPIA (based on 4 most stable genes), or YWAZ, UBC, MRPL19, PGK1 and PPIA (based on 5 most stable genes).
NormFinder analyses for type 1EC and for type 2 EC. NormFinder software, which uses a mathematical model that considers both intergroup and intragroup expression variations (stability), and ranks them in order from the lowest to highest stability value 27 , identified PSMC4 (proteasome (prosome, Macropain) 26S subunit, ATPase, 4) as the single most stable gene (stability value = 0.268). The software specified that the best two gene combination was IPO8 (importin 8) and MRPL19 (mitochondrial ribosomal protein L19) with a stability value of 0.224 for patients with type 1 EC (Fig. 3). By contrast, the single most stable gene for type 2 EC patients was MRPL19 (mitochondrial ribosomal protein L19; stability value of 0.379) and the best combination of two genes was ELF1 (E74-like factor 1) and PUM1 (Pumilio homolog 1) with a stability value of 0.259 (Fig. 3).

Discussion
In this first systematic study, we show which combination of reference genes from a panel of 32 endogenous control reference genes should be used for normalisation in quantitative real-time PCR (qRT-PCR) studies of types 1 and 2 EC. We have identified 2 sets (one for type 1 and another set for type 2) of five 'housekeeping' genes that provide the best combination. When the TaqMan gene Expression Assay kit was used, none of the genes in the two identified combinations was common to both types of EC. When the stability list was increased to six normalising genes, one gene (PUM1) was found to be common to both sets of samples and interestingly one gene (IPO8) in the type 1 EC group and two genes (PPIA and MRPL19) in the type 2 EC group were identical to those we identified in our previous report 5 where only 3 normalising genes were used and in other cancers where 4 normalising genes were used 28 . Several genes (B2M, EIF2B1, ELF1 and PSMC4; type 1 EC, and YWAZ, UBC, and PGK1; type 2 EC) were not identified in the combinations of five best genes for the two cancer sub-types (this www.nature.com/scientificreports www.nature.com/scientificreports/ study) suggesting that the transcriptome of these two EC sub-types may in fact be very different. Indeed, type 1 EC is phenotypically distinct from type 2 EC both in its biology and response to treatment, further supporting our hypothesis that the normalising genes for these two subtypes of EC would also be different; qRT-PCR studies must therefore take this into consideration in order to generate meaningful and clinically relevant data. The data in this new analysis suggest that the best normalisers for qRT-PCR studies that are confined to either type 1 EC or type 2 EC alone are different hence combining these together when studying either in isolation is likely to generate erroneous data.
These findings have important implications for future study design, because qRT-PCR studies that only focus on type 1 EC (with a control group) require a different combination of normalising genes to a study that focusses on only type 2 EC (also with a control group). If the study design utilises type 1 and type 2 EC samples (along with a control of normal tissues), then an additional and different set of normalising genes will be required to provide www.nature.com/scientificreports www.nature.com/scientificreports/ valid and important data, as previously described 5 . For example, in a study of only type 1 EC samples, the list of normalising genes required for qRT-PCR studies are those listed in the results section of this paper, whilst a study of all types of EC requires the list of genes provided previously 5 , i.e. PPIA, MRPL19 and IPO8. The 3 normalising genes we reported previously was not significantly improved upon by increasing the list to 4, 5 or 6 normalising genes 5 and so this extra expense is not required, nor warranted.
Using NormFinder to analyse the best combination of normalising genes and the best gene for each patient group revealed that two genes previously reported to be the best normalisers for studies that included all types of EC (i.e. IPO8 and MRPL19) 5 were also identified in the present study. Furthermore, MRPL19 appears to be the only common gene in the combination of six that is most stable in studies separately investigating type 1 EC and www.nature.com/scientificreports www.nature.com/scientificreports/ type 2 EC (as in this study), and indeed is one of the genes we have previously recommended when both types of EC were combined 5 .
Previous studies focussing on the best normalising genes for studies of patients with endometrial cancer have a limitation in that they evaluated only ten possible normalising genes 29,30 . In those studies, the best normalising gene lists identified genes that were identified here and also previously (PPIA) in type 2 EC, but not those identified in type 1 EC. An additional issue in the stated studies was the large number of samples being examined that was not balanced by an equivalent number of control samples 30 . Increasing the number of biological sample replicates minimises, any internal variations. This could result in erroneous conclusions based on those internal limitations. By limiting our studies to the minimum number of biological replicates, we have maximally increased the biological and experimental variations. This means that the software needs to perform in a robust manner to generate the best normalisers for the study populations. Additionally, by increasing the number of 'housekeeping genes' from ten to 32 genes, we have increased the probability of identifying extra most stable normalisers for qRT-PCR studies, not identified previously. Consequently, we also reported the minimum number of genes (n = 3) for each type of endometrial cancer and the maximum number of genes (n = 6) for absolute robustness. The decision on whether to choose 3, 4, 5 or 6 genes in combination as normalisers for qRT-PCR studies, is dependent on two factors. These factors are: (1) the magnitude of change in transcript levels (more subtle changes need a larger number of genes to robustly identify them) and (2) cost (the cost of six primer pairs is obviously www.nature.com/scientificreports www.nature.com/scientificreports/ more than that of 3, 4 or 5). These considerations are important because an unwise choice may invalidate the study undertaken.
The use of the correct endogenous control reference gene(s) for normalisation in qRT-PCR experiments has been the subject of strong debate 31 ; with advantages and disadvantages 17 , even with studies that follow MIQE guidelines 19 being highlighted. Nevertheless, studies have shown disparities in the stability of reference genes in many different tissues 32 and in the same tissues under different conditions 33 . It is thus imperative that the appropriate selection of stable reference genes relative to the experiment undertaken is made, as has been discussed previously 5,30 . In this regard, improved normalisation is possible when changing from one to multiple endogenous control reference genes, a single reference is unlikely to provide an ideal endogenous control, as is stated in the MIQE guidelines. It is for this reason that we chose to undertake this study, especially as advocated personalised medicine studies do not define the subject samples or the experimental conditions when suggesting new biomarkers for EC. We also deliberately chose samples from a cross section of the different groups that truly represent the patient population. By using both pre-and post-menopausal samples and also limiting our study to 3 patient samples from each group to provide nine control samples, nine type 1 EC samples and six type 2 EC samples that were clearly defined by an experienced gynaecology histopathologist, we maximised the variation within and between the sample groups. This manipulation results in the most robust study where variation is being minimised, whereas a much larger number of samples (e.g. in the tens or hundreds) would have hidden the variation that is needed by the software being used.
It is for this reason that we recommended geNorm qbase + 2 software for the analysis of normalisation studies where a given combination of reference genes are used to generate a normalisation factor and that the data are further complemented with analysis of the same data using NormFinder 34,35 . By providing the best possible combination of normaliser genes, a platform whereby the biology of type 1 EC or type 2 EC can effectively be studied at the transcript level is provided. Additionally, in patient-specific prognosis studies where treatment outcomes are assessed, the observations reported herein are instantly applicable, since it provides a good starting point for normalising gene identification when treatments are applied. Thus, in gene expression studies using normal and type 1 malignant endometria, and where limited amounts of material or resources are available, more reliable normalisation is achieved when using the geometric mean of the Ct values obtained from the combination of three genes of PSMC4, PUM1 and IPO8 and is thus recommended. Similarly, for type 2 EC studies the geometric mean of the Ct values derived from the combination of MRPL19, PGK1 and PPIA provides a reliable normalisation factor. For absolute robustness, we recommend the use of geometric means from the 5 genes EIF2B1, ELF1, PSMC4, PUM1 AND IPO8 for type 1 EC studies and the use of geometric means from the 5 genes YWAZ, UBC, MRPL19, PGK1 and PPIA for type 2 EC studies, especially if the changes in gene expression when compared to controls are relatively subtle in contrast to other highly expressed gene targets. To compare gene expression patterns between type 1 EC and type 2 EC samples (without any control tissues as a reference), researchers are advised to follow the protocols described in this and our previous publication 5 , to generate the best normalising genes for their own patient cohorts.
In summary, by using a panel of 32 optimised and validated endogenous control reference genes in a Taqman gene expression assay format, we identified the most robust endogenous control reference genes for the study of either type 1 EC or type 2 EC. In doing so, we can categorically remove traditionally used normalising genes for the study of either type 1 EC or type 2 EC by qRT-PCR from the database, since the array used included representative genes from different gene families and functional classes. By reporting these data, we hoped to have provided a valuable tool for use in future studies of RNA biomarkers in the biology of type 1 and type 2 EC.

Materials and Methods
All volunteers provided signed, written informed consent to partake in the study (see Ethics Statement section). The Leicestershire, Northamptonshire and Rutland Research Ethics Committee (ref 06/Q2501/49) approved the study. Women undergoing hysterectomy for endometrial carcinoma or a benign gynaecological condition at the University Hospitals of Leicester National Health Service Trust were recruited (Table 1).
Histological diagnosis of the cancer was based on the FIGO system 36 and 15 endometrial carcinoma samples were studied: type 1 grade 1 endometrioid adenocarcinoma (n = 3), type 1 grade 2 endometrioid adenocarcinoma (n = 3) and type 1 grade 3 endometrioid adenocarcinoma (n = 3), type 2 serous (n = 3) and carcinosarcoma (n = 3). All cancer tissues was classified as being FIGO Stage 1. Normal endometrial tissue samples were obtained from volunteers who were undergoing a hysterectomy for benign indications (prolapse, dysfunctional uterine bleeding, fibroids) and were classified into pre-menopausal (secretory (n = 3) or proliferative (n = 3) phase) and postmenopausal demonstrating histological atrophic endometria (n = 3) by the criteria of Noyes et al. 37 . Patient characteristics are shown in Table 1.
Preparation of total cellular RNA and cDNA synthesis. From this point forward, the methodologies are essentially the same as described 4,5 . Fresh uteri were transported on ice to the histopathology department and two adjacent tissue biopsy samples dissected out by an experienced consultant gynaecology histopathologist using a dissecting microscope; one sample was fixed in 10% formal saline, stained with haematoxylin and eosin and used for histological confirmation of the diagnosis. All cancer samples were stage 1 and all control tissues were devoid of myometrial contamination. The second sample used for this normalising gene study was washed www.nature.com/scientificreports www.nature.com/scientificreports/ with phosphate buffered saline (PBS) to remove excess blood and stored in RNAlater (Life Technologies, Paisley, UK) for 24 hours at room temperature before transfer to −80 °C for storage and further processing.
Frozen, endometrial tissues (~100 mg) in lysis/binding buffer (1 ml lysis/binding buffer solution per 100 mg of tissues (miRNA Isolation Kit) were homogenised using a TissueRuptor (Qiagen Crawley, UK) homogeniser at medium speed for 60 seconds on ice until all visible 'clumps' were dispersed. Total RNA was extracted using the mirVana ™ miRNA Isolation Kit (Life Technologies, Paisley, UK) according to the manufacturer's protocol.
Total RNA was then quantified and its purity determined using a NanoDrop 2000c spectrophotometer (Thermo Scientific, Detroit USA). At this point, the RNA concentration was standardised to 10 μg/100 μl, and contaminating genomic DNA digested by treating with a TURBO-DNAse free kit (Life Technologies, Paisley, UK) at 37 °C for 30 minutes. The reaction was inactivated with 10 μl of inactivation buffer and the solution centrifuged for 90 seconds at 10 000 x g. The purity of the extracted total RNA (supernatant), as measured with the Nanodrop spectrophotometer, indicated good quality RNA with a A 260 /A 280 ratio of 2.10 ± 0.31 (OD ratio ± SD) and a A 260 / A 230 ratio of 2.19 ± 0.43. The average nucleic acid yield of RNA after extraction was 1.17 ± 0.61 (μg/μl ± SD). The treated supernatants were subjected to first strand synthesis using the high capacity cDNA MultiScribe ™ Reverse  Table 1). TaqMan Gene Expression Assays were used (Applied Biosystems) and each consisted of a fluorogenic FAM dye-labelled MGB probe (final concentration 250 nM) and two amplification primers (forward and reverse; final concentration 900 nM) provided in a pre-formulated 20X mix. Each assay had an amplification efficiency of 100 ± 10%, calculated by the system software. RT-minus and no template controls (NTC) containing DNAse-free water instead of template mRNA were included in each run and no product was synthesised in the NTC and RT-minus reactions confirming the absence of contamination with exogenous DNA. The final reaction volume was 20 μl and consisted of 2 μl of cDNA, 8 μl of DNAse-free water and 10 μl of TaqMan universal PCR Master Mix. A StepOne Plus instrument  Table 1. Patient characteristics. † P-value compared to atrophic patient group; † † P-value compared to control group; One-way ANOVA with Dunnett's multiple comparisons t-test. Significant differences are in bold font. EC1 = endometrial carcinoma type 1; EC2 = endometrial carcinoma type 2; n.a. = not applicable; n.s. = not significantly different; S.D. = standard deviation.