MiRNA expression profiling and emergence of new prognostic signature for oral squamous cell carcinoma

Oral squamous cell carcinoma (OSCC), the most common type of head and neck cancers, is associated with high recurrence, metastasis, low long-term survival rates and poor treatment outcome. As deregulated miRNA expression plays a crucial role in malignant transformation and cancer progression, the present study is aimed at profiling the miRNA expression pattern in OSCC and developing a new miRNA prognostic signature for oral cancer. MiRNA expression profiling was performed using MiRNA microarray in 30 tumor and 18 normal samples. MiRNA signature obtained was validated with quantitative real time PCR (qRT-PCR) in 144 tumor and 36 normal samples. The potential targets, clinical implications and prognostic value of the miRNA signature were elucidated by various bioinformatics and statistical analyses. Microarray profiling identified a set of 105 miRNAs to be differentially expressed in OSCC, out of which a subset of 19 most dysregulated miRNAs were validated by qRT-PCR. In silico analysis revealed the signature miRNAs to be involved in various cancer associated pathways. Up-regulation of miR-196a, miR-21, miR-1237 and downregulation of miR-204, miR-144 was associated with poor prognosis of OSCC patients. The mir-196a/miR-204 expression ratio emerged as best predictor for disease recurrence and patient survival. Altogether, our study identified a miRNA signature for OSCC with prognostic significance.

Globally, one in every six deaths is caused by cancer, and in 20 years, its incidence and mortality is expected to grow to 27.5 and 16.3 million cases respectively 1 . With more than half a million new cases reported annually, Head and Neck Squamous Cell Carcinoma (HNSCC) stands as the sixth leading cancer by incidence and eighth by death worldwide 2,3 . The 5-year survival rate of patients with HNSCC is 40-50% 3 . Oral Squamous cell Carcinoma (OSCC) accounts for nearly 95% of all the head and neck cancers with an annual incidence of over 300,000 cases 4 , 62% of which arise in developing countries 5 . In the U.S. population, oral cavity cancer represents only about 3% of malignancies, but in India, it accounts for over 30% of all cancers and hence ranks among the top three types of cancers in India 6 . It is also the leading cause of cancer death among men in India 2 . Low socio-economic status, illiteracy, lack of awareness, limited access to specialized treatment and diagnosis results in more than 48% of these cases being diagnosed in later stages, i.e. III and IV 7 . Despite all our advancements, treatment failures remain frequent, calling for better management of disease and prognostic tools.
As a family of super-regulatory small non-coding RNAs, microRNAs (miRNA) have regulatory roles in diseases via regulating expression of protein-coding target genes at both post-transcriptional and translational levels. MiRNAs work by targeting complementary gene transcripts either sequestering them in the cytoplasm or degrading them out-right depending on the degree of complementarity of the target transcript to the miRNA seed region 8 . They have high specificity in expression with regards to tissue type and disease characteristics 9 . It has been demonstrated that miRNA expression profiles have better accuracy in disease classification than mRNA expression profiles 10,11 . A myriad of miRNAs has proven to be differentially expressed across the many studies Results miRNA microarray and the expression pattern in OSCC. Our primary focus was to identify differentially expressed miRNAs (DE miRNAs) in oral carcinoma with respect to normal mucosa, in a set of 48 samples which included 30 tumors and 18 normal controls. Differential expression analysis showed around 105 miRNAs to be significantly dysregulated in oral carcinoma with a p value < 0.05 and fold change > 1.5. Narrowing down for the 50 most significant DE miRNAs, unsupervised hierarchical clustering and PCA broadly clustered tumor from normal. Filtering further for significance (cut-off of p value < 0.01), a 19 miRNA list split the heatmap into tumor and a normal cluster clearly grouping them apart as in the corresponding PCA (Fig. 1A, Table 1). Intraoral tumor site-wise grouping also noticed in this clustering, precise separation of clusters into Tongue   1A, Supplementary Figure 1). In addition, pathway analyses of miRNAs downregulated and upregulated in Tongue and Buccal Mucosa tumors indicated these significantly deregulated miRNAs to be targeting distinct cancer associated pathways (Supplementary Table 2 In-Silico analysis of validated miRNA signature. The signature miRNA list was dichotomized into upregulated (upmiRs) and down-regulated (downmiRs) and subjected to bio-informatic target prediction using DIANA microT. This was followed by enrichment of target genes sets in DAVID to look at the putative target genes for each of the two groups. Pathway prediction for both up (supplementary  Table 3). Together these datasets contained 83 tumors and 31 normal cases. Differential expression analysis showed that the 13 signature miRNAs were significantly dysregulated in these datasets as well.
MiR-196a, miR-196b and miR-424 were consistently upregulated and miR-101 and miR-204 were shown to be consistently down-regulated in all the three datasets, while the rest of the miRNAs showed expressional variation (Fig. 1B).
The clinical implications of the signature miRNAs. Comparing recurrent versus non-recurrent tumors, we noticed a sharp upregulation of miR-196a (p = 0.009), miR-1237 (p = 0.08) and miR-21 (p = 0.04) in recurrent samples and a down regulation of miR-190 (p = 0.07), miR-204 (p = 0.03) and miR-144 (p = 0.05) ( Table 2, Fig. 2A). Surprisingly we found that up-regulation of miR-196a (p = 0.02), and the downregulation of miR-204 (p = 0.06) indeed associated with poor DFS (Table 3, Fig. 2B). To correlate miRNA expression with over-all survival, we split the parent cohort into low overall survivors (≤ 1 year), intermediate (1-3 year) and good survivors (≥ 5 years). We found that miR-196b (p = 0.014) was highly up-regulated in the low survival group, gradually decreasing expression with increasing survival, as was the down-regulation of miR-376a (p = 0.08), gradually making up with increasing overall survival (Table 3 Table 4). miRNA expression ratios have demonstrated a better potential and an increase in predictive power 17,18 . We    The translational potential of miR-196a/miR-204 ratio as a prognostic marker. For all above analysis, normalization was done using global mean normalization method 13,19 as it seems to be superior over internal control U6 normalization. However, it is not practical in measuring one or two microRNAs for routine diagnostic or prognostic purposes. In such cases, internal control normalization is more suitable and practical. Similarly the calculation of fold change is also not possible as it is difficult to have corresponding normal tissue in all cases. The absolute copy number of transcripts is ideal in such situations, but known standards of miRNA of interest is required for absolute quantitation. Hence, more practical and feasible way is to calculate the normalized Ct of miRNAs of interest using an internal control like U6 or any other ubiquitous miRNA. Hence, we also validated our miR-196a/miR-204 ratio in available tumour total RNA samples with U6 normalization. In total, we calculated normalized dCt for miR-196a and miR-204 in 100 oral cancer samples using U6 as internal control. Then the expression ratio was calculated by dividing the normalized dCt value of the up-regulated miR-196a with the down-regulated miR-204. i.e. miR-196a/miR-204. Then the median of the miR-196a/miR-204 ratio was calculated to dichotomize the data to high and low. The median for the miR-196a/miR-204 ratio was 0.74 and for the convenience it was taken as 0.75. The ratio less than 0.75 was considered as low and the ratio equal to or above 0.75 was taken as high ratio. Then the prognostic significance of the miR-196a/miR-204 ratio was evaluated using Kaplan Meier curve. The results showed that the miRNA expression ratio has significant association with the DFS of oral cancer patients. The high miR-196a/miR204 ratio demonstrated better disease free survival than patients with low ratio (Log Rank p = 0.003) (Fig. 3A). However, its relation with the overall survival of patients is marginally significant (Log Rank p = 0.092). ROC curve analysis also demonstrated the excellent predictory power of miR-196a/miR-204 ratio, with AUC = 0.680, p = 0.001, specificity = 69 and sensitivity = 66, to demarcate patients with good and poor disease free survival (Fig. 3B). Along with miR-196a/ miR-204 ratio, significance of other classical clinical prognostic factors with DFS and OS were also evaluated by both univariate (Kaplan Meier/Log Rank) and multivariate (Cox's propotional hazard) methods (Supplementary Table 5). With respect to DFS, miR-196a/miR-204 ratio alone was significant in both univariate and multivariate analysis. However, with respect to OS, the T-status and composite stage were shown to be significant in univariate analysis, but not in multivariate model. The miR-196a/miR-204 ratio was marginally significant with OS both in univariate and multivariate analysis. Thus the independent prognostic influence of the miR-196a/miR-204 ratio on survival of oral cancer patients was shown over other clinical factors.    www.nature.com/scientificreports/ Potential targets for the signature miRNAs. MiRNAs function by regulating the expression of their target genes and therefore the potential targets of the differentially expressed miRNAs were explored. All the miRNAs had experimentally validated miRNA target list in mirtarbase. The miRNA targets with the maximum number of validation methods and reported publications were selected for correlation analysis. The top three genes which showed a strong negative correlation with the miRNA expression were identified (Supplementary Table 6). The top three targets for miR-196a-5p are SPRR2C, ANXA1, S100A9 and miR-204-5p are CDC42, RAB22A, EZR. Furthermore, FAM69C, PACS2, SPOP were identified to have a negatively correlation with miR-1237-3p expression. In addition, DIANA MirPath analysis revealed Hippo signalling pathway to be the major pathway regulated by miR-196a/miR-204 (Supplementary Figure 5).

Discussion
The expression pattern of the 19 microarray DE miRNAs packed enough discriminatory power to distinctly cluster the samples into site specific normals and tumors as evidenced by the hierarchical cluster. Further they separated the tongue and buccal mucosa samples into respective normals and tumors, indicating that these signatures were equally demarcative in both the oral sites. Similarly, clinico-pathological attributes like tumor stage and recurrence were also well represented by up-regulated miR-1237 and down-regulated miR-377 and miR-144. The tumor specific over-expression of miR-1237 is a novel report while other signature miRNAs like miR-21, miR-424, miR-196a, miR-196b and miR-204 have been studied and reported in human cancers including OSCC [20][21][22][23] . The expression profiles obtained through miRNA microarray needs to be validated by an independent profiling method and TaqMan qRT-PCR with its detection sensitivity, sequence specificity and reproducible quantitation stands ahead 19 . But the accuracy of TaqMan chemistry could be seriously misinterpreted by an incorrect and unsuitable normalizing gene 24 . Mean expression value normalization was shown to outperform standard small RNA controls in stability and bring out the true biological changes in both tissues and cell lines 13 and hence employed. The miRNA expression obtained through qRT-PCR, corroborated all the 19 miRNAs assayed i.e. the tester vs array providing a strong validation of our miRNA expression profile. Adding a touch of novelty to the study, the over-expression of miR-1237 was highly indicative of a potential tumor signature and one among the most consistently dysregulated miRNA in our analysis. The up-regulation was much more pronounced in tongue with serious role in tumor depth. In-silico target prediction for miR-1237 revealed putative target sites on multiple genes involved in various cancer associated pathways like; transcriptional mis-regulation in cancer, homologous recombination, endocytosis and PI3K-AKT signalling. Hence there is strong enough reason to consider miR-1237 as an onco-miR especially in the context of oral cancer. miR-1237 has previously been shown to be epigenetically silenced in colo-rectal cancer and HCT-116 cell lines 25 but upregulated in acute lymphoblastic leukaemia cells cultured in human derived osteoblast cell niche 26 .
Meta-analysis using three independent miRNA microarray datasets GSE8100, GSE3177 and GSE34496 helped re-affirm the expression profile of the DE miRNAs in this study. Together, the three datasets include miRNA expression data from 128 OSCC specimen not biased by any conditions or treatments. The heatmap shows the similarity of expression of miRNAs across the datasets compared to our samples. GSE31277 shows the most expression similarity to our set of samples presumably due to the Asian-Hispanic ethnicity of the Brazilian cohort as against the other two American groups. MiR-196a, miR-196b and miR-424 showed steady and consistent over-expression and miR-101 and miR-204 showed down-regulation in all three cases. Hence these miRNAs warrant special attention due to their onco-specific expressional stability even across different platforms and study conditions. MiRNA expression profiles have far reaching implications as biomarkers in disease models especially in cancer. miR-204-5p is reported to inhibit proliferation of gastric and colorectal cancer cells by down regulation of RAB22A 27,28 . miR-204-5p acts through cdc42 to aid the invasion of EBV-associated nasopharyngeal carcinoma cells 29 . miR-196a-5p has shown to inhibit ANXA1 and enhance breast cancer cell growth 30 . miR-196a, miR-21, miR-1237, miR-204 and mR-144 were found to be associated with poor disease prognosis and poor DFS in our samples. Aberrant expression of miR-21 has already been suggested as putative diagnostic biomarker for OSCC 31 and an independent predictor of poor survival for patients with tongue SCC 32 . MiR-144 has also been assigned with tumor suppressive function in lung cancer, and has been reported to inhibit proliferation, induce apoptosis and autophagy upon supplementation in lung cancer cells 33 and down regulated in oral cancer 22 . Both miR-196a and miR-196b have been proposed as biomarkers in oral cancer management and their overexpression was found to promote oral cancer cell migration, invasion and lymph node metastasis 34 . Similarly, miR-204 is a well reported dis-regulated miRNA with tumor-suppressive functions and is consistently found down-regulated in oral cancer and regulates cancer stemness 20,35 . In fact, Kaplan-Meier survival analysis revealed the association of miR-196a and miR-204 individually with overall (OS) and disease-free survival (DFS) in our sample set. Additionally, ROC curves plotted to look into the predictory potential of the 5 miRNAs yielded results very much in favour of miR-196a and miR-204 dysregulations.
Previously many groups have tried the combinatory potential of miRNA expression ratios in disease prediction 17,18,36,37 . Avissar et al. showed how expression ratio of miR-221: miR-375 distinguished oral tumors from normal and their corresponding diagnostic and prognostic potential 17 . Neely et al. showed how miR-21:miR-205 ratio characterized invasive bladder cell lines from normal non-invasive cell lines by a tenfold margin 38 . It has also been reported in veterinary medicine where miR-17-5p/miR-155 was proposed as a new grading tool for canine splenic lymphomas as it correlated with WHO grading 36 . Similarly, in our study we combined the individual strengths of miR-196a and miR-204 by plotting their expression ratio. The disease predictory power of miR-196a/miR-204 ratio was strengthened by its strong association with survival as manifested by the Kaplan-Meier curve. To be noted here, is the fact that computing the miRNA expression ratio, www.nature.com/scientificreports/ slightly enhanced the AUC than the individual profiles of miR-196a and miR-204 positively boosting predictive outcome. Tsai et al. 38 also showed that combined expression signatures of miR-375, miR-204 and miR-196a are promising biomarkers for the diagnosis, prognosis and treatment of OSCC. However, they have not shown its prognostic significance as in the present study with follow-up data. The miRNA expression ratios for the other three miRNAs; miR-1237, miR-21 and miR-144 were not calculated as their association with patient survival was relatively inferior, hence questioning the credibility of their expression ratios if at all. Since miR-196a and miR-204 were both dysregulation markers for poor prognosis, the predictory power of their expression ratio has a potent prognostic effect, where patients expressing a miR-196a/miR-204 ratio < 1 can be predicted to proceed to increased morbidity. Examining their profiles in pre-neoplastic lesions, early onset tumor samples and importantly non-invasive diagnostic specimens like saliva would give better insight into the relevance of miRNA expression ratio as disease predictors. Additional validation in a larger sample cohort would shed more light onto its true strength in oral cancer prognosis.
Defining specific role of miRNAs in cancer is crucial as they control cellular functions by altering target gene expression. Expressional alterations of miRNAs in cancer can hence affect key regulatory pathways. In silico analysis identified SPRR2C, ANXA1, S100A9 as targets of miR-196a-5p and CDC42, RAB22A, EZR as that of miR-204-5p. In Barrett's esophagus (BE), which is a recognized precursor of esophageal adenocarcinoma (EA), miR-196a was recognized as a potential marker of disease progression with SPRR2C and S100A9 as its targets 39 . MiR-196a is also reported to promote proliferation, invasion and metastasis of EA and HNSCC by targeting ANXA1 39,40 . MiR-204, on the other hand plays tumor suppressive role by targeting CDC42 and RAB22A in nasopharyngeal and renal cell carcinoma respectively 29,41 . In oral cancer, knockdown of lncRNA LEF1-AS1 and upregulation of circRNA_0000140 supressed proliferation and metastasis by inhibiting Hippo signaling pathway 42,43 which in our in silico analysis was recognized as the major signaling pathway regulated by miR-196a and miR-204. Further functional validation of these miRNA targets is required to identify their potential functional role in OSCC.
Thus, in this study we have looked at the differential expression of miRNAs in OSCC, arriving at a 19 tumor specific signature miRNA expression with remarkable demarcative potential. Further analysis of the patient cohort narrowed down specific expression markers for tumor stage, depth, peri-neural invasion and patient survival, highlighting the specificity and sensitivity of miRNA expression profiles. Up-regulation of miR-196a, miR-21, miR-1237 and downregulation of miR-204, miR-144 classified a poor prognosis group in our patient cohort, strongly associated with patient survival and capacity for disease prediction. Correlation of miRNA expression with external datasets (GEO) mostly corroborated our signature miRNA expression profiles adding confidence, the differences being attributed to ethnic variation. miR-1237 emerged as a novel report in our study, significantly over-expressed in oral tumor samples especially in the early stages (stage I and II) with low tumor depth suggesting a role in tumor initiation and positively associated with tumor recurrence. Combining their individual strengths, miRNA expression ratio boosted predictory power and miR-196a/miR-204 expression ratio emerged the best predictor for disease recurrence and patient survival. Through this study, we confirm the potential of miRNAs expression profiles as strong molecular tools in oral cancer diagnosis and prognosis.

Selection of patients.
The cases for the present study were selected from patients attending the out-patient Head and Neck Clinic of Regional Cancer Centre, Thiruvananthapuram. Patients with a histopathologically confirmed diagnosis of oral squamous cell carcinoma with no previous history of any type of treatment for cancer and patients free from chronic systemic diseases were included in the study population. Details such as sociodemographic, occupational, habitual, and clinico-pathological features of lesions in each patient were collected in pro-forma.
Tissue sample collection. After obtaining the signed informed consent from the eligible patients, incision biopsy of tumor was taken from each patient either during the investigative biopsy procedure or surgical excision of the lesion. Normal mucosa was also collected from patients undergoing oral and maxillofacial surgery for reasons other than cancer. All the collected tissue bits were immediately snap frozen and stored in liquid Nitrogen.
Patient details. The clinico-pathological features of study subjects are described in Supplementary Table 1. Treatment plan for each patient was arrived at by a joint decision between the radiation oncologist and surgeon at the multidisciplinary clinic or by the institute's tumor board. In general, early stage cancers are mainly subjected to surgery and followed by radiation if the disease is involved into regional nodes. However, for advanced stage diseases (stages III and IV) radiation and/or chemotherapy together with surgery continue to be the standard regime of care. Clinical follow-up of all these patients after treatment were carried out until death or up to a maximum of 60 months. Criteria for histological diagnosis were based upon WHO guidelines for the histological classification of oral lesions.

RNA isolation.
Total RNA was prepared from tissue samples using miRVana miRNA isolation kit (Ambion, TX, USA) as per manufacturer's protocol. The quality and quantity of the isolated RNA was checked using Biospec 1601 UV-Vis Spectrophotometer (Shimadzu, Japan). miRNA microarray. Agilent Human miRNA Microarray v2.0 (G4470B, Agilent Technologies) was used to identify miRNAs differentially expressed in OSCC. MicroRNA processing was carried out according to the manufacturer's instructions. Hybridized microarrays were scanned with a DNA microarray scanner (Agilent MiRNA microarray data analysis. Data pre-processing and differential expression analysis were done in R Studio using the Bioconductor AgiMicroRna package 45 . The Total Gene Signal provided by the AFE image analysis software was used for data analysis. Data were normalized between arrays using the quantile method 46 . Hierarchical clustering was carried out using Euclidian distance as the distance metric using hclust algorithm and heatmaps were plotted using heatmap-2 function with the gplots package in R Studio (R Studio team). Gene Ontology (GO) Enrichment Analysis of the miRNA microarray data was performed using DAVID 47 summarized and visualized using REVIGO 48 . Target prediction was performed with platforms like DIANA-microT-CDS and miRPath v2.0, miRWalk v2.0, MiRTarBase v4 and ComiR for maximum overlap.
cDNA construction and quantitative real time PCR (qRT-PCR). cDNA was synthesized by priming with a pool of gene-specific looped primers including the primers of the miRNAs of interest and RNU6B, as a universally-expressed endogenous control (Applied Biosystems, Foster City, CA, USA) as per manufacturer's specifications. PCR products are amplified from cDNA samples using the TaqMan MicroRNA Assay together with the TaqMan Universal PCR Master Mix (Applied Biosystems, Foster City, CA, USA). Raw CT was normalized to account for inter-sample variation using Global Mean Normalization 13,19 . Relative Quantification of the normalized data was done against control using Livaks and Schmittgen's 2 −ddCT method 49 , to obtain fold change with Microsoft Excel spreadsheet.
Identification of potential targets for the signature miRNAs. The association between miRNAs and their respective gene targets were computed using level 3 data from The Cancer Genome Atlas (TCGA) datasets in HNSC cohort. TCGA data was accessed and analyzed using UCSC Xena Browser (https://xenabrowser.net). In this cohort, there were 528 patients with primary head and neck cancer. miRNA expression was measured using RNA sequencing (Illumina Hiseq). Mirtarbase was used to obtain the experimentally validated microRNA target genes for all the differentially expressed microRNAs. The genes which showed a negative correlation with microRNA expression in the TCGA HNSC cohort were selected as potential targets. The negatively correlated genes were selected by setting Pearson's r value ≤ − 0.1 and p < 0.05.

Statistical analysis.
A two-tailed Student's t test was used to compare between miRNA expression levels obtained by miRNA microarray and Real-Time PCR followed by Spearman's Rho and Pearson correlation to select validated miRNAs. The association of these selected miRNAs with various disease parameters was assessed using Cox regression and log-rank tests and a p value of < 0.05 was considered statistically significant. Significant pathways association of these signature miRNAs to the patient's overall and disease-free survival (DFS) was evaluated by Kaplan-Meier Survival Curve analysis with the Log-rank statistic. Receiver operating characteristics (ROC) analysis determined the ability of the signature miRNAs to discriminate between OSCC and normal samples.
Ethical standards. This study has been cleared by the Institutional Human Ethics Committee of Regional Cancer Centre, Trivandrum, India for collecting the tissue samples from patients (Sanction No. 39/2006). Also the study has been carried out as per the guidelines, "National Ethical Guidelines for Biomedical and Health Research Involving Human Participants-2017" formulated by the ICMR, Government of India.

Data availability statement
The high throughput data generated during and/or analyzed during the current study are available in the GEO repository (GEO Submission No. GSE168227) at https:// www. ncbi. nlm. nih. gov/ geo/ query/ acc. cgi? acc= GSE16 8227.