Introduction

Oral squamous cell carcinoma (OSCC) accounts for over 300,000 new cases each year and currently represents 5% of all cancers worldwide1,2. Despite promising advances in therapeutic approaches, the 5-year survival rate remains around 50%, mainly due to the presence of cervical lymph node metastasis3,4,5,6. The pathogenesis of primary tumours, along with the presence of metastasis, has been linked to the presence and maintenance of cancer stem cells (CSC), which constitute a small subpopulation of tumour cells characterized by slow but unlimited division. These CSC play a significant role in invasion, drug resistance, and surveillance7,8,9,10.

Numerous molecules have initially been utilized in the endeavour to identify and eradicate CSCs implicated in OSCC carcinogenesis, both in comparison to normal mucosae and following radio/chemotherapy11. Nevertheless, the precise combination of markers within primary tumour compartments, specifically before treatment, which can be clinically validated for prognostic purposes, remains to be comprehensively delineated. One of the primary markers utilized for CSC selection is CD44, a membrane protein that promotes cell adhesion, enhances migratory abilities and is associated with the stemness phenotype12. In addition to CD44, the ALDH1 enzyme has also been extensively linked to the CSC phenotype in tumour cells, given its versatility in roles such as detoxification, protection against oxidative stress, retinoid metabolism, and tissue development13. Recently, our research group semi-quantitatively scored the immunoexpression of CSC markers, ALDH1 and CD44, in OSCC samples within the invasive tumour front (ITF) and metastatic lymph nodes as a whole. We found that CD44 immunoexpression was a predictor of lymph node metastasis, whilst ALDH1high immunostaining was associated with angiolymphatic invasion, suggesting that immunoexpression of both markers links the CSC phenotype with invasion and metastasis and, could play important roles in OSCC carcinogenesis14. We then realized that the identification of a set of CSC markers associated with a different biological process marker in a large number of surgical resection specimen tissues, using a semi-automatized analysis, would be crucial to determining the understanding of OSCC carcinogenesis and its prognostic value.

Due to the complex behaviour of CSC, several other markers intimately involved in carcinogenesis have been isolated and identified, especially p75NTR and BMI-1, two important molecules associated with stemness phenotype15. This complexity arises from p75NTR's role as an essential receptor that modulates neurotrophin signalling, impacting neuronal survival, development, plasticity, and axonal growth. Its functions have direct implications for perineural invasion by tumour cells, contributing to poor prognoses in several carcinomas, including OSCC16. Furthermore, an in vitro study using the OSCC cell lineage OM-1 indicated that p75NTR signalling is the primary molecule responsible for forming tumour spheres in a cell suspension culture, suggesting a p75NTR-dependent property in cancer stemness17. On the other hand, BMI-1 appears to be a significant target for cancer biology, regenerative medicine, and ageing-related research due to its involvement in epigenetic regulation, stem cell maintenance, cancer development, ageing, DNA repair, and immune system modulation. Additionally, the overexpression of ALDH1 and BMI-1 can contribute to the poor prognosis of OSCC by promoting treatment resistance and malignant transformation, respectively18,19.

Understanding how CSC maintain their phenotype is essential for the development of targeted therapies that can specifically eliminate these cells. They are often responsible for tumour metastasis from primary tumours, which is associated with poor prognosis in various human carcinomas. The metastatic process in tumour cells is complex and can vary depending on factors such as the cancer type, interactions with the tumour microenvironment, and the therapies employed20. While the factors that regulate these mechanisms are not yet fully understood, there is evidence suggesting a link between CSC and the epithelial-mesenchymal transition (EMT). EMT is a biological process characterized by the loss of cell polarity and cell–cell adhesion, enabling tumour cells to migrate beyond the primary tumour21. Amidst the spectrum of EMT markers, the aberration or loss of E-cadherin, a transmembrane glycoprotein pivotal for mediating cell–cell adhesion in epithelial tissues, stands out as a significant contributor to pathological conditions, particularly in the context of cancer invasion and metastasis22. A recent study by Meng et al.23, utilizing immunohistochemistry (IHQ) on paraffin-embedded breast cancer samples, underscored the independent predictive value of reduced E-cadherin expression at the ITF concerning nodal metastasis. Furthermore, this reduced expression was associated with worse recurrence-free survival and disease-free survival. These findings emphasize the prognostic importance of E-cadherin, prompting the necessity for future research to establish its staining protocol for clinical applications.

Although the precise mechanisms by which CSCs regulate EMT in various malignancies remain unclear, this process typically involves the down-regulation of E-cadherin and the up-regulation of mesenchymal proteins24,25,26,27. It is noteworthy that tumour cells undergoing EMT can also undergo a reversal process known as mesenchymal-to-epithelial transition (MET) during metastasis. This MET enables the formation of a secondary tumour with epithelial characteristics resembling those of the original tumour28,29.

These intricate processes underscore the dynamic nature of cancer progression and metastasis, underscoring the need for further research to enhance our understanding of CSC-related mechanisms of specific markers intimately involved in carcinogenesis. Thus, we conducted quantitative assessments of the immunoexpression levels of CSC markers, including ALDH1, CD44, p75NTR, and BMI-1, along with E-cadherin, in both OSCC primary tumours and metastatic lymph nodes through the utilization of IHC. These assessments were then correlated with clinicopathological data. Our primary objective was to gain insight into the roles of these molecules within different tumour compartments during the progression of OSCC. Furthermore, we sought to establish connections between these markers and patient survival, with the ultimate aim of identifying a set of CSC markers and E-cadherin that are associated with the poor prognosis often observed in OSCC.

Materials and methods

Patients

Ninety-four smoker OSCC patients, aged 38 years or older, with no systemic diseases, and without any previous treatment before surgical resection and neck dissection surgery, were included. Paraffin-embedded samples consisted of 67 primary metastatic and 27 non-metastatic tumours, along with 67 cervical lymph node metastases, 47 of which originated from corresponding primary tumours. These samples correspond to archived material, spanning from January 2003 to December 2011.

Immunohistochemistry technique and analysis

Sections were deparaffinized with xylene and then rehydrated using an ethanol series. Tissue antigens were retrieved as described previously14. Slides were incubated with primary anti-human ALDH1 (1:700, clone 44, cat# 611,195, BD Biosciences, NJ, USA), CD44 (1:700, clone 156-3C11, cat#3570, Cell Signalling, MA, USA) and E-cadherin antibodies (1:200, clone NCH-38, cat#M3612, DAKO, Carpinteria, CA) at room temperature (RT) for one hour. Overnight incubation at 4º C was carried out for the following primary antibodies: anti-human p75NTR (1:500, clone ME20.4, cat# 05–446, EMD Millipore, MA, USA) and BMI-1 antibodies (1:300, clone F6, cat#05–637, Merck, Darmstadt, Germany). Slides were further incubated with a biotin-free peroxidase-based ADVANCE™ kit (#K4068, DAKO) visualization system for one hour at RT, except for the BMI-1 antibody, which was performed using EnVision™ + Dual Link System-HRP (DAKO) for one hour at RT. The immunocomplexes were detected with 3´3-diaminobenzidine (DAB) substrate-chromogen system (DAKO) for one minute and counterstained with Meyer’s hematoxylin. The positive controls for immunostaining standardization were breast cancer cells (ALDH1), lymphocytes from cervical lymph nodes (CD44), testicle (p75NTR), and inflammatory fibrous hyperplasia (BMI-1 and E-cadherin). A negative control was obtained by incubation with IgG serum as a substitution for the primary antibodies14.

Slides were further scanned into high-resolution images using the Aperio Scanscope CS Slide Scanner (Aperio Technologies Inc, Vista, CA, USA). BMI-1 nuclear marker was analyzed using the Nuclear V9 algorithm (Aperio Technologies Inc), while cytoplasmic and membrane expression of ALDH1, CD44, p75NTR, and E-cadherin were analyzed by algorithm PixelCount V9 (Aperio Technologies Inc). For the markers analyzed by the algorithm PixelCount V9, each category received an intensity score: 1 to weak, 2 to moderate, and 3 to strong staining (Supplementary Fig. 1). The final score of each tumour was calculated as the sum of the percentage of each category multiplied by their intensity scores and, results always ranged from 100 to 300 based on the previous set of parameters protocoled30.

For quantification, immunopositivity tumour cells were counted in eighteen randomly distributed microscopic fields from each primary OSCC sample, nine from the centre regions of the tumour, and nine from the ITF (defined as the deepest field with infiltration of tumour cells). At metastatic sites, three microscopic fields randomly distributed in central and extracapsular areas presenting tumour cells were analysed. Then, eighteen of non-metastatic tumours (N0), and twenty-one of locoregional metastatic tumours (N +) standardized microscopic fields (50.000 µm2 rectangles each) were captured per slide.

Validation of CSC markers and E-cadherin using public datasets

In this study, we conducted a comprehensive evaluation of CSC and E-cadherin gene expression patterns of human primary OSCC and metastatic lymph nodes by harnessing publicly accessible datasets. The acquisition of pertinent data was accomplished through a meticulous selection of validated resources. Firstly, we utilized the National Center for Biotechnology Information (NCBI) Database, with a specific focus on the Gene Expression Omnibus (GEO) repository (http://www.ncbi.nlm.nih.gov/geo/). Within the GEO repository, our attention was directed towards the GSE2280 messenger RNA (mRNA) microarray dataset. Furthermore, we expanded our data acquisition strategy to encompass the Xena Functional Genomics Explorer, affiliated with the University of California, Santa Cruz (https://xenabrowser.net/).

This study employed the limma (Linear Models for Microarray Analysis) package within the R statistical environment to identify genes differentially expressed (DE) between two or more experimental groups. This software leverages linear modelling techniques to assess gene expression variability while accounting for experimental design factors, providing robust statistical inference for DE analysis. Microarray data were pre-processed using appropriate normalization techniques to correct for systematic biases and ensure data comparability across samples. Linear models were constructed using limma, incorporating relevant experimental design factors as fixed effects. Moderated t-statistics were employed to estimate gene-wise expression differences between groups, accounting for potential variability within each group. To control for false positives arising from multiple comparisons, Benjamini–Hochberg adjusted p-values were calculated for each gene. Genes were considered differentially expressed if they met both a significance threshold (adjusted p-value < 0.05) and a fold change threshold (absolute fold change > 2). This stringent filtering strategy prioritizes genes exhibiting substantial expression changes likely to hold biological relevance within the context of our study.

Here, we directed our attention towards the Genomic Data Commons (GDC) of The Cancer Genomic Atlas (TCGA) Head and Neck Cancer (HNSC) dataset. As with the GEO dataset, our dataset selection criteria remained consistent, rigorously emphasizing the inclusion of validated studies featuring primary OSCC specimens and metastatic lymph nodes, along with accompanying clinicopathological information. This meticulous selection of publicly available datasets, each aligned with the established criteria, underpinned the robustness and validity of our data acquisition, forming the foundation for our comprehensive investigation of gene expression patterns in the context of OSCC and its metastatic potential for determining prognosis55.

Statistics

Statistical analyses were performed using GraphPad Prism 8 (GraphPad Software, Inc., CA, USA) and JMP v11 (SAS, NC, USA). Protein immunoexpression was quantitatively compared between groups using the One-way ANOVA test and Dunn's multiple comparisons test. To establish dichotomous cut-off values distinguishing high and low immunostaining, we calculated the mean final score for each marker to characterize the evaluated sample regions. Stepwise logistic regression was used to find the best-fit model to predict the probability of lymph node metastasis and survival in OSCC, based on the immunostaining of CSC and E-Cadherin biomarkers both as numerical (absolute) and dichotomized (categoric) data at the centre and ITF of primary OSCC. TNM stage was also included in the analysis to assess the incremental predictive value of biomarkers. Finally, univariate analysis of clinicopathological characteristics and protein immunostaining profiles was conducted using Fisher's exact test. Missing data were imputed using the mean values of immunostaining in both the centres and ITF regions of primary tumours and whole metastatic lymph nodes. Two-tailed p-values < 0.05 were considered for statistical significance. For validating bioinformatics analysis, Benjamini & Hochberg (False discovery rate) was used to establish an adjusted p-value.

Ethics approval and patient consent statement

This retrospective study was performed following the Declaration of Helsinki and approved by the Research Ethics Committee of Bauru School of Dentistry—University of Sao Paulo (Protocol CAAE 50,695,815.2.0000.54.17). The requirement to obtain written informed consent was waived by the Research Ehtics Committee for Project Reviews at the Clinical Hospital of the University of Sao Paulo Medical School from the Department of Head and Neck Surgery.

Results

Sample characterization

Demographic data, tumour location, and stage for the 94 patients enrolled are summarized in Table 1. Most OSCC patients were male (n = 81; 86.17%), aged from 38 to 86 years old, predominantly younger than 60 years old (n = 65; 69.15%) and all patients were smokers and drinkers. Microscopically, OSCC predominantly exhibited perineural invasion (n = 54; 57.45%) but no blood and lymphatic invasion (n = 70; 74.47%; n = 57; 60.64%, respectively). Also, the peritumoral inflammatory infiltrate was mostly moderate (n = 37; 39.36%) followed by a low density of inflammatory cells (n = 34; 36.17%). Following TNM (UICC)31 and WHO32 grading systems for OSCC, the majority of the lesions were T4 (n = 29; 30.85%) and T2 (n = 27; 28.72%), as well as N + (n = 64; 68.09%), exhibiting a moderate (n = 52; 55.32%) and well- (n = 36; 38.30%) patterns of differentiation. According to the 8th Stage edition, most patients were in the IV stage (n = 59; 62.77%), and among all patients who died, most had a survival rate of fewer than 5 years (n = 44; 46.81%).

Table 1 Clinicopathological features of OSCC patients (n = 94 patients).

Immunohistochemical labelling for CSC markers—ALDH1, CD44, p75NTR and BMI-1 and E-cadherin in OSCC and cervical lymph node metastasis

The central regions of primary tumours were primarily characterized by expansive areas populated by tumour cells exhibiting an adherent cell-like structural architecture, which varies according to tumour grade (Fig. 1A). Conversely, the ITF is distinguished by the presence of more invasive cells, often found either individually or in small clusters, some of which invade blood and lymphatic vessels (Fig. 1B), as well as perineural spaces. Remarkably, lymph node metastases exhibit distinct histological features, demonstrating disparities between central regions and extracapsular areas. In lymph nodes, it is possible to observe groups of tumour cells surrounded by lymphoid tissue (Fig. 1C). Notably, primary tumours generally displayed a consistent pattern of CSC panel staining, except for E-cadherin, which exhibited a contrasting profile. Specifically, in terms of the evaluated markers, ALDH1 immunostaining at the tumour centre areas typically manifested as sporadic cytoplasmic staining, predominantly encircling corneal pearls (Fig. 1D), whereas it was more pronounced in the expression observed in the more invasive cells found within lymphatic emboli (Fig. 1E) and metastatic tumour cells (Fig. 1F). Our observations further revealed that the final score (mean rank difference) of ALDH1-positive cells displayed a statistically significant increase in metastatic sites compared to primary tumours, both within the central and ITF regions (p < 0.0001 and p = 0.0156, respectively; Fig. 1G).

Figure 1
figure 1

Immunohistochemistry of CSC markers and E-cadherin in OSCC (Centre and ITF) and cervical lymph node metastasis. (A) Central regions of primary tumours display expansive areas with an adherent cell-like structural architecture. (B) The ITF features more invasive cells, often observed individually or in small clusters, and occasionally associated with lymphatic emboli. (C) Lymph node metastases exhibit distinct histological features compared to central regions, especially within extracapsular areas. (D) ALDH1 immunostaining in central tumour areas typically manifests as sporadic cytoplasmic staining, predominantly encircling corneal pearls. (E) ALDH1-positive tumour cells exhibit increased immunostaining within lymphatic emboli. (F) Metastatic tumour cells display notably high levels of ALDH1 staining. (G) Statistical analysis reveals a significant increase in ALDH1-positive cells within metastatic sites compared to both central and ITF regions of primary tumours. (H) CD44 demonstrates homogeneous immunostaining in central regions. (I) Uniform immunostaining within the ITF of primary tumours for CD44. (J) CD44 displays increased immunostaining at metastatic sites. (K) Notably higher CD44 immunostaining levels are observed at metastatic sites compared to central regions of primary tumours. (L) p75NTR presence is predominantly detected in metastatic primary tumours. (M) Limited and scattered expression of p75NTR in central regions. (N) Strong cytoplasmic staining for p75NTR is evident in metastatic sites. (O) No statistical significance in p75NTR expression is observed among the groups. (P) Focal nuclear immunostaining for BMI-1 is predominantly observed in both central regions. (Q) BMI-1 focal nuclear immunostaining is also observed in the ITF of primary tumours. (R) Positive BMI-1 staining is detected in only a few scattered metastatic tumour cells. (S) The majority of BMI-1 immunopositivity significantly decreases in metastatic sites compared to central areas of primary tumours. (T) E-cadherin exhibits strong and consistent membrane intensity in the central regions. (U) Heterogeneous cytoplasmic positivity for E-cadherin is observed in invasive tumour cells within lymphatic emboli. (V) Lymph node metastasis shows strong positivity for E-cadherin. (W) No significant differences in E-cadherin expression are observed among the groups.

Conversely, CD44 exhibited widespread expression in the membrane and transmembrane regions of tumour cells across all sites (Fig. 1H,I). While CD44 immunostaining was increased at metastatic sites (Fig. 1J), it was notably significantly higher than that observed in central regions of primary tumours (p = 0.0463; Fig. 1K). For p75NTR, its presence was mostly found in metastatic primary tumours, showed limited and scattered expression (Fig. 1L) in centre and ITF (Fig. 1L,M, respectively), but strong cytoplasmatic staining in metastases (Fig. 1N), although without statistical significance (Fig. 1O). Focal nuclear immunostaining for BMI-1 was also observed mostly in both regions of primary tumours (Fig. 1P,Q), with positive tumours detected only in a few and scratched metastatic tumour cells (Fig. 1R). These results demonstrated that the majority of BMI-1 immunopositivity significantly decreased in metastatic sites compared to centre areas of primary tumours (p = 0.0052; Fig. 1S). Furthermore, E-cadherin showed strong consistent membrane intensity in the central regions (Fig. 1T), displayed heterogeneous cytoplasmatic positivity in invasive tumour cells within lymphatic emboli (Fig. 1U), and lymph node metastasis (Fig. 1V) displayed strong positivity for E-cadherin. However, it did not present significant differences among groups (p > 0.05; Fig. 1W).

Immunostaining panel for OSCC poor prognosis

Multivariate analysis was performed to ascertain poor prognostic factors associated with the presence of metastasis and diminished survival rates (specifically, 1-year and 5-year survival rates) in patients with OSCC. E-cadherinlow at the centre of primary OSCC and p75NTRhigh, both categoric and absolute, in ITF of primary OSCC were independent predictors of lymph node metastasis. ALDH1high immunostaining at the ITF was a predictor of a decrease in 1-year overall survival in primary OSCC. Notable, ALDH1, p75NTR, and E-cadherin emerged as independent prognostic indicators in OSCC, exhibiting greater predictive efficacy compared to the conventional gold standard of tumour staging (Tables 2 and 3). Consequently, the potential profile associated with poor prognosis in OSCC was identified as CSChighE-cadherinlow (ALDH1highp75NTRhighE-cadherinlow) (Fig. 2A).

Table 2 Logistic regression for cervical lymph node metastasis and survival in OSCC’s central regions.
Table 3 Logistic regression for cervical lymph node metastasis and survival in OSCC’s ITF.
Figure 2
figure 2

CSC model for OSCC poor prognosis. (A) Computerized staining co-localization for CSChighE-cadherinlow (ALDH1highp75NTRhighE-cadherinlow). (B, C) Tumour cell immune profiles at the centre of primary tumours had no impact on prognosis (p > 0.05). (D) CSCintermediate E-cadherinlow, followed by the CSChigh E-cadherinlow immune profile at the ITF of primary OSCC were associated with nodal metastasis (p = 0.0179). (E) CSChigh E-cadherinlow profile was the strongest predictor of poor 5-year overall survival, compared to CSCintermediate E-cadherinlow (p = 0.0082), Non-CSC E-cadherinlow (p = 0.0002), and Non-CSC E-cadherinhigh (p = 0.0254) profiles. The CSCintermediate E-cadherinlow profile also demonstrated lower 5-year overall survival compared to CSCintermediate E-cadherinhigh (p = 0.0374), and Non-CSC E-cadherinlow (p = 0.0019). (F) shows the immune profile grading according to the presence of metastasis and survival decrease; from left to right is the most to the least related profile associated with OSCC's poor prognosis. * Means significant associations between CSChigh E-cadherinlow profile with others, whilst # means significant associations between CSCintermediate E-cadherinlow profile with others.

To substantiate this hypothesis, we conducted a rigorous validation by cross-referencing our findings with publicly accessible datasets. Subsequently, we classified the outcomes of each biomarker into distinct immunostaining profiles, taking into account their expression in both the central and ITF regions. The number distribution of profiles differed between the central regions of primary tumours and the ITF. In the primary tumour centre, the profiles were CSCintermediateE-cadherinlow (38, 40.4%), CSCintermediateE-cadherinhigh (30, 31.9%), Non-CSC Epithelialhigh (13, 13.8%), CSChighE-cadherinlow (7, 7.4%), and Non-CSC Epitheliallow (6, 6.3%). Controversially at the ITF, the profiles were CSCintermediateE-cadherinlow (36, 38.2%), CSCintermediateE-cadherinhigh (33, 35.1%), CSChighE-cadherinlow (12, 12.7%), Non-CSC Epitheliallow (8, 8.5%), and Non-CSC Epithelialhigh (5, 5.3%) (Table 4; Supplementary Fig. 2).

Table 4 CSC and E-cadherin profiles for OSCC prognosis.

In our comparative analysis, encompassing datasets from GEO and the Xena Browser, we observed consistent trends. Specifically, ALDH1 and p75NTR exhibited upregulation, while E-cadherin displayed a downregulation in primary OSCC cases that exhibited metastatic behaviour. This trend was also noted in metastatic lesions when compared to their respective primary tumours. However, it is noteworthy that these observed differences did not attain statistical significance, as indicated by adjusted p-values exceeding 0.05.

Univariate analyses aimed at predicting nodal metastasis and reduced overall survival did not yield statistically significant associations, as evidenced in Fig. 2B,C, respectively. Conversely, the most salient immunostaining profile for prognosticating a poor outcome emanated from the ITF, displaying variability. Regarding the presence of metastasis, the most significant association was with the CSCintermediate E-cadherinlow immunostaining, followed by CSChigh E-cadherinlow (p = 0.0179; Fig. 2D). Concerning survival, the CSChighE-cadherinlow immunostaining profile was the strongest predictor of poor 5-year overall survival, although the CSCintermediate E-cadherinlow profile also demonstrated lower 5-year overall survival compared to CSCintermediate E-cadherinhigh (p = 0.0374), and Non-CSC E-cadherinlow (p = 0.0019) (Fig. 2E). Figure 2F provides a schematic representation of the immunostaining profiles at the ITF related to metastasis and survival.

Thus, grouping the immunostaining levels of ALDH1, p75NTR, and E-cadherin was consistent with the outcomes from the multivariate analysis, validating that the CSChighE-cadherinlow and CSCintermediate E-cadherinlow profiles can be used as predictors of worse overall survival and metastasis in OSCC. Remarkably, considering the significant decrease in overall survival as the worst clinical outcome, we may state that the CSChighE-cadherinlow is the preferred immunoprofile associated with a poor prognosis in OSCC.

Discussion

The outcomes of our investigation propose a potential immunohistochemistry panel of biomarkers associated with an unfavourable prognosis in OSCC. This proposal is founded on the immunostaining levels of CSC and E-cadherin markers observed across distinct primary tumour compartments, all evaluated before any chemo/radio treatment. Initially, our findings revealed that tumour cells with ALDH1high, p75NTRhigh, and/or E-cadherinlow profile at the ITF were independent predictors of metastasis and lower 1-year survival. Later, we hypothesize that a collective CSChighE-cadherinlow immunostaining profile would correspond to the major subpopulation involved in OSCC poor prognosis. To validate this, we analysed the combination of ALDH, p75NTR, and E-cadherin immunopositivity for predicting poor prognosis in OSCC. We found that the CSChighE-cadherinlow and CSCintermediate E-cadherinlow profiles were most significantly associated with nodal metastasis and overall survival decrease.

In the current investigation, we initially observed a significant elevation of ALDH1+ tumour cells in metastatic sites when compared to both the centre and ITF regions of primary tumours. Importantly, our multivariate analysis unveiled that heightened ALDH1 immunostaining in tumour cells within the ITF served as an independent predictor of diminished survival, collectively underscoring the substantial potential for ALDH1 as an important CSC driver in OSCC carcinogenesis, and progression. These findings align with our prior report, which demonstrated an association between the high presence of ALDH1 in tumour cells with the angiolymphatic invasion in primary OSCC14. Moreover, analogous studies have also reported important roles of ALDH1 in carcinogenesis and aggressive tumour progression, particularly concerning lower tumour grade differentiation, the presence of lymph node metastasis, reduced overall survival, and decreased disease-free survival33,34. These attributes collectively contribute to an unfavourable prognosis35,36,37. The ALDH1 enzyme also assumes responsibility for oxidizing a diverse array of intracellular aldehydes, and within malignant cells, it may be utilized to facilitate several intrinsic surveillance mechanisms aimed at enhancing migratory capabilities, self-renewal and drug resistance properties38.

Furthermore, recent research has shed light on the roles of both ALDH1 and CD44 as indicators of breast cancer malignancy, highlighting their distinct functions in tumorigenesis and progression. Notably, these investigators demonstrated that CD44 promotes self-renewal, proliferation, and tumour growth in xenograft mouse models, while ALDH1 enhances invasion and metastasis in vitro and in vivo, respectively34. CD44, recognized as a multifunctional cell surface adhesion molecule, exhibits diverse roles in carcinogenesis39,40. In our study, initial findings revealed elevated CD44 staining in metastatic sites compared to primary tumours, although multivariate analysis did not reveal a significant association with poor prognosis. It is well-established that subpopulations of CSC coexist and display plasticity, transitioning between preferentially migratory and proliferative phenotypes41. The present study's findings suggest that, while CD44 may not directly correlate with metastatic dissemination, its heightened presence in metastatic cells could be indicative of an increased migratory and proliferative status within the CD44high subpopulation in OSCC. Accordingly, our research group previously demonstrated in a xenograft model that the CD44highESAhigh (Epi-CSC) subpopulation of LUC4 OSCC cells generated tumours more effectively and larger than the CD44highESAlow (EMT-CSC) phenotype. Similarly, Epi-CSC displayed higher colony formation capacity and proliferative activity in vitro than EMT-CSC42. Boxberg et al.43 also documented that the upregulation of CD44 at the ITF in primary OSCC tumours exhibited correlations with poor histopathological differentiation, heightened tumour budding activity, and single-cell invasion. Significantly, the overexpression of CD44 in lymph node metastases emerged as an independent prognostic determinant associated with diminished overall survival, disease-specific survival, and disease-free survival in patients afflicted with advanced OSCC. Collectively, these findings underscore the notion that CD44high tumour cells, while possessing augmented proliferative and tumorigenic potential, necessitate the involvement of EMT molecules to facilitate the migration and invasion of distant anatomical sites.

Numerous studies have also established a close association between p75NTR and the progression of OSCC, underlining its role in enhancing proliferation, invasion, and tumorigenicity across various human cancer types44,45. Notably, the expression of p75NTR, particularly at the ITF of OSCC, has been linked to reduced 5-year disease-free survival rates and an increased likelihood of recurrence, thus contributing significantly to the overall poor prognosis of OSCC46. Our findings are consistent with the existing literature, as heightened immunostaining of p75NTR in tumour cells within the ITF of primary OSCC emerged as a critical and independent predictor of nodal metastasis, as determined by multivariate analysis. This correlation can be rationalized by the observed co-expression of p75NTR with Ki-67, indicative of increased cellular proliferation capacity47,48. Nevertheless, it is well-documented that tumour cells possessing stemness properties within the ITF exhibit a heightened propensity for local metastasis through a myriad of extrinsic and intrinsic mechanisms49. Consequently, our findings substantiate the notion that p75NTR expression constitutes a pivotal step in the facilitation of effective metastasis formation and may serve as a valuable predictor of metastatic events. Still, in the context of CSC markers, we initially found a significant decrease in BMI-1 immunopositivity in nodal metastasis compared to primary tumours. Although the association of BMI-1 levels with OSCC poor outcome was not confirmed by multivariate analysis, it is known that BMI-1 is involved in cell cycle regulation, immortalization, and senescence50. We may suggest that our findings agree with the literature due to the frequent association of BM1-1high expression with early carcinogenesis in OSCC. Also, other studies have supported the correlation between the lack of BMI-1 immunoexpression and with poor prognosis of OSCC patient51,52. It is known that the CSC phenotype is a transient state, that can be regulated by the tumour microenvironment and therapeutic pressures, leading to the hypothesis that BMI-1 may act in primary tumours providing plasticity in oral tumour cells, and inducing cell migration, but not enough to maintain this phenotype at metastatic colonies.

In contrast to the heightened expression of ALDH1 and p75NTR CSC markers, our investigation revealed that low immunostaining of E-cadherin emerged as a significant and independent predictor of metastasis within both the centre and ITF regions of primary tumours. These findings align with the established role of E-cadherin in governing tumour cell migration, as evidenced by the diminished or absent expression of E-cadherin in both individual and collective invasive tumour cells at the ITF. This downregulation contributes to the induction of EMT, ultimately facilitating the metastatic cascade53. Furthermore, prior research has underscored that E-cadherin downregulation, facilitated through HSPD1 activation, promotes the migration and invasion of OSCC tumour cells into adjacent organs, thus fostering metastatic dissemination and correlating with an unfavourable prognosis54. While the precise intricacies of the molecular interplay between CSCs and EMT remain incompletely elucidated, the association between these processes is pivotal for comprehending the behaviour of tumours.

In conclusion, our study has revealed the existence of distinct tumour cell profiles within tissue samples of OSCC, characterized by variations in immunostaining levels of CSC and E-cadherin markers across primary tumours and metastatic sites. This association was in agreement with independent OSCC publically available datasets. Given that reduced survival rates serve as a pivotal indicator of a poor prognosis, the immunohistochemistry profile identified as CSChighE-cadherinlow at the ITF of primary tumours, before treatment, emerges as the preferred prognostic marker associated with unfavourable outcomes in OSCC. It is noteworthy that while CSC and E-cadherin markers have been extensively investigated independently within the context of OSCC, this study, to the best of our knowledge, represents the first substantial association of a distinct combined immunohistochemistry profile encompassing both CSC and E-cadherin markers with adverse prognostic implications in OSCC (Fig. 3).

Figure 3
figure 3

CSC markers and E-cadherin as a putative panel of poor prognosis in Oral Squamous Cell Carcinoma. (A) Tumour compartments, designed as centre and ITF, exhibited tumour cells with different patterns of CSC and E-cadherin immunopositivity (bottom right square). At the ITF, ALDH1high and p75NTRhigh immunostaining were independent predictors of decreased survival and the presence of metastasis, respectively. The most clinically relevant profile was the CSChighE-cadherinlow (ALDH1highp75NTRhighE-cadherinlow) at the ITF, which was a predictor of metastasis and lower overall survival and, could be used as a panel of biomarkers for poor prognosis in OSCC.