Main

Oral tongue squamous cell carcinoma (OTSCC) is the most common malignancy of the oral cavity. OTSCC is increasing in incidence (Patel et al, 2011; Ng et al, 2016), and has an aggressive clinical behaviour with a relatively poor prognosis (Bello et al, 2010a). The 5-year relative survival rate was 63% in a recent report from the Netherlands (van Dijk et al, 2016). During 2017, almost 16 400 new cases of tongue cancer, and 2400 deaths are projected to occur due to this cancer in the United States (Siegel et al, 2017).

Predicting the outcome of OTSCC patients is important when planning treatment. In early stages (cT1–T2) of OTSCC, which are expected to have favourable prognosis, the cancer-related mortality affects about 19% of patients (Almangush et al, 2015). This indicates the need for better prognostic tools. Identification of robust prognostic biomarkers that could accurately predict the behaviour of OTSCC will aid in the selection of appropriate treatment strategies.

Recent research in the field of molecular pathology has introduced thousands of tumour biomarkers, which are associated with the progression and/or prognosis of different cancers. Many of these biomarkers have been evaluated for their prognostic power in OTSCC. However, the use of any molecular biomarker for OTSCC in daily practice has not yet been approved, although several biomarkers have been presented as promising prognosticators that could provide added value upon the classical ones such as stage, tumour grade and depth of invasion.

The task of the current systematic review of literature was to retrieve original studies that have examined the prognostic value of immunohistochemical biomarkers of OTSCC and to meta-analyse the studies of the most repeatedly reported biomarkers. We also summarise the current understanding of the topic and highlight the main shortcomings of the published studies to improve future research in this field.

Materials and methods

Search strategy

A search strategy combining the terms (tongue) AND (cancer* OR squamous cell carcinoma* OR neoplas* OR tumo*) AND (prognos* OR predict* OR surviv* OR recur* OR mortal* OR metasta*) AND (immunohisto* OR protei* OR marke* OR biomark*) was developed. The search terms were entered into Scopus, Ovid Medline, Web of Science and Cochrane Library (1985–2015).

In advanced search, the following search fields were included: title, abstract, subject heading, and keyword. The Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) were utilised (Moher et al, 2009).

Screening

Two independent researchers (AA & IH) examined the retrieved hits, and discarded duplicated ones. We also excluded unrelated studies through careful browsing of the title and/or abstract of each publication. Existing review articles for prognostic biomarkers of tongue cancer (Ferrari et al, 2009; Bello et al, 2010a, 2010b) were screened for papers missed in the search strategy.

Data extraction

For relevant articles, we retrieved information about the name of the first author, country, year of publication, number of patients and the immunohistochemical biomarker/s examined. For those biomarkers which were reported repeatedly, further data including the primary antibody used and its dilution, unadjusted and adjusted analyses, statistical results reported (estimated hazard ratio (HR), 95% confidence interval (CI) and P-value) were retrieved when available.

Exclusion criteria

  1. 1

    Studies in languages other than English.

  2. 2

    Data based on animal samples.

  3. 3

    Studies on cancers other than SCC, or on rare histological variants of SCC.

  4. 4

    Studies including samples from other subsites of the oral cavity, and studies that mixed samples from the oral tongue and the base of the tongue.

  5. 5

    Studies that did not report the prognostic value of the biomarker (i.e., studies that only reported association between the biomarker and classical parameters, such as stage and grade, but did not provide results for the association between the biomarker and survival outcomes).

Quality assessment

The guidelines from Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) (McShane et al, 2005; Altman et al, 2012b) were used to evaluate the quality of studies that were eligible for the meta-analyses of the five most often reported biomarkers. The selected guidelines taken from the REMARK criteria are summarised in Table 1.

Table 1 Evaluation criteria used to assess the quality of studies included in the meta-analysis of the five most often reported biomarkers (adapted from REMARK guidelines)

Statistical analysis

A meta-analysis on overall survival (OS) was performed for each of the five selected biomarkers, including only those original studies which provided an estimate of the HR and the associated 95% CI for the contrast of interest. As most studies provided only unadjusted or ‘univariate’ estimates of HR, the meta-analysis was primarily based on them. However, in case only an estimated HR was reported which was adjusted for various other prognostic factors using multiple Cox regression, such a ‘multivariate’ estimate was entered to the analysis. Depending on the biomarker the direction of HR contrast was either positive vs negative (high vs low; applied for p53, Ki-67, VEGF and cyclin D1) or negative vs positive (low vs high; applied for p16). Where an individual study reported the HR estimate in the opposite direction, the inverse HR and CI were calculated to obtain results concordant with other studies. HR and CIs were transformed into log(HR) and its standard error (s.e.). Pooled estimates of the HR were computed both by a fixed-effect and by a random-effects model based on the generic inverse variance approach (Schwartzer et al, 2015). In the random-effects analysis the between-studies variance was obtained by the Sidik–Jonkmann method (Sidik and Jonkmann, 2005). The I2 statistic and tau-squared, the estimated heterogeneity variance, were used to measure the heterogeneity of HRs between studies. Analyses were performed and forest plots created using the functions ‘metagen’ and ‘forest’ contained in package ‘meta’ (Schwarzer, 2007) of the R environment (R Core Team, 2016).

Results

Search results

Our search retrieved 1817 hits, and 771 of these were relevant for our study (Figure 1). A total of 174 studies remained that evaluated a total of 184 biomarkers for OTSCC. Of these, 32 biomarkers (17.4%) were reported at least in three studies (Figure 2). Of the latter, five biomarkers including p53, Ki-67, p16, vascular endothelial growth factors (VEGFs) and cyclin D1 were reported more often than any other (Tables 2, 3, 4, 5, 6; Supplementary Tables 1–5).

Figure 1
figure 1

PRISMA flowchart: studies included and excluded along the various steps.

Figure 2
figure 2

Biomarkers evaluated in three studies or more for their prognostic value in OTSCC.

Table 2 Summary of studies assessing the prognostic value of p53 in OTSCC providing unadjusted or adjusted estimates of HR and their 95% CIs for one or more endpoints, the HRs contrasting positive to negative expression
Table 3 Summary of studies assessing the prognostic value of Ki-67 in OTSCC providing unadjusted or adjusted estimates of HR and their 95% CIs for one or more endpoints, the HRs contrasting positive to negative expression
Table 4 Summary of studies assessing the prognostic value of p16 in OTSCC providing unadjusted or adjusted estimates of HR and their 95% CIs for one or more endpoints, the HRs contrasting negative to positive expression
Table 5 Summary of studies assessing the prognostic value of vascular endothelial growth factors (VEGFs) in OTSCC providing unadjusted or adjusted estimates of HRand their 95% CIs for one or more endpoints, the HRs contrasting positive to negative expression
Table 6 Summary of studies assessing the prognostic value of cyclin D1 in OTSCC providing unadjusted or adjusted estimates of HR and their 95% confidence intervals for one or more endpoints, the HRs contrasting positive to negative expression

A clear majority (86%) of the studies reported at least one biomarker as a promising prognosticator for OTSCC based on obtaining a ‘statistically significant’ result for an association of that biomarker with at least one outcome variable considered. Most of the studies on these five biomarkers were based on quite small cohorts (<100 patients), and in the analyses of the outcome they commonly mixed early and late stage cancers (Tables 2, 3, 4, 5, 6; Supplementary Tables 1–5). It was also common to exclude a biomarker from an adjusted analysis using Cox regression, when the biomarker turned out to be statistically ‘non-significant’ in an unadjusted analysis, but biomarkers ‘significant’ in an unadjusted analysis were often included in an adjusted analysis. Many authors, though, presented the results based on unadjusted analysis only.

Most of the publications of those frequently studied biomarkers did not fulfil completely REMARK guidelines (Altman et al, 2012b). We used the selected criteria mentioned in Table 1 to evaluate the most often studied biomarkers, on which estimated HRs and their 95% CI were reported for at least one survival endpoint in a number of studies, and more than one study analysed the OS to be included in our meta-analyses (Tables 2, 3, 4, 5, 6). We particularly noted that guideline no. 5 (related to statistics) was not fulfilled in several studies (e.g., adjusted analysis using multiple Cox regression was not conducted), and guideline no. 1 (related to the patient series) was sometimes ignored by the authors (e.g., medical treatment applied to the patients was not well-explained). In addition, guideline no. 6 (related to classical prognostic factors) was not fulfilled properly in many studies (e.g., the relationship between biomarker/s and classical prognostic factors were not reported) (Tables 2, 3, 4, 5, 6).

Results of meta-analyses

The results of the meta-analyses on OS for the five most often studied biomarkers, each being evaluated in at least two studies that reported necessary statistical data, are summarised in Figures 3 and 4. The number of eligible studies was very small ranging from two for cyclin D1 to six for p16, the remaining biomarkers having three studies each. Essential heterogeneity in the HRs across the individual studies was observed for Ki-67, VEGFs and p16, whereas the pertinent measures (I2 and tau-squared) had very low values for p53 and cyclin D1. The point estimates and the error margins of the pooled HR from the fixed effect model and from the random effects model were very similar for the two last ones, whereas for the three first ones the 95% CI of the pooled HR was substantially wider when based on the random effects model as expected. Thus, it is reasonable to focus on the results of the random effects model. On the basis of the available data there was not sufficient evidence for p53, Ki-67 and p16 to be informative prognostic biomarkers. The two available studies on cyclin D1 suggest that this biomarker could be a useful prognosticator worth further evaluation. For cyclin D1 the pooled HR estimate was 2.86 (95% CI from 1.34 to 6.08). As regards VEGFs, the results were mixed with a very wide CI for HR from the random effects model. However, two of the VEGFs studies analysed VEGF-A and one VEGF-C. A positive or high VEGF-C value was shown to be associated with an improved prognosis (Morita et al, 2014), which was in sharp contrast with the two other studies analysing the expression of VEGF-A, both indicating a much worse survival for a positive VEGF-A value. When the VEGF-C study was excluded from the meta-analysis, the pooled estimate for HR was 7.34 (95% CI from 2.32 to 23.22), which provides rather strong evidence for VEGF-A being a useful prognostic biomarker.

Figure 3
figure 3

Forest plots for the pooled analyses of the biomarkers that have been studied most frequently in OTSCC but did not have prognostic usefulness in OTSCC. (A) p53 studies, (B) Ki-67 studies, and (C) p16 studies.

Figure 4
figure 4

Forest plots for the pooled analyses of the biomarkers that have been studied most frequently in OTSCC and have shown prognostic usefulness in OTSCC. (A) VEGFs studies and (B) cyclin D1 studies.

Discussion

Molecular biomarkers may highlight biological differences between cancers and help to prognosticate patient outcome. During the last three decades (1985–2015), more than one hundred molecular biomarkers (identified by immunohistochemistry) were introduced as prognosticators for OTSCC. Five biomarkers including p53, Ki-67, p16, VEGFs and cyclin D1 were most often reported, and their biology has been reviewed elsewhere (Oliveira and Ribeiro-Silva, 2011; Wang et al, 2013). According to the findings of the present meta-analysis, the prognostic usefulness of VEGF-A and cyclin D1 is worth further evaluation. On the other hand, for p53, Ki-67 and p16 there appears to be no sufficient evidence for any prognostic value for OTSCC.

The evaluation of molecular biomarkers in different subsites of the oral cavity is common in literature. However, variations in the immunohistochemical staining results reflect variations in proteomic (and genomic) properties of SCC between different oral subsites. For example, various immunohistological biomarkers analysed in OTSCC and buccal carcinoma samples did not associate with survival in OTSCC, whereas some of them were prognostic in buccal carcinoma (Sathyan et al, 2006; Trivedi et al, 2011). The histological structures and the carcinogenesis are different in buccal mucosa and in oral tongue. Similarly, the base portion and the oral (mobile) portion of the tongue have differences in the etiopathogenesis of the cancer. In the base of the tongue, HPV is commonly linked with the cancer, whereas the virus is rarely founded in mobile tongue. However, still several studies combine oral and base of the tongue SCC samples, or they do not specify for which part of the tongue the analyses were done (Supplementary Tables 1, 2, 4 and 5). To avoid the effect of tumour heterogeneity, we included in the present meta-analyses studies in which the authors defined their cohort as OTSCC. Our meta-analysis and conclusions are based on those studies which included previously untreated, surgically resected, primary OTSCC.

Almost all of the most reported biomarkers (except VEGFs, which contribute to tumour angiogenesis) reflect important growth-related properties of the cancer cells. However, non-neoplastic cells of the tumour stroma, including fibroblasts, endothelial cells and inflammatory cells, seem also to have a critical role in cancer progression (Marsh et al, 2011). Accordingly, biomarkers of the stromal microenvironment might even have a greater impact on prognosis than biomarkers related to tumour cells (Marsh et al, 2011). However, biomarkers related to tumour microenvironment such as Activin A (Kelner et al, 2015) are not yet widely studied in OTSCC. The only tumour microenvironment biomarker that was reported repeatedly in OTSCC, is the cancer-associated fibroblast identified by a α-smooth muscle actin antibody (Supplementary Table 6). Further studies should also focus on the evaluation of promising biomarkers of tumour microenvironment, such as fibronectin and tenascin-C (Sundquist et al, 2017), in addition to biomarkers related to cancer cells.

Notably, 152 biomarkers (83%) have been studied only once or twice, so it is not possible to reach trustworthy conclusions on the basis of such limited evidence. About 86% of the studies claimed to have found at least one biomarker to have prognostic value. Negative or ‘non-significant’ prognostic finding as the only result was not widely published (about 14%). This might well indicate a fair amount of publication bias as noted by Soland and Brusevold (2013). However, the articles which reported new promising biomarkers have also usually evaluated the prognostic significance of previously known biomarkers (e.g., Ki-67, p53, and p16). This approach helps to validate previously published data and allows the accumulation of evidence for known biomarkers. Relationships between biomarkers and clinicopathologic manifestations, other than survival, have also been studied in several publications (Albert et al, 2012; Wang et al, 2012). However, even though it is important to understand such relationships, clinically the most relevant information is provided by proper survival analysis of the tested biomarker.

REMARK guidelines (Altman et al, 2012b) have suggested items to be reported in prognostic studies of tumour markers. Tables 2, 3, 4, 5, 6 show that many studies on p53, Ki-67, p16, VEGFs and cyclin D1 did not follow REMARK criteria accurately. This indicates shortcomings in the reporting of the biomarkers tested in these studies, and subsequently limits the possibilities to reach definitive conclusions of their usefulness.

To date, none of the most often reported biomarkers can be recommended as prognosticators valid for clinical use. This may be related to the mixing of early and late stages of OTSCC in the same analysis, the small sizes cohorts or numbers of events seen in many studies. In addition, antibodies supplied by different manufacturers, variable staining conditions, and different cut-off values were also seen. All of these factors might affect the biomarker results published.

Even though it is highly desirable to apply multiple Cox regression or similar predictive modelling (‘multivariate’ analysis) to adjust for important classical prognostic factors (like TNM stage and histologic grade) and relevant patient characteristics (like age), limited cohort size does not allow conducting a regression analysis of good predictive performance, and might affect the validity of the estimation results (Ogundimu et al, 2016). Moreover, numerous studies which included small cohorts (and small numbers of events) have reported regression analyses with many variables, and such studies may run the risk of overfitting. A rule of thumb of 10 outcome events for each prognostic term involved in multiple Cox regression is widely advocated (Peduzzi et al, 1995; Ogundimu et al, 2016), but in many instances at least 20 events per term would be needed for reliable modelling (Ogundimu et al, 2016). It seems that no consideration has been devoted to this issue in most of the studies applying Cox regression. Another common statistical shortcoming was to report the P value only and to lean on it when making judgments of the prognostic value of a biomarker depending on whether the result was ‘significant’ (when P<0.05) or ‘non-significant’ (when P0.05). Such statistical malpractice meant that many studies were not eligible for inclusion in the current meta-analyses; and may have caused false inferences in some studies, thus it should be avoided in future studies. More useful results can be obtained by reporting the estimated HR, preferably an adjusted one, and its CI, as is recommended by REMARK guidelines (Altman et al, 2012a). When predicting mortality, it was quite common to report OS but not cancer-specific mortality. Analysis by cause of death would, however, provide more relevant prognostic information than that of OS only (Läärä et al, 2016). For better statistical analysis, guidelines for improving statistical reporting (Greenland et al, 2016) should be followed, the authors should involve a statistician, and journals should assign an experienced statistician as one of their reviewers.

Recently, the digital evaluation of biomarkers has been suggested to facilitate the evaluation and to avoid inter- and intra-observer variability (Bouzin et al, 2016). So far only a few studies have used this technique in the prognostication of OTSCC (Hannen et al, 2001, 2002), and none of the studies in the meta-analysis had applied digital scorings. In the future, image analysis will most likely be more commonly applied also for OTSCC scorings.

Regarding the global distribution of the studies, the Japanese population was widely represented in studies of p53, Ki-67, VEGFs and cyclin D1 (Table 7). However, the studies included in our meta-analyses (Figures 3 and 4) were conducted in a different population (marked with an asterisk in Table 7).

Table 7 Distribution of OTSCC immunohistochemical prognostic biomarkers studies according to countries

The primary treatment was surgical OTSCC resection in all of the publications included in the meta-analyses. However, one of the p16 studies included cases with either surgical excision or external beam radiotherapy for patients unwilling/unfit for surgery (Ramshankar et al, 2014). None of the studies in the meta-analyses mentioned any preoperative therapy, which could have led to molecular changes in the resected tumour tissue. Neck dissection, radiotherapy and/or chemotherapy are used for selected cases of OTSCC, especially those in advanced stages. Such variations in treatment modalities must have an impact on the prognosis, as well as on the prognostic analyses. Unfortunately, therefore a limitation in our meta-analyses is that the cases included were not homogenous with regard to neck dissection or other adjuvant therapies. The results of our meta-analyses must, in any event, be interpreted with due caution because of various shortcomings in the published studies and their reporting. First, for each biomarker very few original studies were available that provided the necessary statistical data. Second, essential heterogeneity was evident within and across individual studies with respect to the spectrum of patients, staining methods, and so on. Third, the numbers of patients and outcome events were mostly small implying poor statistical precision. Fourth, only crude HR estimates, unadjusted for classical prognostic factors were typically reported, leaving room for an unknown amount of bias in the assessment of the prognostic power of the pertinent marker due to uncontrolled confounding.

Conclusion

The identification of biomarkers may increase the possibility of detecting cancers with high risk for poor prognosis. Here, for the first time, we systematically reviewed the literature for immunohistochemical prognostic biomarkers for OTSCC. On the basis of our meta-analysis VEGF-A and cyclin D1 may be useful prognostic biomarkers for OTSCC. Therefore, these biomarkers, in addition to the new biomarker candidates, should be further evaluated carefully following the REMARK criteria using large, preferably multicentre, cohorts of OTSCC cases separated into early and late stages.