Main

In the years 2000–2002, the treatment of Head and Neck Cancer evolved with the addition of cisplatin-based concurrent chemotherapy to conventional radiotherapy (RT) based on clinical trials, meta-analyses (Pignon et al, 2000), systematic reviews (Browman et al, 2001) and clinical practice guidelines (Ontario Provincial Disease Site Group, 2002). As a result concurrent chemoradiotherapy (CRT) rapidly became the standard of care for locally advanced squamous cell carcinomas. A second clinical paradigm commenced in 2000 with the identification of human papillomavirus (HPV) as a causative agent in oropharynx cancer (OPC) (Gillison et al, 2000; Gillison, 2004) followed by the demonstrations of increasing incidence of HPV-associated OPC (Chaturvedi et al, 2008) and of dramatic improvements in outcomes for HPV-associated OPC when treated with CRT (Fakhry et al, 2008) as well as other approaches, including surgery (O’Sullivan et al, 2012; Chen et al, 2013; Iyer et al, 2015). However, the randomised controlled trials (RCT) and meta-analyses that concluded that CRT was superior to RT alone were HPV-status naïve. Since there is no specific clinical trial addressing the efficacy for HPV+ve and HPV–ve cancers, it was assumed that the 6.5% improvement (Pignon et al, 2007, 2009) applied to both patients populations. In the most recent meta-analysis Blanchard et al (2011) established a 5-year overall benefit in overall survival (OS) of 8.1% (HR 7.8) specifically for the OPC patient population with CRT over RT alone based on trials with a heterogeneous mix of radiotherapy protocols and chemotherapy protocols.

This is a retrospective population-based treatment effectiveness study designed to determine ‘real-world’ effectiveness of CRT over conventional RT alone for HPV+ve and for HPV–ve patients. We identified all patients with OPC from Ontario Canada in the years 1998, 1999, 2003 and 2004. The charts were reviewed, tissue was tested to determine HPV status and outcomes were compared by treatment era and by treatment with and without HPV status. The years 1998/1999 were selected as CRT had not yet attained community practice adoption. The years 2003/2004 were selected based on data used in a previous study evaluating practice variation, survival and non-survival outcomes of CRT. The patient population, study method, treatments and results of the 2003/2004 OPC cohort without HPV status have been published (Hall et al, 2015).

Materials and methods

Background

Ontario, Canada has a population of over 12 million and all patients with head and neck cancer are treated at one of nine regional comprehensive multidisciplinary cancer centres located at or in association with the teaching hospitals of Ontario.

The three study populations

Figure 1 outlines the creation of the 3 study populations the details of which are found in Supplementary Appendix 1.

Figure 1
figure 1

The three study populations.

All OPC patient cohort

We identified all cases of squamous cell carcinoma of the oropharynx based on ICD codes in the Ontario Cancer Registry with date of diagnosis between 1 January 1998 and 31 December 1999. The out-patient charts with treatment records were requested from each of the Ontario Cancer Treatment Centers and the Princess Margaret Hospital, Toronto. Trained experienced abstractors at the Queen’s Cancer Research Institute reviewed and abstracted charts for data on the patients, including extent of disease, investigations, treatments (RT and chemotherapy regimens) and outcomes (death, cause of death, recurrence).

The previously reported study (Hall et al, 2015) identified a cohort of 571 out of 869 patients with squamous cell carcinoma of the oropharynx treated in Ontario Canada in 2003/2004 using the identical strategy and when combined with the 1998/1999 cohort created a data set of 1028 patients from two eras.

Treatment study cohort

We created a cohort of patients treated for cure with RT or CRT only after excluding 163 patients with other treatment strategies (surgery, other chemotherapy regimens, palliative treatment, no treatment) or missing information leaving 865.

HPV ascertained cohort

After further exclusions, we sent for the tissue blocks on 773 patients. Tissue was obtained, tested and matched to the clinical information of 609 patients and we included 1 patient with a p16 test result on the chart and the tissue block could not be located (n=610). This was 79% of specimens requested.

Clinical variables

A detailed description of the method, clinical variables and treatments is found in the previous publication. The independent variables included treatment centre, comorbidity (ACE-27 (Piccirillo et al, 2004)), a smoking surrogate based on the Cumulative Illness Rating Scale (CIRS) (Linn et al, 1968), histology, extent of disease (TNM 6th edition), subsite, initial treatment, vital status and cause of death. The initial treatment variable included four options: radiotherapy (RT), CRT, surgery, or palliative treatment. These initial treatment groups included the treatment of residual disease as that would be assumed to be part of the initial planned treatment. Residual disease and recurrent disease were distinguished based on the presence or absence of a disease-free status statement on the chart. The RT cohort and the CRT cohort included patients with any of pre-treatment planned neck dissection, neck dissection post-treatment for residual disease (as above) or post-treatment resection of the primary for residual disease. The Surgery cohort included patients who had post-operative RT or post-operative CRT. The Palliative cohort was identified by a statement of intent on the chart stating that ‘palliative’ or ‘no treatment’ was the intended strategy, regardless of actual treatment administered, or if patients received<5 fractions. The abstracted radiotherapy data included technique, as well as fractionation, overall treatment time and dose from which a biological equivalent dose was calculated. The BED was calculated from the formula: (nd)[1+d/(α/β)]–(0.693t/αTpot), where n=the total number of fractions delivered; d=the dose per fraction (Gy); α/β=10 for acute effects and tumour control and three for late effects; α=0.3 Gy−0; t=total days in which RT was delivered; and Tpot=potential doubling time (5.6 days) (Han et al, 2015). Follow-up and cause of death were based on the medical chart or electronic data from either the Ontario Cancer Registry or the Ontario Registrar General.

HPV testing

Formalin-fixed paraffin-embedded blocks

Archival tissue formalin-fixed paraffin-embedded blocks were used. For each block, one section was cut and stained with haematoxylin and eosin for tumour classification. Serial 4-μm sections were cut from each patient’s representative tumour tissue for either IHC or in situ hybridisation.

p16 immunohistochemistry

The immunoreactivity of p16 was evaluated on all specimens obtained and detected by the Ventana Autostainer (BenchMark XT; Ventana Medical Systems, Tucson, AZ, USA) using 1 : 50 dilutions of the purified mouse anti-human p16 INK4A antibody (BD Pharmingen, Biosciences, San Jose, CA, USA). A p16-positive cervical cancer was used as a positive control. Sections without primary antibodies were used as negative controls. A tumour was considered positive when strong signals were detected in both the tumour nuclei and the cytoplasm (Shi et al, 2009; El-Naggar and Westra, 2012).

HPV16/18 viral DNA detection

In a subgroup of 292 samples with either p16-negative or p16-equivocal scoring, HPV16/18 DNA was detected using a catalysed signal amplification in situ hybridisation method (DakoCytomation, Carpentaria, CA, USA) on 4-μm-thick formalin-fixed paraffin-embedded tissue sections with modifications to the manufacturer’s instructions. Both a HPV16-positive OPC sample and a SiHa tumour served as positive controls. Fadu, a hypopharyngeal tumour cells (American Type Culture Collection (ATCC, Manassas, VA, USA)), served as the negative control. Samples were scored as positive for HPV16/18 if a punctate or diffuse pattern of signal was observed in tumour nuclei. Those with neither punctate nor diffuse signal patterns were designated as HPV-negative. All samples were scored as 0, 1+, 2+ or 3+ as previously described (Shi et al, 2009).

Analysis

The data collection was performed at the Queen’s Cancer Research Institute, and analyses were completed using SAS version 9.4M1, SAS/STAT 13.1.

Descriptive

The patients, treatments, vital status and cause of death for the All OPC Patient cohort (n=1028) are reported and outcomes compared between the two eras (1998/1999 vs 2003/2004). Differences in clinical variables are reported in the text if statistically significant (P0.05). The 865 cases in the Treatment Study cohort are reported similarly including a description of the radiotherapy given by the nine cancer treatment centres.

Survival

Five-year OS and disease-specific survival (DSS) are reported using Kaplan–Meier survival curves, log rank tests and hazard ratios (HR) with 95% CIs based on Cox Proportional Hazards Regression models (CPHRM) incorporating era, comorbidity, age, gender, smoking, T Category, N Category, subsite, treatment, fractionation (altered or conventional), BED and HPV status.

Survival by treatment era is compared using the All OPC Patient cohort and the HPV-Ascertained cohort. Survival by treatment is compared using the Treatment Study cohort and the HPV-Ascertained cohort.

Results

This section focuses on the HPV-Ascertained Cohort only. A complete description of the patients, details of treatments, vital status, causes of death and the survival analysis for the All OPC Patient cohort and the Treatment Study Cohorts are found in Supplementary Appendices 2 and 3.

HPV testing and results

The p16 test was +ve in 321cases, −ve in 243 cases and equivocal in 46 cases (Figure 2). In situ hybridisation was performed on 280 of the 289 HPV−ve or equivocal cases and 71 more cases were identified as HPV+ve. Overall, 392 (64.4%) were HPV+ve and 218 (35.6%) were HPV–ve. In the 1998/1999 cohort, the incidence was 53.9% and for the 2003/2004 cohort the incidence had increased to 71.1%.

Figure 2
figure 2

The results of p16 assay and in situ hybridisation (ISH) on the tissue samples from 609 patients.

Study population

Patients and treatments are described in Table 1. The 392 HPV+ve patients were younger with less comorbidity, less smoking, more tonsillar lesions, smaller T category and higher N category. In all, 199 of the 610 patients were treated with CRT including 25.3% of the HPV–ve patients.

Table 1 The patient, tumour and treatment variables for the 610 patients in the HPV-Ascertained cohort

Outcomes for HPV-ascertained cohort

Two hundred (200) patients died of OPC including 143 RT and 57 of the CRT. A total of 108 patients died of other causes including 89 RT and 19 CRT.

There was a 10% improvement in OS between 1998/1999 and 2003/2004 (log rank P=0.003); however, in the multivariable model incorporating HPV status the HR for era was 1.043 (P=0.72) (Table 2). There was no improvement in OS over time with the changes in causation and with the changes in treatment.

Table 2 Hazard ratios for multiple variable regression (Cox) models for overall survival (OS) for the HPV-ascertained cohort, the HPV+ve and HPV−ve cohorts reporting hazard ratio, upper and lower 95th confidence interval and P-value

When comparing treatments (CRT vs RT), there was a similar 13% improvement in OS (log rank, P=0.004); however, in the multivariable model that incorporated HPV status, the HR was 0.983 (P=0.91) (Table 2). Table 3 presents the complete results of this analysis. When we added fractionation (conventional vs altered) or BED (either continuous or categorical (quintiles)) to the model, there was no difference in statistical significance comparing treatments. In summary, there was no improvement in OS or DSS with CRT over RT alone once controlling for HPV status.

Table 3 Hazard ratios (overall survival) for all variables in the HPV-ascertained cohort of 610 patients

When comparing the OS for the HPV+ve patients to the HPV−ve patients, there was, as expected, a 40% improvement in survival (Figure 3A).

Figure 3
figure 3

(A) Overall survival for HPV+ve patients vs HPV−ve patients. (B) Overall survival for HPV+ve patients by era. (C) Overall survival for HPV−ve patients by era.

Outcomes for HPV+ve patients

There was no statistically significant difference in OS between the eras of 1998/1999 and 2003/2004 for the HPV+ve patients (log rank P=0.147) (Figure 3B). The use of CRT did not improve outcomes for the HPV+ve patients over time.

Similarly when comparing treatments for the HPV+ve patients, there was no statistically significant difference in OS (log rank P=0.53) (Figure 4A) or in DSS (log rank P=0.87). In the multiple variable analysis, the HR (OS) comparing treatment was 0.948 (P=0.78) (Table 2).

Figure 4
figure 4

(A) Overall survival for the HPV+ve patients comparing those treated with radiotherapy to chemoradiotherapy. (B) Disease-specific survival for the HPV−ve patients comparing those treated with radiotherapy to chemoradiotherapy.

Outcomes for HPV−ve patients

There was no statistically significant difference in OS between the eras of 1998/1999 and 2003/2004 for the HPV−ve patients (log rank P=0.362) (Figure 3C). The use of CRT did not improve outcomes for the HPV–ve group.

Figure 4B compares DSS by treatment for the 218 HPV−ve patients. The P-values for the log rank tests for OS and DSS were 0.17 and 0.91, respectively. In the multiple variable analysis, the HR for the HPV–ve group was 1.083 (P=0.73) (Table 2), confirming that the addition of chemotherapy to radiotherapy did not improve outcomes for either patient group.

Discussion

The objective of this study was to determine the effectiveness of cisplatin-based concurrent CRT over conventional RT alone for HPV+ve and for HPV–ve OPC patients by comparing outcomes by era, treatment and HPV status during a time period when oncologists were HPV-status naive. We found that OPC survival improved between 1998/1999 and 2003/2004, but that the improvement was explained by the 17% increase in the proportion of HPV patients in the OPC cohort that had a 40% better prognosis, that is, the HPV+ve cancers are more sensitive to RT (Lassen, 2010). We found that the improvement in survival seen on the Kaplan–Meier curve with addition of CRT to treatment protocols in 2003/2004 was explained by HPV status, not treatment. We found that HPV+ve cancers are different from HPV−ve cancers in behaviour as well as causation, but neither had improved outcomes with the addition of cisplatin-based concurrent chemotherapy to RT in this patient population. We did not detect an improvement in survival with CRT; however, costs, hospitalisations, gastrostomy tube insertions, acute toxicity, late toxicity and treatment-related mortality were all increased with CRT (Hall et al, 2015). Oncologists, head and neck site groups, institutions and health-care systems’ funders might reconsider the evidence on which patients with OPC, whether HPV+ve or HPV−ve, receive CRT and need to place the potential of an at-best modest gain of 6.5% (Pignon et al, 2000, 2009) or 8.1% (Blanchard et al, 2011) against the toxicity of treatment and against the much greater established gains seen in other more common cancer sites with the addition of adjuvant chemotherapy (Pritchard et al, 2006; Gill et al 2014). Researchers and policymakers might reconsider the evidence in this disease, given the toxicity profile and need to be sure that future trials include treatment arms with radiotherapy only.

The strength of this study is the inclusion of the complete cohort of ‘real-world’ patients from all treatment centres in Ontario Canada in a time when oncologists were HPV naïve. Other strengths include the use of patients from different eras with different rates of HPV+ve, the systematic data collection and quality of the HPV tissue testing. Testing was done in one independent laboratory (the Molecular Oncology Lab, the Princess Margaret Hospital) that was blind to patient identifiers, treatments and results.

There are potential limitations to this study including treatment selection bias. Treatment decisions for specific patients in 2003/2004 were not based on HPV status (only three patients had p16 tests undertaken in real time) but were based on the clinical judgement of the oncologists in each centre based on their interpretation of the validity of the trials, meta-analyses and guidelines about CRT. Across Ontario in 2003/2004, acceptance of the evidence and treatment varied with 30–80% of patients with OPC by centre having CRT due to concerns by oncologists about the modest improvement vs acute toxicity (Hall et al, 2015) and poor reporting of late toxicity (Trotti et al, 2003). By design we have incorporated patients who would have been offered CRT in subsequent years or at different centres and who are a strength of the study. One centre tended to select patients more liberally for altered fractionation without chemotherapy but after removing all the patients from that centre from the analysis, the result comparing treatment effect remained unchanged.

Another potential limitation is that we could not test or obtain tissue blocks on 198 patients who had curative treatment with either RT or CRT. However, there were no statistically significant differences (chi-sq) between these patients and the patients in the HPV-tested cohort for age, gender, comorbidity, subsite, N-category, TNM stage or treatment. There were more smokers and more patients with higher T category in the non-tested group. There was no statistically significant difference in OS comparing the 610 HPV-tested group and the 83 patients in whom no tissue block was available (P=0.4). Furthermore, there was no statistically significant difference in OS comparing the 610 tested, the combined FNA and no pathology report group (35), the ‘no block available’ group and the ‘other reasons’ group (P=0.12). However, as a combined group the 198 did have poorer survival (P=0.03) compared to the tested group. This resulted from the marginally poorer OS by both the 35 FNA/no pathology report patients (P=0.11) and by the 80 patients in the ‘no reason’ group (P=0.055). It may be that more advanced disease or other complicating medical problems (increased smoking) lead to lack of available tissue in these 155 patients. Finally, there is no reason to suspect a systematic patient selection bias as we did not obtain 20–25% of specimens requested from each centre aside from one centre. One centre only submitted tissue on 40% of cases, but there was no statistically significant difference in case mix or OS (P=0.2) between the tested and non-tested patients from that centre.

Another potential limitation might relate to the determination of HPV status in this study. P16 IHC was performed first on all 610 patient samples; HPV 16/18 DNA using in situ hybridisation was subsequently applied to the 289 samples which were either p16-negative or equivocal. Subsequently, 71 out of 289 cases were identified to be HPV in situ hybridisation-positive. Some groups have described a false-positive rate for p16 IHC ranging from 3.8 to 7.3% (Jordan et al, 2012; Seiwert, 2013). In our hands, p16 IHC performed equally robustly compared to HPV DNA in situ hybridisation (Shi et al, 2009); hence, for pragmatic and fiscal reasons, this was the approach undertaken for HPV determination. The same methodology was applied to both the earlier and latter cohorts (1998/1999 and 2003/2004), identifying an increase in incidence of HPV-positive OPC from 53.9% in the earlier 1998/1999 period to 71.1% for the latter 2003/2004 cohort. Hence, the overall conclusion of this study would not have been affected by the assays in determining HPV status for this large group of patients with OPC treated in Ontario over this 6-year period.

Other potential limitations might include the use of treatments such as low-dose daily cisplatin or regimens incorporating 5FU. However, those regimens were part of the heterogeneity in the clinical trials, meta-analyses and guidelines that changed practice 2 years prior to the 2003/2004 patient cohort (Jeremic and Shibamoto, 1997; Pignon et al, 2000; Browman et al, 2001; Denis et al, 2004; Fallai et al, 2006), and there is no RCT evidence comparing low-dose daily cisplatin or regimens incorporating 5FU in this patient population. There may have been misclassified patients in the cancer registry who could have biased the selection of the study population (Hall et al, 2006). To account for this potential, we included patients with disease sites such as posterior oral cavity in the initial chart request where there might have been confusion by coders. Finally, we have used a surrogate indicator for smoking history based on pulmonary disease (Linn et al, 1968) since reliable specific smoking data were not available on all the charts.

We are not suggesting that the available clinical trials data were incorrect since the majority of reported trials did show an improvement in outcomes. However, we are intrigued that the evidence did not translate into improvements in the ‘real-world’ setting of this study at a population-based level. There are at least four possible explanations for this. The first is the fundamental difference between efficacy and effectiveness inherent in the patient selection bias, treatment bias and settings of RCTs that differ from practice in the community. Our results could be due to factors such as less healthy patients, treatment toxicity or the modifications of treatment in less motivated real-world patients. Booth et al (Booth and Tannock, 2013, 2014) recently reviewed the ‘pros and cons’ of RCTs and population-based studies and concluded that ‘well-designed population-based outcome studies should be considered a natural step in the evolution of evidence and should be conducted in follow-up of all major randomised controlled trials’. A second reason for a difference could be due to the inclusion of heterogeneity in meta-analyses. Anglemeyer (Anglemeyer et al, 2014) suggested this as an explanation for outcome differences between RCTs and high-quality observational studies and in the evolution of the evidence for CRT in head and neck cancer heterogeneity of radiotherapy and chemotherapy regimens and protocols is a feature of all the meta-analyses. A third potential reason is reporting bias. Ross et al (2015) assessed the publication rates of registered clinical trials at ClinicalTrials.gov after 31 December 1999 that were to be completed by 31 December 2005 and those that were documented as completed by 30 June 2007. They selected a random 10% of those trials, searched for publications, found that less than 46% of the trials were actually published and specifically found that only 40% of the trials funded primarily by industry were published. This was the same time frame as the trials on CRT, although it is not known if or how many trials involving CRT were not reported. If unreported trials or the data from unreported trials (especially negative trials) had been reported, the meta-analysis might have been different and perhaps similar to our findings. A fourth potential reason is confounding of some of the trials by HPV. The incidence of HPV+ve patients in the late 1980s and early 1990s was sporadic and increasing across the world (Chaturvedi et al, 2013) and as HPV was unknown, the treatment arms of trials during that time may not have been balanced. The only RCT that compared conventional RT to platin-based CRT exclusively in patients with OPC was reported by Denis (Denis et al, 2004) and by Fallai (Fallai et al, 2006). This was a multicentre Phase III randomised trial involving 226 patients from France diagnosed in the years 1994–1997. They reported overall 5-year survival (22% vs 16%, P=0.05) and DSS (27% vs 15%, P=0.01) for CRT over radiotherapy alone. This study however was not balanced for histology as more patients in the CRT group had poorly differentiated or unreported tumour cell differentiation. As poor differentiation is commonly associated with HPV+ve, the CRT group may have been destined to a better prognosis, their reported difference of 14% (P=0.05) would likely have been reduced and the result (the evidence that changed practice) might not have been statistically significant. In summary, there are many potential reasons that might explain why the evidence did not translate into effectiveness for our patient population.

The reporting of site of relapse, prognostic factors, the reduction of distant metastases and the impact of smoking (Ang et al, 2010; O’Sullivan et al, 2013) were beyond the scope of this initial report and will be published subsequently.

Conclusion

The effectiveness of the addition of CRT over RT alone both over time and by treatment in Ontario Canada in 2003/2004 was confounded by HPV status and CRT did not improve outcomes for OPC overall, for HPV+ve or for HPV−ve patients. Our current treatments are associated with high rates of acute toxicity, high rates of late toxicity and increased costs. Oncologists, head and neck site groups, institutions, researchers and funders of health-care systems might reconsider the evidence on which cisplatin-based CRT is used, might further question the risks vs the benefits and should be sure future clinical trials include treatment arms that reduce toxicity such as RT alone.