Main

Lung cancer is the most common cancer worldwide, in terms of both incidence and mortality. In 2008, there were 1.61 million new cases, and 1.38 million deaths due to lung cancer. The highest rates are in Europe and North America (Ferlay et al, 2010). Lung cancer patients experience a variety of distressing symptoms, which are usually present before diagnosis and continue throughout the course of the disease and treatment, adversely affecting functional status and health-related quality of life (HRQoL) (Akin et al, 2010; Tishelman et al, 2010).

With the availability of reliable and valid self-report questionnaires, HRQoL has become recognised as being important for treatment decision making (Basch et al, 2012). The importance of incorporating HRQoL into cancer research and policy formation has been emphasised by major policy making and regulatory entities (Lipscomb et al, 2007) such as the US National Cancer Institute (National Cancer Institute, 2008), US Food and Drug Administration (Johnson et al, 2003; United States Department of Health and Human Services, Food and Drug Administration (FDA), Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER), Center for Devices and Radiological Health (CDRH), 2007), The European Medicines Agency(EMEA/CHMP/EWP/139391/2004, 2004). There is also growing agreement amongst healthcare professionals and researchers that treatment efficacy should be judged by effects on both quantity and quality of life (Braun et al, 2011a).

Patients with lung cancer often experience an ongoing deterioration in HRQoL (Gralla, 2004). The majority of lung cancer patients (approximately 90%) are diagnosed with non-small-cell lung cancer (NSCLC) and most patients with NSCLC present with locally advanced or metastatic disease, which is incurable with existing treatment modalities (Kvale et al, 2003).

Many studies and meta-analyses have demonstrated that a patient’s baseline HRQoL can predict overall survival (OS) across different cancer types, independent of socio-demographic and other clinical prognostic factors (Ganz et al, 1991; Herndon et al, 1999; Maione et al, 2005; Gotay et al, 2008; Quinten et al, 2009; Braun et al, 2011b). A recent paper by (Sloan et al, 2012) demonstrated in NSCLC patients that overall HRQoL measured by a simple, single item at the time of diagnosis is a significant prognostic factor for survival. However, few studies have investigated the added value of change in HRQoL from baseline over time (Eton et al, 2003; Gupta et al, 2012). Potentially, the short-term evolution of HRQoL could improve the predictive accuracy, although there is uncertainty about the optimal timeframe. The current study investigated whether changes in HRQoL scores from baseline over time were associated with survival, independent of baseline HRQoL scores, in patients with advanced NSCLC.

Patients and methods

Data collection

EORTC 08975 (NCT00003589) was a prospective, multicentre, randomised, non-blinded, phase III trial involving 29 institutions and 480 enrolled patients with advanced NSCLC. All patients had histologically or cytologically confirmed NSCLC stage IIIB or stage IV disease according to the previous staging system (1997) of the American Joint Committee on Cancer ( Mountain, 1997). Patients were randomised to receive paclitaxel 175 mg m−2 followed by cisplatin 80 mg m−2 on day 1 (arm A), gemcitabine 1250 mg m−2 on days 1 and 8 and cisplatin 80 mg m−2 on day 1 (arm B), or paclitaxel 175 mg m−2 on day 1 followed by gemcitabine 1250 mg m−2 on days 1 and 8 (arm C). Treatment cycles were repeated every three weeks. Results showed no difference in OS by arm. These treatments were generally well tolerated with similar outcomes in most HRQoL parameters. Further details of the trial design, conduct and results are reported elsewhere ( Smit et al, 2003).

Health-related quality of life was assessed at baseline (i.e., before treatment), at the end of each treatment cycle, then every 6 weeks until progression of the disease (PD), at PD and thereafter every 3 months until death using the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire Core 30 (EORTC QLQ-C30, version 3·0) in conjunction with the EORTC lung cancer module (QLQ-LC13). The EORTC QLQ-C30 contains: five functioning scales (physical, role, emotional, cognitive and social), nine symptom scales (fatigue, pain, dyspnoea, appetite loss, sleep disturbance, constipation, diarrhoea, nausea and financial difficulties) and the global health status/QOL scale. The LC13 module is meant for use in lung cancer patients, regardless of disease stage and treatment modality (Bergman et al, 1994). The module contains eight scales assessing lung cancer-associated symptoms: dyspnoea, pain, coughing, sore mouth, dysphagia, peripheral neuropathy, alopecia and haemoptysis. For both instruments, raw scores were linearly transformed to a scale from 0 to 100 according to standard scoring procedures (Aaronson et al, 1993). High scores indicate better HRQoL for the functional scales and the global health status/QOL scale but worse HRQoL for the symptom scales. Both instruments have been extensively tested for reliability and validity (Bergman et al, 1994; Osoba et al, 1994; Groenvold et al, 1997). The minimum clinically meaningful important difference on the QLQ-C30 and LC13 scales is at least 10 points (Osoba et al, 1998).

The total number of scales for the EORTC QLQ-C30 and the LC13 is 23. To reduce the multidimensionality of the data, we included the following scales based on published evidence (Efficace et al, 2006) by excluding a priori, the HRQoL scales that were expected to have no prognostic value and had high intercorrelation with other scales: global quality of life, emotional, social and physical functioning, nausea and vomiting, pain, appetite loss, dyspnoea (combined scale constructed from the average of the QLQ-C30 and the QLQ-LC13 dyspnoea items), coughing and dysphagia. Additional socio-demographic and biomedical variables considered in this study were age (continuous), gender, stage of disease (IIIB vs IV), histological subtype (squamous vs non-squamous) and WHO performance status (PS; 0–1 vs 2).

Statistical analysis

The analyses were split into two parts. First, the relationship between baseline HRQoL and OS was investigated for patients with a valid completed baseline HRQoL measure. Second, the relationship between change in HRQoL scores from baseline to the end of each cycle of treatment and survival was assessed for the same patient cohort. Three different change scores were calculated by subtracting the baseline score from the scores at the end of the 1st, 2nd and 3rd cycle of treatment (Braun et al, 2011a). The analysis was limited to changes from baseline up to cycle 3, as available patient numbers declined over time through attrition due to death and treatment withdrawal. Spearman rank correlation was used to summarise the relationships between explanatory variables.

The outcome variable was OS, measured from the date of randomisation until the date of death (due to any cause), and calculated using the Kaplan–Meier method. The prognostic value of individual socio-demographic, clinical and HRQoL variables was evaluated using univariate Cox proportional hazards (Cox, 1972) models (CPHM). Multivariate CPHM were then performed to evaluate the joint prognostic significance of variables that were shown to be univariately prognostic at the 5% level of significance. The randomised protocol treatment was included as a stratification factor.

We used a stepwise method for the model selection. Stepwise procedures can reduce the problem of multicollinearity because two highly correlated predictors will normally not both be entered in the model (Braun et al, 2011a). The assumptions from the CPHM for both the univariate and multivariate analyses were assessed graphically. The prognostic value was assessed via the hazard ratio (HR), its 95% CI and the P-value of the Wald χ2-statistic. A significance level of 5% was used as a threshold for variable selection. The reported HR of the HRQoL scales was rescaled to represent a clinically meaningful important difference of 10 points (Osoba et al, 1998). Potential influence of sample bias and multicollinearity on the results was investigated using the bootstrap resampling technique (Suaerbrei and Schumacher, 1992). This technique generates a number of samples (each the same size as the original data set), by randomly selecting patients and replacing them before selecting the next patient (i.e., bootstrap resampling). Cox regression was then fitted to each of these datasets using automatic stepwise selection (entry level of α=0.05). We calculated the model selection probabilities based on how many times a permissible model was selected in the bootstrap samples. These probabilities were then used as weights to obtain weighted averaged parameters. All analyses were carried out with the SAS Software version 9.4 (SAS Institute Inc., Cary, NC, USA).

Results

Between August 1998 and July 2000, 480 patients were randomly assigned: 159 (group A), 160 (group B) and 161 (group C). Of these, 391 (81.5%) had a valid completed baseline HRQoL questionnaire. Only patients with valid baseline scores were considered for the analyses (group A (133), group B (133) and group C (125)).

Patient characteristics at baseline in terms of stage, histology and treatment and performance status were well balanced between patients for whom HRQoL data were available or not (Efficace et al, 2006). There was a significant difference in median survival between patients with and without baseline HRQoL data (data not shown). There were 302 deaths reported. Given the rule of thumb that CPHM should be used with a minimum of 10 or less outcome events per predictor (Concato et al, 1995; Peduzzi et al, 1995, 1996; Vittinghoff and McCulloch, 2007), this allows for an adequately robust analysis. The correlation coefficient between explanatory variables in absolute value at baseline ranges between 0.03 and 0.50.

Association between baseline HRQoL and survival

Table 1 describes the results of univariate Cox regression analysis of a patient’s clinical and HRQoL scores at baseline. Gender and WHO PS (1–2 vs 3) were significantly associated with survival, whereas age and histological subtype were not. The clinical stage at diagnosis was borderline significant. The median survival for women and men was 9.6 and 7.2 months, respectively, P=0.03. The median survival for patients with good and poor WHO PS was 8.9 and 3.2 months, respectively, P<0.001. All the selected baseline HRQoL scores measured with the EORTC QLQ-C30 and the LC13 subscales were predictive of survival except for emotional functioning for the QLQ-C30. The HRs of these HRQoL scales at baseline were similar in magnitude (0.86 (physical function)—1.16 (dysphagia)) after adjusting for age, gender, WHO PS, histology and stage of disease.

Table 1 Univariate Cox regression analysis of survival at baseline

The Cox multivariate regression including WHO PS, gender and the ten HRQoL scores retained WHO PS, gender, physical functioning, pain and dysphagia after application of the selection procedure. To assess how much prognostic value HRQoL can add to a clinical model, the model selection process was repeated with WHO PS, gender, age, clinical stage of disease and histological subtype forced into the model. Table 2 depicts the results of the final model. From Table 2, every 10-point increase in baseline physical functioning score was associated with a 7% lower risk of death (HR, 0.93; 95% CI, 0.88–0.98), and every 10-point increase in baseline pain was associated with an 11% increased risk of death (1.11, 1.06–1.15). Also for every 10-point increase, the dysphagia score at baseline was associated with a 12% increased risk of death (1.12, 1.04–1.20). We detected no violations of the proportionality assumptions for the variables investigated in our model.

Table 2 Multivariate Cox regression analyses of survival for socio-demographic and clinical data and for socio-demographic, clinical and HRQoL data at baseline

Association between changes from baseline HRQoL and survival

Univariate analysis of the change scores revealed that there was a correlation (HR=0.97 (pain)—1.09 (nausea/vomiting)) between the three changes from baseline scores (i.e., baseline to 1st cycle, 2nd cycle and 3rd cycle of treatment) and the corresponding actual baseline value. Therefore, to account for potential confounding, the baseline value was added to the univariate and multivariate models investigating the association between changes from baseline HRQoL and OS.

Table 3 describes the univariate analysis of changes in scores from baseline to each chemotherapy cycle up to cycle 3. Pain and coughing were predictive for survival at cycle 1. At cycle 2, only social functioning was predictive for survival. At cycle 3, nausea/vomiting was predictive for survival. The correlation coefficient between explanatory variables in absolute value ranges between 0.004–0.46, 0.004–0.40 and 0.01–0.42 at cycle 1, 2 and 3, respectively.

Table 3 Univariate analysis of change in HRQoL scores with associated HRs for death

In Table 4, pain at cycle 1 and social functioning at cycle 2 remained statistically significant in the multivariate analysis. No other HRQoL variables were statistically significant. Every 10-point increase in the pain scale from baseline to cycle 1 was associated with an 8% increase risk of death. Every 10-point increase from baseline to cycle 2 in the social functioning scale was associated with a 9% lower risk of death. There was no evidence of non-proportional hazards in the multivariate models.

Table 4 Multivariate analysis of change in HRQoL scores with associated HRs for death

Bootstrap model resampling

In order to have greater insight into the stability of the final Cox multivariate models for prognostic value of change in HRQoL from baseline, and thus evaluate the importance of a single variable being included as an independent factor, we conducted a bootstrap resampling technique based on 5,000 bootstrap-generated simulation datasets. The highest inclusion frequencies at baseline were pain (97.4%), gender (85.1%), dysphagia (78.3%), performance status (61.1%) and physical functioning (53.3%); at cycle 1, pain (72.5%), gender (65.4%), performance status (45.5%) and coughing (24.7%); at cycle 2, gender (68.1%), age (60.9%), social functioning (53.7%) and stage of disease (46.4%); and at cycle 3, age (68.4%), nausea/vomiting (44.5%) and gender (41.0%). The recorded inclusion frequencies highlight the importance of a single variable being included as an independent factor in the model. This evidence further strengthens the results obtained with the classical Cox regression analysis.

Discussion

The original analysis of this large randomised trial identified (in the multivariate analysis) WHO PS as the only key factor predicting survival in this population (Smit et al, 2003). (Smit et al, 2003) showed that patients with WHO PS 0–1 had a median survival of 8.5 months, whereas those with WHO PS 2 had a median survival of 3.3 months (P<0.001). In our analysis and also that of (Efficace et al, 2006), on the subgroup of patients with HRQoL data, female gender also predicted higher OS. A previously published analysis of 2531 advanced NSCLC patients also showed that female gender and good WHO PS were favourable independent prognostic factors for survival (Albain et al, 1991).

However, (Smit et al, 2003) analysis did not include HRQoL variables. The present analysis identified patients’ self-reported physical functioning (at baseline), pain (at baseline and change from baseline to cycle 1) and social function (change from baseline to cycle 2) as further prognostic factors for survival. We should note that the magnitude of the HRs of the EORTC QLQ-C30 and LC13 scales are smaller than those for the (categorised) clinical variables. The finding of physical functioning and pain predicting survival in lung cancer has previously been reported (Ganz et al, 1991; Herndon et al, 1999; Movsas et al, 2009).

However, we observed a counter-intuitive finding in the univariate analysis: self-reported improvement over time (for the functional scales and global health status) was associated with a higher risk of death. (Braun et al, 2011a) also reported that an improvement in social functioning at 3 months was independently associated with a worse survival. Change score analyses can be biased by floor and ceiling effects at baseline, as those with extreme scores cannot improve or deteriorate beyond the maximum or minimum scores. In our sample, patients who reported a high level of symptoms at trial entry had a poor prognosis, but tended to report better outcomes over time (either through concomitant care, habituation or statistical regression to the mean). The reverse was seen in patients entering the trial with very good self-reported scores.

Our results have important implications for both clinical and research practices. They reinforce the notion that baseline HRQoL scores are prognostic and suggest that change from baseline over time in HRQoL scores, as measured on subscales of the EORTC QLQ-C30 and LC13, may provide additional prognostic value for survival. Thus, baseline HRQoL, in addition to clinical variables, should be taken into consideration when planning treatment. Also, inclusion of baseline HRQoL as a stratification factor could increase trial efficiency, create more homogeneous treatment groups and improve understanding of trial results. Our work suggests that the regular HRQoL assessments during the course of treatment could be an early signal of patient deterioration, and raises the hypothesis that interventions to improve pain, physical functioning, dysphagia and social function could have potential to improve survival outcomes. Appropriate care procedures should be taken when there is an indication of a patient’s HRQoL deterioration. However, the utility of this approach to patient management should be investigated in prospective studies in NSCLC cancer patients.

Our study is a secondary analysis with several limitations. The median survival of the patients who provided HRQoL baseline data, in the original trial, was higher than that of patients who did not. Although our observed baseline HRQoL scores were similar to other groups of patients with the same disease (Scott et al, 2008), such missing data, which are common in HRQoL studies, may restrict the generalisability of our findings. It is possible that our study sample might reflect patients with a better baseline health condition who experienced fewer symptoms.

In conclusion, our findings lend further support to the growing body of literature that suggest that baseline HRQoL, as well as change from baseline over time in HRQoL, as measured on subscales of the EORTC QLQ-C30 and LC13, contains added prognostic value for survival in advanced NSCLC.