Quality of life and cost-effectiveness of interferon-alpha in malignant melanoma: results from randomised trial

A definitive conclusion regarding the value of low-dose extended duration adjuvant interferon-alpha therapy in the treatment of malignant melanoma is only possible once data on health-related quality of life (HRQoL) and costs have been considered. This trial randomised 674 patients to interferon alpha-2a (3 megaunits three times per week for 2 years or until recurrence) or placebo. Health-related quality of life (QoL) was to be assessed up to 60 months using the European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30. Data for the economic analysis, including cost information and the EQ-5D were also collected. Patients in the observation (OBS) group had significantly better mean follow-up quality of on five dimensions of the EORTC QLQ-C30 functional scales: role functioning (P=0.033), emotional functioning (P=0.003), cognitive functioning (P=0.001), social functioning (P=0.003) and global health status (P=0.001). Patients in the OBS group had significantly better mean follow-up symptom scores on seven dimensions of the EORTC QLQ-C30 V1 symptom scales. Economic data showed that costs were £3066 higher in the interferon group and produces an incremental cost per quality-adjusted life year of £41 432 at 5 years. The results show that interferon has significant effects on QoL and symptomatology and is unlikely to be cost-effective in this patient group in the UK.

The role of interferon-alpha in malignant melanoma has long been the debated and researched, with over 6000 patients entered into trials (Molife and Hancock, 2002). One aspect of this has been the effectiveness of low-dose extended duration adjuvant therapy. A recent study in 674 patients with thick primary cutaneous melanoma showed no significant difference in overall survival or recurrence-free survival up to 5 years (Hancock et al, 2004). This, together with other trial evidence (Ascierto et al, 2006), points to there being no routine role for low-dose therapy within this patient group. A definitive conclusion is not possible, however, until data on health-related quality of life (HRQoL) and costs have been considered alongside the survival data. It is possible that a HRQoL advantage exists, or that the cost differentials are such that the treatment may be considered cost-effective even in the face of nonsignificant clinical findings.
Data from within AIM-High have been reported on toxicity and change in Karnofsky Performance Status (Hancock et al, 2004), but these data offer only a partial view of HRQoL. This paper reports on the HRQoL data from AIM-High plus cost and costeffectiveness as estimated by an incremental cost per qualityadjusted life year (QALY).

MATERIALS AND METHODS
Patients within the study were randomised to either interferon alpha-2a at 3 megaunits three times per week for 2 years or until recurrence, or placebo. The study protocol was approved by the relevant research ethics committee and all participating patients gave informed written consent.
HRQoL data in the form of the European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 were originally intended to be collected at baseline, 3, 6, 12, 24, 36, 48 and 60 months for a subgroup of patients. However, HRQoL data were actually collected at a variety of time points postrandomisation from 3 days to 77 months. Data for the economic analysis, including cost information and the EQ-5D were collected at 3, 6, 12, 24, 36, 48 and 60 months. These economic data were only collected for a subgroup of patients, selected as every fifth patient to enter the study.
The European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 is a 30-item cancer-specific instrument designed to assess the health-related quality of life (QoL) of cancer patients participating in international clinical trials (Aaronson et al, 1993). The QLQ-C30 version 1.0 used in the AIM-High Trial incorporated five functional scales: physical (PF), role (RF), cognitive (CF), emotional (EF) and social (SF); three symptom scales: fatigue (FA), pain (PA), nausea and vomiting (NV); a global health status/QoL scale (QL) and six single items assessing additional symptoms commonly reported by cancer patients: dyspnoea (DY), loss of appetite (AP), insomnia (SL), constipation (CO), diarrhoea (DI) and a single item on the perceived financial impact of the disease (FI). All of the scales and single-item measures range in score from 0 to 100. A high scale score represents a higher response level. Thus a high score for a functional scale represents a high/healthy level of functioning; a high score for the global health status/QoL represents a high QoL; but a high score for a symptom scale/item represents a high level of symptomatology/problems. Patient utilities were obtained from the EQ-5D questionnaire. The EQ-5D is a five-dimensional health state classification. The five dimensions are mobility, self-care, usual activities, pain/ discomfort and anxiety/depression. These five dimensions are each assessed by a single question on a three point ordinal scale (no problems, some problems, extreme problems). An EQ-5D 'health state' is defined by selecting one level from each dimension. A total of 243 health states are thus defined. Values or preference weights for a sample of these health states were obtained from a general community sample using a Time-Trade-Off (TTO) technique. Estimates for all health states were extrapolated from this sample by statistical regression modelling. The EQ-5D preference-based measure can be regarded as a continuous outcome scored on a À0.59 to 1.00 scale, with 1.00 indicating 'full health' and 0 representing dead. The negative EQ-5D scores represent certain health states valued as worse than dead.
Data on resource use covered all key areas of care; interferon dose, inpatient and outpatient hospital care, community nurse and general practitioner care. Data on interferon were collected via the study case report form as completed by the study clinician or research nurse. The other economic data, including the EQ-5D, were collected through a patient completed questionnaire.

Analysis
Data for the EORTC QLQ-C30 were scored using the EORTC Scoring Manual (Fayers et al, 1995). EQ-5D data were scored using UK population values (Dolan, 1997), and combined with mortality data to calculated QALYs (Drummond et al, 1997). As baseline EQ-5D values were missing for baseline, these were imputed from a regression of EORTC QLQ-C30 responses on EQ-5D values from other visits.
Differences in patient characteristics between groups were tested for using independent sample t-tests, or w 2 tests, as appropriate. The Kaplan -Meier method was used to calculate the time from randomisation to death, and the log rank test to compare the survival times of both groups (Altman, 1991). HRQoL data were collected at a variety of time points postrandomisation from 3 days to 2136 days, mean 403 days. Patients had between 1 and 13 followup QoL assessments, with an average of 3.85 assessments postrandomisation. Given this variation in data collection we decided on a relatively straightforward approach to the analysis of the longitudinal data which involved the use of summary measures (Matthews et al, 1990). We summarised follow-up QoL responses for each individual subject by taking the simple average of their follow-up QoL responses over time as our summary measure (Matthews et al, 1990) as we were concerned with differences in overall levels of QoL rather than more subtle effects.
Differences in mean follow-up HRQoL between the groups were compared using a multiple linear regression model, with mean follow-up HRQoL as the outcome variable and baseline HRQoL, overall survival status (dead or censored) and treatment group as covariates. P-values of less than 0.05 were regarded as being statistically significant.
The economic analysis followed guidelines set down by the National Institute for Clinical Excellence (2004). Costs were calculated by combining resource use data with unit costs representing national estimates (British Medical Association, 2002;Netten and Curtis, 2003). Costs beyond 1 year were discounted at 3.5% per annum. Prices are at 2003/4 levels with prices adjusted using the Hosptial and Community Health Services Pay and Price Index where appropriate (Netten and Curtis, 2003). Cost differences were tested for using independent sample t-tests.
Interferon alpha (n=338) An incremental cost-effectiveness ratio was calculated using mean costs and QALYs.

Baseline assessments
Economic data can be severely limited by missing data as both the costs and the QALYs are cumulative measures (i.e. totals over the entire follow-up period). Consequently, if only one value is missing from the full series of follow-up data, the total cannot be calculated. To avoid this problem, missing data imputation becomes an important part of analysing economic data. Within this study, the last observation carried forward was used to impute missing data in order to calculate total costs and QALYs (Heyting et al, 1992). Figure 1 shows that 444 patients (out of 674) or 66% had a valid baseline QoL assessment; 230/338 (68%) in the IFN group  For the EORTC QLQ-C30 v1 function scales a higher score represents a better level of functioning. For the EORTC QLQ-C30 v1 symptom scales a higher score represents a worse level of symptoms. and 214/336 (64%) in the OBS group (P ¼ 0.233). Comparison of the n ¼ 398 patients with a valid baseline QoL assessment and at least one valid follow-up QoL assessment and n ¼ 276 patients with no baseline or follow-up QoL assessments, suggested that the two groups have similar age (P ¼ 0.151), gender (P ¼ 0.349), histology (P ¼ 0.078), and lengths of follow-up (P ¼ 0.528) (Table 1). There was no interaction between treatment group and followup QoL assessment status with regard to overall survival (P ¼ 0.251) and no evidence of a difference in overall survival between the no follow-up QoL data and valid follow-up data groups (log rank P ¼ 0.84). Median survival was 4.05 years for patients with no valid follow-up QoL data vs 3.81 years for patients with valid baseline and follow-up QoL data. This implies we can assume that the QoL sample of 388 patients is a randomly selected subsample of the AIM-High trial population.

Health-related QoL
The IFN and OBS groups in the QoL sample had similar age, gender, stage and overall mortality. The IFN and OBS groups in the QoL sample had similar baseline EORTC QLQ-C30 scores, except for the PAIN dimension, where the IFN had significantly higher levels of pain, þ 5.8 (95% CI: þ 1.2 to þ 10.4, P ¼ 0.013), see Table 2. There was no evidence of a difference in overall survival between the IFN and OBS groups (log rank P ¼ 0.15) in the QoL sample. Median survival for IFN was 4.29 years vs 3.21 years for the OBS group (see Figure 2).
Patients in the observation (OBS) group had significantly better mean follow-up QoL on five dimensions of the EORTC QLQ-C30 V1 functional scales: RF, EF, CF, SF and QL (see Table 3) after adjustment for baseline QoL and overall survival status (dead or censored). Patients in the OBS group had significantly lower (better) mean follow-up QoL symptom scores on seven dimensions of the EORTC QLQ-C30 V1 symptom scales: FA, NV, DY, AP, CO, DI and FI (see Table 4) after adjustment for baseline QoL and overall survival status (dead or censored).

Economic evaluation
In total, 134 patients were entered into the economic study and data were available for 111 of these patients. Costs were higher for the interferon (IFN) group in the first 2 years, then slightly lower, thereafter. Overall, costs were d3066 higher in the IFN group. This is almost entirely due to the cost of therapy (Figure 3), but is not statistically significant (P ¼ 0.396). The IFN group generates 0.074 more QALYs (Table 5), which is equivalent to an extra 27 days in full health, although this is not statistically significant (P ¼ 0.752).
The incremental cost per QALY for interferon therapy is d41 432. There is considerable statistical uncertainty around this estimate, and a threshold of d30 000 per QALY, there is only a 45% chance of interferon being cost-effective.

DISCUSSION
These results show that HRQoL is worse in the IFN group in terms of both functioning and symptomatology. As assessed by the EORTC QLQ-C30, statistically significant differences were found in terms of role functioning, emotional functioning, cognitive functioning, social functioning and global health status. Symptom scores in the IFN group were significantly worse for fatigue, nausea/vomiting, dyspnoea, appetite loss, constipation and diarrhoea.
Despite the great interest in interferon therapy for melanoma and its recognised toxicities (Hancock et al, 2000), there are very few large scale studies that have used validated HRQoL instruments. Paterson looked at 21 patients receiving high-dose interferon alpha-2b using the Functional Assessment of Cancer Therapy -Biological Response Modifier (FACT-BRM) scale, showing decreased QoL (Paterson et al, 2005). In an associated study, Trask looked at 16 patients in a longitudinal analysis which showed reductions in QoL (Trask et al, 2004). Bender assessed QoL as part of a trial with 16 patients, and showed a significant reduction in physical well-being associated with high-dose interferon alpha-2b therapy using the Functional Assessment of Cancer Therapy -General (FACT-G) scale (Bender et al, 2000).
The largest available study that used a validated QoL measure is by Rataj et al (2005) that reported a study of 110 melanoma patients receiving interferon alpha-2b patients following radical Other work has been undertaken looking at QoL in melanoma patients receiving interferon; however, this has been undertaken with a completely different approach. Kilbridge et al (2001), for instance, used the standard gamble technique to value a series of health states describing the QOL associated with interferon toxicity, melanoma recurrence and disease-free health. Their study, based on 107 patient interviews, showed that the side effects from interferon treatment reduced QoL, from 0.96 for the disease-free health state to 0.81 from severe side effects.
The Kilbridge utility estimates have been combined with mortality data from the ECOG 1684 trial (n ¼ 280) to produce a quality-adjusted survival analysis (Kilbridge et al, 2002) and a cost-utility analysis . Other analyses have used other utility estimates to describe treatment and post treatment QoL for interferon patients (Cole et al, 1996;Hillner et al, 1997;Lafuma et al, 2001); however, the utility figures were assigned by the researchers rather than derived from patients.
All of these utility-based studies show that a decrease in QoL during interferon treatment is more than offset by improved QoL owing to reduced recurrence and reduced mortality. Consequently, when these utility estimates are combined with the ECOG 1684 data, results tend to show that treatment with high-dose interferon is cost-effective compared to other technologies (Hillner et al, 1997;Lafuma et al, 2001;Crott et al, 2004). These results are in contrast to this study, which shows that while median survival is around 1-year longer, combining QoL with mortality proves the IFN group to be only marginally better (0.074 QALYs, P ¼ 0.752). This produces an incremental cost-effectiveness ratio of d41 432 per QALY. Using a funding threshold of d30 000 per QALY which is at the higher end of a range used within the United Kingdom (National Institute for Clinical Excellence, 2004), these results show that low-dose extended duration interferon therapy is unlikely to be considered cost-effective.
There are several reasons for these differences. Firstly, AIM-High is a study of low dose interferon therapy, whereas ECOG 1684 is a study of high-dose therapy. Consequently, QoL, survival and recurrence might be expected to differ. Secondly, the utility figures are derived in completely different ways. Our study used a generic preference based outcome measure (EQ-5D) to gather data prospectively from within the trial, from which general population utilities values were applied from a standard algorithm. Kilbridge et al (2001) generated utility values from melanoma patients by asking them to value health states describing various treatment scenarios. Thirdly, our study estimates cost-effectiveness at 5 years, while the modelling studies look at longer time scales; 35 years in one case . This is an important difference as shorter time frames generate higher incremental costeffectiveness ratios. While extrapolation of our results is possible, the lower final year QALY estimate (Table 5) implies that even worse cost-effectiveness results may be produced, if such an analysis were undertaken.
We should also consider the deficiencies associated with our study. Only 66% of patients in the trial had a baseline EORTC QLQ-C30 assessment. Despite this, there appears to be no systematic difference between patients included in our QoL analysis, and those excluded. Another problem was that the number and timing of QoL assessments completed, varied. This led us to undertake a simple analysis, using a summary measure of QoL based on the average scores. As assessments were more frequent during interferon treatment in order to capture the impact of side effects, the results will be weighted toward the early months of treatment. However, repeating the analysis with average follow-up over the first 2 years as the outcome, rather than the total follow-up gave almost identical results to the longer follow-up (data not shown). For the EORTC QLQ-C30 v1 function scales a higher score represents a better level of functioning. The treatment group difference in mean follow-up scores is adjusted for baseline score and overall survival status (dead or censored). A positive follow-up difference implies the observation (OBS) group has a better level of functioning at follow-up, than the interferon (IFN) group.
We followed the advice of Cox et al (1992) for QoL studies which recommended simplicity of design, analysis and presentation of QoL assessments. Therefore, we decided to use a simple approach and not the simultaneous assessment of QoL and survival. There are several approaches to the simultaneous assessment of QoL and survival including: QALYs (for which we employed the EQ-5D), Q-TWiST (quality-adjusted time with spent with symptoms of disease and toxicity of treatment) and multistate survival analysis (Billingham et al, 1999). The latter two approaches would require the definition of a finite number of health states in terms of the 15 EORTC QLC-30 dimension scores. We felt it was very difficult to define a set of finite, mutually exclusive and exhaustive health states that are clinically meaningful and fully describe the experiences of patients with malignant melanoma using the 15 dimensions of the EORTC QLC-30.
We assumed that the missing QoL data are missing at random and that dropout was noninformative. We found that the dropout rates and survival experience were similar across the treatment arms and believe that the between-treatment comparisons of QoL remain unbiased. We also included a term for overall survival status in our regression model to adjust for whether the patient was alive or dead during follow-up. This term should take into account that patients who died during follow-up may have a different average QoL at follow-up than patients who were alive or censored.
Further loss of data was present when the economic results are considered, such that data on only 111 patients were available for analysis. Even for these patients, missing data meant that imputation was required to produce a rectangular data set. While differences between this economic subsample and the full sample are not statistically significant, we are limited in our ability to detect differences between the two arms due to the smaller sample size. This problem is perhaps compounded by the possible insensitivity of the EQ-5D seen in several studies (Harper et al, 1997;Nicholl et al, 2001;Patel et al, 2004). Taken together, the lack of a clear pattern in the QALY estimates shown in Table 5 is difficult to interpret. For the EORTC QLQ-C30 v1 symptom scales a higher score represents a worse level of symptoms. The treatment group difference in mean follow-up scores is adjusted for baseline score and overall survival status (dead or censored). A negative follow-up difference implies the observation (OBS) group has a lower/better level of symptoms, at follow-up, than the interferon (IFN) group. Quality-adjusted life years are calculated by multiplying quality of life by length of life, such that 1 year in full health is equivalent to one quality-adjusted life year (QALY). When 1 year produces less than one QALY, this reflects less than full health, for example, 0.5 QALYs is 1 year in a health state valued at 0.5, which is deemed to be equivalent to 6 months (0.5 years) in full health.  Looking beyond this study, it is difficult to cast light on other QoL evidence and economic evaluations, as methods are different, as too are the interferon dosing regimens. However, given that such a clear picture of QoL is produced with the EORTC QLQ-C30 we would recommend its use for further studies of interferon treatment. The much cited ECOG 1684 study did not incorporate prospective QoL assessment, and so subsequent evaluations have had to add on supplementary studies. While several improvements to future economic evaluations have been suggested (Crott, 2004), basing future evaluations on trial-based QoL and/or utility estimates would appear to be important given the differences identified here.
Few studies have assessed the impact of interferon therapy on health related QoL using validated instruments. These results show that interferon has significant effects on QoL and symptomatology. Our associated economic analysis also showed that overall, adjuvant low-dose extended duration interferon therapy does not appear cost-effective in this patient group in the UK context.