Main

Lung cancer is commonly known to be a disease that has strong associations with smoking (Doll and Hill, 1956; Korhonen et al, 2008; Papadopoulos et al, 2011). A report published by Peto et al, 2006 showed that, in Finland in the year 2000, 86% of lung cancer deaths in males and 60% of lung cancer deaths in females were deemed to be attributed to smoking. In addition to this, they showed that 12% of cardiovascular deaths in males and 3.6% of cardiovascular deaths in females were also deemed to be attributed to smoking. Figures were also reported for other types of cancer and other causes of death. Not only does smoking put you at a high risk of developing lung cancer and consequently dying from lung cancer (Doll and Hill, 1956; Papadopoulos et al, 2011), it also increases your chances of dying from many other diseases (Wolf et al, 1988), such as cardiovascular disease (Willett et al, 1987) and other less common forms of cancer (Moore, 1971; Fuchs et al, 1996).

This has led to heavy debate as to whether relative survival should be used as a method to analyse lung cancer data (Dickman and Adami, 2006; Sarfati et al, 2010). Relative survival is a method that compares the survival experience of a group of patients to the survival experience of the general population. The method is particularly advantageous, as it does not require an accurate cause-of-death information. Mortality estimates for the general population are usually taken from national life tables that are broken down by age, sex and calendar year. One of the key assumptions of relative survival is comparability – if the patient did not have cancer, then it is assumed that they would have the same survival experience as the general population. It is argued, as most lung cancer patients are smokers and therefore carry a higher risk of many other diseases, that they are not comparable to a population where the majority are likely to be non-smokers (Phillips et al, 2002). However, despite these potential problems, relative survival is still the usual method of analysis in population-based cancer studies.

This paper assesses the impact that the non-comparability has on the relative survival estimates through the use of a sensitivity analysis. Similar studies have been carried out previously to assess the impact that specific cancer deaths in the population mortality figures can have on the estimate of relative survival (Hinchliffe et al, 2011; Talbäck and Dickman, 2011).

Methods

Relative survival

Relative survival is a measure that estimates the survival from a particular disease in the absence of other causes of death. It can be written as the ratio of the observed survival in the study population to the expected survival in the general population (Ederer et al, 1961). More formally:

where S(t) is the observed survival, S*(t) is the expected survival and t is the time from diagnosis (Lambert et al, 2010). When relative survival analysis is applied to a cohort of lung cancer patients, we are making a comparison of survival in lung cancer patients relative to survival in the general population. Because of the higher prevalence of smoking amongst lung cancer patients, the expected survival is likely to be too high. We adjust the expected survival via a sensitivity analysis to assess the impact on estimates of 1- and 5-year relative survival.

Sensitivity analysis

In Finland, it is required that all physicians, hospitals and other relevant institutions send notification to the Finnish Cancer Registry of all cancer cases that come to their attention. The Registry, therefore, has full population coverage for all cancer cases going back to 1953. Lung cancer data (ICD-O-3: C340-C349) were obtained from the Finnish Cancer Registry for patients diagnosed in the years 1995–2007, inclusive. Population mortality data for Finland, broken down by age, sex and calendar year, were obtained from the Human Mortality Database (2008). Patients under the age of 18 and anyone diagnosed through autopsy were excluded from the analyses. All relative survival analyses were carried out by the age groups 18–44, 45–59, 60–74, 75–84 and 85+. To obtain up-to-date estimates of relative survival, a period analysis approach was adopted. The relative survival estimates were derived from data on the survival experience of patients in the 2005-2007 period (Brenner and Gefeller, 1996).

An initial relative survival analysis was carried out using the unadjusted population mortality data. The population mortality data was then modified to represent the scenario where 100% of the general population are assumed to be smokers. This creates a group that is more comparable to the cohort of lung cancer patients in which the vast majority are also smokers. The adjustment was made by considering the following quantities: the odds ratio for increased/decreased odds of dying from any cause for smokers compared with non-smokers denoted as θ, the probability of dying from any cause if you are a smoker denoted as ps, the probability of dying from any cause if you are a non-smoker denoted as pn, the total probability of dying from any cause in the general population denoted as pt, and the proportion of daily smokers in the general population denoted as α. The above quantities are connected through the following equation

We developed an adjustment for pn, which included all the terms described above. The formulae for this are given in the Appendix. It should be noted that pt, pn and ps are yearly probabilities that will vary by age, sex and calendar year.

As we do not have information on the exact number of smokers in the population-mortality data file, it was assumed that the prevalence of smokers, α, was as shown in Table 1. These estimates were taken from a report of the ‘Health in Finland’ (Koskinen et al, 2006). The total probabilities of dying from any cause, pt, were taken from the population-mortality data file. The odds ratio, θ, was set to 2, 3, 4 and 5 to demonstrate both plausible and extreme scenarios for the increased risk in overall mortality from smoking. This information was used to determine the probability of dying from any cause if you are a non-smoker, pn, using the equations given in the Appendix. This value was subsequently used to estimate the probability of dying from any cause if you are a smoker, ps.

Table 1 Smoking prevalence in adults by gender (%; Koskinen et al, 2006)

Comparisons were made between the relative survival estimates derived using the total probability of dying, pt, from the original unadjusted population mortality file and the relative survival estimates derived using the adjusted probabilities of dying from all causes for smokers, ps.

A systematic review by Schane et al, 2010 reported an odds ratio of 1.6 (95% CI: 1.3 to 2.1) for the risk of all-cause mortality of light and intermittent male smokers compared with male non-smokers. To visualise the bias in the relative survival estimates when adjusting for a more realistic odds ratio, this odds ratio of 1.6 was taken as the ‘estimated’ value for θ for both genders and all age groups. This was done in addition to the adjustments made with odds ratios of 2, 3, 4 and 5.

Results

Relative survival curves using odds ratios (θ) of 2, 3, 4 and 5 for increased odds of all-cause mortality for smokers compared with non-smokers are shown in Figures 1, 2, 3, 4, respectively. Each figure compares the relative survival curve obtained using the unadjusted population mortality files to the relative survival curve that has been adjusted assuming that everyone in both the lung cancer cohort and population mortality file is a smoker. All four figures show that adjusting for a higher probability of death in smokers makes little, if any, difference in the 18–44 and 45–59 age groups, as the probability of death from other causes is low in these ages. There is also very little difference between the curves in the older three age groups until the odds ratio reaches 4 and 5, where the largest differences in the relative survival estimates are between 0.05 and 0.1.

Figure 1
figure 1

Comparison of relative survival curves with no adjustment made to the external population with relative survival curves, assuming external population consists of 100% smokers and that the odds of all-cause mortality is twice as high for smokers as compared with non-smokers.

Figure 2
figure 2

Comparison of relative survival curves with no adjustment made to the external population with relative survival curves, assuming external population consists of 100% smokers and that the odds of all-cause mortality is three times as high for smokers compared with non-smokers.

Figure 3
figure 3

Comparison of relative survival curves with no adjustment made to the external population with relative survival curves, assuming external population consists of 100% smokers and that the odds of all-cause mortality is four times as high for smokers compared to non-smokers.

Figure 4
figure 4

Comparison of relative survival curves with no adjustment made to the external population with relative survival curves assuming external population consists of 100% smokers and that the odds of all-cause mortality is five times as high for smokers compared with non-smokers.

Table 2 gives the percentage unit differences between the unadjusted 1-year and 5-year relative survival estimates and the 1-year and 5-year relative survival estimates adjusted using odds ratios of θ=2, 3, 4 and 5. It also includes a column showing the percentage unit differences when adjusting for the ‘estimated’ θ. The results show that by using unadjusted life tables, the relative survival estimates are slightly underestimated when compared with life tables that are adjusted using odds ratios of 2, 3, 4 and 5.

Table 2 Percentage unit difference in 1-year and 5-year relative survival estimates between values with no adjustment and 2, 3, 4, 5, and ‘estimated’ (1.6) adjustments

Discussion

Although the assumption of comparability between the patient cohort and general population may be unreasonable for lung cancer, we have shown that correcting for this does not have a concerning impact on the relative survival estimates. In the younger age groups, the probability of dying from other causes is low; therefore, even a fairly large relative adjustment to this value will not have a large impact. It follows that the adjustment will therefore have little effect on the relative survival estimates.

Furthermore, for all age groups, the prognosis for lung cancer is poor, with the majority of patients dying within the first 2 years. If the majority of lung cancer patients are dying quickly from lung-cancer-related deaths, then the fact that these patients are also at an increased risk of death from other diseases will have little impact on the relative survival estimates. Patients do not have the ‘opportunity’ to die from other causes, because of the lethality associated with a diagnosis from lung cancer.

The performed sensitivity analysis made adjustments to the population mortality data to represent a scenario where 100% of the comparison population were smokers. This was done in an attempt to create a more comparable group to the lung cancer patient population. The true smoking figures amongst the lung cancer patient population will most likely not be 100%. Therefore, our adjustment was an extreme case. However, we have shown that the bias is relatively small regardless, and a more realistic proportion will only decrease this bias.

Although we have only considered lung cancer in this paper, we acknowledge that there are other cancer sites, such as bladder cancer, and cancer of the oral cavity and pharynx, that have also been shown to be smoking-related. To carry out a similar sensitivity analysis for these cancer sites, an estimate of the prevalence of smoking within each cohort of cancer patients would be required. It would be unreasonable to assume that the proportion of smokers is anywhere near 100% in bladder and oral cancer cohorts. As these cancers have a better survival than lung cancer, it is likely that the lack of comparability of the life tables may have a larger impact on the relative survival estimates for these sites.

Unfortunately, information was not available on smoking status within the population mortality file. As a result, external information was used to obtain appropriate estimates for this (Table 1; Koskinen et al, 2006). These estimates were not stratified by age group. Should the proportion of smokers be larger in any of the age groups, then the bias in the relative survival estimates would most likely increase. This is particularly true for the oldest age group.

If smoking status had been available, then it would be preferable to create separate life tables for smokers and non-smokers. However, difficulty lies in making a strict definition of a ‘smoker’. People’s smoking status varies over time, as does the level of cigarette consumption. Both of these factors are likely to have an impact on the general health status and prognosis from lung cancer, and so, would also ideally be incorporated into the life table.

We have focussed on the potential bias in the relative survival estimates, as this is the measure most commonly reported. However, if there was interest in comparing groups in terms of the excess mortality, then there may also be bias in the excess mortality-rate ratio. Had smoking status been available, then a comparison could have been made using both smoking-adjusted and -unadjusted life tables. Using the general population life tables, we would expect that the excess mortality-rate ratio for smoking status would be downwardly biased, as the excess mortality rate for smokers would be underestimated and the mortality rate for non-smokers would be overestimated.

The value of θ that was chosen as the ‘estimated’ odds ratio was taken from a systematic review that was carried out to identify studies on the health outcomes associated with light and intermittent smoking. The value of 1.6 was calculated using data on males only, but we used this value to represent all ages and both genders in our sensitivity analysis. Although this value may be overestimated or underestimated for some subgroups of patients, given that even with an odds ratio of 5, the difference between the curves is still reasonably small, we can conclude that in practice, we don’t have to be too concerned about the level of bias that may be introduced into the relative survival estimates by the assumption addressed in this paper.

The method described in this paper only makes adjustments for the assumption of comparability between the observed and expected populations. Other assumptions, such as independence between the mortality associated with the disease of interest and the mortality associated with other causes, are presumed to be reasonable.