Abstract
Background:
Under certain assumptions, relative survival is a measure of net survival based on estimating the excess mortality in a study population when compared with the general population. Background mortality estimates are usually taken from national life tables that are broken down by age, sex and calendar year. A fundamental assumption of relative survival methods is that if a patient did not have the disease of interest then their probability of survival would be comparable to that of the general population. It is argued, as most lung cancer patients are smokers and therefore carry a higher risk of smokingrelated mortalities, that they are not comparable to a population where the majority are likely to be nonsmokers.
Methods:
We use data from the Finnish Cancer Registry to assess the impact that the noncomparability assumption has on the estimates of relative survival through the use of a sensitivity analysis.
Results:
Under realistic estimates of increased allcause mortality for smokers compared with nonsmokers, the bias in the estimates of relative survival caused by the noncomparability assumption is negligible.
Conclusion:
Although the assumption of comparability underlying the relative survival method may not be reasonable, it does not have a concerning impact on the estimates of relative survival, as most lung cancer patients die within the first 2 years following diagnosis. This should serve to reassure critics of the use of relative survival when applied to lung cancer data.
Similar content being viewed by others
Main
Lung cancer is commonly known to be a disease that has strong associations with smoking (Doll and Hill, 1956; Korhonen et al, 2008; Papadopoulos et al, 2011). A report published by Peto et al, 2006 showed that, in Finland in the year 2000, 86% of lung cancer deaths in males and 60% of lung cancer deaths in females were deemed to be attributed to smoking. In addition to this, they showed that 12% of cardiovascular deaths in males and 3.6% of cardiovascular deaths in females were also deemed to be attributed to smoking. Figures were also reported for other types of cancer and other causes of death. Not only does smoking put you at a high risk of developing lung cancer and consequently dying from lung cancer (Doll and Hill, 1956; Papadopoulos et al, 2011), it also increases your chances of dying from many other diseases (Wolf et al, 1988), such as cardiovascular disease (Willett et al, 1987) and other less common forms of cancer (Moore, 1971; Fuchs et al, 1996).
This has led to heavy debate as to whether relative survival should be used as a method to analyse lung cancer data (Dickman and Adami, 2006; Sarfati et al, 2010). Relative survival is a method that compares the survival experience of a group of patients to the survival experience of the general population. The method is particularly advantageous, as it does not require an accurate causeofdeath information. Mortality estimates for the general population are usually taken from national life tables that are broken down by age, sex and calendar year. One of the key assumptions of relative survival is comparability â€“ if the patient did not have cancer, then it is assumed that they would have the same survival experience as the general population. It is argued, as most lung cancer patients are smokers and therefore carry a higher risk of many other diseases, that they are not comparable to a population where the majority are likely to be nonsmokers (Phillips et al, 2002). However, despite these potential problems, relative survival is still the usual method of analysis in populationbased cancer studies.
This paper assesses the impact that the noncomparability has on the relative survival estimates through the use of a sensitivity analysis. Similar studies have been carried out previously to assess the impact that specific cancer deaths in the population mortality figures can have on the estimate of relative survival (Hinchliffe et al, 2011; TalbÃ¤ck and Dickman, 2011).
Methods
Relative survival
Relative survival is a measure that estimates the survival from a particular disease in the absence of other causes of death. It can be written as the ratio of the observed survival in the study population to the expected survival in the general population (Ederer et al, 1961). More formally:
where S(t) is the observed survival, S^{*}(t) is the expected survival and t is the time from diagnosis (Lambert et al, 2010). When relative survival analysis is applied to a cohort of lung cancer patients, we are making a comparison of survival in lung cancer patients relative to survival in the general population. Because of the higher prevalence of smoking amongst lung cancer patients, the expected survival is likely to be too high. We adjust the expected survival via a sensitivity analysis to assess the impact on estimates of 1 and 5year relative survival.
Sensitivity analysis
In Finland, it is required that all physicians, hospitals and other relevant institutions send notification to the Finnish Cancer Registry of all cancer cases that come to their attention. The Registry, therefore, has full population coverage for all cancer cases going back to 1953. Lung cancer data (ICDO3: C340C349) were obtained from the Finnish Cancer Registry for patients diagnosed in the years 1995â€“2007, inclusive. Population mortality data for Finland, broken down by age, sex and calendar year, were obtained from the Human Mortality Database (2008). Patients under the age of 18 and anyone diagnosed through autopsy were excluded from the analyses. All relative survival analyses were carried out by the age groups 18â€“44, 45â€“59, 60â€“74, 75â€“84 and 85+. To obtain uptodate estimates of relative survival, a period analysis approach was adopted. The relative survival estimates were derived from data on the survival experience of patients in the 20052007 period (Brenner and Gefeller, 1996).
An initial relative survival analysis was carried out using the unadjusted population mortality data. The population mortality data was then modified to represent the scenario where 100% of the general population are assumed to be smokers. This creates a group that is more comparable to the cohort of lung cancer patients in which the vast majority are also smokers. The adjustment was made by considering the following quantities: the odds ratio for increased/decreased odds of dying from any cause for smokers compared with nonsmokers denoted as Î¸, the probability of dying from any cause if you are a smoker denoted as p_{s}, the probability of dying from any cause if you are a nonsmoker denoted as p_{n}, the total probability of dying from any cause in the general population denoted as p_{t}, and the proportion of daily smokers in the general population denoted as Î±. The above quantities are connected through the following equation
We developed an adjustment for p_{n}, which included all the terms described above. The formulae for this are given in the Appendix. It should be noted that p_{t}, p_{n} and p_{s} are yearly probabilities that will vary by age, sex and calendar year.
As we do not have information on the exact number of smokers in the populationmortality data file, it was assumed that the prevalence of smokers, Î±, was as shown in Table 1. These estimates were taken from a report of the â€˜Health in Finlandâ€™ (Koskinen et al, 2006). The total probabilities of dying from any cause, p_{t}, were taken from the populationmortality data file. The odds ratio, Î¸, was set to 2, 3, 4 and 5 to demonstrate both plausible and extreme scenarios for the increased risk in overall mortality from smoking. This information was used to determine the probability of dying from any cause if you are a nonsmoker, p_{n}, using the equations given in the Appendix. This value was subsequently used to estimate the probability of dying from any cause if you are a smoker, p_{s}.
Comparisons were made between the relative survival estimates derived using the total probability of dying, p_{t}, from the original unadjusted population mortality file and the relative survival estimates derived using the adjusted probabilities of dying from all causes for smokers, p_{s}.
A systematic review by Schane et al, 2010 reported an odds ratio of 1.6 (95% CI: 1.3 to 2.1) for the risk of allcause mortality of light and intermittent male smokers compared with male nonsmokers. To visualise the bias in the relative survival estimates when adjusting for a more realistic odds ratio, this odds ratio of 1.6 was taken as the â€˜estimatedâ€™ value for Î¸ for both genders and all age groups. This was done in addition to the adjustments made with odds ratios of 2, 3, 4 and 5.
Results
Relative survival curves using odds ratios (Î¸) of 2, 3, 4 and 5 for increased odds of allcause mortality for smokers compared with nonsmokers are shown in Figures 1, 2, 3, 4, respectively. Each figure compares the relative survival curve obtained using the unadjusted population mortality files to the relative survival curve that has been adjusted assuming that everyone in both the lung cancer cohort and population mortality file is a smoker. All four figures show that adjusting for a higher probability of death in smokers makes little, if any, difference in the 18â€“44 and 45â€“59 age groups, as the probability of death from other causes is low in these ages. There is also very little difference between the curves in the older three age groups until the odds ratio reaches 4 and 5, where the largest differences in the relative survival estimates are between 0.05 and 0.1.
Table 2 gives the percentage unit differences between the unadjusted 1year and 5year relative survival estimates and the 1year and 5year relative survival estimates adjusted using odds ratios of Î¸=2, 3, 4 and 5. It also includes a column showing the percentage unit differences when adjusting for the â€˜estimatedâ€™ Î¸. The results show that by using unadjusted life tables, the relative survival estimates are slightly underestimated when compared with life tables that are adjusted using odds ratios of 2, 3, 4 and 5.
Discussion
Although the assumption of comparability between the patient cohort and general population may be unreasonable for lung cancer, we have shown that correcting for this does not have a concerning impact on the relative survival estimates. In the younger age groups, the probability of dying from other causes is low; therefore, even a fairly large relative adjustment to this value will not have a large impact. It follows that the adjustment will therefore have little effect on the relative survival estimates.
Furthermore, for all age groups, the prognosis for lung cancer is poor, with the majority of patients dying within the first 2 years. If the majority of lung cancer patients are dying quickly from lungcancerrelated deaths, then the fact that these patients are also at an increased risk of death from other diseases will have little impact on the relative survival estimates. Patients do not have the â€˜opportunityâ€™ to die from other causes, because of the lethality associated with a diagnosis from lung cancer.
The performed sensitivity analysis made adjustments to the population mortality data to represent a scenario where 100% of the comparison population were smokers. This was done in an attempt to create a more comparable group to the lung cancer patient population. The true smoking figures amongst the lung cancer patient population will most likely not be 100%. Therefore, our adjustment was an extreme case. However, we have shown that the bias is relatively small regardless, and a more realistic proportion will only decrease this bias.
Although we have only considered lung cancer in this paper, we acknowledge that there are other cancer sites, such as bladder cancer, and cancer of the oral cavity and pharynx, that have also been shown to be smokingrelated. To carry out a similar sensitivity analysis for these cancer sites, an estimate of the prevalence of smoking within each cohort of cancer patients would be required. It would be unreasonable to assume that the proportion of smokers is anywhere near 100% in bladder and oral cancer cohorts. As these cancers have a better survival than lung cancer, it is likely that the lack of comparability of the life tables may have a larger impact on the relative survival estimates for these sites.
Unfortunately, information was not available on smoking status within the population mortality file. As a result, external information was used to obtain appropriate estimates for this (Table 1; Koskinen et al, 2006). These estimates were not stratified by age group. Should the proportion of smokers be larger in any of the age groups, then the bias in the relative survival estimates would most likely increase. This is particularly true for the oldest age group.
If smoking status had been available, then it would be preferable to create separate life tables for smokers and nonsmokers. However, difficulty lies in making a strict definition of a â€˜smokerâ€™. Peopleâ€™s smoking status varies over time, as does the level of cigarette consumption. Both of these factors are likely to have an impact on the general health status and prognosis from lung cancer, and so, would also ideally be incorporated into the life table.
We have focussed on the potential bias in the relative survival estimates, as this is the measure most commonly reported. However, if there was interest in comparing groups in terms of the excess mortality, then there may also be bias in the excess mortalityrate ratio. Had smoking status been available, then a comparison could have been made using both smokingadjusted and unadjusted life tables. Using the general population life tables, we would expect that the excess mortalityrate ratio for smoking status would be downwardly biased, as the excess mortality rate for smokers would be underestimated and the mortality rate for nonsmokers would be overestimated.
The value of Î¸ that was chosen as the â€˜estimatedâ€™ odds ratio was taken from a systematic review that was carried out to identify studies on the health outcomes associated with light and intermittent smoking. The value of 1.6 was calculated using data on males only, but we used this value to represent all ages and both genders in our sensitivity analysis. Although this value may be overestimated or underestimated for some subgroups of patients, given that even with an odds ratio of 5, the difference between the curves is still reasonably small, we can conclude that in practice, we donâ€™t have to be too concerned about the level of bias that may be introduced into the relative survival estimates by the assumption addressed in this paper.
The method described in this paper only makes adjustments for the assumption of comparability between the observed and expected populations. Other assumptions, such as independence between the mortality associated with the disease of interest and the mortality associated with other causes, are presumed to be reasonable.
Change history
23 January 2013
This paper was modified 12 months after initial publication to switch to Creative Commons licence terms, as noted at publication
References
Brenner H, Gefeller O (1996) An alternative approach to monitoring cancer patient survival. Cancer 78(9): 2004â€“2010
Dickman PW, Adami HO (2006) Interpreting trends in cancer patient survival. J Intern Med 260(2): 103â€“117
Doll R, Hill AB (1956) Lung cancer and other causes of death in relation to smoking: a second report on the mortality of British doctors. Br Med J 2: 1071â€“1081
Ederer F, Axtell L, Cutler S (1961) The relative survival rate: a statistical methodology. Natl Cancer Inst Monogr 6: 101â€“121
Fuchs CS, Colditz GA, Stampfer MJ, Giovannucci EL, Hunter DJ, Rimm EB, Willett WC, Speizer FE (1996) A prospective study of cigarette smoking and the risk of pancreatic cancer. Arch Intern Med 156(19): 2255â€“2260
Hinchliffe SR, Dickman PW, Lambert PC (2011) Adjusting for the proportion of cancer deaths in the general population when using relative survival: a sensitivity analysis. Cancer Epidemiol 36: 148â€“152
Human Mortality Database (2008). University of California and Rostock: Max Planck Institute for Demographic Research: Berkley
Korhonen T, Broms U, LevÃ¤lahti E, Koskenvuo M, Kaprio J (2008) Characteristics and health consequences of intermittent smoking: Longterm followup among Finnish adult twins. Nicotine Tob Res 11(2): 148â€“155
Koskinen S, Aromaa A, Huttunen J, Teperi J eds (2006) Health in Finland. KTL, Stakes and Ministry of Social Affairs
Lambert PC, Dickman PW, Nelson CP, Royston P (2010) Estimating the crude probability of death due to cancer and other causes using relative survival models. Stat Med 29(78): 885â€“895
Moore C (1971) Cigarette smoking and cancer of the mouth, pharynx, and larynx. J Am Med Assoc 218(4): 553â€“558
Papadopoulos A, Guida F, CÃ©nÃ©e S, Cyr D, Schmaus A, RadoÃ¯ L, PagetBailly S, Carton M, Tarnaud C, Menvielle G, Delafosse P, MoliniÃ© F, Luce D, StÃ¼cker I (2011) Cigarette smoking and lung cancer in women: Results of the French ICARE casecontrol study. Lung Cancer 74: 369â€“377
Peto R, Lopez A, Boreham J, Thun M, Heath JC (2006) Mortality From Smoking in Developed Countries 1950â€“2000, 2nd edn. Oxford University Press: Oxford
Phillips N, Coldman A, McBride ML (2002) Estimating cancer prevalence using mixture models for cancer survival. Stat Med 21(9): 1257â€“1270
Sarfati D, Blakely T, Pearce N (2010) Measuring cancer survival in populations: relative survival vs cancerspecific survival. Int J Epidemiol 39(2): 598â€“610
Schane RE, Ling PM, Glantz SA (2010) Health effects of light and intermittent smoking: a review. Circulation 121(13): 1518â€“1522
TalbÃ¤ck M, Dickman PW (2011) Estimating expected survival probabilities for relative survival analysis. Exploring the impact of including cancer patient mortality in the calculations. Eur J Cancer 47(17): 2626â€“2632
Willett WC, Green A, Stampfer MJ, Speizer FE, Colditz GA, Rosner B, Monson RR, Stason W, Hennekens CH (1987) Relative and absolute excess risks of coronary heart disease among women who smoke cigarettes. N Engl J Med 317(21): 1303â€“1309
Wolf PA, Dâ€™Agostino RB, Kannel WB, Bonita R, Belanger AJ (1988) Cigarette smoking as a risk factor for stroke. J Am Med Assoc 259(7): 1025â€“1029
Acknowledgements
Mark J Rutherford is funded by a Cancer Research UK Postdoctoral Fellowship (CRUK_A13275). Michael Crowther is funded by a National Institute of Health Research (NIHR) Methods Fellowship (RPPG040710314).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
This work is published under the standard license to publish agreement. After 12 months the work will become freely available and the license terms will switch to a Creative Commons AttributionNonCommercialShare Alike 3.0 Unported License.
Appendix
Appendix
To carry out the sensitivity analysis, we need to partition the total probability of dying from any cause in the general population into the probabilities for smokers and nonsmokers separately.
If we consider the odds ratio, Î¸, which compares the odds of dying from any cause if you are a smoker to the odds of dying from any cause if you are a nonsmoker. By rearranging the formulae for an odds ratio, we can write in terms of the probability of dying from any cause if you are a smoker (p_{s}):
We now have the probability of dying from any cause if you are a smoker (p_{s}), as a function of both the odds ratio, Î¸, and the probability of dying from any cause if you are a nonsmoker (p_{n}).
We also know that the total probability of dying from any cause (p_{t}) can be written as a function of p_{s} and p_{n}, if we can quantify the proportion of smokers in the general population (Î±):
By substituting equation (3) into equation (4), we can write the total probability of dying from any cause, p_{t}, in terms of the odds ratio, Î¸, the proportion of smokers in the general population, Î±, and the probability of dying from any cause if you are a nonsmoker, p_{n}, as follows:
We can rearrange equation (5) as follows:
The equation is now in the format with which the quadratic formula can be used to solve equation (9):
Now that we can calculate the probability of dying from any cause if you are a nonsmoker, p_{n}, using equation (3), we can also calculate the probability of dying from any cause if you are a smoker (p_{s}).
The population mortality file can now be adjusted, so that rather than using the total probability of dying from any cause (p_{t}) as we would have done previously, we now use the probability of dying from any cause if you are a smoker (p_{s}). This now assumes that 100% of the population are smokers.
Rights and permissions
From twelve months after its original publication, this work is licensed under the Creative Commons AttributionNonCommercialShare Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/byncsa/3.0/
About this article
Cite this article
Hinchliffe, S., Rutherford, M., Crowther, M. et al. Should relative survival be used with lung cancer data?. Br J Cancer 106, 1854â€“1859 (2012). https://doi.org/10.1038/bjc.2012.182
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/bjc.2012.182
Keywords
This article is cited by

AÂ new cure model that corrects for increased risk of noncancer death: analysis ofÂ reliability and robustness, and application to reallife data
BMC Medical Research Methodology (2023)

Mixture Cure Models in Oncology: A Tutorial and Practical Guidance
PharmacoEconomics  Open (2021)

Incorporating competing risk theory into evaluations of changes in cancer survival: making the most of cause of death and routinely linked sociodemographic data
BMC Public Health (2020)

Errors in determination of net survival: causespecific and relative survival settings
British Journal of Cancer (2020)

Different survival analysis methods for measuring longterm outcomes of Indigenous and nonIndigenous Australian cancer patients in the presence and absence of competing risks
Population Health Metrics (2017)