Standard errors of non-standardised and age-standardised relative survival of cancer patients

Background: Relative survival estimates cancer survival in the absence of other causes of death. Previous work has shown that standard errors of non-standardised relative survival may be substantially overestimated by the conventionally used method. However, evidence was restricted to non-standardised relative survival estimates using Hakulinen's method. Here, we provide a more comprehensive evaluation of the accuracy of standard errors including age-standardised survival and estimation by the Ederer II method. Methods: Five- and ten-year non-standardised and age-standardised relative survival was estimated for patients diagnosed with 25 common forms of cancer in Finland in 1989–1993, using data from the nationwide Finnish Cancer Registry. Standard errors of mutually comparable non-standardised and age-standardised relative survival were computed by the conventionally used method and compared with bootstrap standard errors. Results: When using Hakulinen's method, standard errors of non-standardised relative survival were overestimated by up to 28%. In contrast, standard errors of age-standardised relative survival were accurately estimated. When using the Ederer II method, deviations of the standard errors of non-standardised and age-standardised relative survival were generally small to negligible. Conclusion: In most cases, overestimations of standard errors are effectively overcome by age standardisation and by using Ederer II rather than Hakulinen's method.

Relative survival reflects the survival cancer patients would be expected to have in the absence of competing causes of death and is commonly reported by cancer registries. It is computed as the ratio between absolute survival of cancer patients and expected survival in the absence of cancer. The latter is estimated from life tables of the general population (Ederer et al, 1961), mostly by the so-called Ederer II method (Ederer and Heise, 1959) or Hakulinen's method (Hakulinen, 1982).
The variance of relative survival is a function of the absolute and expected survival as well as the variance of absolute survival and expected survival and the covariance between absolute and expected survival (see Appendix). In the computation of the standard error of relative survival it is commonly assumed that the variance of expected survival is zero (which implies that the covariance with absolute survival is also zero), as the database underlying the population life tables is usually very large. Therefore, the standard error is commonly computed as the ratio between the standard error of absolute survival and the expected survival (Parkin and Hakulinen, 1991;Estéve et al, 1994). If the assumption of zero variance of expected survival does not hold, however (for reasons discussed below), then the variance of relative survival may be over-or underestimated, depending, among other factors, on the relative magnitude of the variance of expected survival and the covariance of absolute and expected survival (see Appendix). Brenner and Hakulinen (2005) empirically assessed the validity of the commonly used standard error definition (ignoring variance of expected survival) using data from the nationwide Finnish Cancer Registry. They computed conventional and bootstrap standard errors for 5-and 10-year non-standardised (crude) absolute, expected and relative survival for patients with 25 common forms of cancer in Finland in 1989. Expected survival was computed according to Hakulinen's method. When applying the conventional standard error definition, standard errors were overestimated by up to 17% for 5-year and 32% for 10-year nonstandardised relative survival. This overestimation can be explained by non-zero standard errors for expected survival that may arise, as the expected survival is still subject to random variation due to random variation of the age distribution of the sample. Furthermore, absolute and expected survival may often be positively correlated, as survival typically decreases with increasing age in cancer patients as well as in the general population. To overcome the overestimation of standard errors, Brenner and Hakulinen (2005) suggested to compute bootstrap standard errors for relative survival and provided extensions of publicly available SAS macros for survival analysis (period and periodh; Brenner et al, 2002) that allow computing these standard errors.
In addition to non-standardised relative survival, age-standardised relative survival is commonly reported to compare survival across populations with different age structures. Age-standardised relative survival is commonly computed as the weighted sum of age-specific relative survival, using weights according to standard age distributions of cancer patients such as the International Cancer Survival Standard (ICSS; Verdecchia et al, 1999;Corazziari et al, 2004). Standard errors of these survival estimates are defined as the square root of the sum of the squared weighted age-stratified standard errors of relative survival. Thus, the same assumptions as in the computation of the standard error of non-standardised relative survival apply, but now on the level of the age-stratified survival estimates. As the age variability within these age groups is much smaller, the correlation of absolute and expected survival and the standard error of expected survival may be much lower. Thus, the conventional method to estimate the standard error of age-standardised relative survival might result in more accurate estimates.
To our knowledge, however, no previous study has assessed the validity of the conventional method in the context of agestandardised survival. This is particularly important, as agestandardised estimates are frequently reported. In addition, the validity of the method in the context of the Ederer II method, which is now increasingly applied, has not been investigated yet. Thus, the aim of this paper was to assess the validity of the conventional method to estimate standard errors of age-standardised relative survival and to estimate the accuracy of the conventional method when expected survival is computed according to either the Ederer II method or Hakulinen's method. Bootstrap standard errors of non-standardised and age-standardised relative survival were compared with standard errors computed according to the standard procedure. In addition, the random error of non-standardised expected survival and the correlation between non-standardised expected and absolute survival were estimated to check the assumptions of negligible variation of expected survival and negligible correlation.

MATERIALS AND METHODS
The analysis was based on data from the population-based Finnish Cancer Registry, which has been operating for more than 50 years, covers the whole population of Finland (about 5.4 million people), and is well known for its data quality and completeness (Teppo et al, 1994). Patients aged 15 years or older with a first diagnosis of one of 25 common forms of cancer between 1989 and 1993 were included. These years of diagnosis were selected to ensure comparability with results of our earlier work on non-standardised relative survival, which had pertained to patients diagnosed in 1989, the most recent cohort of patients for whom 10-year survival had been completed at the time of our previous analysis. To avoid problems due to sparse data due to age stratification, the current analysis was extended to 5 years of diagnosis, that is, 1989 -1993. Reproducibility of the results was assessed by repeating all analyses separately for patients diagnosed between 1979 and 1983 and between 1984 and 1988. As the patterns regarding the standard errors for non-standardised and age-standardised survival were generally very similar, results for these groups of patients are not shown separately.
For each cancer site, 5-and 10-year non-standardised absolute, expected and relative survival and age-standardised relative survival and the corresponding standard errors in % units were computed. Expected survival was calculated based on age-, sexand calendar period-specific population life tables of Finland using both Hakulinen's (Hakulinen, 1982) and the Ederer II method (Ederer and Heise, 1959). Because available life tables of Finland were limited to ages up to 95 years, expected survival for cancer patients older than 95 years were based on the survival estimate of the general population of age 95 years. Age-standardised survival was computed as weighted average of age-specific survival, defining the age groups (15 -44, 45 -54, 55 -64, 65 -74, 75 þ years) according to the ICSS (Corazziari et al, 2004) and the weights for each site and gender proportional to the numbers of patients at diagnosis in these age groups. The definition of weights was done in order to make the age-standardised and non-standardised survival figures mutually comparable.
For the calculation of the standard error of non-standardised and age-standardised relative survival, two methods were employed. The first method followed common practice in the analysis of population-based cancer registry data: Standard errors of relative survival were obtained by dividing the standard errors of absolute survival, which were computed according to Greenwood's method (Greenwood, 1926), by the expected survival (Ederer et al, 1961). For age-standardised survival, the ratio is computed between age-stratified estimates. The standard error of the agestandardised relative survival is estimated by the square root of the sum of the squared weighted age-stratified standard errors of relative survival. In the second method, the standard errors of relative survival were estimated by non-parametric bootstrap analysis as the standard deviation of the respective point estimate in 10 000 bootstrap samples. Each bootstrap sample was obtained by sampling with replacement from the patient population for a given cancer in the cancer registry, the same number of patients as in the original data set. Conventional and bootstrap standard errors were compared by computing ratios between the estimates.
To investigate the assumption of negligible standard error of expected survival and negligible correlation between absolute and expected survival, standard errors for non-standardised expected survival were estimated by bootstrapping, and Pearson's correlation coefficients of the point estimates of non-standardised absolute and expected survival in the bootstrap samples were calculated.   Table 2 for 5-year survival, and in Table 3 for 10-year survival. Five-and ten-year non-standardised absolute survival varied strongly by cancer site, ranging from 89.7 and 88.1% for testicular cancer to 1.8 and 1.1% for pancreatic cancer. Standard errors of 5-and 10-year absolute survival ranged from 0.25 and 2.84 to 0.18 and 2.98, respectively.
Non-standardised expected survival computed by Hakulinen's method was highest for testicular cancer, as these patients are on average younger than patients of the other cancer sites, and lowest for prostate cancer because of their relatively high age at diagnosis. Standard errors of 5-and 10-year non-standardised expected survival ranged from 0.13 and 0.22 for breast cancer to 0.93 and 1.53 for eye cancer. For all cancer sites, they were higher for 10-year than for 5-year survival and, although lower than standard error of absolute survival, generally not negligible.
Comparing standard errors of relative survival computed by the conventional and bootstrap method, conventional standard errors of 5-year non-standardised relative survival were overestimated by more than 5% for eight cancer sites and by more than 10% for Hodgkin's lymphoma (11%) and thyroid gland cancer (17%). Median overestimation across the 25 cancer sites was 4%. For 10-year non-standardised relative survival, overestimation was mostly larger (median: 6%). For seven cancer sites, it was larger than 10%. The highest overestimation was observed for thyroid gland cancer (26%). For age-standardised relative survival, deviations between the conventional and bootstrap standard errors were negligible (within ±1% for 16 of 25 cancer sites for 5-year survival and for 15 cancer sites for 10-year survival). Strongest overestimations were found for laryngeal cancer for 5-year (3%) and cancer of the breast, urinary bladder and corpus uteri for 10-year (3%) age-standardised relative survival.
The analysis was repeated with the computation of expected survival according to the Ederer II method (Tables 4 and 5). Nonstandardised expected survival computed by the Ederer II method was overall higher than the estimates computed by Hakulinen's method. Standard errors of 5-and 10-year non-standardised expected survival were comparable, ranging from 0.15 and 0.27 for breast cancer to 1.55 and 4.01 for gallbladder cancer, respectively.
Correlations between non-standardised absolute and expected survival, which were based on the Ederer II method, were generally lower than the correlations in the context of Hakulinen's method (ranges across cancer sites: 0.03 -0.32 and 0.03 -0.32, respectively).
Conventional standard errors of relative survival were generally smaller when computed by the Ederer II instead of the Hakulinen's method. Deviations between the conventional standard errors and the bootstrap standard errors of non-standardised relative survival were negligible for 5-year survival (median deviation: 2%) and small for 10-year survival (median deviation: 3%). Deviations were largest for cancer of the thyroid gland and corpus uteri for 5-year Table 2 Non-standardised absolute, expected and relative survival and age-standardised relative survival after 5 years of patients aged 15 or more years with a first diagnosis of common forms of cancer, Finland, 1989Finland, -1993    survival (4%), and for gallbladder and urinary bladder cancer (À6 and 6%) for 10-year survival. The accuracy of the standard errors of age-standardised relative survival was again higher with a median deviation of 0% for 5-year and 10-year survival, and a maximal deviation of 4 and 6%, respectively.

DISCUSSION
Our results confirm previous observations that conventional standard errors for non-standardised relative survival, which were computed according to Hakulinen's method, may often be substantially overestimated reaching percentage deviations of up to 28% in the examples assessed in this article. Overestimations were overall larger in case of long survival times, that is, for 10-than for 5-year survival, and in case of higher correlations between absolute and expected survival. Largest overestimations were observed for thyroid cancer and Hodgkin's lymphoma. Both cancer sites had the highest correlation between absolute and expected survival, which can be explained by the large age gradient in prognosis of patients with these two malignancies. We could show, however, that standard errors for age-standardised relative survival, which were computed by Hakulinen's method, were accurately estimated by the conventional standard error method. When using the Ederer II method, deviations of the standard errors of non-standardised relative survival were small to negligible, and standard errors of agestandardised survival were accurately estimated. The commonly used definition of the standard error of relative survival is based on the assumption of zero standard error of expected survival. However, non-negligible standard errors for non-standardised expected survival were observed, which can be explained by the random variation of the age distribution of the sample. Furthermore, for many cancer sites, a substantial positive correlation between non-standardised absolute survival and expected survival computed by Hakulinen's method was observed, which can be explained by the common variation of absolute and expected survival, as both estimates decrease with increasing age. Within age groups used for age-standardisation random variation of the age distribution and, hence, variance of expected survival and covariance of absolute and expected survival is smaller.
Overestimation of the standard error of non-standardised relative survival is a consequence of random error in expected survival, along with positive covariance of absolute and expected survival (Brenner and Hakulinen, 2005). Substantial overestimations were observed when expected survival was computed by Hakulinen's method. The age-standardisation computed as weighted averages of age-specific estimates overcomes the problem of overestimated standard errors of relative survival. When expected survival was computed by the Ederer II method, errors in the estimation of the standard errors of non-standardised relative survival were small to negligible. The difference between these two methods might be explained in part by smaller covariance of absolute and expected survival for the Ederer II than for Hakulinen's method, which is reflected by the smaller correlation coefficients in our bootstrap analysis.
Relative survival, computed as the ratio of absolute and expected survival, is often interpreted as an estimate of net survival when cancer was the only cause of death. However, it has been shown that this interpretation is valid only when survival with respect to cancer and survival with respect to other diseases are independent (Pohar Perme et al, 2011). This condition is only rarely fulfilled, as both survival proportions are usually affected by common Table 5 Non-standardised absolute, expected and relative survival and age-standardised relative survival after 10 years of patients aged 15 or more years with a first diagnosis of common forms of cancer, Finland, 1989Finland, -1993  covariates such as age. Comparison of the accuracy of methods such as Ederer I, II and Hakulinen for the estimation of net survival have usually been restricted to investigations of the point estimate (Pokhrel and Hakulinen, 2008;Hakulinen et al, 2011). These investigations have suggested that without applying regression modelling, the gold standard to estimate net survival may be age-standardised relative survival with expected survival computed by the Ederer II method and weights proportional to the number of patients at the beginning of the follow-up (Pokhrel and Hakulinen, 2008). In general, the bias in the estimation of net survival is smaller when using the Ederer II method than when using the Ederer I or Hakulinen's method and increases for the Ederer I and Hakulinen's method over time (Hakulinen et al, 2011). Our results on the standard error of relative survival support the recommendation of the Ederer II method, as both standard errors of nonstandardised and age-standardised relative survival estimates were accurately estimated by the conventional method.
Recently, a new method to estimate net survival was proposed by Pohar Perme et al (2011). This new method provides an unbiased estimate for net survival without any modelling. The corresponding variance estimate has already been tested for accuracy by comparison with bootstrap estimates. Results from the empirical and theoretical investigations of this method are very promising. However, the method requires the absence of censoring and it has to be shown how estimates change in case of censoring.
The accuracy of the standard error of relative survival was estimated by the ratio between the conventional and bootstrap standard error. To test whether the measurement of bias in relative terms depends on the sample size, we re-run the analysis for the two most common cancer sites (breast and lung cancer) and the two cancer sites with the largest bias (thyroid cancer and Hodgkin's lymphoma) using a variable proportion (1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, y , 90%) of the complete data set. No indication for the dependence of relative bias on the sample size was found (data not shown).
In summary, our results show that overestimations of the standard errors can be substantially reduced by computing agestandardised survival. In age-standardised survival, the correlation between absolute and expected survival is substantially reduced. For the Ederer II method, standard errors of non-standardised and age-standardised relative survival were accurately estimated. Thus, our results support the recommendation to use the Ederer II method to estimate relative survival ratios.