Projections of cancer incidence are important for planning health services and to provide a baseline for assessing the impact of public health interventions.
Rates estimated from smooth function age–period–cohort modelling of cancer incidence data from Great Britain 1975 to 2007 are extrapolated to 2030 and applied to UK population projections. Prostate and breast cancer projections take into account the effect of screening.
Overall rates of cancer are projected to be stable over the next 20 years, but this masks individual changes. In both sexes, age-standardised rates of cancers of the stomach, larynx, bladder and leukaemia are projected to fall by 1% per year, whereas cancers of the lip, mouth and pharynx (ICD-10 C00-C14) and melanoma are projected to increase by 1% per year. The growing and aging populations will have a substantial impact: numbers of cancers in men and women are projected to increase by 55% (from 149 169 to 231 026) and 35% (from 148 716 to 200 929), respectively, between 2007 and 2030. The model used yields similar results to those of Nordpred, but is more flexible.
Without new initiatives for smoking and obesity reduction, the number of cancers in the United Kingdom will increase substantially reflecting the growing and aging populations.
Quantifying the future burden of cancer incidence and mortality, in terms of expected numbers of cases and deaths, or rates of disease, is important in optimising the allocation of resources for screening, diagnostic, therapeutic and palliative services and to provide a baseline from which the success of future interventions can be judged (Bray and Møller, 2006).
The future number of cases (or deaths) is a consequence of two components of change; the evolution of the size and age structure of the population (demographic component) and changes in the rates of disease. The latter component is generally evaluated by some form of projection – likely future trends are estimated on the basis of those observed in the past. A distinction should be made between what we refer to as ‘projections’ and ‘predictions’. We use the term ‘projections’ for the extrapolation of observed trends in incidence rates that do not explicitly take into account expected changes in factors influencing risk (in a positive or negative way). By contrast, we use ‘predictions’ to be statements about the future that take into account all known trends and likely future changes in risk factors. Predictions necessarily involve some sort of modelling: incorporating data on postulated future change in factors that affect risk, the components of the population likely to be involved, the size of the effect (relative risk) and latency between exposure and outcome.
Several statistical methods have been used to make projections of future cancer burden. While short-term projections may assume continuation of past rates of change – generally within specific age groups – projections over the longer term need to model more complex interactions, incorporating the past components of change due to the effects of age, calendar period and birth cohort, and making explicit assumptions about the extent to which past changes are likely to operate into the future (Møller et al, 2003; Sedjo et al, 2007; Olsen et al, 2008). In the United Kingdom, previous studies have used such models to generate cancer mortality projections up to the year 2025 (Olsen et al, 2008), and cancer incidence projections up to the year 2020 for England (Møller et al, 2007). In this paper, we use data on the incidence of cancer in Great Britain (GB: England, Scotland and Wales) between 1975 and 2007 to make projections of the future rates of 28 types of cancer in the United Kingdom up to 2030, and estimate the corresponding future burden in terms of numbers of cases, using national population projections. Compared with the paper by Møller et al (2007), we use data for the United Kingdom (England, Scotland, Wales and Northern Ireland) rather than England, include data for 2004–2007 and make projections for prostate and bladder cancers (which they avoided) and use methodology that takes into account changes in breast screening when modelling breast cancer rates.
Materials and Methods
Data on the incidence of 28 cancers (individual sites or groups of sites, defined by ICD-10 codes), in terms of numbers of cases and population at risk, by 5-year age group and sex, in GB (England, Scotland and Wales) for the period 1975–2007 were provided by Cancer Research UK, who in turn received the data from the UK Association of Cancer Registries. National population projections (2008 based) for the United Kingdom (GB plus Northern Ireland) by sex, 5-year age group and year, from 2008 to 2030, were obtained from the population projections of the Office for National Statistics (ONS, 2009).
The cancer incidence rates were modelled using age-period-cohort (APC) models and extrapolated out to 2030. The basic APC model was
where λ is the incidence rate as a function of age and calendar period, g is a ‘link’ function and fA, fP and fC are functions of age, period (year of incidence) and cohort (year of birth, i.e., cohort=period-age), respectively. Here the link function, g, is either the exponential function or the power 5 function (i.e., g(x)=x5 as used by Møller et al (2002)), and the functions fA, fP and fc are either step functions on 5-year intervals or natural cubic splines. Whereas step-functions are more commonly used, cubic splines provide much greater flexibility and are more realistic if one believes that changes over time occur smoothly rather than in sudden jumps. Flexible additive models were popularised by Hastie and Tibshirani (1990) and cubic splines have been used earlier in APC models by Sasieni and Adams (1999, 2000) and more recently by Cartensen (2007). Additionally, there is a question regarding how one should extrapolate the model beyond the most recent period. Although the issue is most apparent in terms of deciding on what fP should be for a future period, the linear dependence between age, period and cohort means that there is no unique definition of a model in which the fP is kept constant for future periods. When modelling, using cubic splines, we absorb the linear trends in period and cohort over the timespan with observed data into a drift component and then attenuate the drift into the future. The idea of attenuating or dampening the drift comes from the belief that past trends will not continue forever and was seen to be valuable empirically for making future predictions by Møller et al (2002). We have reduced the drift by 8% each year beyond the period of observations. The factor 0.92 (=1–0.08) was chosen so that the drift after 8 years will be approximately half of what it is during the observation period. (Møller et al (2002) used arithmetic dampening: 25% in years 1–5, 50% in years 6–10, 75% in years 11–15. Our geometric dampening gives corresponding amounts of 21.6%, 48.3% and 65.9%.)
For cancers of the prostate and breast, the models were fitted to a modified data set in order to take account of the changes in incidence attributable to the increased use of PSA testing and the introduction of breast screening, respectively. For prostate, data from the period prior to extensive use of PSA testing (1975 to 1991) were used to fit an APC model (with a logarithm link) to predict rates in the absence of extensive PSA testing for 2004–2007. These predicted rates were then used to calculate age-specific observed/predicted ratios. The observed counts for 2004, 2005, 2006 and 2007 were divided by these ratios in order to obtain an estimate of what the counts would have been in the absence of the increased use of PSA testing. Predictions for 2008–2030 were made by fitting APC models to the data for 1975–1991 and the modified data for 2004–2007 and multiplying the model projections by the previously obtained observed/predicted ratios. Our projections are thus based on the assumptions that the level of PSA testing in the future will be similar to that in 2004–2007. Our methodology separates the underlying trends in prostate cancer incidence from the effect of PSA testing, and can thus be used to make predictions that include explicit assumptions regarding future rates of PSA testing. Similarly, for female breast cancer we used data from before screening was offered to a particular age-group to predict what would have happened to rates once the programme had reached a steady state in that age group, and used the ratio of observed to predicted rates to adjust all subsequent data before making predictions until 2030. Explicitly, incidence rates from the following age groups and periods were assumed to have been affected by the introduction of screening and not used in the model building: age group 50–64 years during 1989–1996; age group 65–69 years during 1990–1997 and 2003–2007; age group 70–74 years during 2004–2007.
In order to make projections for ‘all cancer’, we fitted models to 17 different cancer sites in males plus one for ‘other cancer in males’ and added the results. Similarly we modelled 19 different sites in females plus ‘other’.
A function was developed in Stata to fit the model to the incidence of cancer in individual years (for the 33-year period from 1975 to 2007), and 5-year age groups (0–4, 5–9, …, 80–84, 85+) by sex. This function was used in Stata version 10. The primary projections presented here are based on the power-5 ‘link’ function and natural cubic splines for the functions of age, period and cohort. Other functions were used to study the sensitivity of the results to these constraints. The European standard population (Waterhouse et al, 1976) was used for age standardisation.
To estimate the number of cases occurring in the population of the United Kingdom in the years 1984, 2007 and 2030, the number of cases in GB in each 5-year age group was multiplied by the ratio of the age (and sex)-specific populations in the United Kingdom and GB. We did not use UK rates, as incidence rates for Northern Ireland are only available since 1993. Population estimates and projections for the United Kingdom and GB (by sex and 5-year age groups) were obtained from the Office of National Statistics (ONS 2009). The rates were converted to numbers of cases for the United Kingdom by multiplying by the ratio of the respective populations. The underlying assumption is that the rates (and past trends) for the United Kingdom were not significantly different from those of GB (the population of which was 97.1% of that of the United Kingdom in 2007).
Comparisons, for all sites other than breast and prostate, of our cubic spline projections were made with those using step functions in the Nordpred software (Møller et al, 2002) with the R-statistical package. For each site–sex combination and each 5-year age group (age 50–84 years) and 5-year period, we calculated the absolute difference between the two projections, and compared this with the mean of the two projections.
Overall projected rates and numbers
Trends (1984–2007) and projections (2008–2030) of age-standardised cancer incidence rates by sex and age group (10–24, 25–49, 50–64, 65–74, 75+ and all-ages) for all sites combined are shown in Figure 1. The dots represent observed rates. The solid lines are the projections based on the exponential link and the dotted lines are the projections based on the power-5 link function. Note that the jump from the observed rate for males aged 75+ years in 2007 and the projected rate for 2008 is due to the modelling of prostate cancer for which the projections were made from a 2004–2007 base. Table 1 shows age-standardised incidence rates and numbers of cancers for individual cancer sites, and for all cancers combined for 1984, 2007 and 2030. The rates for 1984 and 2007 are the observed GB rates. Those for 2030 are based upon the projections using the power-5 model. For all years, the UK numbers are estimated by applying the UK population to the age-specific GB rates.
The observed age-standardised rates (ASRs) for cancer of all sites by sex in GB have increased since 1984. For both males and females, the ASRs are projected to level off with a 1.0% fall (from 2007 to 2030) in males and 1.9% fall in females.
Figure 2 shows trends and projections for 17 different cancers shown (separately for male and female). The graphs are arranged so that those with similar scales on the y-axis (incidence rates) are in the same panel. As before, the dots represent observed data and the lines represent the model fits (dotted line for power-5, solid line for exponential). For all cancers apart from breast and prostate, the model fits during the observed period (1975–2007) are also plotted. It is seen that in most cases, the fit to the observed data is excellent and that in all cases it is reasonable. It is noted that the differences between the exponential and the power-5 model fits (to the observed data) are generally small, but that the projections are more different the further one goes in to the future. The only site for which the short-term (i.e., 1–9 years) projections by the two models are not extremely similar is prostate cancer (particularly in the age group 65–74 years). For male lung cancer, the attenuation of the decreasing rates is seen to be particularly strong for the power-5 model.
There is a projected decrease in age-standardised incidence rates for several sites between 2007 and 2030, with annual declines of 1% or more for stomach, larynx, bladder, ovary and leukaemia. Conversely, ASRs are projected to increase at rates of 1% or more annually for cancers of the oral cavity, liver (males only), malignant melanoma and kidney (males only).
These overall summary figures disguise some interesting variations in the projected rate of change over time, and between age groups. For instance, the large increases in incidence of cancer of the ovary up to around the year 2000 have been reversed in the last 5–6 years, and this decline is projected to continue into the future (Figure 2E). On the other hand, the current steep increase in incidence of cancer of the corpus uteri (primarily endometrial cancer) is projected to continue, before peaking around the year 2012 for women aged 65–74 years and a few years later for those over 75 years (Figure 2D). It is noted that the fit of the APC model to the observed rates for cancer of the corpus uteri is poor. (The lines do not run through the dots in the age groups 50–64 or 75+). For this reason, the projected down-turn in ages 65–74 and 75+ years must be viewed with some caution. For breast cancer, the currently observed decrease in incidence in the age group 50–64 years is projected to continue into the future, while the recent increase at ages 65–74 years is projected to continue for another 7 or so years, before reversing (Figure 2I). The results for prostate cancer vary somewhat depending on whether one uses an exponential or a power-5 ‘link’ in the model, but, in general, rather modest increases are projected in all age groups, subsequent to the very brisk increases observed following the introduction of PSA testing in the mid early 1990s (Figure 2K).
Effect of an increasing and ageing population
There is projected to be almost no change in the overall incidence rates of cancer (for all cancers combined) in the 23-year period 2007–2030: the age-standardised incidence is projected to decrease by −1.0% (equivalent to an average annual change of −0.04%) in males and 1.9% (−0.08% per year) females (Table 1). However, because of the increase in the size of the population and ageing, the number of cases is projected to increase for practically all types of cancer, even for those with decreasing rates. The projected increase in population size is 16% (61.0 million in 2007 and 70.7 million in 2030) for the United Kingdom. The proportion of elderly persons in the population is increasing at a faster rate: from 16% aged 65 years or more in 2007 to 22% in 2030; the proportion of very elderly (85 or older) is expected to increase from 4.5% to 7.6% over the same time period. In 2007, 100 672 (67.4%) cancers in men and 87 214 (58.7%) in women were in those aged 65 plus years; in 2030, we project that there will be 176 224 (76.2%) male and 135 784 (67.5%) female cancer patients over age 65 years. In terms of absolute numbers at all ages, the largest increase is in cancers of the breast (11 684 (26%) more cases in 2030 than in 2007) and prostate (25 006 (69%) more cases in 2030 than in 2007).
Figure 3A shows, for all cancers combined, the percentage change between 2007 and 2030 in terms of ASRs, crude incidence rates and numbers of cases. The change in ASRs approximates to the change in risk, while the change in crude rates includes, in addition, the effects of an ageing population. A further component, as represented by the difference between the increase in crude rates and that in the number of cases, is due to the increase in size of the population.
Figure 3B shows the equivalent changes (in ASR, crude rate and numbers of cases) for cancers of the prostate and breast. The ASR of breast cancer is projected to fall by 8% between 2007 and 2030, while, despite the overall decline in risk, there will be a small increase in the crude rate (6%) because of population ageing. Population ageing makes the largest contribution to the projected increase in the number of prostate cancer cases.
Figure 4 shows the proportionate distribution of different cancers among the totals in 1984, 2007 and 2030. For females, the rank order and percentage distribution change rather little over time; the biggest differences being the increase in melanoma and the decrease in stomach and cervical cancers. For males, however, there has been a marked change. Lung cancer, by far the most important cancer in 1984 (29% of all cancers) is now in third position (12% of male cancers), while prostate cancer now comprises 29% of the total, having comprised only 11% of the total in 1984. Stomach cancer, the fifth most common type of cancer in 1984 no longer figures in the top 10 (it ranks 12th, with <2% of the total), and melanoma (14th with just under 1% of male cancer in 1984) is predicted to become the fourth most common male cancer accounting for 4.6% of cases in 2030.
Comparison with Nordpred predictions
We also used the Nordpred package (with a power-5 link and 5-year step functions for age, period and cohort) to project rates until 2023–2027. Comparison of 2052 age–sex–site–period-specific projections (for all sites other than breast and prostate, for ages 40+ years for all sites except for testis, and for ages 15–79 years for testis) were made (Table 2). The median of the observed absolute differences increased from 9% for 2008–2012 to 14% for 2023–2027. The 95th percentile (i.e., the 488th largest of the 513 comparisons in each 5-year calendar period) of the distribution of disagreement was about four times greater than the median (Table 2).
These results can be compared with those obtained by using the data until 2002 in Nordpred to predict results for 2003–2008 and comparing these with the observed data. Here the median absolute disagreement was 7% and the 95th percentile 28%.
The observed data were annual incidence rates, for GB, by cancer site, sex and 5-year age group from 1975 to 2007. The rates were converted to numbers of cases for the United Kingdom by multiplying by the ratio of the respective populations. Compared with the actual incidence (number of recorded cases) in the United Kingdom in 2007, the difference between observed and modelled numbers of cases (using GB rates applied to the UK population) was <1.5% for all sex/site combinations. In Table 1, we present the projected changes in ASRs and in the numbers of cases of cancer between 2007 and 2030. These can be used to split the change in numbers between changes due to changing rates and changes due to changing population. Thus, for instance, the 11% increase in male stomach cancer cases is accompanied by a 33% fall in rates, so that the effect of the population change is to increase numbers by 66% (=[100+11]/[100−33]−1).
We have deliberately shown results on a linear rather than a logarithmic scale because of the greater uncertainties in predicting cancers in the young. For ovarian cancer, for instance, cancers in young women are likely to be germ cell with very different risk factors from epithelial cancer in older women; therefore, there is no reason to suspect that cohort effects seen in those aged 15–29 years will carry forward to ages 35–49 years. Rates of melanoma, ovarian cancer, endometrial cancer (corpus uterus) and testicular cancer (not shown in Figure 2) are all non-negligible in those aged 25–49 years, and our model is seen to fit these observed data well. For other sites, one would need to more carefully adjust the model used if the interest was specifically on rates in the population in this age group.
Future predictions depend on multiple assumptions, but the basic premise is that past trends, affecting as they do the risk of cancer in specific generations and/or time periods, will be carried forward into the future. We modified this basic assumption in one important respect – we attenuated the ‘drift’ component of the observed changes in rates by 8% each year; thus in 2017–2018 years the annual drift is just 43% of what it was between 1975 and 2007, and by 2030 it was just 15%. The idea of attenuating the drift, so as not to assume that increases or decreases continue forever, was proposed and shown to be empirically useful by Møller et al (2002), although whereas they used arithmetic damping, we chose to use a geometric damping.
We compared our results with those based on projections using the Nordpred package and its default power-5 link. The Nordpred package requires input of past data for 5-year time periods, and produces projections for periods of the same duration. The ‘drift’ component in Nordpred is reduced by 25% for each 5-year period after the first. The results are a little different from those based on our modelling using single years of observation, and the magnitude of the deviation is similar to that seen when current observed incidence rates are compared with those projected from past data using Nordpred. We also allow the reader to compare our model projections using power-5 and exponential link functions. The latter is used for comparison because it is the default for Poisson regression and because the age and period effects can be interpreted as relative risks. It is not used for the main projections because it can lead to extreme results, as the model is extrapolated further into the future. It is of course impossible to judge in advance which of the many projections (by sex, age group and cancer site) will prove to be the more accurate.
The uncertainty associated with these predictions does not concern sampling error (which would be small when based on the relatively large numbers of cases under study), but the unquantifiable bias when trends in some cancers behave in a manner that is inconsistent with the assumptions of the statistical analysis of past rates.
The projected numbers of cases are also dependent on reasonable population projections. In this study, we used the 2008-based population projections prepared by the Office for National Statistics (ONS, 2009).
In the 24-year period 1984 to 2007, the overall age-standardised incidence rate for cancer has increased in GB in both sexes, although in the most recent 10-year period (1998–2007) they have remained fairly constant (Figure 1). Male rates increased slightly but have now returned to those of the late 1990s level. Female rates have increased slightly (by around 3%). Future projections suggest that the decline in male rates will continue, whereas rates in women will peak, then start to decrease during the current decade (2010–2019). The reduction in smoking prevalence has been associated with declines in lung cancer rates in males throughout the period, and, from around 2004 in females (Cancer Research UK, http://info.cancerresearchuk.org/cancerstats/projections/lung/females/, accessed December 2010). Our projections implicitly assume that the reduction in smoking will continue, but do not explicitly model smoking rates or take account of the most recent changes in smoking prevalence. Indeed, changing rates of smoking over the last decade will continue to affect lung cancer rates into the next decade due to the cohort effect of smoking cessation – ex-smokers have lower rates of lung cancer compared with current smokers even (particularly) many years after cessation. Nevertheless, whereas the reduction in male smoking in the United Kingdom in the last quarter of the twentieth century was dramatic, the more recent changes have been more modest and while the rates of lung cancer are predicted to continue to fall, the numbers will begin to increase (Table 1) as the population grows and ages. Similarly, we do not consider the likely impact of colorectal screening, HPV vaccination or of accelerated changes in obesity. The predicted impact of fecal occult blood testing on colorectal cancer incidence is small, as this form of screening is intended primarily to diagnose cancer early. Nevertheless, the lead time from screening will change the age-specific rates (Parkin et al, 2008), and with the introduction of flexible sigmoidoscopy it is likely that the bowel screening programme will result in a noticeable fall in the incidence of distal cancers. The impact of HPV vaccination (introduced in women aged 12–18 years in 2008) on cancer rates by 2030 will be quite small, as the total burden of HPV-related cancers by age 40 years is small relative to the lifetime burden of cancer in women.
After a long period of increasing incidence, age-standardised incidence rates of breast cancer have begun to decline since 2005, and this trend is projected to continue, reflecting in particular declines in the age group 55–64 years. Some (if not all) of the decline in breast cancer in this age group is undoubtedly due to the substantial reduction in the use of HRT in recent years (Parkin, 2009).
Other notable changes in rates include the continued steady fall in stomach cancer in both men and women; the increase in number of rarer cancers including melanoma, non-Hodgkin's lymphoma, kidney, liver and orophaynx in both men and women; the increase in oesophageal cancer particularly in women; and the increase in cancer of the corpus uterus. Although it is not possible in a paper such as this to consider in detail the likely reasons for past changes in rates of particular cancers or the likely impact of changes to the health service over the next decade, the standard approach for obtaining projections at any site can be used as a basis for more detailed epidemiological study of such trends one site at a time.
Møller et al (2007) published projections of cancer incidence for England, for the period 2004–2020, based on observed rates in 1974–2003, using the Nordpred package. Although incidence rates in men were projected to decline by 7% between 2001 and 2020 (driven largely by the declining rates of lung cancer), female rates were projected to increase by the same amount, with significant increases in the risk of breast cancer in particular. Apart from the somewhat different populations studied (our projections were based on rates from GB, not just England (86.2% of the British population in 2007)), the availability of 4 years additional data (2004–2007) and the slight differences in methodology, there are two important sources of difference. First, Møller et al ‘projected’ the future incidence of cancer of the prostate by assuming that rates would remain at the level observed in 1999–2003. This almost certainly results in an estimate of future burden of cancer of the prostate that is too low, and, as this cancer accounted for almost one-quarter of all new cancer cases in men in 2007, an underestimate too of the overall cancer burden. There is considerable uncertainty in predicting prostate cancer incidence, which is being driven not only by an inherent increase in the risk of the disease, but also by the over-diagnosis (and over-treatment) as a consequence of testing with PSA. There is little information available on the extent of PSA testing in the United Kingdom. In Scotland, the PSA testing first came into use in 1989, and the rate of testing accelerated rapidly after 1991 (Brewster et al, 2000); the slowly increasing rates of incidence in the United Kingdom greatly accelerated at this time (Figure 2K). We have attempted to capture the underlying increase in incidence (pre-testing) and assumed a testing effect remaining as it was in 2004–2007. This is of course almost certainly wrong, but, equally certainly, likely to provide a more realistic future estimate than the assumption of no change, as we do take into account the underlying increase in rates and we believe that PSA testing, which is currently much less common than in the USA, will most likely increase. Second, Møller et al (2007) projected forward the trends in breast cancer incidence observed in 1994–2003. The changes in incidence in recent years due to patterns of breast cancer screening by age, and the decrease in use of HRT (Parkin, 2009) mean that their assumption of a continuing increase in incidence rates (∼1% annually) is almost certainly too pessimistic. In our prediction, the age-standardised incidence for breast cancer will decline by 8% between 2007 and 2030.
Projections of cancer incidence, although inherently subjective and unreliable, do provide a necessary baseline for future planning of cancer resources and against which preventive interventions can be judged. The methodology used here minimises the subjectivity and provides a framework for such planning.
We acknowledge Lesley Walker, Nick Ormiston-Smith and Catherine Thomson at Cancer Research UK for provision of data and helpful comments on the project. We also acknowledge the essential work of the cancer registries in the United Kingdom Association of Cancer Registries, without which there would be no incidence data. The work of PS and AA was supported by Cancer Research UK programme grants C8162/A10406 and C569/10404 respectively.
This work is licensed under the Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/