Many high-income countries, including England and the United States, are facing the challenge of providing care to an ever increasing population of patients with cancer (Sullivan et al, 2011). There are currently 1.8 million people living in England with a cancer diagnosis (Maddams et al, 2012). According to the latest projections more than one in three people in England will develop cancer in their lifetime and there will be an estimated 3 million people living with cancer in 2030 in England, due to increasing incidence and improving survivals (National Audit Office, 2015). These trends are expected to increase pressure on the budget of the National Health Service (NHS).

Evidence on the cost of cancer should be one of the main pillars supporting policymakers in achieving the best value for money and realise an efficient allocation of public resources across different services and pathways of care. However, there is a dearth of evidence due to the lack of large databases collecting information on the cost of care accessed by patients over a sufficiently long period of time (National Audit Office, 2015). In England, current evidence is based on a limited number of patients treated in a restricted number of hospital sites (Hall et al, 2015), or is based on predicted pathways of care (Incisive Health, 2014). A number of studies used aggregated utilisation and cost data (Martin et al, 2008; Luengo-Fernandez et al, 2013; Incisive Health, 2014), which may affect the accuracy of estimates and limit the scope for analysis. Some authors deem the shortage of health economic studies to be a major contributor to the increasing cancer costs in England and other developed countries (Sullivan et al, 2011).

In USA, the availability of the SEER-Medicare database has empowered researchers and allowed for increased evidence on the direct costs and economic costs of cancer in the past 20 years (Brown et al, 1999, 2002; Warren et al, 2008; Yabroff et al, 2008a, b, 2009, 2011; Basu and Manning, 2010). The USA experience highlights the potential of using population-based, patient-level databases in investigating a wide range of topics on the cost of cancer and in producing evidence to inform policymakers and the wider public. Although SEER-Medicare provides granular data on health-care utilisation and the costs of care, data are available only for a proportion of the population in USA age 65 and over. A recent study from New Zealand using population-based, patient-level data found higher costs in the age groups under 65 (Blakely et al, 2015).

In this study, we generate a new database for the analysis of the cost of cancer in England similar to the SEER-Medicare in the United States by matching data on the cost of care to data from all English cancer registries and hospital administrative databases. We use the new database to estimate incidence and prevalence costs of cancer in England and compare patterns of care costs in patients age 18–64 and 65 years old. The use of population-based, patient-level data allows us to disaggregate our estimates by phases of care and by stage at diagnosis, and to compare the direct cost of care in patients with and without cancer in the whole population of England. To the extent of our knowledge, this is the first time that such an analysis is made using population-based, patient-level data in England.

Methods and materials

Data sources

Our study includes data from three main sources: the National Cancer Data Repository (NCDR); Hospital Episode Statistics (HES); and the National Schedules of Reference Costs (NSRC). The NCDR provides information on the characteristics of patients, including tumour site, age, date of cancer diagnosis and date of death. HES collects information on patients’ utilisation of hospital inpatient and outpatient care for all NHS patients in England; the few non-NHS patients account for <1% of total hospital income. Our data extract includes all episodes of care generated by patients in our sample before and after their cancer diagnosis between 2001 and 2010 (Supplementary Appendix 1). Finally, the NSRC includes information on the cost of all inpatient and outpatient services accessed by NHS patients. All NHS hospitals are mandated to report the cost of every service delivered to their patients at the end of the fiscal year. Cost data are disaggregated at the level of HRG (Healthcare Resource Group) making special adjustments for patients’ type of admission, length of stay and access to special services, such as renal dialysis, chemotherapy, radiotherapy and rehabilitation. A detailed description of the NSRC data is included in Supplementary Appendix 2; the procedure followed to match NSRC with HES data is described in Supplementary Appendix 3.


We considered all individuals with a recorded diagnosis of colorectal cancer (ICD-10 code: C18, C19 and C20), breast cancer (C50), prostate cancer (C61) or lung cancer (C33 and C34) in the cancer registries of England between 1 January 2001 and 31 December 2010. We excluded individuals with age less than 18, or with a previous history of cancer, and males with breast cancer. We further excluded patients reported to have died with improper death certificate (DCO) registrations in line with previous work (Coupland et al, 2011). Our final sample included 275 985 colorectal cancer patients, 359 771 breast cancer patients, 286 426 prostate cancer patients and 283 940 lung cancer patients.

Outcome measures

The primary outcome measures of this study were incidence costs, phase of care costs and prevalence costs. The analysis is based on the cost of hospital activity fixed in 2010.

Incidence costs

Incidence costs are defined as the costs of delivering care to a homogeneous cohort of patients fixed in the year of their diagnosis and followed up for a number of years. In every year following the diagnosis, incidence costs include only patients who survive the previous year. Although we have registry data from 2001 to 2010, we have accurate costing data from 2006 to 2010 only. We extend the time window for cost analysis by including all cohorts diagnosed between 2001 and 2007 using similar methods to previous work (Brown et al, 1999, 2002). Firstly, we defined a starting cohort of patients diagnosed with cancer in 2007 and follow these patients for up to 3 years post diagnosis and 1 year prior. Secondly, we estimate 4–9-year incidence costs using hospital activity generated in 2010 by patients diagnosed between 2001 (9-year incidence cost) and 2006 (4-year incidence cost).

We used inverse probability weights (IPWs) (Hirano and Imbens, 2001; Wooldridge, 2007) to adjust for the potential differences between patients in the 2007 cohort and patients in the 2001–2006 cohorts. IPWs allow for greater weight to be given to the cost estimates of individuals who have similar characteristics to the 2007 cohort. Similarly, we extended our incidence costs up to 3 years before diagnosis using patients diagnosed in 2008 and 2009. IPWs were calculated from the propensity scores of a set of logistic regressions estimated over the differences between the 2007 and other cohorts. The set of examined covariates include: age; gender; deprivation; strategic health authority of residence; surgery in first 12 months; and number of hospital admissions in first 12 months. Little difference was observed between the 2007 diagnosis cohort and other cohorts.

Phase of care costs

We identified three distinct phases of care by examining patients with increasing survival times similarly to other studies (Brown et al, 1999; Yabroff et al, 2008a, 2009):

  • The initial phase: the first 6 months immediately following diagnosis.

  • The terminal phase: the final 12 months of life.

  • The continuum phase: the time period between the initial and terminal phase.

In patients surviving no longer than 12 months (e.g., a large share of lung cancer patients), all costs were allocated to the terminal phase. To enter the initial phase, a patient must survive at least 13 months, and to enter the continuum phase, a patient must survive at least 19 months.

Prevalence costs

Prevalence costs provide a snapshot of the total costs delivered to all patients in a specific calendar year and include patients at different points after diagnosis. Prevalence costs are useful to monitor resources used by patients with a similar cancer and to plan for appropriate resource allocation in the future. Prevalence costs were estimated for 2010 by including only patients who were diagnosed within the previous 5 years (2006–2010). Costs were estimated for each phase of care (initial, continuum or terminal).

As this population of patients would still consume health care if free from cancer, we also compare cancer prevalence costs with the costs of care in a similar population without cancer. To this end, we used data on ‘all’ inpatient and outpatient admissions in England in 2010, data from Census 2011 and a simple standardisation technique. Firstly, we calculated the total cost of care accessed by ‘all’ patients aging 18 and over in 2010 (excluding patients with cancer and their costs) using the same methods for costing cancer patients and described in Supplementary Material. Secondly, we calculated the average cost of care by 5-year age groups by dividing the total cost in each age group by the total population in each age group. Finally, we multiply the average costs in each age group by the total population of cancer patients in that group.


Patient average incidence costs

Table 1 reports the characteristics of patients in our sample separately for patients age 18–64 and 65 years old. The latter age group account for a substantial share of the population affected by the main four cancers: 73.3% of colorectal; 44.3% of breast; 77.4% of prostate; and 74.7% of lung cancer patients. Patients age 18–64 have a higher probability of receiving surgery within 12 month of their diagnosis and also surviving the first year after diagnosis. Cancer staging was missing in 24.5% of colorectal and 54.7% of breast cancer patients and imputed following methods described in Supplementary Appendix 4. Staging was missing for a large majority of patients with prostate and lung cancer; hence, we did not report it.

Table 1 Patients’ characteristics in selected cancer sites, 2001–2010

Table 2 reports average incidence costs per patient for patients age 18–64 and 65, 3 years before and 9 years after their diagnosis. Costs of care are relatively small 2–3 years pre-diagnosis across all cancers and age groups and range from £162 per year for a prostate cancer patient age 18–64 to £542 per year for a lung cancer patient age 65. Costs start growing 1 year before diagnosis ranging between £484 (breast cancer age 18–64) and £1979 (lung cancer age 65) and peak in the year of diagnosis with marked differences between age groups. Costs in the year of the diagnosis reaches £17 241 per patient age 18–64 and £14 776 per patient age 65 in colorectal cancer, £11 109 and £7788 in breast cancer, £5171 and £4699 in prostate cancer and £12 083 and £9061 in lung cancer patients, respectively. Costs reduce in the years following the diagnosis but remain substantially higher than their pre-diagnosis level with patients age 18–64 now experiencing smaller costs as compared with patients age 65.

Table 2 Average incidence costs per patient in selected cancer sites. Incidence costs are defined as the total cost of care delivered to all patients who are alive at the beginning of the considered period

Table 3 reports average incidence costs per patient for patients diagnosed with lower stage cancer (stage 1–2) and patients diagnosed with higher stage cancer (stage 3–4) for colorectal and breast cancer. Costs are calculated separately for patients age 18–64 and 65 and the difference in costs between lower and higher stage diagnoses is also reported. We were able to examine patients with colorectal and breast cancer only, since staging is not reported in a sufficient number of prostate or lung cancer patients. An early diagnosis is associated with lower costs in patients with colorectal and breast cancer both in patients age 18–64 and 65. However, the potential cost savings associated with an early diagnosis are greater in patients age 18–64 than in patients age 65. In colorectal cancer, lower stage diagnosis is associated with −£4276 cost per patient age 18–64 in the first year of diagnosis (−22.3% first year costs) as compared with −£1215 cost per patient age 65 (or −7.9%). The total difference in cost 9 years after diagnosis equals to −£12 577 per patient age 18–64 as compared with −£4294 per patient age 65. In breast cancer, lower stage diagnosis is associated with −£2569 lower costs per patient age 18–64 in the first year of diagnosis (or −19.3% of first year costs) as compared with −£1207 cost per patient age 65 (or −13.7% of first year costs). The total difference in cost 9 years after diagnosis equals to −£13 659 per patient age 18–64 as compared with −£7812 per patient age 65. Differences in the cost of care between lower and higher staging 2–3 years before the diagnosis are small both in colorectal and breast cancer. This suggests that much of the differences in costs emerging after the diagnosis are explained by differences in cancer staging.

Table 3 Average incidence costs per patient by lower and higher stage cancer

Table 4 shows the differences in the type of care accessed by patients with lower and higher stage colorectal and breast cancer. Patients with lower stage colorectal and breast cancer are more likely to receive surgery within 12 month from their diagnosis with a positive impact on costs. However, they experience shorter hospital stay and a lower number of emergency admissions, day cases and outpatient visits within 12 month of diagnosis. These factors tend to reduce costs and are likely to explain the difference in cost reported in Table 3.

Table 4 Health-care services accessed by patients with lower and higher stage cancer

Phase of care costs

Figure 1 reports average monthly hospital costs in cohorts of patients surviving 12–13 months, 24–25 months, 36–37 months, 48–49 months and 60–61 months from diagnosis. Costs are close to zero before diagnosis with a progressive rise in the three months before and a stark increase in the month of diagnosis. The highest average monthly costs are observed in the months immediately following diagnosis (the ‘initial’ phase) and in the months immediately preceding death (the ‘terminal’ phase).

Figure 1
figure 1

Patient average monthly costs: partitioned by survivals.

Prevalence costs

Table 5 reports 5-year cancer prevalence costs in 2010 for patients with a cancer diagnosis occurring up to 5 years before. We calculate costs separately for patient age 18–64 and 65 and partition costs by phases of care (initial, continuum and terminal). We also compare costs in patients with cancer to costs in a similar population without cancer.

Table 5 Five-year prevalence costs in selected cancer sites, 2010

The highest 5-year prevalence costs are generated by colorectal patients age65 (£459m), followed by breast cancer patients age 18–64 (£426m), prostate cancer age 65 (£290m) and lung cancer age 65 (£267m). The comparison groups allow us to estimate the additional health-care cost that is due to the cancer condition, rather than to the other characteristics of patients with cancer, for example, their age. After subtracting the costs in the comparison group, prostate cancer is associated with the lowest prevalence costs both in the population of patients age 18–64 (£56m) and age 65 (£104m) suggesting that most of the costs are due to the age of these patients, rather than cancer. Colorectal cancer is still the most expensive in the population of patients age 65, although net costs after subtracting comparison group costs are noticeably lower (£329m), followed by lung cancer (£193m) and breast (£134m). Breast cancer is the most expensive in the population of patients age 18–64 (£371m) followed by colorectal (£195m) and lung (£114m) cancer.

Differences in phase-specific costs are observed across examined cancers. Initial, continuum and terminal phases cover a similar share of costs for colorectal cancer for patients age 65. Initial phase costs absorb a large share of the total cost of care delivered to patients with colorectal cancer due to high incidence (new cases diagnosed every year) and high costs of surgical intervention that follows the diagnosis as displayed in Figure 1. Costs in the continuum phase absorb a greater proportion of prevalence costs relative to costs in the initial and terminal phases for prostate and breast cancer due to a larger proportion of these patients surviving the initial phase and not entering the terminal phase. Terminal costs contribute by far the largest share to lung cancer costs owing to poor survival and a large proportion of patients dying in the year of their diagnosis.


This study expands the scope of existing population-based, patient-level data to the analysis of the costs of care accessed by patients with cancer in England. We combined the most granular cost information available from the NSRC with the NCDR-HES database creating a new resource for the analysis of the cost of cancer similar to the well-established SEER-Medicare database in USA. We processed millions of data records and reconstructed the patient care pathway retrospectively for each cancer patient in our sample. The new database has the potential to support a generation of new research in a similar vein to the success of SEER-Medicare producing much needed evidence to achieve the efficient allocation of current and future health resources to the care of patients with cancer.

We used the new NCDR-HES-RC database to estimate incidence costs, phase-specific costs and prevalence costs for the main four cancers in England. We were able to compare costs in the population of patients age 18–64 and 65 years old. Because of the lack of appropriate data, there is little evidence of the costs of care in the former age group both nationally and internationally. We examined costs by staging, before and after the cancer diagnosis, and in a comparison population of similar patients without cancer. We find evidence that the increment in the cost of care after a cancer diagnosis is markedly higher in patients age 18–64 as compared with patients age 65 across the four cancers examined. This is likely to be explained by the higher probability of receiving surgery for patients in the 18–64 age group. Health-care costs reduces dramatically after the first year and more markedly in patients age 18–64 who consume less resources 3 years after diagnosis as compared with patients age 65. However, costs do not return to pre-diagnosis levels even 9 years after diagnosis in both age groups. We also find evidence that a lower stage diagnosis (stages 1–2) is associated with markedly lower costs as compared with a higher stage diagnosis (3–4) in patients with colorectal and breast cancer for whom sufficient data on staging were available. Although lower staging is associated with higher prevalence of surgery which may increase costs, we also find evidence that lower staging is also associated with shorter in-hospital stay, lower number of emergency admissions and outpatient visits, which are likely to reduce costs. Our findings suggest that an earlier diagnosis can generate substantial savings for the health system and even larger savings if achieved in the population of patients age 18–64. The younger patients are more likely to get surgery and also more likely to get offered chemotherapy, which might explain the broader scope for cost savings. Our evidence can be used to support existing health interventions aiming at improving the earlier diagnosis of cancer, such as the urgent GP referrals for patients with suspect cancer (National Institute for Health and Care Excellence, 2015) and the colorectal and breast cancer screening programs.

We identified the costs associated with the initial, the continuum and the terminal phase of the care pathway. We found evidence that the cost curve follows a ‘U’ shape distribution with high cost in the initial phase (first 6 months from diagnosis) and the terminal phase (last 12 months preceding death) and relatively low costs during the continuum phase similar to other studies (Riley et al, 1995; Brown et al, 1999, 2002; Yabroff et al, 2008b, 2011).

Finally, we calculate the additional costs of care due to cancer by comparing costs in examined cancer cohort with appropriate comparison groups of individuals without cancer. We elicited the amount of resources used by cancer patients because of their health condition from the resources used by the same patients because of their age and gender. This calculation provides a snapshot of the total costs to the health system of the care provided to cancer patients every year excluding the costs that would be incurred had these people been cancer free. We estimate that colorectal cancer costed £542 million to the health system in 2010 due to hospital care, breast cancer £504 million, lung cancer £307 million and prostate cancer £160 million. The total cost of the main four cancers to the health system amounts to £1.5 billion in 2010, namely 3.0% of the total cost of hospital care in England (£47.3 billion). Most of the existing studies do not elicit the cost of cancer from the cost of providing care to the cancer population in absence of cancer making it difficult to assess the impact of the disease on the resources of the health system. Our evidence provides an additional support to well-established evidence on the health outcomes of the population living with cancer and helps in making informed decisions on the financial scope of health interventions.

Study limitations

Our study presents a number of limitations due to the secondary data sources used in the analysis; most of these limitations are expected to fade away as the quality of the data collected in the HES, NCDR and NSRC improves over time and new data are added to existing sources. Firstly, our analysis does not include the costs of primary care, and social care services since data on utilisation and costs of these services are not available for the whole population of patients examined in this study. Other studies estimate that primary care and social care costs are a really small proportion of total care cost in patients with cancer (Luengo-Fernandez et al, 2013; Nuffield Trust, 2014).

Secondly, the NCDR data used in our analysis does not report cancer staging for a large share of patients in our sample reducing our ability to investigate the impact of staging on costs. We were able to use imputation techniques to estimate staging in colorectal and breast cancer, but we could not replicate this exercise in prostate and lung cancer due to insufficient data on staging recorded. However, Cancer Registries in England are making noticeably progress towards the collection of complete staging information for all cancers and the new release of NCDR data comes with more complete data on staging.

Finally, the quality of the cost information reported in the NSRC is variable across different hospitals and over time. We mitigate variation in data quality by using costs reported at a fixed point in time (2010), by excluding outliers, and calculating weighted averages of the costs of similar services reported by different hospitals (details included in Supplementary Appendix 3). Although measurement error is reduced using these techniques, information on cost variation across hospitals and over time is lost. Following recommendations from the Department of Health, an increasing number of hospitals are adopting a more sophisticated system to collect cost information at the level of patient; 50% of NHS hospital trusts used the new system at the time of our analysis. The diffusion of the new costing system will improve the quality of the NSRC data allowing for more granular cost analyses to be performed in the future.

Future research

The NCDR-HES-RC database offers numerous opportunities for future research. Our analysis is limited to data on utilisation of care in 2006–2010 as these were the most recent years of data available at the time of our study. As more data become available, new research could be devoted to assess the impact of the diffusion of new technologies on the cost of care, such as robotic radical prostatectomy. New studies could examine geographical variation in the cost of care and provide evidence on the impact of variation in medical practice and need of care. Finally, new research could be devoted to assess the impact of different pathways of care to costs, such as different routes that lead to a cancer diagnosis.

Improving the quality and the scope of the NCDR-HES-RC database will be crucial in fostering the new research.