Breast cancer is the most common cancer in the UK, with approximately 54 000 new cases diagnosed each year (Cancer Research UK, 2016). International comparisons have shown that breast cancer survival in England is lower than in comparable, industrialised countries (Møller et al, 2010; Coleman et al, 2011). The survival deficit is particularly manifest in the first months after diagnosis, consistent with delayed diagnosis and the presence of a subset of patients with very advanced and rapidly fatal disease (Møller et al, 2010).

Black women with breast cancer have a particularly high risk of death. In California, Black and Hispanic women with breast cancer had a higher risk of death compared with non-Hispanic White women (Boyer-Chammard et al, 1999). A recent study from USA showed that Black women with breast cancer had a more advanced stage distribution and lower survival than White patients (DeSantis et al, 2016). In South East England, Black Caribbean and Black African patients had higher breast cancer-specific mortality than White patients (Jack et al, 2009). Data from a prognostic study of outcomes in sporadic and hereditary breast cancer in the UK showed that young Black women with breast cancer had poorer outcomes, compared with White and Asian women (Copson et al, 2014). It is not yet clear if these variations are due to socio-economic, biological, or other factors. Some studies have suggested that social, personal and biological factors may each contribute to a part of the excess mortality in Black women with breast cancer (Jack et al, 2013; Iqbal et al, 2015; Warner et al, 2015).

Breast tumours may express receptors of which the three most important are the oestrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor 2 receptor (HER2). These biological markers are considered when determining the most suitable treatment, alongside patient age, morphology, tumour size, tumour grade, lymph node status and lymphovascular invasion (Lakhani, 2012; Dawson et al, 2013). Survival outcomes were better in ER-positive and PR-positive patients (Fisher et al, 1988). HER2 is part of the epidermal growth factor receptor (EGFR) family and is over-expressed in 18–20% of breast cancers (Onitilo et al, 2009; Parise and Caggiano, 2014). There is a significant association between HER2 over-expression and poor prognosis, with decreased disease-free survival and overall survival in node-positive patients (Piccart et al, 2001). Some studies characterised breast cancers on the basis of combinations of ER, PR and HER2 and showed wide variation in survival from good prognosis in ER+, PR+ subtypes to low survival in the ER−, PR−, HER2− subtype (Onitilo et al, 2009; Parise and Caggiano, 2014). Breast cancer that are negative for all of ER, PR and HER2 (triple negative cancers) tends to exhibit a more aggressive pattern of disease and a proportion show greater resistance to conventional systemic chemotherapy (Dent et al, 2007; Foulkes et al, 2010).

Traditionally, tumour characteristics were poorly recorded in population-based cancer registries. In recent years, a high priority was given to the recording of cancer stage and other clinical data items by the National Cancer Registration and Analysis Service, who register all cancers diagnosed in England (The National Cancer Registration Service, 2016). The present paper describes the availability of data on important biological factors in the recently introduced English cancer registration data set. It provides an analysis of the survival of breast cancer patients from Black, Asian and White ethnic groups, with consideration of underlying differences in socio-economic factors, tumour stage and tumour biology.

Data and methods

All breast cancer registrations for women in England diagnosed during 2012–2013 were extracted from the National Cancer Registration and Analysis Service. There were 87 538 records. We excluded 25 records with unknown vital status and 661 records that were based entirely on information from a death certificate, leaving 86 852 records for the analysis. Of these remaining cases, 106 had a record of diagnosis on the same date as the date of death. In order to retain these records in the survival analysis we added one day to their survival time.

We assigned women to broad ethnicity groupings: White, Black, Asian, and Other and unknown, the latter included mixed groups. The information on ethnicity is based on the self-reported ethnicity that persons provide when admitted to a hospital. The Black group was subsequently divided into subgroups Black Caribbean, Black African and Other Black.

ER was assessed according to national guidelines using immunohistochemical assessment with mandatory participation in a national quality assurance scheme. ER is reported semi-quantitatively with recording of both the proportion and intensity of nuclear cell reactivity; most histopathology laboratories categorise ER according to Allred score, with a tumour scoring 3 or more defined as ER positive. PR assessment is not mandatory but laboratories that do perform PR assays typically report according to the Allred scoring system with the same cut-off to define PR positivity. HER2 is assessed according to national guidelines and algorithm with immunohistochemistry as first line technique and in situ hybridisation as second line. Scores of 0 or 1+ are considered as HER2 negative and scores of 3+ as HER2 positive. Cases with borderline membrane reactivity on immunohistochemistry were categorised according to the ratio of number of copies of the Her2 gene to a chromosome 17 centromeric probe. Cases with a ratio of 2.00 or more were regarded as HER2 positive.

Covariates in the survival analysis were age (tabulated in 10-year groups; for age adjustment a second-order polynomium of age was used in the regression models in order to accommodate a non-linear association between age and mortality), socio-economic status (SES) (quintiles based on the income domain of IMD 2010 (Department for Communities and Local Government, 2011), geographical area of residence (nine Government Office Regions in England), Charlson comorbidity score (based on diagnoses in inpatient and day-case hospital discharge episodes in the 3-year period prior to breast cancer diagnosis, and excluding cancer itself as a comorbidity) (Charlson et al, 1987; Quan et al, 2005), breast cancer morphology (ductal, lobular, other and unspecified), tumour-nodes-metastasis (TNM) stage, grade, and ER, PR and HER2 receptor status of the breast cancer. The information on and ER, PR and HER2 receptor status was combined to define a group of triple-negative tumours where all three receptors were known and negative.

The 2-year cumulative risk of death from any cause was estimated with the Kaplan-Meier method, and univariate and multivariate Cox proportional hazards regressions were used to estimate hazard ratios (HR) and their 95% confidence intervals (95% CI). The follow-up ended on 31 December 2014. The total number of deaths was 8761 and breast cancer was recorded as the underlying cause of death in 70% of the cases with an available cause of death.

We initially analysed the entire breast cancer cohort, with sequential adjustment for age, socio-economic status, region, comorbidity, stage, grade, ER, PR and HER2 receptor status and morphology. We evaluated the assumption of proportional hazards using Schoenfeld residuals, and we explored the internal consistency and sensitivity using stratified analyses by period of follow-up, broad age group, subgroups of Black women, broad geography and stage. We finally did a focused analysis of the subpopulation where the mortality HR was particularly high for Black women compared with White women: premenopausal women in London who had stage IV breast cancer.


Table 1 shows the distributions of age, socio-economic status, geography of residence, comorbidity score, morphology, stage, grade and receptor status.

Table 1 Distribution of covariates in survival analysis of 86 852 breast cancer cases in England 2012–2013 with follow-up to 2014

Data on key biological prognostic factors were available in most cases, that is, stage (83%), grade (94%), ER (77%), HER2 (74%), but it was lower for PR status (41%). The missing information was not a random subset, and cases with missing data on stage, grade or HER2 had high HRs. Figure 1 demonstrates that cases with missing data on tumour grade had very high mortality in the short term after diagnosis.

Figure 1
figure 1

Kaplan-Meier failure estimates of death from any cause in breast cancer patients, in relation to tumour grade.

Each of the covariates in Table 1 was associated with mortality among breast cancer patients in univariate analysis. Mutual adjustment attenuated the estimated effects of age, socio-economic status, comorbidity, morphology, stage, grade, and oestrogen and progesterone receptor status. The mutual adjustment changed the association between region of residence and mortality. In the adjusted analysis, London residents had lower mortality than residents in other regions; this was mainly the result of adjustment for socio-economic status and stage. The effect of the derived triple-negative characteristic was robust to statistical adjustment and changed very little in the adjusted model.

The variables in Table 1 were subsequently used in sequential statistical adjustment in the analyses of ethnicity.

Table 2 shows the principal analysis of the broad ethnic groups. In the age-adjusted analysis, Black breast cancer patients had higher mortality than White patients (HR: 1.77; 95% CI: 1.48–2.13). Asian and Other and unknown groups of women had mortality rates very similar to White women (0.98 [0.81–1.18] and 0.99 [0.95–1.04], respectively).

Table 2 Survival analysis of 86 852 breast cancer cases in England 2012–2013 with follow-up to 2014

The excess mortality of Black women was much reduced by adjustment for socio-economic status, geography and comorbidity (1.52 [1.26–1.84]). Adjustment for stage, grade, receptor status and morphology separately showed that stage provided the largest further attenuation of the effect (1.35 [1.12–1.63], but each of the other tumour characteristics had its own independent effect on the Black vs White difference. In the fully adjusted model the excess mortality of Black women was reduced to 1.24 (1.03–1.50). From these estimates, the excess mortality in Black women was attributable in sequence to recorded social and person-level characteristics (socio-economic status and comorbidity) (32% [23–45%]), then to recorded tumour stage (22% [14–34%]) and then to recorded biological characteristics (grade, morphology, receptors) (14% [8–25%]), leaving 31% (22–43%) unexplained.

Table 3 shows the HRs of Black women vs White women in different stratified analyses. Stratification of the follow-up period into three 1-year intervals did not materially change the estimate. The excess mortality of Black women was much stronger in young women (0–49 years) than in middle-aged and older women (50+ years), and the effect in the younger group was less sensitive (more resilient) to statistical adjustment for person and tumour characteristics. The effect was different in subgroups of Black women with the highest excess consistently in the Other Black subgroup and the lowest excess in the Black Caribbean subgroup. The excess mortality of Black women was highest and least sensitive to adjustments in the London population. Finally, analysis stratified by stage showed that the excess mortality of Black breast cancer patients was strongest and least sensitive to adjustments among women with stage IV cancer. The age-adjusted HR within stage IV was 1.51 (1.15–1.97) and the fully adjusted HR was 1.47 (1.11–1.93).

Table 3 Survival analysis of Black and White breast cancer cases in England 2012–2013 with follow-up to 2014

The subsequent analysis focused on the population of young breast cancer patients in London who were diagnosed with stage IV breast cancer. This is the subgroup where the difference between Black and White women is largest. Figure 2 shows the Kaplan-Meier analysis of the young, stage IV breast cancer patients in London, comparing White patients (74 patients; 11 deaths) and Black patients (30 patients; 17 deaths). The difference was highly statistically significant (log-rank test, P<0.001). There was no indication of a difference between Black Caribbean, Black African and Other Black groups (P=0.89). Comparing Black and White women, the age-adjusted HR was 5.21 (2.42–11.22) (data not shown). The estimate was not sensitive to further statistical adjustment for socio-economic status, grade, receptor status or morphology, but adjustment for co-morbidity reduced the estimate to 3.70 (1.62–8.44). A higher proportion of the Black patients had missing data for size of the primary tumour (73 vs 62%) and for nodal status (80 vs 58%). Among women with non-missing values, Black women had larger median tumour size (37 mm vs 28 mm) and a larger number of positive nodes (3 vs 2).

Figure 2
figure 2

Kaplan-Meier failure estimates of death from any cause in Black and White breast cancer patients, 0–49 years of age, resident in London, with stage IV cancer.


This is a very large study of current mortality outcomes in unselected breast cancer patients in a large, national setting. The modernisation of cancer registration in England has led to a much improved data set with high completeness of collection of several important prognostic factors. The present analysis confirms known or expected associations of mortality with age at diagnosis, socio-economic status, co-morbidity, morphology, stage, grade and receptor status. The study shows higher mortality in Black breast cancer patients, compared with White patients, and demonstrates independent contributions to this excess from personal and social factors, from tumour stage, and from biological characteristics of the cancer.

The analyses and findings presented here have been possible because data on prognostic factors (especially stage, grade and tumour receptor status) are now available for breast cancer cases in the English national cancer registration data set at a level that permits detailed, adjusted analyses. The known excess mortality of Black women with breast cancer has contributions from an adverse case-mix distribution, including socio-economic factors, stage distribution and tumour biology.

Population-based cancer registration aims to register all cases of cancer in the geographically defined population, regardless of route to diagnosis, basis of diagnosis, stage of disease or receipt of specialised oncology care. Compared with groups of cancer patients accrued from a histopathology case series or oncology case series, the population-based register will inevitably include a sub-set of cancer patients that were only diagnosed shortly before death and patients that for other reasons received no active cancer care. It is, therefore, to be expected that the population-based data set will be partially completed with data items that require diagnostic procedures and pathology. Registered cases with no information on, for example, stage, grade or hormone receptor status indicate situations where clinical or pathological investigations were not possible or were considered to be of little relevance to the care of the patient. The missing data on these items are, therefore, likely to be selective. This is evident from the Kaplan-Meier analysis for tumour grade (Figure 1) and was also observed for patients with missing data on stage and hormone receptor status. When the missing data are selective we consider it misleading to impute the missing values, and we decided to represent the missing data as a separate category of each variable (Galati and Seaton, 2016).

The mutual statistical adjustment in the Cox regression model greatly attenuated the high HRs for missing stage (the HR of 6.89 was reduced to 3.64), missing grade (7.12 to 2.08) and missing HER2 receptor (2.16 to 1.11). This indicates a degree of correlation between these characteristics in a sub-set of advanced and rapidly fatal cases where staging and histopathology analysis were perhaps not relevant to the care of the patient.

Breast cancer cases with missing data on ER and PR status had HRs that were intermediate between the receptor-negative and the receptor-positive subgroups, but we note that the proportion of cases with no information on PR status was high (59%). The importance of ER status as a prognostic factor has been established firmly since the 1970s (Fisher et al, 1988), but the role of PR status has been less certain. The prevailing attitude in the UK has been that PR status was less important than ER and HER2 status, and the 2009 NICE guidelines advised against routine assessment of PR status (National Institute for Health and Care Excellence, 2009). This may be the reason for the current poor recording of PR status. The present results should re-enforce the view that PR status may be a relevant indicator of mortality outcomes, independently of other prognostic factors, including ER and HER2 status.

We were particularly interested in the category of ‘triple-negative’ cancers (Dent et al, 2007; Foulkes et al, 2010) and we attempted to derive this entity from the recorded ER, PR and HER2 data. We identified a sub-set of 5% of the total cohort of breast cancer cases who were known to be negative for each of the three hormone receptors, and found that these had an HR of 2.00 compared with the remaining 95% of patients. However, due to the missing information on some receptor data, the 5% is less than the expected value of 11–17% (Dent et al, 2007; Foulkes et al, 2010), and the estimated HR probably lower than the true value due to the misclassification.

The prognosis of Black women with breast cancer is known to be worse than in White women, most likely for a variety of reasons, such as socio-economic deprivation, lower uptake of mammographic screening, more advanced stage and a more adverse biological case-mix (Jack et al, 2009, 2014). By means of sequential adjusted regression models, it was attempted to establish the origins of the difference in prognosis of Black and White patients, and to attribute the excess mortality in Black cancer patients to factors at different levels, ranging from the social and personal, through stage of disease, and to the biological characteristics of the cancer. This sequence was chosen because the social and personal characteristics are at least in principle modifiable, and could be subject to intervention such as facilitating the awareness of symptoms of cancer or the uptake of mammographic screening. Stage of the cancer is intermediate because this is a mixture of effects from the social and personal level (e.g., through early diagnosis) and the biological level (more aggressive biology giving rise to a more advanced stage distribution). The factors at the biological level are most likely not amenable to intervention. We found that the excess mortality of Black breast cancer patients had contributions from all three levels (social/personal; stage; biology).

Our final analysis focused on the sub-set of the data where the Black/White difference was strongest (i.e., age group 0–49 years; London residents; stage IV cancer). This reduced the analysis population from 86 852 to 104. We had hoped that analysis of this niche group would reveal something about the underlying cause of the ethnic difference, but we were not able to attribute any of this marked difference to the available characteristics. Most plausibly, a larger sub-set of the Black patients is diagnosed very late with the disease in a potentially untreatable, incurable state.

The factors investigated in this study partially explain the substantial variation in breast cancer survival by ethnicity; however, 31% could not be explained by the available variables. This may in part be due to imperfect classification of the available covariates, or other factors (including treatment) that may influence the variation in breast cancer survival, and which were not investigated. Potentially important is the use of treatment, as it has been shown that ethnicity can be associated with delivery of treatment (Fedewa et al, 2010; Sail et al, 2012; Reeder-Hayes et al, 2013; Silber et al, 2013). One American study found that African American breast cancer patients had a higher risk of both a 60-day and 90-day delay of chemotherapy following surgery (Fedewa et al, 2010), which may influence variation in short-term survival. In addition, a number of other factors, for example genetics (Pal et al, 2015), family history, comorbidities, alcohol, smoking and education (Tannenbaum et al, 2013; Wu et al, 2013; Shariff-Marco et al, 2014) have been found to be associated with breast cancer survival, and it is not yet understood how the factors interact. Previous work in South East England showed that Black African women were less likely to receive surgery, chemotherapy and hormone therapy than White women (Jack et al, 2009). While this may contribute to the differences in survival, it would also point to possible worrying inequity in treatment uptake between ethnic groups. It remains an important limitation of this analysis that detailed treatment information is not yet available in the analysis data set. Work is in progress to include radiotherapy, systemic therapy and surgery information in the data set.

The principal strength and relevance of this analysis is the use of national, population-based data on breast cancer in England. The recency of the present data are both a strength and a limitation. Data and outcomes of patients diagnosed during 2012–2013 are likely to be relevant to the clinical management of breast cancer today, but the duration of follow-up for mortality is short and this analysis addresses short-term survival only. The ultimate outcome in breast cancer care is long-term cure and good quality of life. In defence of our short-term outcomes analysis, we consider, firstly, that short-term survival is a necessary condition for long-term survival and quality of life, and, secondly, international comparisons have made it clear that short-term survival is a particular concern in breast cancer care in England (Møller et al, 2010).