Main

The intrinsic molecular subtypes have become central to breast cancer research (Perou et al, 2000; Sørlie et al, 2001). However, their successful translation into clinical diagnostic assays has not yet been achieved and remains a priority if patients are to benefit from the knowledge of the molecular heterogeneity of breast cancer. The current assessment of the histological and clinical characteristics of tumours fails to identify patients most appropriate for adjuvant systemic therapy. Although adjuvant therapy significantly improves breast cancer survival (EBCTCG, 1998; Berry et al, 2005), it is generally accepted that a substantial proportion of patients who are at low risk of relapse are nonetheless receiving adjuvant chemotherapy, hence experience the side effects of the treatment without deriving much benefit (EBCTCG, 1998; Berry et al, 2006). The translation of the intrinsic subtypes of breast cancer into clinical assays may enable us to stratify patients by their likelihood of benefiting from adjuvant treatment.

This problem is most serious amongst patients with oestrogen receptor (ER)+ disease because those with ER− disease are known to derive greater absolute benefit of adjuvant chemotherapy (EBCTCG, 1998; Berry et al, 2006). Indeed, ER has been proposed as a determinant of whether patients should receive chemotherapy (Henderson, 2010; Pritchard, 2011; Regierer et al, 2011), however, according to the largest meta-analysis, the proportional risk reduction for mortality is not significantly different by ER status (EBCTCG, 1998). By gene expression profiling ER+ tumours are classified as luminal A or luminal B (Perou et al, 2000). Luminal B tumours are defined by the expression of higher levels of proliferation-related genes, including MKI67, than luminal A tumours (Perou et al, 2000). Although a proportion of luminal B tumours can be distinguished from luminal A tumours by detecting amplification of human epidermal growth-factor receptor 2 (HER2), the remainder are more difficult to identify. Ki67 expression by immunohistochemistry (IHC) has been used as a means of identifying HER2-negative luminal B tumours, successfully defining a subset of ER+ cases with poor outcome (Cheang et al, 2009). In this case, Ki67 was used as a surrogate tissue-based readout of proliferation in order to recapitulate the classification originally based on clustering of tumour transcriptomes (Perou et al, 2000; Cheang et al, 2009). That proliferation is a powerful prognostic factor in breast cancer is evidenced by its inclusion in the assessment of histological grade as mitotic count, which has recently been shown to be largely responsible for the prognostic value of tumour grade (Abdel-Fatah et al, 2010). Moreover, the prognostic power of multigene predictors in breast cancer has been shown to be almost exclusively attributable to proliferation and cell-cycle-related genes and limited to ER+ breast cancer, because ER− cases are nearly always deemed high risk (Teschendorff et al, 2006; Desmedt et al, 2008; Wirapati et al, 2008).

Although MKI67 is invariably included amongst the proliferation genes of multigene predictors, there are also other cell-cycle-related genes, which have received less attention (Paik et al, 2004; Paik, 2007). Assessment of Ki67 expression by IHC holds promise as a prognostic and predictive biomarker, however, reports have been conflicting and comparison between studies made difficult by varying methodologies and cut-points for positivity (Urruticoechea et al, 2005; Yerushalmi et al, 2010), indeed guidelines have been produced in order to address these limitations (Dowsett et al, 2011). Although Ki67 is not generally used in the routine management of breast cancer, it has recently been recommended by the St Gallen consensus committee for discriminating between luminal A and luminal B tumours (Goldhirsch et al, 2011). Alternative IHC markers of proliferation have been proposed and have included those involved in cell-cycle control, including cyclin E (CCNE1) (Keyomarsi et al, 2002) and those that carry out the function of DNA licensing for replication (Gonzalez et al, 2004, 2005). Both mini-chromosome maintenance protein 2 (MCM2) and geminin (GMNN), which licence DNA for replication and inhibit re-replication of DNA, respectively, have been shown to carry prognostic value in breast cancer (Gonzalez et al, 2003, 2004). Assessment of a panel of cell-cycle-related proteins, including MCM2, has been proposed to differentiate actively cycling cells, those in-cycle but with arrested progression and those out of cycle, which may provide prognostic information in breast cancer (Loddo et al, 2009). Thus in a manner analogous to gene signatures of proliferation, measuring multiple proliferation-related proteins has been hypothesised to carry greater prognostic information than relying on a single marker.

We compared the prognostic value of a panel of proliferation markers measured using IHC in a large cohort of tumours represented in tissue microarrays (TMAs). We selected MCM2, Ki67, aurora kinase A (AURKA), polo-like kinase 1 (PLK1), GMNN and phospho-histone H3 (PHH3) based on their differential expression in the phases of cell cycle (Loddo et al, 2009). Our aims were to establish the marker of greatest prognostic utility and to investigate whether a multi-marker assessment of proliferation offered additional prognostic value compared with a single marker.

Materials and methods

Study population

The prospective population-based study SEARCH (studies of epidemiology and risk factors in cancer heredity) was used for this work. This study primarily includes women <70 years with early breast cancer who are identified through the East Anglia Cancer Registry. Details of this study have been published previously (Lesueur et al, 2005). A total of 3093 patients were included. The characteristics of the study cohort are detailed in Table 1. Available data included breast cancer-specific mortality, clinical and treatment data. Previously generated data on the IHC markers ER, progesterone receptor (PR), HER2, cytokeratin 5/6 (CK5/6) and epidermal growth-factor receptor (EGFR) were also available (Blows et al, 2010). The SEARCH study is approved by the Cambridgeshire 4 Research Ethics Committee; all the study participants provided written informed consent.

Table 1 Characteristics of the SEARCH study cohort

Tissue microarrays, IHC and scoring

Each tumour was represented by a single 0.6-mm tissue core in a TMA constructed from paraffin-embedded tissue blocks guided by haematoxylin and eosin stained slides marked for invasive carcinoma, as previously described (Kononen et al, 1998). Tissue microarray sections of 3–4 μm thickness were dewaxed in xylene and rehydrated through graded alcohols. Immunohistochemistry was conducted using a BondMax Autoimmunostainer (Leica, Bucks, UK). Details of reagents and antigen retrieval conditions are summarised in Supplementary Table S1. Bound primary antibody was detected using a BOND polymer detection kit (Leica) and developed with 3-3′-diaminobenzidine. Stained slides were inspected for uniformity of staining or assay failure and those not considered interpretable were excluded from assessment. The Ariol platform (Genetix Limited, Hampshire, UK) was used to scan slides and the resulting images were used for scoring. Details of scoring systems for all markers are provided in Supplementary Table S1. Proliferation markers were scored according to the proportion of positive cells only, using an Allred proportion score (0=0%, 1=<1%, 2=1–10%, 3=11–33%, 4=34–66% and 5=>66%). For MCM2, Ki67, GMNN and PHH3, a cell was considered positive if there was any nuclear signal above background, whereas for AURKA and PLK1, any cell with nuclear or cytoplasmic signal above background was deemed positive (Figure 1).

Figure 1
figure 1

Photomicrographs of representative immunostaining for all the proliferation markers.

Definition of molecular subtype

In order to investigate the relationship between proliferation markers and molecular subtype, a surrogate IHC-based classifier was used, as previously described (Blows et al, 2010). Molecular subtypes were defined as: luminal1a (ER+ or PR+, HER2−, CK5/6− and EGFR−), luminal1b (ER+ or PR+, HER2−, CK5/6+ or EGFR+), luminal2 (ER+ or PR+, HER2+), HER2 (ER− and PR− and HER2+), core basal phenotype (ER− and PR−, HER2−, CK5/6+ or EGFR+) and five-marker negative phenotype (ER−, PR−, HER2−, CK5/6− and EGFR−).

Statistical analyses

All the analyses were stratified according to the ER status in order to account for the fundamental differences between ER+ and ER− tumours (Pharoah and Caldas, 2010). This study complied with REMARK (reporting recommendations for tumour-marker prognostic studies) criteria (McShane et al, 2005). Correlations between ordinal variables were made using Spearman’s rank correlation coefficient. A log-rank test was used to compare survival between strata in Kaplan–Meier survival plots. Association with survival was assessed using a Cox proportional-hazards model with 10-year breast cancer-specific survival (BCSS) as outcome, providing a hazard ratio (HR) and 95% confidence interval (CI) for each variable. The date of study entry rather than date of diagnosis was used to determine time under observation (left-truncation) in order to adjust for unobserved events (Azzato et al, 2009). For the analysis of associations with clinical characteristics, all the proliferation markers were modelled as dichotomous and the significance of associations was tested by Pearson’s Chi-square test or Fisher’s exact test as appropriate. The cut-off for dichotomisation was informed by comparing strata against non-expressing cases in a Cox proportional-hazards model. For survival analyses, proliferation markers were modelled as continuous or dichotomous according to the relative fit of multivariate models adjusted for the standard prognostic factors, assessed using likelihood ratios. The standard log–log plots were used to explore compliance with the proportional-hazards assumption. Where markers violated the assumption, the Cox model was extended to include a coefficient for each time-dependent covariate, which varied as a function of log-time, indicating the direction and magnitude of change in relative risk with time. That is, the log of the coefficient will be >1 if risk increases with time and <1 if risk decreases with time. The P-value of the time varying coefficient was used to determine whether to model a covariate as time-dependent in different subgroups. The prognostic value of proliferation markers was directly compared by including all markers in a Cox model that was modified in a backward stepwise manner to identify proliferation markers, which carried prognostic value independent of each other. These markers were then included in a multivariate model with age (>55 years), lymph node status, grade, tumour size (<2, 2–4.9 and 5 cm), endocrine therapy, adjuvant chemotherapy, PR and HER2 status. Grade and tumour size were modelled as continuous variables. This model was modified in a backward stepwise manner until the most parsimonious fit was attained. In order to adjust for the inevitable selection bias associated with missing data in molecular pathology studies (Hoppin et al, 2002), we used multiple imputation (MI) to resolve missing values for all the variables included in multivariate models including an outcome indicator variable in the model (Moons et al, 2006), generating 50 data sets as previously described (Ali et al, 2011). We have recently validated MI as a method of handling missing data in molecular pathology prognostic marker studies (Ali et al, 2011). Results for survival analyses conducted on imputed data represent a combination of analyses for each of the 50 data sets and are presented alongside the results from analyses excluding cases with missing data (complete case analysis) for comparison. All statistical analyses were conducted using Intercooled Stata version 11.1 (StataCorp., College Station, TX, USA).

Results

The characteristics of the study cohort are summarised in Table 1. There were 465 deaths from breast cancer with 416 occurring within 10 years of diagnosis. Excluding cases with missing data, 75% of the cohort was ER+, 72% was PR+ and 12% was HER2+.

Correlations and associations of proliferation markers

All the proliferation markers were significantly correlated with each other and tumour grade in both ER+ and ER− disease (Table 2). In ER+ disease, GMNN was most strongly correlated with grade with a Spearman’s ρ of 0.31 (P<0.0001). In ER− disease, GMNN and Ki67 were most strongly correlated with grade, each with a Spearman’s ρ of 0.39 (P<0.0001). Correlation between proliferation markers was strongest for Ki67 and MCM2 (Spearman’s ρ=0.55; P<0.0001) in ER+ disease, whereas in ER− disease, it was Ki67 and GMNN that showed the strongest correlation (Spearman’s ρ=0.59; P<0.0001). These weak to moderate correlations between proteins, putatively tracking the same biological process, may be explained by the proportion of cell cycle during which each protein is expressed. The number of cases with higher Allred proportion scores was smaller for proteins expressed for a shorter period of cell cycle (Table 3). For example, MCM2, which is expressed for the longest period during cell cycle of any of the proteins (early and late G1, G2, S and M), was expressed by 13% of cases (after excluding those with missing data) in >66% of cells and 36% of cases in >10% of cells. In contrast, PHH3, which is expressed for the shortest period during cell cycle (M phase only), was expressed by 11% of cases in >10% of cells, with no cases expressing PHH3 in >66% of cells.

Table 2 Correlation between proliferation markers and grade in ER-positive and ER-negative disease
Table 3 Distribution of proliferation marker Allred proportion scores

Proliferation markers were associated with adverse clinical characteristics in ER+ disease. Both AURKA and GMNN were significantly associated with positive lymph node status (Table 4). Of the two, AURKA showed the stronger association with 46% of AURKA+ cases being lymph node positive compared with 35% of AURKA− cases (P<0.001). All the proliferation markers, except PLK1 and PHH3, were significantly associated with HER2 positivity. MCM2 showed the strongest association with 19% of MCM2+ cases being HER2+ compared with just 6% of MCM2− cases (P<0.001). In contrast, in ER− disease, the pattern of association was less clear with some indication of an association with favourable clinical characteristics (Supplementary Table S2). For example, only PLK1 was significantly associated with lymph node status in ER− cases. However, this association was with negative lymph node status with 65% of PLK1+ cases being lymph node negative compared with 49% of PLK1− cases (P=0.024). Similarly, both AURKA and PLK1 showed a negative association with HER2 positivity. In all, 81% of AURKA+ cases were HER2− compared with 72% of AURKA− cases (P=0.027) and for PLK1, 89% of positive cases were HER2− compared with 76% of negative cases (P=0.037). These findings lend weight to the idea that the clinical and biological significance of proliferation is different between ER+ and ER− tumours.

Table 4 Associations of proliferation markers with clinical characteristics in ER-positive disease

Proliferation markers predict poor outcome in ER+ disease only

Univariate survival analyses revealed an association between all the proliferation markers and poor outcome in ER+ but not in ER− cases (Table 5 and Supplementary Table S3). For ER+ cases, AURKA, GMNN, PHH3 and MCM2 were best modelled as continuous variables. Both MCM2 and GMNN showed a reduction in hazard with time in both complete and imputed data. Ki67 was the only proliferation marker significantly associated with survival in ER− disease, with an association of nominal significance when imputed data was analysed (HR 1.5; 95% CI 1.0–2.1; P=0.032) and a similar point estimate when cases with missing data were excluded (HR 1.3; 95% CI 0.88–1.8; P=0.195). However, in a model adjusted for tumour grade, Ki67 no longer showed an association with survival in imputed data of ER− cases (HR 1.3; 95% CI 0.88–1.9; P=0.200).

Table 5 Univariate analysis in ER-positive disease

Aurora kinase A and GMNN carried prognostic value independent of each other in ER+ disease. The prognostic value of proliferation markers was compared by multivariate analysis including only the proliferation markers as covariates. Both AURKA and GMNN retained independent prognostic significance in the analyses of complete and imputed data (Table 6, Model 1). This finding supports the hypothesis that different markers of proliferation carry distinct prognostic information by better reflecting the phases of cell cycle (Gonzalez et al, 2004, 2005; Williams and Stoeber, 2007; Loddo et al, 2009). Although MCM2 was also retained in the multivariate model of complete data, this association was not recapitulated when the imputed data was analysed. Ki67 did not provide prognostic information independent of all the other proliferation markers.

Table 6 Multivariate analysis of proliferation markers in ER-positive disease indicating independent prognostic value of AURKA (bold)

Aurora kinase A carried prognostic information independent of major clinical and molecular characteristics on multivariate analysis of ER+ disease (Table 6, Model 2). There were 88 deaths from breast cancer in the multivariate model of complete data. The increase in relative risk of event was 40% and 30% for complete and imputed data, respectively (Adjusted 10-year BCSS for AURKA were 0=93%, 1=90%, 2=88%, 3=58% and 4=insufficient sample size). Because AURKA was best modelled as a continuous variable, these data represent the increase per increment of Allred score (Figure 2). Multivariate analysis of the same model with AURKA replaced by Ki67 showed that Ki67 also retained independent prognostic significance in imputed data (HR 1.4; 95% CI 1.0–1.9; P=0.053) with the same point estimate for complete data (HR 1.4; 95% CI 0.97–2.1; P=0.070). However, for the same model, Ki67 was no longer associated with survival in the presence of AURKA either in complete (HR 1.1; 95% CI 0.66–1.7; P=0.828) or imputed (HR 1.2; 95% CI 0.84–1.6; P=0.346) data. In contrast, in this model including Ki67, AURKA retained independent prognostic value in both complete (HR 1.3; 95% CI 1.0–1.6; P=0.017) and imputed data (HR 1.2; 95% CI 1.0–1.4; P=0.023), confirming that AURKA outperforms Ki67 as a prognostic marker in ER+ breast cancer. Although AURKA expression was correlated with tumour grade, the relationship with luminal molecular subtypes was less pronounced (Figure 3). This implies that, in addition to CK5/6, EGFR and HER2, AURKA could be used to refine the distinction between luminal subtypes.

Figure 2
figure 2

Kaplan–Meier survival plot of AURKA scores in ER+ disease. AURKA expression as Allred proportion scores (0–4, because there were no ER+ cases with a score of 5) (Log-rank <0.0001).

Figure 3
figure 3

Bar charts illustrating the relationship between AURKA and (A) grade (B) and molecular subtype in ER-positive disease.

Discussion

Proliferation has emerged as a robust prognostic factor in ER+ breast cancer (Desmedt et al, 2008; Stuart-Harris et al, 2008; Wirapati et al, 2008). Although mitotic count contributes to tumour grade, additional measures of proliferation have been shown to add prognostic value independent of grade (Aleskandarany et al, 2011). Of these, Ki67 labelling by IHC has been most widely investigated (Urruticoechea et al, 2005; Cheang et al, 2009; Colozza et al, 2010; Yerushalmi et al, 2010). However, other promising proliferation-related proteins have received less attention as potential prognostic markers (Gonzalez et al, 2003, 2004, 2005; Loddo et al, 2009). We have compared the prognostic utility of a panel of proliferation-related proteins, including Ki67, in a large cohort of primary invasive breast tumours. We confirm that proliferation markers are significantly associated with survival in ER+ disease only and find that AURKA carries the greatest prognostic value outperforming Ki67 and serving as an independent prognostic factor in ER+ breast cancer.

This study has some limitations. First, our conclusions require validation in an independent cohort, even though we have employed a large study cohort (3093 cases) lending statistical robustness to our findings, which are also are in keeping with previous reports (Nadler et al, 2008; Loddo et al, 2009). Second, we have used TMAs to represent tumours. Although excellent concordance between TMAs and full-face sections has been reported (Callagy et al, 2003; Ruiz et al, 2006), further evaluation of AURKA as a clinical assay would require use of full-face sections. Finally, we have not assessed the predictive value of AURKA’s in this observational study, as this would be best addressed in the context of a randomised clinical trial. However, our data support the ability of AURKA to predict absolute benefit of adjuvant systemic therapy, highlighting the potential clinical utility of AURKA.

Prognostic classifiers based on the assessment of tens of genes have followed seminal studies of breast tumour transcriptomes (Perou et al, 2000; Sørlie et al, 2001; Paik et al, 2004; Teschendorff et al, 2006). The prognostic power of these classifiers has been shown to heavily rely on proliferation-related genes (Desmedt et al, 2008; Wirapati et al, 2008). These classifiers utilise several correlated genes to produce a readout of proliferation. Similarly, a panel of IHC proliferation markers has been proposed to show greater prognostic significance than a single marker (Gonzalez et al, 2005; Williams and Stoeber, 2007, 2012; Loddo et al, 2009). The basis of this additional value has been argued to relate to the integration of DNA-licensing markers and markers of actively cycling cells in order to gauge the ‘rate’ of proliferation in a given tumour (Gonzalez et al, 2005; Williams and Stoeber, 2007, 2012). The analysis of a panel of cell-cycle-related proteins can identify distinct cell-cycle phenotypes both at the level of single cells (Endl et al, 2001; Shetty et al, 2005) and cell populations (Loddo et al, 2009). Indeed, DNA-licensing factors, particularly MCMs, have been shown to be powerful predictors of clinical outcome in several solid tumours including prostate, lung, kidney, breast and ovary (Meng et al, 2001; Ramnath et al, 2001; Gonzalez et al, 2003, 2004; Dudderidge et al, 2005; Kulkarni et al, 2007). Three cell-cycle phenotypes can be identified by integrating markers of cell-cycle progression, they approximate to (1) an ‘out-of-cycle’ or differentiated state defined by the lack of expression of cell-cycle proteins, including MCMs, and that may express markers of ‘differentiated’ cells, including inhibitors of cyclin-dependent kinases such as p27, (2) a ‘G1-delayed/arreseted’ or growth-arrested state defined by the expression of an MCM, hence DNA is ‘licensed’ for replication, but lacking expression of mitotic kinases including PLK1 and AURKA or other markers of actively proliferating cells including Ki67 or GMNN and (3) ‘accelerated cell cycle’ or actively proliferating state defined by the expression of both MCMs and proteins expressed after the cell-cycle restriction point including AURKA, GMNN, PLK1, PHH3 and Ki67 (Endl et al, 2001; Dudderidge et al, 2005; Williams and Stoeber, 2007; Loddo et al, 2009). A scheme for determining cell-cycle phenotype in this way holds particular promise as a predictive biomarker by identifying tumours sensitive to cell-cycle phase-specific chemotherapeutic agents (Williams and Stoeber, 2007, 2012; Loddo et al, 2009). We addressed the hypothesis that multi-parameter estimates of cell-cycle phenotype would outperform single-marker assays as predictors of outcome by including markers expressed differentially during cell cycle in a multivariate analysis. We found that GMNN and AURKA indeed provided independent prognostic information. However, subsequent analysis in a model adjusted for the standard clinical variables showed only AURKA retained independent prognostic value. This may arise as a result of our assessing protein expression as a proportion of a population of cancer cells separately for each cell-cycle marker and subsequently comparing these scores in a multivariate model. This cell-population approach may not identify the proposed cell phenotypes, particularly growth-arrested cells, with adequate sensitivity. A multiplexed single-cell assay, which determines the proportion of cells in each of the three phenotypes per tumour, may overcome this limitation, especially if combined with the sophisticated methods of automated image analysis (Camp et al, 2002; Williams and Stoeber, 2012).

Aurora kinase A is among the proliferation genes that contribute to the 21-gene recurrence score (Paik et al, 2004). Aurora kinase A is required for proper centrosome function and for mitotic spindle assembly (Lens et al, 2010). As a protein, which functions specifically during mitosis, AURKA also represents an attractive drug target and several AURKA inhibitors are under development (Keen and Taylor, 2004; Lens et al, 2010). The basis of the superior prognostic performance of AURKA compared with the other proliferation markers is not clear and is likely to relate to many variables including biological function, assay differences and ease of interpretation. Aurora kinase A was one of the proliferation markers, best modelled as a continuous variable. This is consistent with the idea that luminal tumours form a continuum according to the expression levels of proliferation-related genes and that their division into two subgroups is somewhat arbitrary (Desmedt et al, 2008; Wirapati et al, 2008; Colombo et al, 2011). In this respect, AURKA labelling by IHC could be used as a means of better reflecting this diversity in clinical practice. Moreover, the prognostic utility of AURKA may be increased by including it in a combined index with B-cell lymphoma protein 2, just as we have recently shown for Ki67 (Ali et al, 2012). Moreover, AURKA gene expression has recently been used as a prototypical proliferation marker in a three-gene classifier for the molecular subtyping of breast cancer shown to be more statistically robust than other methods (Haibe-Kains et al, 2012).

In summary, we have conducted a large head-to-head comparison of the prognostic value of a panel of proliferation markers in primary breast cancer. We have used IHC and a scoring system used routinely in clinical practice to show that the prognostic significance of proliferation is limited to ER+ disease and that AURKA outperforms other proliferation markers including Ki67. Aurora kinase A defines five subgroups in ER+ breast cancer and carries independent prognostic significance in multivariate analysis. Our findings show that Ki67 may not be the optimal IHC marker of proliferation and warrant further studies addressing the predictive value of AURKA.