Main

Breast cancer, which is the most common cancer in women (Parkin et al, 2001; Kamangar et al, 2006) is a complex disease characterised by multiple molecular alterations. The current clinical management of breast cancer relies on availability of robust clinicopathological and individual molecular prognostic and predictive factors to support decision making. However, the varied behaviour and response to therapy within the clinically and morphological similar classes indicate that the traditional prognostic factors currently available are insufficient to reflect the genetic heterogeneity of breast tumours. Recent advances in high-throughput molecular technologies have further demonstrated this biological heterogeneity of breast cancer.

A seminal study by Perou et al (2000) identified four distinct molecular breast cancer groups based on gene expression profiles: luminal epithelial/oestrogen receptor (ER) positive, c-erb-B2 (HER2) positive, basal-like and normal breast-like. A subsequent study extended this by dividing the luminal/ER-positive group into three subtypes: luminal A, B, and C (Sorlie et al, 2001), but existence of the luminal C group remains uncertain (Sorlie et al, 2003). Sotiriou et al (2003) demonstrated six similar groups, with two basal-like subgroups and no normal breast-like group. While numerous studies have reported these and other novel molecular subtypes, and assigned a prognostic significance to these identified classes (West et al, 2001; van't Veer et al, 2002; Calza et al, 2006), they remain varied in their detailed classification (Hu et al, 2006). In addition, issues regarding the potential clinical utility of gene expression profiling include sample processing, data interpretation and analysis, reproducibility, validation, feasibility, and cost (Ein-Dor et al, 2006; Pusztai et al, 2006; Simon, 2006). Existing studies have also not addressed the stability of the proposed classifications across different case sets, the biological value of the different genes involved in the cluster designation and the proportion of cases that cannot be classified into any of the core molecular classes. Such an issue appears of critical relevance considering the need to identify the molecular features of individual tumours in routine practice.

An alternative approach is to use established robust laboratory technology, such as immunohistochemistry on formalin-fixed paraffin embedded (FFPE) patient tumour samples utilising a set of proteins with a well-defined biological and clinical relevance in breast cancer. Previously, we have applied a 25-protein biomarker panel with known relevance to breast cancer, to large numbers of cases using FFPE tissue microarrays and unsupervised clustering analysis, and have confirmed the existence and clinical significance of distinct breast cancer classes (Abd El-Rehim et al, 2005). In further analysis, we used a consensus methodology between three alternative clustering techniques, Hierarchical (Anderberg, 1973), K-means (Al-Daoud and Roberts, 1996), and Adaptive Resonance Theory (ART) (Carpenter and Grossberg, 1987), followed by the computation of several validity indices to independently verify the number of output clusters and address the issue of clustering stability and classification uncertainty. By examining the concordance between the clustering techniques, we determined a set of core six breast cancer classes (Soria et al, 2010). Concordance between clusters, assessed by conventional statistical techniques (principal component projections and boxplots) and two automated methods (an artificial neural network and a rule-extraction approach) were used to characterise these classes. This served to confirm that key biological classes of breast can be identified using an immunohistochemical panel of biomarkers and demonstrated that luminal and basal classes of breast cancer are heterogeneous and contain distinct subclasses. Of importance was the observation that only 60% of breast cancer cases clearly exhibited core class membership criteria, while the remaining 40% of cases were not assigned to a class. However, clinical adoption of such a classification system using 25 proteins determined by immunohistochemistry would be impractical due to cost and time constraints.

Further analysis was therefore required to minimise the number of markers needed to classify patients into these distinct breast cancer classes but retain the sophistication of the classification, maintain the clinical heterogeneity of classes, and reduce the unclassified tumours to a lower level while retaining usefulness for clinical decision making.

In this study, we therefore have sought to reduce the number of biomarkers necessary to classify breast cancer classes using immunohistochemistry and to examine the association between these classes with various clinical and pathological factors and patient outcome.

Patients and methods

Patients and laboratory methods

A series of 1073 patients from the Nottingham Tenovus Primary Breast Carcinoma Series presenting with primary operable (stages I, II and III) invasive breast cancer between 1986 and 1998 were used. Immunohistochemical reactivity for 25 proteins, with known relevance in breast cancer including those used in routine clinical practice, were determined using standard immunohistochemical techniques on tumour samples prepared as FFPE tissue microarrays as previously described (Abd El-Rehim et al, 2005). Levels of immunohistochemical reactivity were determined by microscopical analysis using the modified H-score (values between 0 and 300), giving a semi-quantitative assessment of both the intensity of staining and the percentage of positive cells as previously described (Abd El-Rehim et al, 2005). For c-erb-B2, the American Society of Clinical Oncology/College of American Pathologists Guideline Recommendations for Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer were used for assessment (Wolff et al, 2007). Equivocal (2+) cases were confirmed by chromogenic in situ hybridisation as previously described (Garcia-Caballero et al, 2010).

The Nottingham Series is a well-characterised consecutive assembly of patients who were treated according to standard clinical protocols. Of the available cases, 708 (66%) cases were aged 50 years or more. At the time of diagnosis, 160 (14.9%) tumours were histological grade 1,343 (31.9%) were grade 2, and 572 (53.2%) grade 3 (Table 1). A total of 736 (68.4%) had tumour size more than 1.5 cm. A total of 654 (60.8) patients had lymph node-negative disease and 419 (38.9%) had positive lymph nodes (332 cases with between one and three positive nodes, 87 cases with four or more positive nodes). Frequencies for histological tumour types were: 649 invasive ductal carcinomas of no special type, 171 tubular and tubular mixed carcinomas, 30 medullary carcinomas, 112 lobular carcinomas, 11 mucinous carcinomas, 37 mixed histological type, 3 papillary type carcinomas and four miscellaneous tumours. Patient management was based on tumour characteristics using Nottingham Prognostic Index (NPI) (Galea et al, 1992) and hormone receptor status. Patients with an NPI score of <3.4 received no adjuvant therapy, those with a NPI score >3.4 received hormone therapy if ER positive or classical cyclophosphamide, methotrexate and 5-fluorouracil if ER negative and fit enough to tolerate chemotherapy. Hormonal therapy was given to 420 patients (39%) and chemotherapy to 264 (24.5%). This study was approved by the Nottingham Research Ethics Committee 2 under the title ‘Development of a molecular genetic classification of breast cancer'. The Reporting Recommendations for Tumour Marker Prognostic Studies (REMARK) criteria, recommended by McShane et al (2005), were followed.

Table 1 Breast cancer biological class and clinicopathological parameters (Note that P-values derived using Cramer’s V (Friendly, 2000).)

Data relating to survival were collated in a prospective manner for those patients presenting after 1989 only; including survival time, defined as the interval (in months) from the date of the primary treatment to the time of death. Both short (5 years) and long (up to 20 years)-term patient outcome was investigated with respect to the biological classes in all patients where outcome data was available. Follow-up data was available for 974 patients, with overall survival ranging from 4 to 224 months (median 123 months, mean 118 months). During this period, a total of 317 (29.5%) patients developed distant metastases while 346 patients died, 263 of them from breast cancer. Patient age ranged from 18 to 72 years (median 54 years).

Classifying methods

Using the expression of 25 proteins, determined by immunohistochemistry, breast tumours were classified as previously described (Soria et al, 2010). Using the combination of boxplots for the whole data and for the singular classes and following clinical judgment, the 25 markers on which the breast cancer classes were derived were reduced to 14 by dismissing those biomarkers that had ‘no role’ in the class definition (i.e., had an identical overall distribution even when considered in any specific class). Then, using supervised classification approaches based on the naïve Bayes classification performance (Soria et al, 2008), the number of markers was further reduced by an exhaustive search down to the minimum number of biomarkers required to retain the previous classification. This formed the basis of the development of a fuzzy rule induction algorithm using the methodology previously described in Rasmani et al (2009). The ultimate goal of this process was to create a single algorithmic process to classify a breast tumour into one of the six clinical classes, while reducing the number of unclassified patients to a minimum. Technical class characterisation in terms of marker distribution was performed as previously described (Soria et al, 2010). The rule induction algorithm was subsequently validated on the expression of the 10 biomarkers determined by IHC in an additional 238 unselected cases of primary breast cancer from the Nottingham Tenovus primary breast cancer series.

Statistical analysis

The association between breast cancer classes and both histopathological and clinical characteristics, which were not involved in the development of the classes, was assessed using Cramer’s V (Friendly, 2000) to produce P-values. Breast cancer-specific survival and disease-free survival (DFS) between classes was determined using Kaplan–Meier curves. Non-parametric Kruskal–Wallis was used to test the difference of NPI across classes.

Results

Breast cancer classification

Using an exhaustive search of the best combination of a reduced set of 14 biomarkers, the minimum number required for classification of breast tumours into the six classes was reduced from 25 biomarkers to 10. These biomarkers were ER, progesterone receptor, cytokeratin (CK) 5/6, cytokeratin 7/8, EGFR, HER2, c-erbB3 (HER3), c-erbB4 (HER4), p53, and Mucin 1. Using a fuzzy rule induction-derived algorithm, a total of 997 out of 1073 (93%) breast tumours were subsequently assigned to one of the six classes and contained 370, 146, 123, 126, 87, and 145 patients, respectively. The remaining patients (n=76, 7%) were not assigned to any class.

Having removed unclassified tumours, we compared the classification of the 619 remaining tumours between the classification using 25 and 10 biomarkers. There was a good agreement between classes (Table 2, kappa index=0.79, Kendall’s tau correlation coefficient=0.92). Any shift in classification between tumours primarily occurred within the main (luminal and basal) groups.

Table 2 Comparison of the distribution of breast tumours by classification using 25 and 10 biomarkers

The classification approach was validated using an additional set of 238 patients for which IHC data for the ten biomarkers was available. Results showed that the breast cancer classes were populated with similar distribution to our initial findings (68, 61, 36, 23, 24, and 25 patients, respectively) and presented clear characteristics of the six classes.

Class characterisation

Figure 1 provides a visualisation of the separation of the classes and the association with the biomarkers. Table 3 summarises the performance of the principal component analysis where the first two components accounted for 44% of the variance. Figure 1A shows the biplot obtained for all patients, in which those not assigned to any class (unassigned) have been coloured grey. It can be seen that these fall mainly into the centre region of the biplot. Figure 1B shows the biplot obtained for only patients assigned to all classes—except the unassigned cases. The first axis was mainly determined, on the left, by luminal markers including luminal cytokeratin (CK7/8), hormone receptors (ER and PgR), and MUC1 overexpression and, on the right, by basal cytokeratin (CK5/6) and partly by p53 overexpression. The second axis is determined, on the bottom, by HER2, HER3, and HER4 overexpression.

Figure 1
figure 1

Biplots of classes projected on the first and second principal component axes: (A) for all patients and (B) for only patients assigned to a class.

Table 3 Performance of principal component analysis

The core molecular classes identified in this study included three luminal class tumours characterised by high luminal CK7/8 and hormone receptor (ER and PgR) expression. Luminal A and luminal B tumours showed high expression of CK7/8, ER, HER3 and HER4 but were separated by relatively lower levels of PgR expression in luminal B compared with luminal A tumours. In contrast, luminal N tumours showed differential expression of HER3 and HER4. There were two basal classes of tumour, characterised by low luminal cytokeratin and high basal expression (CK5/6) along with showing a triple-negative phenotype (i.e., ER, PgR and HER2 negative). They were, however, separated by p53 protein expression levels where the tumours either expressed high p53 (basal–p53 altered) or low p53 (basal–p53 normal). The remaining class (HER2) was characterised by high luminal cytokeratin and HER2 overexpression but showed heterogeneity in the expression of hormone receptors. We therefore split these tumours into two subclasses those expressing ER (HER2+/ER+) and those that showed an ER− phenotype (HER2+/ER−). The unassigned tumours showed heterogenic expression of all ten markers. A summary of the breast cancer classes and relative biomarker expression is shown in Figure 2A and B.

Figure 2
figure 2

Breast cancer biological classes. (A) Classification and proportions of cases. (B) Representative immunohistochemical profiles of the biological classes of breast cancer.

Clinical characterisation of patients by class

Significant associations were found between the identified breast cancer classes with respect to tumour grade, size, lymph node stage, and histological tumour type and were in line with expectations (Table 1). The basal and HER2 classes were significantly associated with larger tumour size, higher grade, higher stage, invasive ductal carcinoma of no special type and the poorer NPI prognostic groups. The luminal classes were significantly associated with tubular, lobular, and mixed-type tumours where luminal N tumours tended to be of lobular and mixed type compared with luminal A and B tumours, which were ductal of no special type.

Short- and long-term patient outcome was compared by tumour class where each class was significantly associated with distinct breast cancer-specific survival (BCSS) rates (Figure 3A and B, Table 4) and DFS (Figure 3C and D, Table 5). The highest frequency of mortality due to breast cancer in the first 5 years was seen in patients whose tumours belonged to the HER2 classes, with the HER2/ER− being the worst. A lower, but still high, frequency was seen in patients with tumours from luminal B and basal classes. Luminal A and N had the lowest frequency of breast cancer-specific death. Over a period of 20 years, luminal B and HER2+/ER+ tumours showed the worst BCSS (Figure 3B). A similar pattern of outcome in the biological classes was observed for DFS as for BCSS although the basal–p53 normal class had a longer disease-free interval compared with the basal–p53 altered tumours during the first five years.

Figure 3
figure 3

Breast cancer biological classes in relation to ( A ) 5-, ( B ) 20-year breast cancer-specific survival and ( C ) 5-, ( D ) 20-year disease-free survival.

Table 4 Kaplan–Meier P-values between breast cancer classes and BCSS
Table 5 Kaplan–Meier P-values between breast cancer classes and DFS

A boxplot of the NPI split by class is shown in Figure 4 illustrating significant differences between the NPI score and biological class (overall Kruskal–Wallis P<0.01). Table 6 summarises P-values for the class-by-class comparisons. It can be seen that the NPI for luminal N tumours is lower than that of the other luminal classes (luminal A and B). HER2+/ER− class of tumours have a higher NPI score than the other classes. This is an interesting observation for two reasons. First, it confirms that the NPI is providing discriminant information between the biological classes. Second, it suggests that the class divisions are providing additional information to the NPI.

Figure 4
figure 4

Boxplots of NPI by biological class.

Table 6 P-values of Kruskal–Wallis analysis between breast cancer classes and Nottingham Prognostic Index

Discussion

We have identified core classes of breast cancer using a reduced panel of 10 protein biomarkers determined by immunohistochemistry which, we believe, are clinically meaningful and clinically well-characterised. Three of these classes have not been previously identified and, while their precise prognostic and therapeutic relevance is not yet clear, their elucidation serves as a basis for ongoing investigations to address these important factors. Of course, different clustering techniques can and will result in different clusters. It is for precisely this reason that we have previously used concordance between multiple methods to establish the core classes: the identification of stable clusters through multiple methods forms a basis of methodological validation (Soria et al, 2010). A further study is underway to refine these classes using a more sophisticated fuzzy rule induction algorithm (Soria et al, 2013).

Also of importance is the observation that 93% of breast cancer cases clearly exhibit core class membership criteria, whereas only 7.1% remain unclassified. As a matter of fact, when core classes were derived in Soria et al (2010) the number of unclassified patients was 413, whereas after applying Rasmani et al’s (2009) approach this number went down to 76. Some of these unclassified cases result from the stringent criteria used to derive class membership but it is important to recognise that other cases do exhibit characteristics of more than one class. Biologically, this observation may be explained by the complex heterogeneity of the molecular portrait of breast cancer. These mixed class cases clearly merit recognition and investigation to determine their optimal treatment strategies. However, the well-defined biological and clinical relevance of the markers used, association with clinicopathological variables, and outcome indicates the biological and clinical relevance of the current molecular classification.

The core molecular classes identified in this study are similar to those determined by gene expression profiling, but we have been able to refine the definition of the luminal and basal tumours into further distinct classes with different clinical outcome. Phenotypic classification into core luminal, basal, and HER2 classes is possible using smaller panels of three to five antibodies (Nielsen et al, 2004; Carey et al, 2006; Cheang et al, 2008; Blows et al, 2010) but such limited panels cannot further sub classify these core groups. Our study clearly demonstrates that using a larger panel of 10 biomarkers a higher level of stratification is achieved, which may have direct and important clinical relevance.

The biological characterisation of the luminal A tumours are consistent with our previously identified group 1 (Abd El-Rehim et al, 2005) and Ambrogi's group 1 (Ambrogi et al, 2006), but are more distinctly defined. These tumours show high homology to the luminal-A-type tumours, as identified in gene array studies, which are also characterised by high gene expression of luminal differentiation, ER signalling markers and those involved with EGF (epidermal growth factor) signalling (Sorlie et al, 2006). Luminal B tumours were also characterised by higher levels of EGFR, HER3, and HER4 but, in contrast, showed relatively lower levels of PgR (levels of ER were similar). This class was not identified by our previous study (Abd El-Rehim et al, 2005), but is similar to group 2 in the Ferrara series (Ambrogi et al, 2006). This class of tumour shows homology to the luminal B group of tumours, defined by low-to-moderate expression of luminal-specific genes, including the ER cluster (Sorlie et al, 2006).

Whereas gene expression profiling has determined two luminal tumour classes, we have divided those tumours with a luminal phenotype into a further class (luminal N). This novel class of tumours while having high levels of ER and PgR, have negative/low expression of the HER family, particularly HER3 and HER4. Interestingly, the luminal A and luminal N tumours were similarly associated with good prognostic factors, including smaller tumour size, grade 1 tumours, node-negative and tubular mixed carcinomas. In contrast, the third luminal class, luminal B, although phenotypically similar to the luminal A tumours (except for PgR expression), consisted of those tumours with poorer prognostic factors such as larger tumour size, higher stage, and grade. It is also apparent that HER3 and HER4 are important discriminators in our breast cancer classification, although there remains controversy as to their prognostic significance (Witton et al, 2003; Abd El-Rehim et al, 2004).

We previously identified a basal-like subtype using protein expression (group 5 (Abd El-Rehim et al, 2005) and Ambrogi’s group 3 (Ambrogi et al, 2006)). However, consistent with other studies that showed that the basal-like subtype is heterogeneous (Laakso et al, 2006), we have now determined two basal-like classes. These classes were characterised by high expression of basal cytokeratin (CK5/6), low expression of luminal cytokeratin and a triple-negative phenotype. They were, however, separated by p53 protein expression level, which is similar to our observation in triple-negative breast cancer (Biganzoli et al, 2011). High frequency of tumour suppressor p53 mutations and protein expression have previously been detected in the basal-like subtype (Ellis et al, 1999; Sorlie et al, 2001; Pollack et al, 2002). The association between medullary carcinomas, p53 and basal tumours has been previously demonstrated (de Cremoux et al, 1999; Sorlie et al, 2001).

Those tumours with high HER2 expression were clustered into one class, which is homologous to Sorlie's HER2 group (Sorlie et al, 2001) and Ambrogi’s group 4 (Ambrogi et al, 2006). This class has the worst overall survival. However, due to the heterogeneity of HER2 overexpression in both ER-positive and -negative tumours, we manually split this class into two classes those with (HER2+/ER+) or without (HER2+/ER−) ER expression.

In conclusion, we have previously applied different clustering techniques with validity indices, and used cluster consensus to derive a classification that is robust across different multivariate procedures. This has highlighted the dangers of relying on a single clustering technique, as has often been the case in other studies, particularly in the interpretation and management of high-throughput assay results. As a consequence, we have now further refined the classification of breast cancer based on a panel of 10 proteins assessed by immunohistochemistry to identify distinct biological classes of breast cancer. Using 10 biomarkers has produced a more realistic distribution of breast cancer patients into the three main classes of luminal, basal and HER2, which were established in Sorlie et al (2001) and confirmed by subsequent papers (Abd El-Rehim et al, 2004). The fuzzy approach used to derive the classification with 10 markers has produced a breast cancer classification consistent with the proportion of cancer subtypes reported in other studies, while the previous classification with 25 markers assigned, for example, only 7% of patients to the HER2+ group. We have shown that, in addition to the luminal A, luminal B and HER2 classes identified using gene expression analysis, there are three further biologically and clinically relevant classes of breast cancer, namely the luminal N, basal–p53- altered, and basal–p53 normal classes. Furthermore, we have confirmed the complex biological heterogeneity of breast cancer through the identification of cases exhibiting characteristics of mixed class. Studies are underway to further validate these classes and to enable the creation of a clinically usable algorithm for prospective classification, taking into account current therapeutic strategies.