Introduction

Genomic instability is thought to be the underlying mechanism by which cells acquire genetic alterations eventually leading to cancer. Especially in breast cancer, the evidence has in recent years been mounting that the acquisition of genetic aberrations detectable by modern cytogenetic techniques does not happen in an anarchic fashion. Rather, multiple, partially parallel pathways seem to exist, each of them associated with a specific cellular morphology and/or tumor architecture. This has led to the proposal of a morphology-based, cytogenetic progression model from normal to in situ and invasive breast cancer (Buerger et al, 1999b; Roylance et al, 1999; Vos et al, 2000). In this model, the loss of 16q seems to be the most important step associated with tumor proliferation rate, independent of the rate of cytogenetic instability (Buerger et al, 2000b, 2001). Nevertheless, the expected differences in protein expression patterns associated with the cytogenetic events in these pathways remained unclear. In addition, studies on carcinogenesis pathways have to take into account features of presumed precursor cells. Our own studies pointed to changing cytokeratin expression patterns within the physiologic maturation and differentiation of the normal female breast (Boecker et al, 2002). The aim of this study was therefore to relate patterns of cytogenetic changes reflected by comparative genomic hybridization (CGH) with protein expression patterns assessed on tissue microarrays (TMAs) in relation to the different precursor cells in breast carcinogenesis.

Results

The average value (duplicate experiments) of the immunohistochemical scoring of the arrays has been used for final analysis. Thirteen carcinomas showed expression of cytokeratin 5/6 (Ck 5/6), two of these a coexpression with Ck 8/18. A total of 31% and 32% were negative for estrogen receptor (ER) and progesterone receptor (PR), respectively. Eight tumors were strongly positive for p53. A total of 13.6% revealed a moderate or strong staining intensity for epithelial growth factor receptor (EGFR). Ten percent were strongly positive for c-erbB2. In all but one of these cases, chromogenic in situ hybridization (CISH) analysis revealed an amplification of c-erbB2 (>10 signals/nucleus).

CGH results of the tumors investigated in this study have been partly published before (Buerger et al, 1999a, 2001). On average 7.9 alterations per case could be detected. In short, the most frequent alterations were chromosomal gains involving 1q (60%), 3q (25%), 5p (19%), 6q (12%), 7p (13%), 8q (49%), 11q (13%), 17q (19%), and 20q (21%) and losses of 3p (10%), 6q (16%), 8p (32%), 9p (13%), 11q (28%), 13q (21%), 14q (11%), 15q (11%), 16q (53%), 17p (31%), and 18q (12%). A statistically higher average number of genetic alterations could be detected in tumors with Ck 5/6 expression compared with Ck 5/6-negative tumors (14.0 vs 7.3; p < 0.01, Student's t test, two-tailed). Also the rate of 16q losses was significantly lower in tumors with a pure Ck 5/6 expression.

The protein expression profiles clearly differed in Ck 5/6+/Ck 8/18−, Ck 5/6+/Ck 8/18+, and Ck 5/6−/Ck 8/18+ tumors concerning ER, PR, Ki-67, p27, p21, bcl-2, p53, cyclin A, c-erbB2, and EGFR (Fig. 1). Tumors with (co)expression of Ck 5/6 demonstrated a clearly higher frequency of cyclin A, Ki-67, p53, and EGFR expression, whereas the loss of Ck 5/6 expression was associated with an increasing frequency of tumors expressing p27, p21, ER, PR, c-erbB2, and bcl-2 (Fig. 2). For cyclin D1 and cyclin E, a difference could not be detected in tumors with or without CK 5/6 expression.

Figure 1
figure 1

Protein expression patterns of invasive breast cancer cases in relation to a previously postulated model of physiologic breast differentiation (Boecker et al, 2002). According to different subsets of cells within the normal breast lobule, which might function as potential precursor cells of breast cancer (upper row), all tumors were subgrouped by their expression of cytokeratin 5/6 (Ck 5/6) and cytokeratin 8/18 (Ck 8/18). Clear differences could be demonstrated for the differential expression of c-erbB2, epithelial growth factor receptor (EGFR), and cyclin A. The average number of genetic alterations per case is indicated.

Figure 2
figure 2

The multitude of the immunohistochemical markers used revealed a clear differential expression in Ck 5/6+/Ck 8/18−, Ck 5/6+/Ck 8/18+, and Ck 5/6−/Ck 8/18+ breast cancers. The results in Ck 5/6+/Ck 8/18+ tumors are divergent for some markers. This is probably a result of the low number of tumors displaying this feature. a, For ER, PR, p21, bcl-2, and 16q losses, the frequency of “positive findings” are indicated. b, In the case of Ki-67, p27, and p53, the average of all scores is shown.

Biomathematical Cluster Analysis

Based on the immunohistochemical markers, three clusters could be demonstrated. One was characterized by c-erbB2 overexpression and c-erbB2 amplification, whereas the two others were defined by the expression of “basal” Ck 5/6 and low-molecular Ck 8/18, respectively (Fig. 3). This was independent of the algorithm and/or the number of markers used (data not shown).

Figure 3
figure 3

Cluster tree of all immunohistochemical markers used. Biomathematical cluster analysis demonstrates the existence of Ck 5/6, Ck 8/18, and c-erbB2 clusters (in red frames). SMA = smooth muscle actin.

Combining immunohistochemical staining (IHC) and CGH results showed the formation of multiple clusters (Fig. 4). One cluster contained almost all Ck 5/6 and p53 (3+) overexpressing tumors, whereas another, separate cluster included all c-erbB2 amplified carcinomas. p53 (3+)-overexpressing carcinomas were never associated with a high-level amplification of c-erbB2. In these two clusters, the rate of 16q losses was significantly lower than in the other clusters (p < 0.05, Student's t test, two-tailed).

Figure 4
figure 4

Cluster tree of all invasive breast cancer cases integrating comparative genomic hybridization (CGH) analysis and immunohistochemical staining findings. Ck 5/6-expressing tumors are indicated in orange, c-erbB2-amplified tumors in green, and p53 strong-positive tumors in blue. Coexpression of CK 5/6 and p53 (3+) is a rather common finding, whereas a strong coexpression of c-erbB2 and Ck 5/6, respective of p53, is a rare or even not detectable finding. One major cluster arm is predominantly composed of breast cancer cases expressing Ck 5/6. Two Ck 5/6-expressing tumors have been clustered elsewhere. One of these cases (c2528) revealed a coexpression of Ck 8/18.

Discussion

Intense debates about the current concepts of breast carcinogenesis are dominated by two points of view. In contrast with the idea of multiple, genetic, and morphologically parallel pathways (Buerger et al, 1999b; Holland et al, 1994; Mommers et al, 1999; van Diest, 1999), the concept of a stepwise genetic, morphologic and radiologic tumor progression (Tabar et al, 1992; Tirkkonen et al, 1998), a so-called serial tumor progression, has been more widely accepted. Nevertheless, studies to establish a (cyto)genetic progression model of breast cancer reflecting the dogma of a serial tumor progression, such as for colorectal cancer (Fearon and Vogelstein, 1990), have failed so far.

Our results, using 15 antibodies, convincingly indicated the presence of multiple clusters or groups of genes within sporadic breast cancer, comparable to clusters defined in recent studies, using cDNA chip–based gene expression analysis of sporadic breast cancer cases (Gruvberger et al, 2001; Perou et al, 2000). Also using conventional IHC on whole sections, identical correlations for the coexpression of p53 and EGFR in Ck 5/6-positive breast cancer as discussed below could be drawn (Megha et al, 2002). In analogy to previous studies, clusters with c-erbB2, basal keratin 5/6, and luminal keratins 8/18 could be found (Alizadeh et al, 2001) (Fig. 3). Therefore, chromosomal gains and losses and protein expression patterns seem to be more relevant for our understanding of breast carcinogenesis than the concentration on distinct, single genes (van't Veer et al, 2002). A strong result has been obtained despite the immunohistochemical analysis of small tumor areas by TMA that may have restrictions with regard to representivity. Apparently, with the use of TMA-adapted thresholds for interpreting the immunoreactivity, the influence of unavoidable “false-positive or false-negative” TMA staining results is minor (Simon et al, 2001). The use of these thresholds also resulted in an equal frequency of tumors in this series (over)expressing distinct proteins such as Ck 5/6, c-erbB2, or p53 (Jones et al, 2001; van de Vijver, 1993), as a further argument for the use of TMAs.

Recent gene expression, immunohistochemical, and clinicopathologic studies have been focused on distinctive properties of breast cancer cases expressing so-called “basal,” high-molecular cytokeratins (Ck 5/6, Ck5, Ck 5/14) (Perou et al, 2000; Tsuda et al, 2000; Wetzels et al, 1991). It is widely accepted that cytokeratin expression patterns seem to be highly conserved during carcinogenesis (Moll et al, 1982). In consequence, the existence of Ck 5/6-positive tumors raises fundamental controversies about their formal pathogenesis, with low-nuclear grade tumors normally exhibiting Ck 8/18 expression. Therefore “basal-type” carcinomas seem to represent a unique pathway in breast carcinogenesis with an as yet undefined relation toward previously proposed cytogenetic pathways in ductal invasive breast cancers (Jones et al, 2001).

The introduction of a new biologic cell concept into breast pathology opens another, unifying explanation model (Boecker et al, 2002). Against the background that different cell populations with differing cytokeratin expression patterns within the female breast exist, CK 5/6-positive breast cancer cases might arise out of different precursor cells than Ck 8/18-positive breast cancer cases. By subdividing breast cancer cases according to their cytokeratin expression patterns, in analogy to the cell populations in the normal breast, clear differences could be demonstrated. As in the normal female breast, an increasing degree of “maturation,” indicated by the expression of CK 8/18, dramatically changed the expression of various growth factor receptors, cell cycle–associated proteins, and hormone receptors. Although the percentage of EGFR, cyclin A, Mib-1, and p53 strongly positive tumors clearly decreases with increasing Ck 8/18 expression, the rate of c-erbB2-, ER-, PR-, bcl-2-, p21-, and p27-positive tumors increased with reduced or absent Ck 5/6 expression (Fig. 2). The low number of tumors might explain the inhomogeneous results in the Ck 5/6 and 8/18 coexpressing tumors, but the trend is obvious. Interestingly, the Ck 5/6-positive tumors displayed statistically significant differences in 16q losses as indicators of a lower tumor grade (Ck 5/6+ vs Ck 5/6−; p < 0.01; Student's t test, two-tailed) (Buerger et al, 1999b; Vos et al, 2000).

Largely overlapping gene expression patterns of breast cancer cell lines, breast cancer, and normal breast tissue have been shown recently (Ross et al, 2000; Sorlie et al, 2001). To date, data about differential gene expression of cells with varying cytokeratin expression patterns within the normal female breast are lacking. Nevertheless, previous studies indicate a further diversification of gene expression. DiRenzo and coworkers were able to show that immortalized mammary epithelial cells with a basal phenotype revealed an expression of proteins belonging to the p53 and EGFR pathway (DiRenzo et al, 2002). Comparing these findings with our new data (Fig. 3) gives strong support to the hypothesis that cell type–specific, basic physiologic gene expression patterns are maintained in distinct evolutionary pathways of breast cancer. Against this background it seems rather unlikely that a change of cytokeratin expression patterns in breast carcinogenesis occurs. Members of the erb gene family thereby seem not only to play different roles in subgroups of breast cancer but also to have different impacts in physiologic breast differentiation (Muthuswamy et al, 2001).

The cytogenetic alteration status with a decreasing average number of genetic alterations per case in tumors with Ck 8/18 expression (Ck 5/6+ and Ck 8/18− vs. Ck 5/6− and Ck 8/18+; p < 0.001, Student's t test, two-tailed) (Fig. 2) also points to the existence of a unique subgroup, as described previously for c-erbB2–amplified tumors (Isola et al, 1999).

Because breast cancer is a disease with disturbances of chromosomal integrity causing altered protein expression patterns or vice versa, a better definition of each single breast cancer case requires the knowledge of the protein expression and chromosomal alteration status. Integrating the immunohistochemistry-based TMA data into CGH data by cluster analysis again, irrespective of the algorithm used (data not shown), elaborated one cluster arm almost exclusively containing CK 5/6-positive, ductal invasive G3 breast cancer cases. It is of interest that Ck 5/6-positive tumors in 5 of 13 cases reveal a strong coexpression of p53, indicative of p53 mutations as has been shown recently (Megha et al, 2002), but rarely revealed an amplification of c-erbB2, associated with a strong c-erbB2 overexpression (one case). Regarding the extreme forms of c-erbB2 and p53 overexpression, respectively, the exclusive use of either of these pathways becomes obvious. In this series a strong concerted expression of p53 and c-erbB2 within one tumor could not be shown, also supporting the idea of early, complex cytogenetic changes before the step of aneuploidization (Rennstam et al, 2001). In contrast, other studies could not find this exclusive strong overexpression of p53 and erbB2. Nevertheless, poorly differentiated or G3 breast cancer has to be interpreted as a spectrum of parallel pathways with different underlying, partially independent, partially overlapping pathogenetic mechanisms, but with similar histologic differentiation and cytogenetic alteration patterns. Furthermore, independent members of this tumor group might be breast cancer cases with a background of germ line mutations within BRCA1 or BRCA2 (Breast Cancer Linkage Consortium, 1997), resulting in the activation of differing intracellular pathways (Hedenfalk et al, 2001), reflected or caused by differing cytogenetic alterations (Tirkkonen et al, 1997, 1999).

In conclusion, it seems legitimate to interpret these results as evidence for an important role of subpopulations within the normal breast functioning as specific precursor cells of breast cancer within specific cytogenetic pathways. Although a carcinogenic hit of “near stem-cell-like-cells or progenitor-cell” automatically seems to be associated with the evolution of a poorly differentiated breast cancer, the tumor-associated progression of a glandular cell seems to be associated with a lower tumor grade. It is important to mention that this tumor group seems to be mainly associated with a cytogenetic pattern dominated by the loss of chromosomal 16q material, in contrast to poorly differentiated breast cancer cases that may have various, but largely exclusive, protein expression patterns and unspecific far-advanced cytogenetic alteration patterns. The knowledge of different precursor cells of breast cancer associated with or maybe resulting in different protein expression patterns will therefore influence the research in unraveling new therapeutic and prophylactic regimens.

Materials and Methods

Materials

Fresh-frozen and paraffin-embedded tissue of 166 breast cancer cases, originating from the files of the Gerhard-Domagk-Institute of Pathology, University of Muenster, were used for CGH analysis and TMA production, respectively. The tumor series represented all subtypes of invasive breast cancer; the tumors were graded according to established protocols (Ellis and Elston, 1998) as G1 (n = 28), G2 (n = 79), or G3 (n = 59).

Methods

CGH Analysis

The method of CGH analysis and the criteria for the evaluation of genetic alterations were performed as previously described (Buerger et al, 1999b). In short, high molecular DNA of fresh-frozen breast cancer tissue has been used for CGH analysis. Only metaphase spreads showing an even high-intensity hybridization with low granularity were taken into account. Corresponding ratio profiles were evaluated only if the 95% confidence limits did not exceed 0.15. The 50% thresholds (upper threshold 1.25, lower threshold 0.75) were applied to define the chromosomal regions of DNA sequence losses or gains. Independent confirmation of chromosomal aberrations has shown that these thresholds are reliable and eliminate the possibility of false-positive results. The consistency of these aberrations has been confirmed by previous reverse-CGH experiments (tumor DNA labeled with digoxigenin; reference DNA labeled with biotin). Each CGH experiment included a control hybridization of FITC and rhodamine-labeled normal DNA to each other (Buerger et al, 1999b, 2000a; Kallioniemi et al, 1994).

Tissue Arrays

Cancer tissue was fixed in buffered 4% formaldehyde and afterward embedded in paraffin according to standard procedures. Sufficient tumor material of 166 breast cancer cases, characterized by CGH, was available for the production of TMAs. Two duplicate arrays were constructed and each subjected to further analysis. To enable the definition of representative tumor areas, a hematoxylin and eosin-stained section was made from each donor block. The spot diameter was 0.6 mm. The distance between the spots was 1.0 mm. According to standard procedures (Brandt et al, 2002; Kononen et al, 1998; Torhorst et al, 2001), one spot per tumor was punched out of the original tumor block using a specialized tissue array precision instrument (Beecher Instruments, Silver Spring, Maryland). The TMA block was cut in sections of 3-μm thickness.

Immunohistochemistry.

Staining procedures were done according to standard protocols. The pretreatment conditions, the source, and the dilution of the commercially available primary antibodies are shown in Table 1.

Table 1 Pretreatment Conditions, Source, and Dilution of Primary Antibodies Used

According to the literature and our own experience, the IHC results were semiquantitatively evaluated as follows (Bankfalvi et al, 1994; Mommers et al, 1998, 2002. Expression was binary graded for ER and PR (positive: at least 2% of cells with nuclear expression), bcl-2 (positive: any cytoplasmic staining), p21 (positive: at least 5% of cells with nuclear expression), cyclin D1 and cyclin E (positive: 5% and 10% of cells with nuclear staining, respectively), and Ck 5/6, Ck 8/18, and smooth muscle actin (any positive cytoplasmic staining). Expression was graded from 0 to 3 for Ki-67 (0 = less than 2% of all cells with nuclear staining, 1 = 2–10%, 2 = 11–25%, 3 = more than 25%), p53, cyclin A, and p27 (0 = no staining, 1 = 1–25%, 2 = 26–50%, 3 = more than 50%), c-erbB2 (Dako Score), and EGFR (Dako Score, modified due to a lower staining intensity).

CISH

CISH analysis was done on 5-μm–thick array sections as previously described (Tanner et al, 2000). Low-level amplification was defined as 6 to 10 signals per nucleus in >50% of tumor cells. Amplification of c-erbB2 was defined when numerous (>10) separate gene copies or large gene copy clusters in >50% of carcinoma cells were seen.

Biomathematical Cluster Analysis.

Immunohistochemistry data was collected according to established scores as described above. The tabulated data of the CGH ratio profiles were recoded in a different scale to fulfill mathematical evaluation demands. Scores ranging from 0 to 3 or 0 to 1 were transferred to a scale ranging from 1 to 4 and 1 to 2, respectively, to avoid scores of zero, which hamper cluster analysis. Missing IHC data (9% of all IHC data) were replaced by the median of that specific score. This procedure approximated the real values in and affordable way and did not bias the evaluation (data not shown). In the analysis of CGH data combined with IHC data, the scales of both categories were adapted to achieve equally weighted variables. The different lengths of the CGH and IH data in the feature vector were not changed. To analyze the independent behavior of the CGH and IH feature vectors, both categories were also analyzed separately. Hierarchical cluster analysis based on an Euclidean distance measure was applied (Alaiya et al, 2002; Harris et al, 2002), with the underlying rationale that our case set did not include grave outliers and distinct large groups with further substructures. Among other hierarchical cluster methods, predominantly Ward and Complete Linkage were applied. The comparative analysis of these further algorithms showed that the differences in the result sets were marginal. Therefore we decided to choose the common method Complete Linkage because of the comparability to other data. Our evaluations were performed with the statistical platform SPlus6-r2 using the functions “hclust” and “agnes”—based on algorithms as previously described (Kaufman et al, 1995; Struyf et al, 2002). The results are visualized in a dendrogram showing in a graphical way the similarity of the cancer cases.