Chromosomal instability (CIN) is a hallmark of cancer, and for the common epithelial tumours complex and clonal karyotypic evolution is the predominant driver of clinical disease. Breast carcinomas show gain and/or loss of either whole chromosomes (aneuploidy) or segments of chromosomes (segmental aneuploidy), which frequently results in abnormal DNA content (Fridlyand et al, 2006). Flow cytometry provides a reliable estimate of the DNA content of tumour populations and provides a global assessment of genetic instability. Flow cytometric and cytogenetic evidence suggests that most common epithelial cancers have a chromosomal constitution in the near diploid or sub-tetraploid range (Cornelisse et al, 1987; Olaharski et al, 2006; Weaver et al, 2006). It is hypothesised that tumours proceed from near diploid to high structural and numerical chromosomal aneuploidy by a process of endoreduplication. This is followed by further non-random gain and/or loss of specific chromosomes or chromosomal regions resulting in an intermediate DNA index (DI) with enhanced malignant potential (Fujiwara et al, 2005; Ganem et al, 2007).

In breast cancer, previous flow cytometry studies have correlated DNA content with tumour grade (Hitchcock et al, 1989), p53 mutation status (Eissa et al, 1997; Friedrich et al, 1997), disease recurrence/relapse (Ewers et al, 1984; Stal et al, 1994a), and survival (Shiao et al, 1997; Utada et al, 1998), suggesting a therapeutic and prognostic clinical relevance. Tumour DNA content, in conjunction with S-phase fraction (SPF), provides a reliable measure of breast cancer aggression (Ferno et al, 1992; Bagwell et al, 2001b). Using Cox’s proportional hazard modelling, low DI, node negative, breast cancers were associated with good prognosis (Bagwell et al, 2001a). However, the technical limitations of the single parametric flow cytometry technique, including the inability to differentiate between DNA diploid tumour and normal cell populations (Hedley et al, 1983), does not allow accurate sub-grouping of breast tumours, thus limiting further statistical analysis. Additionally, previous attempts to improve DNA content analysis of formalin-fixed paraffin-embedded (FFPE) samples using the single parameter technique by introducing an external control as a reference were unsuccessful, in particular for the detection of tumours with a DI close to 1.0 (Schutte et al, 1985; Darzynkiewicz, 2010). Further, high DI tumours in the DNA tetraploid range with <25% tumour cells could not be differentiated from low DI tumours with an increased G2/M phase (Bauer, 1993). These limitations of single parametric flow have made DNA content analysis controversial and prevented the technique from achieving clinical utility (Ross et al, 2003).

Dual parameter DNA content flow cytometry of FFPE tissues allows detection of epithelial cells by anti-keratin antibodies (Nylander et al, 1994; Leers et al, 2000). This was further improved by Corver et al (2005), who used anti-keratin and anti-vimentin antibodies to allow specific identification of the tumour epithelial cells along with diploid stromal cells used as an internal reference, thus enabling accurate grouping of tumours based on their DI. This also enabled identification of DNA aneuploid tumour populations with a DI close to 1.0. It has been previously suggested that carcinoma-associated fibroblasts can be genetically abnormal and can have an abnormal DNA content; however, more recent studies have shown that, whatever the gene expression features of carcinoma-associated fibroblasts, the chromosome complement of stromal cells is diploid (Allinen et al, 2004; Hosein et al, 2010; Corver et al, 2011).

Additionally, the multiparametric technique facilitated the identification of high DI tumour (epithelial) cells overlapping the G2/M phase of the diploid stromal cells allowing characterisation of these tumours. These advantages of the multiparametric technique make it a robust platform for DNA content analysis of FFPE tissues and give it the potential to be a clinically useful tool. The aims of this study were to confirm the application of multiparameter flow cytometry on FFPE breast cancer specimens and to examine the clinical and biomarker associations of multiparametric flow analysis in primary breast cancer.

Materials and methods

Patients and tissues

The FFPE tissue blocks from 201 primary, previously untreated operable breast cancers were selected at random from patients diagnosed between 1997 and 2002. The mean age of the patients was 61 years (age range 28–88 years). All patients provided consent; the project received ethical permission from Tayside Tissue Bank under delegated authority from the Local Tayside Research Ethics Committee.

Blocks were reviewed to confirm the presence of >40% tumour cells. Tumours were graded according to the NHS guidelines for pathology reporting of breast disease (Bloom and Richardson, 1957) by a specialist breast pathologist (CAP or LBJ) (Table 1). Human epidermal growth factor receptor 2 (HER2) status was determined using immunohistochemistry (IHC) with confirmation of HER2-positive tumours by fluorescent in situ hybridisation (FISH) (Hsi and Tubbs, 2004; Purdie et al, 2010). The Quickscore (Detre et al, 1995) method was employed to determine oestrogen receptor (ER) and progesterone receptor (PgR) status. P53 mutation status was obtained using the p53 Amplichip test (Roche, Pleasanton, CA, USA) that detects single base pair substitutions and deletions (Baker et al, 2010). Breast cancer-specific overall survival, disease-free survival and recurrence were recorded with a minimum of 5 years follow-up for patients included in the study.

Table 1 Summary of the clinico-histopathological variables for 201 breast cancers studied using flow cytometry for tumour DNA content

Tissue processing and staining for flow cytometry

Tissue sections were processed as described by Corver et al (2005). Briefly, 3 × 60 μm thick sections from each block were de-waxed using xylene and rehydrated with decreasing concentrations of alcohol before antigen retrieval using 10 mM citric acid buffer (pH 6) for 1 h at 80 °C. Tissue was dissociated with a solution of 0.1% collagenase (Sigma-Aldrich, Dorset, UK) and dispase (Gibco; Invitrogen, Paisley, UK) for 1 h. One million cells were incubated overnight with anti-cytokeratin (2 μg ml−1 MNF116, DAKO (Glostrup, Denmark) and 5 μg ml−1 AE1AE3, Chemicon International (Temecula, CA, USA) MAB3412) and anti-vimentin (diluted 1:5 V9-2b, Antibodies for Research Applications BV, Gouda, The Netherlands; primary antibodies. Cells were washed with PBA (500 ml PBS/5% BSA/0.05% Tween-20) and treated with fluorescently conjugated secondary antibodies, FITC-IgG1 (Southern Biotech, Birmingham, AL, USA) for cytokeratin detection and PE-IgG2b (Southern Biotech) for vimentin detection for 30 min. Finally, the cells were resuspended in a solution of 10 μ M propidium iodide (PI) (Sigma-Aldrich) and 0.1% RNase (Sigma-Aldrich) for 2–3 h or overnight before analysis.

Flow cytometry and analysis

Cell suspensions were analysed using a FACScan (BD Biosciences, San Jose, CA, USA) with CellQuest software (version 3; BD Biosciences). DNA content analysis was performed by acquiring 50 000 events from each sample and simultaneously measuring PI, FITC-labelled epithelial, and PE-labelled stromal cell populations (Corver et al, 1996; Figure 1C). The keratin-positive tumour epithelial cell population detected using flow cytometry ranged from 6% to 90% of the total events analysed. Doublet discrimination was applied to remove cell clumps and enable the analysis of single cell events by using PI area vs PI width signal and noise/debris in front of the first G1 peak was also gated out (Figure 1A) during acquisition. The 530/30 band pass filter was used to detect FITC-labelled epithelial cells in the FL1 channel, PE-labelled stromal cells were detected using the 585/42 band pass filter in the FL2 channel and finally, PI was detected using the 650 long pass filter in the FL3 channel. According to the published guidelines for DNA content analysis from FFPE tissues, samples with a coefficient of variation (CV) of >8% should be excluded from data analysis to ensure robust data generation (Hedley, 1993). All breast cancers included in the present study yielded a CV ranging between 3.05% and 7.84% and the reproducibility of the technique confirmed with 15 randomly selected samples (data not shown).

Figure 1
figure 1

The four DNA content categories based on multi-parametric analysis. (A) Dot plot showing doublet discrimination to ensure analysis of a single cell population. (B) PI-positive cells as a negative control. (C) Dot plot showing keratin-positive epithelial cell population (EP), and vimentin-positive stromal cell population (SP). (D) DNA content histograms showing an example of a low DI, intermediate DI, high DI, and a multiploid tumour. The epithelial tumour cell population (green) is shown overlapping the normal stromal cell population (red) is shown. Histograms in (E) show the mathematical modelling analysis using the ModFit algorithm. The color reproduction of this figure is available at the British Journal of Cancer journal online.

The DI and percentage SPF (SPF%) were calculated using ModFit 3.2.1 and WinList 6.0 (Verity Software House, Topsham, ME, USA). The two software packages were remotely linked allowing ModFit to use the median of the diploid stromal cells to calculate the DI of the tumour (epithelial) cell population (mean of G0/G1 tumour epithelial peak/mean of G0/G1 reference stromal peak) (Hedley, 1993; Dayal et al., 2008; Corver and ter Haar, 2011). A data file contained all events except those at the lower and higher ends of the linear scale (Figure 1A), to allow the ModFit algorithm to correct for debris, clumps, and aggregates. ModFit analysis also included a gate to select all keratin-positive events in the FSC vs keratin dot plot, thus including doublets, noise and debris, allowing ModFit to handle data correction without manual interference.

Statistical analysis

Flow cytometry data were compared with clinical and histopathological parameters including histological tumour grade, p53 mutation status, HER2 status, ER status, and PgR status. An in-house data analysis tool (INSPIRE), which includes the two tail Fisher’s exact test (FET) was used for preliminary statistical analysis (Quinlan et al., 2008). Disease recurrence, disease-free survival, and overall survival were studied using the Kaplan–Meier (KM) log-rank test. Significant results were further analysed to confirm independent associations using the backward stepwise Binary Logistic Regression (BLR) and Cox’s Regression (CR) statistical analysis. The strength of the associations was assessed by calculating the β coefficient. The P-values for each association were used to discriminate independent associations between the variables analysed. For all analyses, the null hypothesis was rejected at α level of 5% (P<0.05).


Categorising primary breast tumours

In all, 201 primary breast cancers were grouped into four categories based on the DI of the tumour cell population (Figures 1D and E). Tumours with a DI between 0.76 and1.14 were observed to be in the ‘low DI’ range (n=79), ‘intermediate DI’ tumours (n=42) had a DI ranging from 1.18 to 1.79, whereas ‘high DI’ tumours (n=23) had a DI of 1.80, tumours with two or more peaks from one or more of the above three categories were classed as ‘multiploid’ tumours (n=57). DNA index was consistent and no variations were observed when DNA content analysis was repeated for 15 randomly selected tumours ensuring the reliability of the data generated and DI cutoffs were carefully determined based on published recommended standards (Hedley, 1993; Shankey et al, 1993). Figure 2 summarises the DI for all 201 samples used in this study

Figure 2
figure 2

A bimodal DNA index distribution curve of 201 breast cancers. The number of samples is represented on the y axis and the DNA index is indicated on the x axis.

Based on the technical precision of multiparameter flow analysis (Corver et al, 2005), tumours in the low DI range could further be categorised into three groups, (i) group 1 (n=26), with 0.76DI0.95, (ii) group 2 (n=25), with 0.96DI1.05, and (iii) group 3 (n=28), with 1.06DI1.14 (Figure 3).

Figure 3
figure 3

Low DI sub-categories. Histograms showing a group 1 low DI tumour with overlapping stromal (SP) and epithelial (EP) cell population and DI=0.90, group 2 low DI=0.97, and group 3 low DI tumour with DI=1.11.

The SPF% was also obtained for each tumour using ModFit analysis. Tumours were separated into three categories based on SPF% as described previously (Stal et al, 1994b), low SPF% <5% (n=105), medium SPF%=5–10% (n=41), and high SPF% 10% (n=55) tumours.

Clinical associations

Univariate statistical analysis

Low DI tumours showed significant associations with good prognostic features, including HER2-negative status (P=0.03, FET), ER-positive status (P=0.003, FET), PgR-positive status (P=0.013, FET), p53 wild type (P=0.03, FET), tumour grade 1–2 (P<0.0001, FET), disease-free survival (P=0.0027, KM), and overall survival (P=0.0023, KM). Further, analysis of the low DI cases (into subgroups 1, 2, and 3) indicated that group 2 low DI tumours associated with better prognostic markers when compared with the group 1 low DI tumours. The group 1 low DI tumour category (n=26) associated with poor prognostic features such as poor Nottingham prognostic index (NPI) (Galea et al, 1992) (P=0.03, FET) and tumour grade 3 (P=0.02, FET). Whereas, group 2 low DI tumours (n=25) associated with lymph node-negative status (P=0.02, FET). Group 3 tumours showed no significant associations.

Intermediate DI tumours showed an association with markers of poor prognosis, including ER-negative status (P=0.02, FET), p53 mutant status (P=0.0016, FET), tumour grade 3 (P=0.0014, FET), and disease recurrence (P=0.04, KM). High DI tumours were only associated with HER2-positive status (P=0.006, FET) and multiploid tumours showed no significant associations to any of the clinical parameters examined.

The above tumour categories were further assessed using clinical markers associated with the luminal equivalent (ER, PgR), HER2-positive and triple receptor-negative breast cancers (TNBC). Low DI tumours were significantly associated with clinical markers (ER+ve and/or PgR+ve) that are characteristic of the luminal type cancers (P=0.004, FET), whereas intermediate DI tumours associated with TNBC (P=0.007, FET). The molecular subtyping of these tumours based on microarray expression analysis was outside the scope of this study.

Tumours with high SPF (10%) associated with ER-negative status (P=0.001, FET), PgR-negative status (P=0.001, FET), p53 mutant tumours and intermediate-high DI tumours (P=0.001, FET).

Multivariate statistical analysis

Significant associations obtained from univariate analysis were tested using CR (Table 2) and BLR analyses (Table 3) to determine independent associations. An independent association between tumour DNA content and overall survival was not confirmed by multivariate analysis (Table 3). However, low DI tumours independently associated with PgR-positive status (P=0.012, BLR), intermediate DI tumours associated with p53 mutant status (P=0.001, BLR), and high DI tumours independently associated with HER2-positive status (P=0.004, BLR) (Table 3; Figure 4). High SPF (10%) independently associated (P=0.027, CR) with low patient survival (Table 2; Figure 4).

Table 2 Cox’s regression (CR) analysis
Table 3 Backward stepwise Binary Logistic Regression (BLR) statistical analysis
Figure 4
figure 4

Independent associations of DNA content categories with prognostic markers. Independent associations between DNA content categories and molecular markers associated with breast cancer prognosis using BLR and CR analysis. The sense of association (negative or positive) and the corresponding level of confidence (P-value) are also indicated in the figure. The direction of the arrows indicates the direction of association found by BLR and CR. DI, DNA index; SPF%, S-phase fraction.


This study has, for the first time, applied the multiparametric flow technique to FFPE tissue from 201 primary breast cancers. The technique allows assessment of tumour DNA content by differential staining of keratin expressing epithelial tumour cells (FITC labelled) from vimentin expressing normal stromal cells (PE labelled) (Corver et al, 2005). The ability of this technique to utilise routinely processed FFPE clinical material of mixed tumour and stromal tissues is beneficial compared with other methods of DNA analysis, such as deep sequencing, that requires a minimum of 50–70% tumour cells from frozen tissue and relies on other labour intensive techniques. Reassuringly, the DI values obtained in this study from FFPE sections show a similar distribution (Figure 2) to previously published data on fresh/frozen tissue material (Kimmig et al, 2001), underpinning the reliability of this multiparameter flow technique for DNA content analysis of FFPE tissues.

The association of low DI tumours with good prognostic indicators in breast cancer confirms previous findings (Cunningham et al, 1994; Spiethoff et al, 2000). Furthermore, reliable subgrouping of low DI FFPE tumours based on the DI, while not possible using the single parametric technique, demonstrated an association of group 2 low DI tumours with lymph node-negative status (P=0.02, FET), not previously reported using FFPE tissue. This suggests that the technical deficiencies of single parameter flow cytometry may have masked clinically relevant associations between lymph-node status and tumour DNA content (Beerman et al, 1990; Bergers et al, 1996; Spiethoff et al, 2000).

Tumours belonging to the intermediate DI, high DI, and multiploid categories (Figure 1D and E) could not be appropriately grouped in previous studies due to the technical limitations of the single parameter technique to differentiate between epithelial and stromal cell populations. As a result, multiploid tumours may have been incorrectly allocated to either the intermediate or high DI tumour groups. A reliable separation of intermediate DI and high DI tumours could only be achieved as exemplified here with the multiparametric flow method. The association of DNA content categories with different markers of prognosis (Figure 4) in the present study indicates that these tumours clinically behave differently. This highlights the clinical potential of multiparametric flow analysis of breast cancer to determine sub-groups of cancers within tumour categories previously identified by DNA content using single parametric flow analysis.

Biomarker associations

The PgR expression and tumour p53 mutation have been postulated as independent markers of prognosis in breast cancer (Overgaard et al, 2000; Liu et al, 2009) as confirmed here (Table 3; Figure 4). An association of low DI tumours with ER-positive and PgR-positive status has been previously reported (Leers and Nap, 2001); however, for the first time we demonstrate multivariate analysis showing an independent association of low DI tumours with PgR, but not ER, positive IHC.

Tumours with aneuploid DNA have been associated with poor prognostic features (Friedrich et al, 1997; Leers and Nap, 2001). On further interrogation these tumours revealed two distinct subgroups, (i) intermediate DI tumours, which associated with poor prognostic features including p53 mutation and triple negative status; and (ii) high DI tumours associated with HER2 positivity but not with any other markers of prognosis. Multivariate statistical analysis confirmed these associations suggesting the presence of distinct molecular categories within the aneuploid tumours of potential clinical significance.

High SPF% independently associated with poor survival using CR analysis. Previous studies have failed to identify SPF% as an independent prognostic marker using single parameter DNA content analysis in lymph node-positive breast cancer patients (Witzig et al, 1991, 1993) but have found an association with overall survival in lymph node-negative tumours (Cunningham et al, 1994) and, recently, in both early onset and advanced stage breast cancer (Vielh et al, 2005). In the present study, SPF% was not found to be significantly associated with lymph-node status; however, both lymph-node status and SPF% were independently associated with survival (Table 2; Figure 4). The SPF% data presented here are inherently different from previously reported studies as the tumour (epithelial) cell fraction SPF% has been specifically calculated, not the total SPF%. Interestingly, high SPF% also associated with intermediate and high DI tumours (P=0.001, FET), the tumour categories in this study associated with poor prognostic markers.

The identification of clinically distinct DNA content sub-groups in the present study suggests that a reliable DNA content analysis platform is required and achievable to investigate the complex biological mechanisms that drive cancer. The SPF% and DNA content analysis using multiparameter flow analysis associate significantly with prognosis and warrant further investigation. Moreover, an understanding of the basic biology responsible for tetraploidy and sub-tetraploidy in breast cancer is likely to have direct relevance to similar events in other common epithelial tumours.


Multiparameter DNA content analysis of FFPE primary breast cancers identifies categories of breast cancer, based on DI, associated with conventional clinical and pathological parameters of prognostic and therapeutic significance. Investigating the mechanisms involved in tumour cell DNA content may provide future therapeutic opportunities in breast cancer.

Ethical approval

Ethical approval was obtained from the Tayside Tissue Bank under delegated authority from the Tayside Local Research Ethics Committee for all patient samples included in this study.