Main

Histologic grading of breast cancer is liable to inter- and intra-observer variability and it has suboptimal reproducibility1 because of the subjective nature of the three components that constitute the grading system: nuclear pleomorphism, tubule formation and mitotic count. The usefulness of nuclear morphometry by image analysis in providing more objective and reproducible prognosis for breast cancer patients has been recognized for a long time.2, 3, 4, 5, 6, 7, 8 Prognostically important features express the nuclear size and shape (and in some cases the nuclear texture and the architecture of the tissue9) in a quantitative manner. Incorporating nuclear morphometry features into grading would therefore make sense, but wider acceptance of such an approach to histological grading has been hampered by the tedious and time consuming nature of the manual segmentation of nuclei. Another contributing factor has been the lack of technology for high throughput digitization of histological slides. Recently, whole slide imaging10, 11 has become more affordable and thus more accepted into pathology labs, with the scanning and processing time constantly decreasing. Thus, fully digital pathology archives are already feasible.12 This development of scanning equipment has in turn prompted the development of automatic image analysis methods of histopathology images that aim at reducing or completely eliminating the manual input to the quantitative analysis of the tissue.13

In this study, we demonstrate the usefulness of automatic image analysis in breast cancer grading by employing a segmentation method to extract prognostically relevant morphometric features related to size from cancer nuclei in male breast cancer patients. This relatively rare type of cancer represents less than 1% of all breast cancers. Despite this, the mortality and morbidity of the disease are significant. Owing to the rare occurrence, large series are lacking and most of the knowledge is generalized from breast cancer in females. One previous study on 50 male breast cancer patients revealed that nuclear morphometry features from manually produced segmentations are predictive for the survival of the patients.7 In the work presented here, we extract two morphometric features, the mean nuclear area and standard deviation of the nuclear area, using a fully automatic segmentation method on whole slide images, and we analyze their prognostic value.

Materials and methods

Patients

The study population comprised 101 invasive male breast cancer patients. These are consecutive cases from the years 1986 to 2010 from four pathology labs in the Netherlands: St Antonius Hospital Nieuwegein, Diakonessenhuis Utrecht, University Medical Center Utrecht and Laboratory for Pathology East Netherlands. This group of patients was previously used to analyze the molecular sub-typing, fibrotic focus and hypoxia in male breast cancer.14, 15 The patients for whom survival data was not available were excluded in the current study. Age, tumor size and lymph node status were extracted from the pathology reports. Cases with isolated tumor cells were considered as lymph node positive. Hematoxylin and eosin (HE) slides were reviewed by three experienced observers (PJvD, RK, AHJV-M) to confirm the diagnosis and to characterize the tumor. Histological type (WHO), tubule formation, nuclear grade, mitotic activity index (MAI16) and histological grade according to the modified Bloom and Richardson17 score were recorded. The grades were assigned by consensus of the three observers in one microscope session. Prognostic information was obtained from the Integral Cancer Centre, the Netherlands. A summary of the clinicopathological data is given in Table 1.

Table 1 Summary of clinicopathological features of 101 male breast cancer patients

Tissue preparation and slide scanning

HE-stained slides were used to identify representative tumor areas. From these areas three 0.6 mm punch biopsies from formalin-fixed and paraffin-embedded tissue blocks were obtained and embedded in a recipient paraffin block to produce a tissue microarray as described previously.14 Sections of 4 μm were cut and stained with HE. The tissue microarray slides were digitized using a ScanScope XT whole slide scanner (Aperio, Vista, CA, USA). The digitization was done at Ă— 40 magnification with a resolution of 0.25 μm/pixel. To reduce the storage requirements, JPEG2000 compression with high quality factor was used. The tissue microarray cores were manually extracted and stored as separate image files. Entire cores, or parts of them, were removed from the analysis in certain cases like folded tissue, non-tumor tissue, poor fixation and poor digitization due to failed autofocusing. The latter two negatively affect the automatic segmentation method. In total, 16 cores were completely removed and 16 cores were partially removed. This includes two patients for whom all three cores were completely removed.

Image analysis

For the task of automatic nuclei segmentation of the epithelial nuclei, we used an extension to our previously developed method18 of which we give a short outline here, also illustrated in a block-diagram in Figure 1.

Figure 1
figure 1

Overview of the automatic nuclei segmentation method.

The segmentation procedure consists of four main steps: (1) preprocessing, (2) multi-scale marker-controlled watershed segmentation, (3) postprocessing and (4) merging of the results from multiple scales. Color unmixing19 (also called color deconvolution), which is a special case of spectral unmixing,20 and series of mathematical morphology operators21 are used to preprocess the image, removing irrelevant structures and substructures that may hamper the segmentation. Image transformation that highlights points of high radial symmetry22 and the regional minima of the preprocessed image are used to mark candidate nuclei locations. Watershed regions23 are grown from the markers, after which spurious regions are removed based on shape, texture and boundary saliency. The contours of the remaining regions are approximated with ellipses. The results from multiple scales, where the scale is defined by the size of the structuring elements for the preprocessing, are merged into a single segmentation using a region fitness criterion to resolve spatial concurrences.

The segmentation method was evaluated in a separate study on a set of 39 regions of size 1 × 1 mm from digitized breast cancer slides. The evaluation included detection (sensitivity and positive predictive value) and segmentation accuracy (median Dice coefficient), as well as the correspondence between the manual and automatic estimations of the mean and standard deviation of the nuclear area (correlation analysis). The sensitivity, median Dice coefficient and correlation were evaluated by comparison with manual segmentations produced with systematic random sampling.24, 25 The positive predictive value was evaluated by annotating a subset of the segmented regions as correct segmentations or spurious regions. The method showed overall good results in terms of sensitivity (0.86±0.09), positive predictive value (0.90±0.07), median Dice coefficient (0.89±0.02) and correlation of the mean nuclear area (r2=0.84), and the standard deviation of the nuclear area (r2=0.81) between the manual and automatic estimations.

Automatic nuclei segmentation was performed on all tissue microarray cores after which the mean and standard deviation of the nuclear area were calculated for all patients. The values were corrected to compensate for the systematic over- and under-estimation error with linear regression trained on the 39 regions mentioned before. Some example segmentation results are given in Figure 2.

Figure 2
figure 2

Example of the automatic nuclei segmentation results in whole slide images of male breast cancer (all images are reproduced at the same scale). The tumors in the first row have low mean nuclear area and the tumors in the second row have high mean nuclear area.

Statistical analysis

A result was considered statistically significant if P<0.05. As the number of patients with score I for nuclear atypia, tubule formation and Bloom and Richardson grading system was very low, scores I and II were pooled. The mean and standard deviation of the nuclear area were correlated with clinicopathological features using the independent sample t-test.

The median follow-up time for the patients was 5.7 years, so all survival analysis was based on the 5-year survival rates. For the univariate survival analysis, patients were divided into groups of high and low mean, and standard deviation of the nuclear area. The low group included the patients in the first two tertiles, whereas the high group included the last tertile. The rationale behind this kind of dichotomization was to establish an analogy with the nuclear and Bloom and Richardson grades, for which approximately one third of the patients were assigned a high grade. Tumor size and MAI were dichotomized using previously defined cutoff values.15 Univariate survival analysis was done by plotting the Kaplan–Meier survival curves and performing the logrank test.

Features that proved significant in univariate analysis were entered in multivariate survival analysis using Cox′s proportional hazards model (forward stepwise selection). Due to the relatively high median age of the patients, age was also taken as a covariate in multivariate survival analysis.

Results

The comparison of the mean and standard deviation of the nuclear area between patients grouped by clinicopathological features is summarized in Table 2. Significant differences were found for the mean nuclear area between patients with low and high nuclear atypia (P=0.032), low and high MAI (P=0.011), and Bloom and Richardson grades I and II vs III (P=0.007). For the standard deviation of the nuclear area, significant differences were observed between patients with low and high MAI (P=0.014), and Bloom and Richardson grades I and II vs III (P=0.047).

Table 2 Association between automatically assessed mean and standard deviation of the nuclear area on whole slide images with classical clinicopathological features in male breast cancer patients (t-test)

Results from the univariate survival analysis are given in Table 3, and the survival curves according to the mean and standard deviation of the nuclear area are presented in Figure 3. Large tumor size (P=0.036), low tubule formation (P=0.019), high MAI (P=0.015), high Bloom and Richardson grade (P=0.027), and high mean nuclear area (P=0.022) were associated with poor survival. Five-year survival rates for low and high mean nuclear area were 77 and 52%. The standard deviation of the nuclear area was not a significant predictor of outcome.

Table 3 Univariate survival analysis results of male breast cancer patients according to classic clinicopathological features and automatically extracted mean and standard deviation of nuclear area on whole slide images
Figure 3
figure 3

Kaplan–Meier survival curves for male breast cancer patients grouped by automatically extracted nuclear morphometry features. (a) Mean nuclear area (P=0.022) (b) Standard deviation of the nuclear area (P=0.328).

In Cox proportional regression, tumor size (P=0.017), tubule formation (P=0.035) and mean nuclear area (P=0.032) were retained as independent prognostic factors. The coefficients for the model are given in Table 4.

Table 4 Coefficients for the Cox proportional hazards model analyzing the additional prognostic value of classical and nuclear morphometry features in male breast cancer

Discussion

The goal of this study was to analyze the prognostic significance of automatically extracted nuclear morphometric features from whole slide images in male breast cancer patients. From the two examined features, only the mean nuclear area provided significant prognostic value. In contrast to others studies, the lymph node status was not a univariate prognostic predictor of survival. It should be pointed out that this remains the case even if the patients with isolated tumor cells (n=4) are regarded as node negative. A significant difference in mean nuclear area was found between patients with low to moderate and high nuclear atypia as graded by an expert pathologist. This was expected as subjective assessment of the size of the nuclei is a major part of the nuclear atypia scoring. Interestingly, nuclear atypia score on its own was not a significant prognostic factor, which is in agreement with previous studies in female breast cancer.26 In Cox regression, automatically extracted mean nuclear area was an independent prognostic factor to tumor size and tubule formation. These results are in agreement with multiple studies that have shown the prognostic significance of nuclear morphometry in female and male breast cancer patients.2, 3, 4, 5, 6, 7, 8 Altogether, this suggest that prognostication in male breast cancer could benefit from replacing classical nuclear atypia scoring according to the Bloom and Richardson grading system by automated nuclear morphometry of whole slide images.

Careful examination of the results revealed that many of the cases that have nuclei of small or intermediate size were given grade III for nuclear atypia due to the irregular chromatin texture and presence of large nucleoli and vice versa. Chromatin texture and presence of nucleoli are more easily evaluated by visual examination than nuclear size, and thus, arguably, they are more influential in the grading process. In this study, we did not consider automatically extracted nuclear texture features or nucleolar size, but this presents an interesting topic for future work.

As mentioned before, the recent advancements in slide scanning equipment and automatic image analysis methods of histopathology images may increase the use of quantitative methods in pathology. Software applications for analysis of immunohistochemically stained slides are already available from commercial vendors, some of which have approval by the USA Food and Drug Administration.27 Such applications should be considered as additional decision support tools for the pathologists; not overruling them.28 One application example would be presenting the automated mean nuclear area to the pathologist at the time of grading a particular case, together with mean nuclear area values of reference cases and their nuclear atypia score. This additional quantitative input would help ‘steer’ the decision for the nuclear atypia grading. These ‘hybrid’ approaches, however, are yet to be examined in an experimental setting.

It should be pointed out that the automatic estimation has some drawbacks. Poorly fixed or poorly stained tissue, inclusion of regions with severe necrosis or lymphocytic infiltration and failed autofocusing during the digitization may negatively affect segmentation performance, which in turn affects the estimation of the prognostic features. These situations, however, can be easily identified in a revision step by an experienced observer.

In conclusion, we here present an automatic method for nuclear morphometry in male breast cancer grading. This approach using whole digital slides offers all the benefits of a quantitative method while eliminating the tediousness of the previous interactive methods. With the increasing availability of slide scanning equipment in pathology labs, this kind of quantitative approach can be easily integrated in the workflow of routine pathology practice.