Main

Among European men, prostate cancer is the most commonly diagnosed and third most lethal cancer type (Ferlay et al, 2013). Randomised trials confirm that radical prostatectomy improves survival, at least for high-risk patients (Bill-Axelson et al, 2011; Wilt et al, 2012), and that radiation therapy or immediate hormonal therapy after radical prostatectomy may further reduce the risk of recurrence and death (Messing et al, 1999; Thompson et al, 2009). There is thus a need for identifying patients at high risk of recurrence in order to offer adjuvant treatment.

Clinical characteristics and pathological evaluations can be applied to estimate the risk of recurrence after radical prostatectomy. Gleason grading gives the currently best prognostic marker, but is subjective and has high observer variation (Melia et al, 2006; Bottke et al, 2013). Current risk-stratification tools incorporate a range of prognostic markers to obtain accurate prediction of cancer recurrence and mortality. One externally validated tool is Cancer of the Prostate Risk Assessment Postsurgical (CAPRA-S; Cooperberg et al, 2011; Punnen et al, 2014; Tilki et al, 2015), which combines preoperative PSA and pathological evaluations of Gleason score, surgical margins (SM), extracapsular extension (ECE), seminal vesicle invasion (SVI) and lymph node invasion (LNI). Objective methods that confirm and supplement the prediction of such tools are desired, but are challenging to identify due to several properties of prostate cancer, including tumour heterogeneity and slow growth (Algaba and Montironi, 2010; Andreoiu and Cheng, 2010).

The aim of this study was to develop and validate a novel prognostic marker for recurrence after radical prostatectomy that could supplement current risk stratifications. The marker objectively assessed disordered chromatin organisation by analysing the texture in images of DNA-specifically stained cell nuclei, and was thus a measure of epigenetic aberrations. Similar markers have proven to offer substantial prognostic information for a wide range of malignancies, including prostate cancer (Jorgensen et al, 1996; Yogesan et al, 1996; Potter et al, 1999; Veltri et al, 2004; Mohamed et al, 2009). Unlike the most similar methods, our marker stratifies on regions defined by the distance to the nuclear periphery, as some genes and chromosome territories reposition radially during carcinogenesis (Cremer et al, 2003; Leshner et al, 2016). In addition, we focus on the smallest cell nuclei and account for heterogeneity and variations in imaging equipment. The marker, termed Nucleotyping, was integrated with CAPRA-S for improved risk stratification.

Patients and methods

Patient material

A consecutive series of 317 patients treated with radical prostatectomy at the Norwegian Radium Hospital during 1987 to 2005 was included in the study. The study was approved by the Regional Committees for Medical and Health Research Ethics (REK) in Norway (REK no. S-07443a). A total of 307 patients had available tumour material and were eligible for analysis. Median follow-up time was 10 years (interquartile range (IQR), 7–14). The primary end point was time to recurrence, defined in accordance with Punt et al (2007), and recurrence was assessed with biopsy, digital rectal examination or imaging modalities. Pathology of all study specimens were centrally reviewed by an experienced uropathologist (LV) who was blinded with respect to patient outcome, using the 2005 ISUP consensus and the ADASP practice guidelines (Epstein et al, 2005, 2008).

Tumour block sampling and selection

On average three tumour block samples (IQR, 3-4) were included for each patient in order to account for heterogeneity and the entailed sampling challenge. In DNA ploidy analysis, the patient was considered non-diploid if a non-diploid tumour block was sampled; otherwise, the patient was classified as diploid. This is in accordance with the DNA ploidy study by Pretorius et al (2009), which used a subset of the present sample and patient cohort. For Nucleotyping, we correspondingly represented the patient by the tumour block sample most indicative of recurrence (see Supplementary Methods).

Image cytometry

Monolayers of isolated cell nuclei stained with Feulgen-Schiff were prepared from formalin-fixed paraffin-embedded tissue as described by Kristensen et al (2003). Images of cell nuclei and corresponding DNA contents were measured with a Zeiss Axioplan microscope equipped with a 546 nm green filter and a monochrome high-resolution digital camera (AxioCam MrM, Zeiss, Jena, Germany or C4742-95, Hamamatsu Photonics, Hamamatsu, Japan) and the Ploidy Work Station Grabber (Room4 Ltd, Crowborough, UK). Each pixel in the resulting images has a value, called the grey level, which reflects the local DNA density.

The imaged cell nuclei were automatically segmented and classified according to cell type by computer software. Trained personnel verified the cell classification and only epithelial cell nuclei were included for Nucleotyping, resulting in about 1400 nuclear images for each sample (IQR, 1000–1500).

Technical variation

The included samples were prepared over a 10-year period. The microscopy system and sample preparation technique has been modified to some degree during these years, resulting in changed pixel resolution and image grey levels. The studied Nucleotyping method is founded on these attributes of the nuclear images and it is thus crucial that they are consistent across all samples in order to avoid occlusion of prognostic information.

In order to compensate for the technical dissimilarities, we normalised the size and grey levels of the nuclear images based on automatic detection of diploid cells in each sample, see Supplementary Methods for details. After normalisation, diploid cells in all samples had on average the same nuclear size and estimated amount of DNA.

Nucleotyping

The chromatin organisation in cell nuclei was automatically analysed by examining the spatial variations in DNA density within each nucleus. The observed spatial variations gave a single numerical value for each patient, termed the chromatin value, which was categorised into a two-group risk assessment of the patient.

For a given pixel in an image of a DNA-specifically stained cell nucleus, the spatial variation in DNA density was characterised by the entropy of the grey levels in the surrounding pixels. Entropy is a concept originating from thermodynamics, where it is used as a measure of disorder. In our context, it was applied to quantify the spatial disorder in DNA density. If the surrounding pixels have similar DNA densities, then the entropy will be low, while the entropy will be high in nuclear regions with diverse DNA densities. It is thus closely related to the organisation of euchromatin and heterochromatin in the nucleus and can be seen as a descriptor of either epigenetic phenotype or image texture.

The entropy in a nuclear region was paired with the grey level value at the region centre. This pair was computed for every pixel in the nuclear image, and a table was created to store the number of times each pair occurred; the procedure is illustrated in Supplementary Figure 1. This table gives the grey level entropy matrix (GLEM; Yogesan et al, 1996), which has shown diagnostic and prognostic significance (Jorgensen et al, 1996; Dunn et al, 2011; Nielsen et al, 2012, 2015).

Previous studies on texture analysis have shown that important diagnostic and prognostic information is found in the proximity of the nuclear periphery (Young et al, 1986; Nielsen et al, 1999, 2001; Wei et al, 2011). To include radially dependent information, we stratified the GLEM computation on each of five consecutive 10% radial regions of the nuclei (Supplementary Figure 2). Joining the GLEMs computed for each of these five radial regions resulted in a three-way table termed the radial GLEM (RGLEM); see Supplementary Methods for details. This table will describe DNA content and spatial variations in DNA density near the nuclear periphery.

Cell nuclei of widely different size have been shown to express distinct spatial variations in DNA density (Nielsen and Danielsen, 2006). Merging the characteristics of all nuclei may therefore degrade the prognostic ability, but can be dealt with by stratifying or selecting nuclei based on their size. Preliminary results led us to select the small nuclei (<2000 pixels) for this study, or the 1% smallest nuclei for patients with no small nuclei (only one patient in the study cohort). Consequently, the RGLEM was calculated for about 80 (6%) nuclei in each sample (IQR, 29–112).

An established algorithm was applied to obtain a single numerical value from the RGLEM of each tumour block sample, and the value most indicative of recurrence was automatically selected to represent the patient (see Supplementary Methods). The patient value was categorised into two outcome groups, representing either the indication of recurrence or the absence of such, using the conventional minimum Euclidean distance classification method where the patient is classified to the outcome group with most similar mean patient value (Duda et al, 2000). The patient grouping resulting from this procedure is hereafter referred to as the Nucleotyping classification. Figure 1 depicts the steps required to assess the Nucleotyping classification for a new patient.

Figure 1
figure 1

How to obtain the Nucleotyping classification of a prostate cancer patient. (A) After surgery, nuclei of the prostate cancer specimen are imaged. (B) Spatial variations in DNA density near the nuclear periphery are described by a single value, which is the chromatin value for the patient. (C) A fixed threshold is applied to obtain the estimated prognosis, either recurrence or no recurrence.

Training and validating Nucleotyping

Computing the Nucleotyping classification requires a patient set because it applies an adaptive algorithm, that is, an algorithm that is automatically tuned to a patient cohort. The patients utilised to train a classifier cannot subsequently be applied to realistically evaluate its performance (Schulerud et al, 1998; Nielsen et al, 2008). We have therefore divided the patient material into a learning data set (n=154) and a validation data set (n=153) by systematically sampling every other patient after sorting by surgery date. The Nucleotyping classifier was adapted to the learning set and blindly applied to the validation set, providing an unbiased estimate of its predictive accuracy. Only a single classifier was validated, and thereby the multiple comparisons problem was avoided (Miller, 1981).

Repeated random sampling was applied to reliably estimate the expected validation performance of the Nucleotyping classifier (Michiels et al, 2005). The combined learning and validation set was randomly divided into new equally sized learning and validation sets (1000 repetitions). The Nucleotyping classifier was adapted to each resampled learning set, which is crucial to avoid overoptimism (Ambroise and McLachlan, 2002; Schulerud and Albregtsen, 2004), and evaluated in the corresponding, disjoint validation set.

DNA ploidy analysis

The grey level of a pixel is directly related to the DNA content at the pixel position. Adding the DNA content measurements of each nuclear pixel gives the total amount of DNA in the nucleus. This estimated DNA content of each cell in a tumour block sample was plotted as a histogram (see Supplementary Methods for details). Specially trained personnel classified these histograms as diploid, tetraploid or aneuploid according to specific criteria, see (Pretorius et al, 2009).

CAPRA-S

The CAPRA-S score is the sum of category points assigned to PSA, Gleason score, SM, ECE, SVI and LNI, and can be used directly or grouped at low (score 0–2), intermediate (score 3–5) and high (score 6–12) risk of recurrence (Cooperberg et al, 2011). We integrated the Nucleotyping classification and the exact CAPRA-S score to obtain a Nucleotyping-pathological-classifier (NPC); based on the learning set, it was defined as the CAPRA-S score plus three if Nucleotyping indicated recurrence.

Statistical analyses

Recurrence was used as the end point in survival analysis, and time to recurrence was calculated from surgery to recurrence, death or 31 December 2008. The estimated survival functions were compared using Mantel–Cox' log-rank test in univariate analysis of categorical variables and Wald’s χ2-test in univariate analysis of continuous variables and in multivariate analysis. The proportional hazards assumption was assessed using Schoenfeld residuals and found acceptable. The model’s ability to predict patient outcome was measured with the concordance index by Harrell et al (1982), and the average optimism estimate from 10 000 bootstrapped samples was applied to adjust the concordance index. Difference in concordance index between two models was assessed by the sign test on the bootstrapped samples. Decision curve analysis was performed to compare the net benefit of different models (Vickers and Elkin, 2006). Associations were evaluated using Pearson’s χ2-test and Mann–Whitney’s U-test. Patients with missing values for at least one included variable were excluded from the analysis. Statistical significance was defined as two-sided P<0.05. MATLAB 2012b (The MathWorks, Natick, MA, USA) was used for Nucleotyping, whereas survival analyses were carried out in Stata/SE 14.0 (StataCorp, College Station, TX, USA) and R v.3.1.3 (http://www.r-project.org).

Results

Clinical and pathological characteristics of the studied patient cohort are summarised in Table 1. Few patients (6%) had low Gleason scores (6) and the majority (67%) had the intermediate Gleason score (7). All CAPRA-S scores were observed, but about half the patients (49%) were high-risk, by CAPRA-S (score 6). The vast majority of the analysed small nuclei were of diploid cells in G0 or G1 phase, but all cell cycle stages and ploidy types were represented (Supplementary Figure 3).

Table 1 Characteristics of the studied patients

Nuclear texture analysis near the periphery of the smallest cell nuclei was significant for the validation cohort in univariate analysis (P=0.004; hazard ratio (HR) with 95% confidence interval (CI)=2.3 (1.3–4.1)) and when stratified on each CAPRA-S score (P=0.016; HR=2.3 (1.2–4.6)). Survival curves grouped by the Nucleotyping classification are shown in Supplementary Figure 4. The median (95% CI) sensitivity, specificity and correct classification rate of our epigenetic marker were 70% (54–83%), 65% (55–74%) and 67% (59–73%) in repeated random sampling. Similarly, the classification accuracies were 73%, 66% and 68% in the combined learning and validation set.

In the combined set, Nucleotyping was significant in univariate analysis (P<0.001; HR=3.3 (2.1–5.1)) and with adjustment for CAPRA-S (P<0.001; HR=2.5 (1.5–4.1)). The epigenetic marker and CAPRA-S showed low positive correlation (Spearman’s ρ=0.36, P<0.001). Medians (IQRs) for CAPRA-S and NPC were 5 (4–8) and 7 (4–10). Their concordance indices after bootstrap optimism adjustment were 0.76 and 0.78, respectively, and the difference was significant in internal validation (P<0.001).

Figure 2A and B illustrate that our epigenetic marker substratified patients who were high-risk by clinical and pathological characteristics (CAPRA-S score 6: P=0.001, HR=2.4 (1.4–4.0)) and also patients at low or intermediate risk (CAPRA-S score 5: P=0.010, HR=3.5 (1.3–9.6)). This additional prognostic information could be incorporated in an NPC risk-stratification tool by letting Nucleotyping separate the intermediate- and high-risk CAPRA-S groups, as visualised in Figure 3. The bootstrap optimism-adjusted concordance index for the NPC risk groups was 0.75, which was 0.045 higher than for the CAPRA-S risk groups (P<0.001). Figure 4A displays the increased net benefit of the NPC risk stratification.

Figure 2
figure 2

Kaplan–Meier curves of recurrence probability after radical prostatectomy grouped by Nucleotyping. (A) Patients with high CAPRA-S (score 6). (B) Patients with low or intermediate CAPRA-S (score 5). (C) Patients with low Gleason score (6). (D) Patients with Gleason score 3+4. (E) Patients with Gleason score 4+3. (F) Patients with high Gleason score (8). (G) Patients with diploid DNA ploidy type. (H) Patients with non-diploid DNA ploidy type. Abbreviations: CAPRA-S=Cancer of the Prostate Risk Assessment Postsurgical; NNR=Nucleotyping indicated no recurrence; NR=Nucleotyping indicated recurrence.

Figure 3
figure 3

Flowchart illustrating the definition of the three NPC risk groups. Abbreviations: CAPRA-S=Cancer of the Prostate Risk Assessment Postsurgical; NPC=Nucleotyping-pathological-classifier.

Figure 4
figure 4

Survival decision curves for predicting 10-year post-radical prostatectomy time to recurrence (TTR) for ( A ) all patients and ( B ) patients with Gleason score 7. CAPRA-S and NPC, both grouped in three risk groups, were converted into 10-year TTR probabilities before estimating net benefit. The integrated model demonstrates a higher net benefit. Abbreviations: CAPRA-S 3-group=the three Cancer of the Prostate Risk Assessment Postsurgical risk groups; NPC 3-group=the three Nucleotyping-pathological-classifier risk groups.

The prognostic impact of Nucleotyping within each Gleason category is shown in Figure 2C–F. Besides the small group of patients with Gleason score 6 which consisted of 18 patients, only one with recurrence, the epigenetic marker provided additional prognostic information in each Gleason category with significance for Gleason score 3+4 and 4+3.

DNA ploidy was statistically significant in univariate analysis (P=0.010; HR=1.7 (1.1–2.5)), but not when stratified on each CAPRA-S score (P=0.83; HR=0.95 (0.61–1.48)). Nucleotyping was able to identify patients at increased risk of recurrence in each DNA ploidy group (Figure 2G and H). Univariate and multivariate analysis of all candidate markers are shown in Supplementary Table 1.

Patients with Gleason score 7

Nucleotyping was significant among patients with Gleason score 7 with adjustment for CAPRA-S (validation: P=0.006, HR=4.3 (1.4–13.1); combined set: P=0.002, HR=2.9 (1.4–5.8)). The bootstrap optimism-adjusted concordance index for the NPC score was 0.77 in this subgroup of the combined set, compared with 0.74 for the CAPRA-S score, a difference which was significant (P<0.001). Correspondingly, the three NPC risk groups gave a concordance index of 0.75, while the CAPRA-S risk groups gave a significantly lower index of 0.71 (P<0.001). The NPC risk stratification improved the net benefit compared with the three CAPRA-S risk groups (Figure 4B).

Heterogeneity analysis

In total, 199 (19%) tumour block samples indicated recurrence according to Nucleotyping; for any given patient, at most four samples indicated recurrence (Supplementary Figure 5). No samples indicated recurrence in 164 (53%) patients, which were the patients Nucleotyping classified as not recurring. Although the epigenetic marker correctly identified 73% of the recurring patients, only 31% of their samples indicated recurrence. Typically, Nucleotyping indicated patient recurrence because of a single tumour block sample, but multiple samples indicating recurrence correlated with worse patient outcome (one vs multiple samples indicating recurrence: P=0.006, HR=1.9 (1.2–3.0); Supplementary Figure 6).

Sensitivity analyses

The prognostic value of Nucleotyping was robust to alterations in the applied definition of small cell nuclei and changes in number of analysed cells (Supplementary Figure 7). In addition, the epigenetic marker was a significant predictor of metastasis (P=0.003, HR=2.4 (1.3–4.3)) and cancer-specific survival (P=0.013, HR=2.6 (1.2–5.9)).

Discussion

Nucleotyping by texture analysis near the periphery of small cell nuclei was an independent prognostic marker in prostate cancer in both validation and the combined learning and validation set. The expected validation performance measured by repeated random sampling was similar to the performance in the combined set, justifying the application of the combined set for analysis.

Our novel epigenetic marker significantly enhanced CAPRA-S both as score and as three-grouped risk stratification. The integrated model was superior to CAPRA-S, demonstrated by significantly increased concordance with patient outcome and improved net benefit (Figure 4). In particular, there was strong evidence for improved prediction of recurrence for patients at high-risk by clinical and pathological characteristics (CAPRA-S score 6; Figure 2A).

Even when centrally reviewed by an experienced uropathologist, Nucleotyping substratified patients with intermediate Gleason score, either 3+4 or 4+3 (Figure 2D and E). Merging Nucleotyping with the three CAPRA-S risk groups markedly augmented the predication accuracy for these patients, increasing the concordance index from 0.71 to 0.75 and amplified the net benefit. Proper identification of low-risk patients with Gleason score 7 tumours is challenging, but important, as illustrated by the difficulties in distinguishing Gleason score 3+4 patterns from 4+3 and the debate on including Gleason score 3+4 patients in active surveillance programs. Our epigenetic marker thus meets a particular need for improved prognostication in this patient subgroup.

DNA ploidy analysis has been shown to provide prognostic information in prostate cancer (Pretorius et al, 2009; Böcking et al, 2014; Danielsen et al, 2015). In this patient series, we found DNA ploidy to be a significant prognostic marker in univariate analysis and that Nucleotyping provides additional information and is an even stronger prognostic marker (Figure 2G and H; Supplementary Table 1).

Marked heterogeneity was observed in our patient cohort. Among the recurring patients, only 31% of their samples had epigenetic characteristics indicating recurrence even though Nucleotyping correctly classified 73% of them. The risk of patient recurrence gradually increased with the number of samples indicating recurrence according to our epigenetic marker (Supplementary Figures 5 and 6), suggesting that the presence of similar epigenetic aberrations at multiple foci correlates with a more advanced disease. Further investigations are warranted to validate this prognostic trend and identify its mechanistic foundation.

Robustness is essential for a clinically applicable prognostic marker. The studied small cell nuclei constituted only 6% of all epithelial nuclei. Sensitivity analyses showed that Nucleotyping was robust to changes in number and size of analysed nuclei (Supplementary Figure 7). Moreover, the epigenetic marker delineated the patient outcome in analyses of time to metastasis and cancer-specific survival, demonstrating that the marker also correlates with later and harder oncologic endpoints.

The observed DNA density patterns in the proximity of the nuclear periphery (data not shown) were in accordance with observations by Nielsen et al (2001) in early ovarian cancer. Patients with recurrence were associated with decreased DNA density near the periphery of the nuclei and increased DNA density in the nuclear interior. As peripheral chromatin is typically more condensed in normal cell nuclei than interior chromatin, this agrees with observations that the radial arrangement is reduced in tumour cell nuclei (Cremer et al, 2003; Zuleger et al, 2011).

Although it is preferable that only a single tumour block sample is applied to assert a biomarker, the heterogeneity in prostate cancer implies that multiple sampling may be essential to obtain the best prognostication. Indeed, most pathological assessments, including Gleason grading, utilise all samples and we have therefore applied multiple samples to assess the Nucleotyping classification.

Extensive PSA testing in recent years has led to earlier diagnosis and the discovery of disease that would previously not have been detected. Current prostate cancer patients thus have fewer and later recurrences and less aggressive clinical and pathological characteristics. The majority of patients included in this study were diagnosed prior to extensive PSA testing. The high recurrence rate and long follow-up in the present patient cohort makes it particularly applicable to analyse more advanced prostate cancer patients by today’s standard.

In conclusion, Nucleotyping by texture analysis near the periphery of the smallest cell nuclei is an objective and independent prognostic marker in prostate cancer. It was integrated with CAPRA-S and then significantly enhanced the risk-stratification of patients treated with radical prostatectomy. The improvement was most distinct for patients with high-risk CAPRA-S scores and for patients with Gleason score 7. Nucleotyping may therefore be applicable for treatment management of prostate cancer patients. An independent multicentre trial to validate the prognostic impact is warranted.