Main

Over 238 000 men are estimated to be diagnosed with prostate cancer in the United States this year, with 29 000 dying from the disease (American Cancer Society, 2013). Nearly 85% of these cancers will be detected by prostate-specific antigen (PSA) screening with diagnostic PSA level above 4.0 ng ml−1 (Glass et al, 2013). Two recently reported studies showed disparate findings with regard to the impact of PSA screening on prostate cancer-specific mortality, yet both showed that the single biggest risk for screening is overdetection (Andriole et al, 2009; Schröder et al, 2009). It is estimated that 10–56% of the cases of diagnosed prostate cancer would be ‘insignificant’ and not have had a meaningful clinical impact if they had remained undetected/untreated (Draisma et al, 2009). In low-risk patients, the Prostate Cancer Intervention versus Observation Trial demonstrated that a survival benefit cannot be observed in radical prostatectomy vs active surveillance (AS) after 10 years of median follow-up (Wilt et al, 2012). Despite this latter finding, the vast majority of men with prostate cancer are actively treated with surgery, radiation, androgen ablation, or all the three (Cooperberg et al, 2010). This holds true even for those cancers deemed ideal for surveillance or less aggressive treatments (Barocas et al, 2008).

Overdiagnosed prostate cancers can cause a negative impact on the quality-adjusted life years following PSA screening (Heijnsdijk et al, 2012). It is difficult to assure a patient that a low-risk cancer at biopsy is truly ‘insignificant’ as biopsies inherently undersample the prostate (Harnden et al, 2008; Shaw et al, 2014). Using contemporary AS protocols and standard multicore biopsy techniques, patients with no evidence of a high Gleason score cancer on biopsy still carry a significant risk of having a high-grade tumour when the prostate is removed and sampled completely (Iremashvilli et al, 2012). This is troubling as a number of clinical end points have been linked to Gleason score, including TNM stage, response to different therapies, biochemical failure, progression to metastatic disease, and survival, including prostate cancer-specific survival and overall survival (Partin et al, 1995, 2001; Kattan et al, 1998, 2003; Freedland et al, 2003; Cooperberg et al, 2005). A recent study reports that 36% of tumours with a Gleason score of 6 on biopsy are upgraded at radical prostatectomy to a Gleason score of 7 or higher (Epstein et al, 2012). A similar upgrading rate of 34% as well as an upstaging rate of 13% was also reported in both the African American and other racial groups (Jalloh et al, 2014).

Given the inherent limitations of biopsies, additional means are required to more accurately assess the tumour aggressiveness. Prostate cancer-specific methylation patterns detected in urine samples provide a potential source of additional information beyond what can be assessed by biopsy. In order to demonstrate that urine-sourced methylation patterns can discriminate between the aggressive and low-risk tumours, we developed a urine-based multiplex methylation specific PCR assay. The assay measures the copy number of β-Actin as a housekeeping gene and the methylation status of GSTP1 and APC, both of which have been shown previously to be associated with prostate cancer when detected in urine (Baden et al, 2011).

Initially, we used tissue specimens to test the ability of the assay biomarkers to discriminate Gleason scores in the primary tumour. We then use the assay to examine the methylation status of these markers in urine from men undergoing prostate cancer screening in order to develop the prediction model. The objective of the validation phase was to accurately predict the adverse disease at radical prostatectomy (surgical Gleason 7 and/or pathological stage T3a) using the assay on urine specimens taken before surgery. Should the assay significantly improve risk stratification, this would improve the accuracy of identifying men who do not harbour aggressive prostate cancer and can safely avoid unnecessary therapy.

Material and methods

Patient population and sample processing

To directly measure the assay biomarkers in primary tumour tissue, a retrospective collection of 104 formalin-fixed paraffin-embedded (FFPE) radical prostatectomy tumour samples were obtained from men who underwent radical prostatectomy at the Durham VA Medical Center. After evaluating the β-Actin control, 100 of these had a valid test result. There were no restrictions on the total PSA level or any other clinical parameters.

The biopsy cohort used to train the prediction models involved a prospective trial among 18 urology clinics across the United States and included 683 men. Subjects were enrolled in this study between September 2008 and January 2009. Inclusion requirements were age between 40 and 75, total PSA levels between 2.0 and 10.0 ng ml−1, and the completion of a digital rectal examination (DRE). Among men diagnosed with prostate cancer (n=253), pathological assessments from radical prostatectomy specimens were available for 39 men.

The pre-prostatectomy cohort used for validation of the assay contains 99 male subjects from four urology clinics. Enrolment in this study commenced in September 2008 and was completed by August 2012. Subject requirements were 40–75 years of age, total PSA level between 2.0 and 10.0 ng ml−1, completion of a DRE, biopsy that confirmed prostate cancer, and scheduled to undergo a radical prostatectomy without neoadjuvant therapy.

For all the subjects in both the biopsy and pre-prostatectomy cohorts, 5–40 ml urine samples were collected following a DRE, and were processed as described previously (Baden et al, 2011). Following urine sample collection, subjects in the biopsy cohort underwent TRUS-guided needle biopsy of at least 10 cores. As such, for the biopsy cohort, urine samples were collected before biopsy. The urine samples were processed using the methylation assay. A valid assay test result was determined for 665 subjects in the biopsy cohort and 96 subjects in the pre-prostatectomy collection. A valid test result required that the amount of β-Actin in the sample was sufficient to produce an RT–PCR cycle threshold of <25 cycles.

Methylation assay

Methylation status was determined as previously described (Baden et al, 2011). Primers and Scorpion probes for two methylation markers (GSTP1, and APC) and an internal control (β-Actin) were chosen for use in a closed-tube format two-step PCR assay. Internal and external controls were used in each run; the internal control (β-Actin), to determine the adequacy of the DNA quantity and preparation, and the external control (targeted plasmid DNA), to confirm a lack of environmental contamination and to assure that reaction mixtures were functioning appropriately. Sample results that were not valid were omitted from the analyses.

Statistical analyses

The prediction model is a logistic model developed in the biopsy cohort by fitting the biopsy cohort values of β-Actin, GSTP1, APC, age, DRE, and PSA to the binary response of a biopsy Gleason score of 7 or higher vs Gleason score of 6 or lower or no cancer. Values for APC were allowed to be transformed by a restricted cubic spline as directed by the response variable, whereas β-Actin, GSTP1, PSA, and DRE were not subjected to any transformations. The calibration between the outputs of the model and the observed probabilities was evaluated using bias-corrected estimates of the risk scores vs the observed values based on nonparametric smoothers A logistic regression model using untransformed continuous PSA, age, and DRE measurements without β-Actin, GSTP1, and APC was also designed in the same manner (Harrell, 2001).

Using the prediction model scores from men in the biopsy cohort, a dichotomous cutoff was chosen to select the bottom 20% of the low-risk scores, which was found to maximise the negative predictive value (NPV) for a biopsy Gleason 7. The prediction models were then applied to an independent pre-prostatectomy validation cohort (n=96) to test its ability to discriminate between adverse (surgical Gleason 7 and/or stage pT3a) and low-risk (surgical Gleason 6 and stage pT2c) disease on final pathology using the pre-established cutoff from the biopsy cohort.

Bar charts and box plots were generated using MedCalc (MedCalc Software, Ostend, Belgium). Binomial confidence limits were calculated using the Wilson method in SAS software (SAS Institute Inc., Cary, NC, USA). Confidence intervals (CIs) of the prevalence-adjusted NPV were determined as outlined in Pepe, (2004). Clinical variables were plugged into a preoperative nomogram trained to predict organ confined disease (available at nomograms.org), the output of probabilities for this model were then used to make adverse disease predictions (Wang et al, 2006). Logistic modelling and all other statistical analysis was done using R (R Foundation for Statistical Computing, Vienna, Austria). P-values for group comparisons were determined by the two-sided Student’s t-test. The area under the receiver operating characteristic curve (AUC) was calculated using the Wilcoxon–Mann–Whitney test statistic.

Results

Formalin-fixed paraffin-embedded tissue evaluation of the radical prostatectomy specimens

Initial assessments of the predictive power of the methylated regions of GSTP1 and APC were made by evaluating the FFPE prostate cancer tissue of 100 radical prostatectomy subjects from the Durham VA Medical Center. Clinical and pathological characteristics of the patients are described in (Table 1). GSTP1 and APC were found to have a significant increase in hypermethylation in patients with a surgical Gleason score 7 compared with those with a surgical Gleason score <7 (P<0.001 and P=0.024, respectively). Furthermore, the magnitude of this difference in methylation levels increased for both the markers when the Gleason score <7 group was compared with samples that had both a Gleason score 7 and a tumour stage pT3a (P<0.001 and P=0.002, respectively; Figure 1A).

Table 1 Clinicopathological characteristics of the study population
Figure 1
figure 1

The relationship between GSTP1 and APC methylation status in FFPE tissue samples and patients with a Gleason score <7, a Gleason score 7, and a Gleason score 7 with a pathological tumour stage of at least T3a are represented by box plots in panel A. The quantitative PCR cycle threshold values are inversely related to the degree of hypermethylation. The assay scores calculated from urine results in the biopsy cohort are shown in panel B as a stacked bar chart representing the percentage of men with a biopsy Gleason 7, a biopsy Gleason <7, or no cancer. Bins were selected to contain an equivalent number of patients in each bin.

Biopsy cohort

Given the proof of principle that at least on the tissue level, GSTP1 and APC could segregate low- and high-risk tumours, we then developed a multicenter biopsy cohort to construct the prediction model. Clinicopathological characteristics for these men are shown in (Table 1). On biopsy, 263 of the 665 men with usable assay scores (40%) were diagnosed with prostate cancer and 120 (18%) had a Gleason score 7. Consistent with the association of the methylation status with adverse pathology at prostatectomy in the FFPE collection, the test score from the prediction model was significantly higher in men with a Gleason score 7 vs Gleason <7 or non-cancer preceding biopsy (P<0.001). As the individual’s test score increased, the probability of Gleason score 7 cancers increased with no appreciable increase in the risk of low-grade cancer (Figure 1B).

The prediction model trained with both the biomarkers and the clinical factors had a significantly increased AUC for predicting a Gleason score 7 on biopsy compared with a prediction model using the same clinical factors without the biomarkers (0.82 vs 0.69, P<0.001) in the biopsy cohort. A cutoff was chosen as a point that maximises the NPV of the assays ability to predict a biopsy Gleason score 7. At the cutoff that dichotomizes the bottom 20% of the low-risk scores there is a 5% predicted probability of having a high biopsy Gleason score. At this cutoff, 20% of the patients in the biopsy cohort are classified as low risk with an NPV of 100% and there is a <5% predicted probability of a biopsy Gleason 7. The predicted probabilities of the model are well calibrated to the observed probabilities with a mean absolute error of 0.009.

Pre-prostatectomy cohort

To validate the model and cutoff, we prospectively tested the urine obtained from 96 subjects undergoing radical prostatectomy. The probability of high-grade cancer and adverse pathological features increased with an individual’s test score from the prediction model, (Figure 2A and B). When combined with the Gleason result from biopsy, the test score accurately stratified men with adverse disease at prostatectomy (AUC=0.89), and outperformed the clinical model alone (AUC=0.79; Figure 3). Using the 64 subjects that had a biopsy Gleason <7, a publically available preoperative nomogram designed to predict organ confined disease had an AUC of 0.73 when used to predict the adverse disease, whereas the assay model’s AUC remained high at 0.83.

Figure 2
figure 2

Distribution of the urine-based assay test score in the pre-prostatectomy validation set and it association with the surgical Gleason score and adverse disease. The assay test score was divided into three bins (low, medium, and high), where the low bin represents the biopsy cohort established cutoff, and the medium and high bins were then chosen to contain an equivalent number of patients in each bin. In panel A the distribution is coloured by either the adverse or low-risk disease category, and in panel B the distribution is coloured by the surgical Gleason score.

Figure 3
figure 3

Receiver operating characteristic (ROC) curves of the assay test score for the prediction of surgical Gleason score of 7 or a pathological stage of T3A or higher.

Of the subjects with a biopsy Gleason score <7, 48% were found to have adverse disease following surgery. When the test score from the prediction model was evaluated as a dichotomous variable using the cutoff determined from the biopsy cohort with the criteria of a biopsy Gleason score 7 being categorised as adverse, it correctly identified 100% of the subjects with either a surgical Gleason score of 7 or greater, or a pathological tumour stage of T3a or greater (Table 2). When used to classify subjects with a biopsy Gleason <7, the NPV of the assay for adverse disease was 100% (95% CI: 86–100%).

Table 2 Assay and clinical prediction model performance data

Discussion

Many diagnosed prostate cancers are slow growing or indolent. So long as the tumour remains in that state, it does not imminently threaten the health of the patient. The European Randomized study of Screening for Prostate Cancer (ERSPC) trial determined that about 30% of the detected prostate cancers remain unlikely to progress to a point where it will cause a patient’s death (Schröder et al, 2012). At the same time, the ERSPC trial reported a 20% reduction in the prostate cancer mortality at 11 years from PSA screening. The Göteborg PSA Screening trial reported a 44% reduction in the prostate cancer mortality over 14 years (Hugosson et al, 2010). The resource intensive nature of PSA screening becomes clear when one looks at the number of men needed to be screened to lower the mortality rate. ERSPC reported 1.07 fewer deaths from prostate cancer per 1000 men screened. Nevertheless, it is clear from the high frequency of aggressive treatment of low-risk prostate cancer that patients and their physicians are not favouring AS or less aggressive treatments (Dall’Era et al, 2008).

To address this, a useful test not only should show significance in its ability to distinguish patients with and without adverse disease but must also be clinically actionable. A recent study looked at various nomograms via regression analysis, adjusted for the total number of biopsies, and concluded that the AUCs only ranged from 0.52–0.67, and for any progression the AUCs ranged from 0.52–0.70 (Wang et al, 2013). Although this is not a pure prediction of harbouring higher grade and/or stage from biopsy to RP, it highlights the point that nomograms that are purported to predict indolent disease per se really do not function well-enough and set the stage for use of biomarkers of whatever form. Coupled with the low-risk clinical factors, a high NPV with this methylation-based test should increase the confidence that a low-risk patient indeed has low-risk disease and provide added reassurance that AS and avoiding treatments will not negatively impact the survival. In our validation cohort of men undergoing radical prostatectomy, 25% of the subjects with grade 6 or below on biopsy and 17% of the entire cohort could have been recommended against aggressive definitive treatment following biopsy using the assay with an NPV of adverse findings at surgery of 100% (95% CI: 86%–100%) among these men.

The noninvasive nature of this assay provides an opportunity for monitoring without the risks associated with repeat biopsy. With each biopsy, there is a significantly increased risk of hospitalisation due to serious infectious and noninfectious urological complications (Loeb et al, 2013). Furthermore, a negative biopsy coupled with a PSA level of 3 ng ml−1 or higher has been estimated to cause 20% of men to experience a distressful psychological effect (Macefield et al, 2010). A negative attitude to repeat biopsy has also been reported in 20% of the subjects in the Prostate Biopsy Effects cohort nested within the Prostate Testing for Cancer and Treatment study (Rosario et al, 2012). The high NPV observed in the biopsy cohort suggests that an assay of this nature may have a place earlier in the prostate care pathway continuum, and possibly provide subjects with an elevated PSA with no history of biopsy or patients with a prior negative biopsy greater resolution of their risk, although minimising the distress of biopsy.

The shedding of cancerous cells from the prostate and exiting through the urine offers another opportunity to identify a cellular population that may otherwise have been missed, given that a needle core biopsy removes 1/3000th of the prostate. Using the methylation status of a series of markers secreted into the urine, the assay indicates the presence of prostate cancer with adverse features that may remain hidden from a biopsy. Hypermethylation events have been shown to correlate with an elevated risk of prostate cancer progression (Ellinger et al, 2008). Both GSTP1 and APC are hypermethylated in prostate cancer and have shown to be strongly correlated to adverse pathological features (Jerónimo et al, 2004; Zhou et al, 2004; Bastian et al, 2005; Enokida et al, 2005).

Our data provide evidence of a strong association between the presence of adverse disease and the methylation status of both APC and GSTP1 when detected in urine. We are limited by the reliance on the biopsy Gleason score to fit our prediction model because of the unavailability of post-surgical findings in this set. Although the model can classify adverse disease, the predictive probabilities from the model should not be interpreted as a calibrated probability of adverse disease. Furthermore, any direct comparisons with a validated clinical nomogram would require that the nomogram was developed using an identical definition of adverse disease. Future larger studies will be necessary to narrow the CI around the performance parameters of the assay, and to understand the impact on patient and physician decision-making.