Main

Although prostate cancer (PCa) is highly prevalent and prostate-specific antigen (PSA) screening has led to an abundant diagnosis of disease, including many indolent cases, a substantial number of men will still develop symptomatic metastases or die from their cancer. The ability to identify individuals with a more aggressive disease phenotype would result in more appropriate initial treatment strategies. The need for novel biomarkers to add predictive capacity to existing clinical nomograms at time of diagnosis is of upmost relevance.

Inherited germline susceptibility loci, such as single-nucleotide polymorphisms (SNPs) have the potential to be effective biomarkers, not only in screening for disease but also in contributing to predicting recurrence and response to specific treatments. Since the first PCa genome-wide association study (GWAS) in 2006, there are now over 75 such SNPs associated with disease risk (Eeles et al, 2013). Although related to risk, there is far less known about the ability of these SNPs to discriminate aggressive disease. To date, there are only a handful of studies that have looked at the association of PCa risk loci and disease-specific end points (Penney et al, 2009; Wiklund et al, 2009; Gallagher et al, 2010; Pomerantz et al, 2011; Szulkin et al, 2012; Shui et al, 2014). Many of these studies have small cohorts, have variability across institutions, and lack granular data on disease extent, treatment and length of follow-up.

In this study, we aimed to assess the frequency of a large selection of validated PCa susceptibility alleles from previously reported GWAS within a single institution ascertainment of PCa patients with long-term follow-up in order to determine association between risk alleles and disease outcomes, including clinical disease progression and PCa-specific mortality (PCSM).

Materials and Methods

Study population

The study population consisted of 1354 men treated for localised PCa at the Memorial Sloan-Kettering Cancer Center between June 1988 and December 2007. Seven hundred and sixty-two individuals identified themselves as being of Ashkenazi Jewish descent, whereas over 90% of the remaining 592 were self-reported non-Jewish Caucasian individuals. Blood samples were drawn and medical records were collected as part of an institutional PCa research database using standardised questionnaires and chart abstraction forms. Pertinent clinical data included disease stage (TNM classification), Gleason score (from needle biopsy), PSA levels and age at diagnosis, as well as dates of biochemical recurrence (BR), development of castration-resistant metastasis (CM), PCSM and overall patient survival.

All patient records were reviewed by physicians to confirm the clinical end points being tested. Age at diagnosis was considered as the date of first positive prostate biopsy. BR was defined as a single measure of PSA0.2 ng ml−1 after radical prostatectomy, and a value of ‘nadir +2’ after other therapy (Stephenson et al, 2006; Nielsen et al, 2008). CM was defined as time of progression of disease following initiation of antiandrogen therapy. Review of the patient’s death certificate and/or medical record identified cause of death. In accordance with an institutional research board-approved protocol, patient identifiers were removed at the time of genetic analysis.

Selection of SNPs and genotyping

A total of 75 susceptibility loci of interest were identified, the majority of which were selected on the basis of being significantly associated with PCa risk from previous published GWAS (n=67), the remainder were SNPs associated with PSA levels (n=6). Established PCa risk SNPs that we previously evaluated for association with disease end points in a subset of this cohort were excluded (Gallagher et al, 2010). SNPs were genotyped using the Mass ARRAY QGE iPLEX system (Sequenom, Inc. San Diego, CA, USA; Gabriel et al, 2009). PCR and extension primers for SNPs were designed over three separate multiplex assays using Mass ARRAY Assay Design 3.0 software (Sequenom, Inc). PCR and extension reactions were performed according to the manufacturer’s instructions, and extension product sizes were determined by mass spectrometry using the Sequenom iPLEX system. Duplicate test samples and negative controls were also included. In all, 14 of 75 SNPs (19%) failed quality control and were removed from the analysis. The remaining 61 SNPs had an average genotype call rate of 86%, with each SNP being in Hardy–Weinberg equilibrium (Tables 1 and 2). The minor allele frequency in the study cohort ranged from 1 to 49%. The average control rate among duplicate samples was 98%.

Table 1 Prostate cancer risk polymorphisms genotyped and analysed in study cohort
Table 2 PSA-associated polymorphisms genotyped and analysed in study cohort

Statistical methods

Univariate Cox proportional hazards regression was used to investigate the association between each SNP and BR, CM, and PCSM. Each SNP was analysed under an additive model. The risk allele for each PCa SNP was defined as the allele associated with an increased risk of disease in the literature. Time at risk was calculated from the date of diagnosis to the date of event or date of last contact, and patients without the event were censored at their last follow-up date.

Multivariable analyses were conducted controlling for self-reported Ashkenazi Jewish ancestry; age at diagnosis; biopsy Gleason grade coded as a continuous variable (1=Gleason <=6, 2=Gleason 7, 3=Gleason >=8); and clinical stage coded as a continuous variable (1=T1, 2=T2, 3=T3/4).

Collection of blood samples for genetic testing began in 2000, and therefore, some cases diagnosed before 2000, and who died before 2000 (or who did not participate in blood sampling), were not included in this cohort. This scenario is referred to statistically as ‘left truncation.’ To account for this, we left-censored the interval from diagnosis to blood draw for each patient.

To address issues of multiple testing, by examining 61 SNPs and applying a Bonferroni correction, statistical significance was defined as P<0.0008. All statistical analyses were conducted using Plink (v1.07) and R (v2.9.1) as we have previously described (Willis et al, 2012).

Results

One thousand three hundred and fifty-four patients were genotyped. Patient characteristics are presented in Table 3. The median age at diagnosis was 66 years (y) and median pre-operative PSA was 7.3 ng ml−1. Treatment at presentation was based on patient and physician preference. The majority of patients (93%) were treated with curative intent: 466 (34%) underwent radical prostatectomy with 804 (59%) receiving radiotherapy (RT) with or without antiandrogen therapy. A majority of patients (61%) had biopsy Gleason score 7, and 53% of patients with available clinical staging information had T2 disease. Median (interquartile range) follow-up for survivors was 10.4y (7.2–13.8). At last follow-up, BR was documented in 671 patients (49%), CM in 313 (23%), with 194 (14%) individuals having died from PCa. Median (interquartile range) BR-free survival was 8.1y (2.6–not reached) and median time to CM 21.4y (11.7–23.3). At 5y after PCa diagnosis, 98% of the study population were alive, 91% at 10y, 76% at 15y and 62% at 20y.

Table 3 Characteristics of study population

Univariate associations between susceptibility loci and PCa outcomes (P<0.05) are summarised in Table 4. In all, 2 of 61 SNPs, rs13385191 and rs339331, were associated with an increased risk of BR (P<0.05). Three SNPs were associated with CM (P<0.05); rs13385191 associated with an increased risk of CM (hazard ratio (HR)=1.26), with rs9284813 and rs11067228 both associated with decreased risk of CM (HR=0.75 and 0.74, respectively).

Table 4 Univariate associations between SNPs and PCa outcomes under a codominant model (P<0.05 by the 2 df test)

Seven SNPs showed associations on multivariable analysis with clinical end points (P<0.05). Again rs13385191 (HR=1.36; 95% CI=1.03–1.81; P=0.02) and rs339331 (HR=1.47; 95% CI=1.04–2.08; P=0.02) were associated with an increased risk of BR. Four SNPs, rs13385191, rs1894292, rs17178655 and rs11067228, were associated with CM, with different directions of effect and rs11902236 (HR=0.78; 95% CI=0.62–0.98; P=0.03) and rs4857841 (HR=0.78; 95% CI=0.62–0.98; P=0.04) with PCSM (Table 5). Of note, none of these associations were significant after a Bonferroni correction for multiple testing was applied (P<0.0008).

Table 5 Multivariate associations between SNPs and PCa outcomes

We also asked if any of the SNPs were associated with PSA at diagnosis. One SNP, rs17632542, was significant after multiple test correction (P=1.7 × 10−5), with carriers of the risk allele [C] more likely to have lower PSA levels at diagnosis (Figure 1).

Figure 1
figure 1

Box plot graph for rs17632542 (KLK3) illustrating PSA level at diagnosis with respect to allele (Common T, Het TC, Rare C).

Discussion

Several existing PCa nomograms incorporating clinico-pathological parameters such as Gleason score, TNM stage and PSA aid in predicting likelihood of disease recurrence (Kattan et al, 1998; Stephenson et al, 2009), however, they are limited in their prognostic capabilities. Novel biomarkers to identify aggressiveness of disease and likelihood of recurrence are required. Although much focus is currently being placed on analysis of somatic mutations in contributing to these predictive models (Erho et al, 2013; Karnes et al, 2013), germline genetic variants have certain unique advantages. Knowing the inherited genetic predisposition of an individual to develop recurrent disease and metastatic progression at the time of diagnosis would clearly inform decision making regarding best initial treatment strategy and the intensity and approach to follow-up.

Since the first PCa GWAS in 2006 (Amundadottir et al, 2006) up through the most recent addition of a further 23 susceptibility loci by the PRACTICAL consortium in 2013 (Eeles et al, 2013), over 75 SNPs known to be associated with PCa risk have been identified. The ability of these susceptibility loci, however, to predict disease aggressiveness and clinical outcomes is less clear. Although several studies have reported associations with disease-specific outcomes, results are often conflicting and inconsistent (Penney et al, 2009; Wiklund et al, 2009; Gallagher et al, 2010; Pomerantz et al, 2011; Szulkin et al, 2012). Most recently, Shui et al analysed the association of 47 PCa susceptibility loci with PCSM in a large cohort with over 1000 events and reported association of eight SNPs with disease-specific death (Shui et al, 2014). In this same study, however, susceptibility loci were not able to distinguish aggressive vs non-aggressive disease (Shui et al, 2014).

We believe our current study is the first to assess a large number of susceptibility loci with respect to all three clinical end points with extensive follow-up (median 10.4y). We found evidence of associations of several SNPs with all three clinical end points on both univariate and multivariable analyses (P<0.05). Importantly, however, when incorporating a Bonferonni correction for multiple testing (P<0.0008), the only persistent significant association was with rs17632542, a previously reported KLK3 variant (Gudmundsson et al, 2010; Klein et al, 2010; Kote-Jarai et al, 2011; Parikh et al, 2011), and PSA levels at diagnosis. Thus, the evidence for association at the other SNPs is only suggestive at this point and will need to be replicated in other studies.

rs17632542 lies within exon 4 of the KLK3 gene and has also previously been associated with PCa risk (Kote-Jarai et al, 2011; Parikh et al, 2011; Penney et al, 2011; Klein et al, 2012; Knipe et al, 2014). The minor allele (C) causes a non-synonymous amino-acid change from isoleucine to threonine at position 179 (Ile179Thr). In our analysis, carriers of the C allele had a lower PSA at diagnosis; however, there were no associations seen with age at diagnosis, disease stage, Gleason grade, family history of PCa or any of the disease-specific clinical end points. This direction of effect is consistent with other studies such as by Gudmundsson et al who reported carriers of the rs17632542-T allele as having higher PSA levels (Gudmundsson et al, 2010). Interestingly, we have previously reported that rs17632542-C is associated with decreased PCa risk, (OR=0.64 (CI=0.51–0.81) P=0.00019). It is plausible that harbouring the rare allele of rs17632542 (C) leads to a direct effect on the function of the PSA protein, possibly through regulatory effects (on transcription of the gene), through altered protein stability or effect on antigenicity and as such detectability in PSA tests. It is also plausible that patients with the rare allele may be less likely to undergo biopsy subsequent to PSA screening because of lower PSA levels, although they may harbour asymptomatic and indolent PCa.

Two other PSA-related SNPs showed associations on multivariable analysis (P<0.05). rs11067228 is located in a linkage disequlibrium block that contains the genes TBX3 (T-Box Transcription Factor 3) and OSTF1P1 (Osteoclast Stimulating Factor 1 Pseudogene 1), with the common allele (A) being previously associated with higher PSA levels (Gudmundsson et al, 2010). The same study reported no association, however, with PCa but an association with a greater probability of having a normal prostate biopsy (Gudmundsson et al, 2010). In our study, we observed rs11067228-G to be associated with a lesser chance of development of castrate-resistant disease (HR=0.79 (CI=0.63–0.99), P=0.04). rs17178655, an intronic variant in the microseminoprotein-β gene (β-MSP), was also seen to be associated with development of CM, with carriers of the minor allele (A) less likely to develop CM (HR=0.73 (CI=0.55–0.97) P=0.03). This SNP had previously been reported by our group to be associated with semen levels of both free and total PSA (P=0.0027) but interestingly not levels of β-MSP (Xu et al, 2010). In contrast with PSA, whereby risk of PCa increases with higher PSA levels, β-MSP levels measured in serum, urine and prostate tissue have been shown to be statistically significantly lower in men with PCa and even lower in men with aggressive disease (Nam et al, 2006; Whitaker et al, 2010).

rs13385191 is located in intron 6 of C2orf43 (chromosome 2 open reading frame 43) and achieved significance (P<0.05) on multivariable analysis for both clinical end points of BR (HR=1.36 (CI=1.03–1.81) P=0.02) and CM (HR=1.28 (CI=1.03–1.60) P=0.02). This SNP was initially reported by Takata et al in 2010 with the rare allele associated with increased risk of PCa in an Asian population (OR=1.15 (CI=1.10–1.21); Takata et al, 2010) and subsequently replicated in both European (OR=1.07 (CI=1.02–1.12); Lindstrom et al, 2012) and Chinese populations (OR=1.33 (CI=1.11–1.58); Long et al, 2012). Recently, Shui et al (2014) reported association of rs13385191 with PCSM, however, with the opposite direction of effect (OR=0.88 (CI=0.78–1.00) P=0.05). C2orf43 is a highly conserved gene (Long et al, 2012) and as such, may harbour important functional variants in or within close proximity to its location around 2p24.

There were four additional SNPs (rs339331, rs1894292, rs11902236, rs4857841), which showed associations with end points at P<0.05. Importantly, however, the allele conferring an increased risk of PCa from previous GWAS studies was, in our analysis, a predictor of less aggressive disease as measured by time to recurrence and disease-specific death (Table 4).

The above results and those from other similar analyses lead us to conclude that susceptibility loci that are associated with initiation and development of PCa are likely to differ from loci that predict disease progression and aggressiveness. The mechanisms and pathways contributing to a more aggressive disease phenotype are still elusive, and additional large-scale discovery studies focusing on disease-specific end points are required. Investigating the cumulative effect of PCa SNPs may well reveal more about the molecular mechanisms of PCa oncogenesis (Jiang et al, 2013). As we discover further risk loci, pathway analysis and computational statistical programmes will hopefully shed further light on these molecular mechanisms. In addition, we must also be aware that SNP function may vary among ethnic populations as has been suggested in other recent work (Jiang et al, 2014).

Our study has several limitations: although we report associations of a large selection of PCa risk loci, there are a number of reported GWAS SNPs that due to genotyping failures, were not included in the analysis. We also did not set out to discover any novel susceptibility loci or pathways. However as strengths, we utilised a large sample size with extended follow-up and granular phenotypic data, which includes detailed pathological and treatment variables.

Conclusions

The ability to discriminate individuals who are more or less likely to harbour an aggressive PCa phenotype and who are predisposed to disease recurrence has long been the focus of attention by the urologic oncology community. Existing nomograms are clinically useful but there is significant potential to increase their accuracy with addition of new biomarkers and individual genetic predictors. In this study, we confirmed that rs17632542 in KLK3 is associated with PSA at diagnosis confirming reproducibility across multiple cohorts. No significant association was seen between loci and disease-specific end points when accounting for multiple testing. This provides further evidence that known PCa risk SNPs do not predict likelihood of disease progression. Further larger discovery analysis in cohorts with robust clinical end points are required to shed further light on germline predictors of disease recurrence to improve initial management and surveillance strategies.