Genetic architecture of prostate cancer in the Ashkenazi Jewish population

Background: Recently, numerous prostate cancer risk loci have been identified, some of which show association in specific populations. No study has yet investigated whether these single nucleotide polymorphisms (SNPs) are associated with prostate cancer in the Ashkenazi Jewish (AJ) population. Methods: A total of 29 known prostate cancer risk SNPs were genotyped in 963 prostate cancer cases and 613 controls of AJ ancestry. These data were combined with data from 1241 additional Ashkenazi controls and tested for association with prostate cancer. Correction for multiple testing was performed using the false discovery rate procedure. Results: Ten of twenty-three SNPs that passed quality control procedures were associated with prostate cancer risk at a false discovery rate of 5%. Of these, nine were originally discovered in studies of individuals of European ancestry. Based on power calculations, the number of significant associations observed is not surprising. Conclusion: We see no convincing evidence that the genetic architecture of prostate cancer in the AJ population is substantively different from that observed in other populations of European ancestry.

Recent studies have identified numerous single nucleotide polymorphisms (SNPs) that modify an individual's risk of developing prostate cancer (Amundadottir et al, 2006;Eeles et al, 2008;Gudmundsson et al, 2007a;Gudmundsson et al, 2008;Gudmundsson et al, 2007b;Haiman et al, 2007;Robbins et al, 2007;Thomas et al, 2008). Although some investigators have considered the possibility of heterogeneity between ethnic groups, where a SNP shows a different effect on prostate cancer risk depending on the population being studied, these studies only considered ethnic groups with different continents of ancestral origins (Haiman et al, 2007;Waters et al, 2009;Yamada et al, 2009;Hooker et al, 2010;Zheng et al, 2010). As alleles of numerous SNPs are known to vary in frequency across Europe (Bersaglieri et al, 2004), and population substructure is consistently observed in Americans with ancestry from different locations in Europe (Price et al, 2008;Tian et al, 2008), there is a possibility that prostate cancer risk alleles may have different effects in different populations of European ancestry.
Ashkenazi Jews are Jews whose ancestors come primarily from central and eastern Europe; the majority of North American Jews and a large proportion of Israeli Jews are of Ashkenazi ancestry. The global linkage disequilibrium (LD) profiles of the Ashkenazi Jewish (AJ) population do not seem to differ significantly from that of other populations of European ancestry. However, it has been suggested that there may be significant local differences in allele frequencies and haplotype structure between the Ashkenazi population and other European populations, including at loci associated with common cancer (Gold et al, 2008;Olshen et al, 2008;Price et al, 2008;Tian et al, 2008). Therefore, examination of known prostate cancer risk SNPs in the AJ population provides a unique opportunity to test for genetic heterogeneity at these loci among individuals of European ancestry.
Here, we report the results of a case -control association study in the AJ population of 29 previously identified prostate cancer SNPs. Our data argue against the hypothesis that risk alleles for prostate cancer generally have different effects in the Ashkenazi and non-Ashkenazi European ancestry populations. corresponding institutional review boards and the National Genetic Committee of the Israeli Ministry of Health approved the studies. The DNA samples from 963 prostate cancer cases were used in this study. Of these, 885 cases presented at Memorial Sloan-Kettering Cancer Center (MSKCC) with histologically confirmed prostate cancer and report all four grandparents were from Eastern Europe and Jewish. An additional 78 cases from Montreal, a Canadian metropolitan area, for whom both parents were of Ashkenazi ancestry were included. These patients were treated for prostate cancer at one of the three McGill University affiliated hospitals: the Royal Victoria and Montreal General Hospital sites of the McGill University Health Center and the Jewish General Hospital. Control DNA was collected from 1854 healthy men in New York and Israel -419 participants in New York Cancer Project (Mitchell et al, 2004), 194 samples from the National Laboratory for the Genetics of Israeli Populations (NLGIP) (www.tau.ac.il/medicine/NLGIP/nlgip.htm), and 1241 healthy individuals from the Israeli Blood Bank. All controls selfreport that all four grandparents are of Ashkenazi ancestry. These samples have been previously described (Kirchhoff et al, 2004;Shifman et al, 2008;Tischkowitz et al, 2008 A total of 29 SNPs of interest were identified based on previous reports of association with prostate cancer risk from recent genome-wide association studies (GWAS) and other studies ( Table 1). Most of these SNPs were selected on the basis of being reported in one of the GWAS papers as being significantly associated with prostate cancer risk. We also included numerous SNPs at 8q24 that were discovered in follow-up studies of this locus after its initial identification by linkage and association. Finally, we also included a SNP (rs7008482) reported as a prostate cancer risk SNP in the African-American population to see if this SNP had an effect in the Ashkenazi population despite not being associated in other European populations studied.
Samples from MSKCC, McGill University, the NY Cancer Project, and the NLGIP were genotyped using the Sequenom MassArray technology at MSKCC. This includes all cases and 31% of the controls. We designed two multiplex assays to genotype all 29 SNPs using Assay Design software (Sequenom, San Diego, CA, USA). PCR amplification and extension were performed using Sequenom iPLEX Gold reagents as per the manufacturer's protocol and analysed on the Sequenom MassARRAY system (Sequenom). Genotypes were called using the Typer 4.0 software package (Sequenom).
For quality control on the Sequenom data, we first manually inspected the cluster plots. Then, the data was processed with PLINK (Purcell et al, 2007). In all, 112 individuals with more than 20% missing data were removed. All SNPs had o20% missing data and showed no significant deviation from Hardy -Weinberg equilibrium in controls (P40.01; Table 2). Six SNPs had significant differences in genotype calling rate between cases and controls (Po0.01; FDRo0.05) and were therefore removed from further consideration.
The controls from the Israeli blood bank were processed separately as they were initially genotyped genome wide as part of a separate study not related to cancer. These samples were fully anonymised immediately after collection and subsequently genomic DNA was extracted from blood samples by using the Nucleon kit (GE Healthcare, Piscataway, NJ, USA). The samples were genotyped on the Illumina HumanOmni1-Quad arrays (Illumina, San Diego, CA, USA) according to manufacturer's specifications under protocols approved by the Institutional Review Board of the North Shore-LIJ Health System. SNPs were filtered on the following bases: call rate o98%, minor allele frequency o0.02 and Hardy -Weinberg exact test Po0.000001. The samples were filtered based on cryptic identity and first-or second-degree relatedness using pairwise identity-by-decent (IBD) estimation (PI_HAT 40.20) in PLINK with 128 403 LD pruned (r 2 40.2) genome-wide SNPs and population stratification using Principal Component Analysis with Ancestry Informative Markers specific for the AJ population.
Of the 23 SNPs that passed quality control from the Sequenom genotyping, 20 were directly genotyped on the Illumina chip. The  (2008) Abbreviations: alleles ¼ major/minor alleles; chr. ¼ chromosome; gene ¼ nearby gene as reported in the cited literature; MAF ¼ allele frequency in the controls of the cited paper for the minor allele as observed in the Ashkenazi Jewish population; prev. OR ¼ previous odds ratio for the SNP as cited by the given paper; SNP ¼ single nucleotide polymorphism. When MAF is 40.5, it indicates that the minor allele in the Ashkenazi Jewish population is the major allele in the study that initially reported the SNP.
remaining three SNPs were analysed only with the data from the Sequenom genotyping. Association analysis was performed in PLINK using logistic regression. Regression was performed twice, once without an adjustment for age and once with an adjustment for age of either diagnosis (cases) or sample collection (controls). Multiple testing was accounted for by holding the false discovery rate to be 5% using the Benjamini -Hochberg procedure (Benjamini and Hochberg, 1995).
To compute the power to detect association for each SNP, we assumed the previously reported odds ratio (OR), allele frequencies in our control population, and a sample size based on the number of successfully genotyped cases and controls. We used a previously reported method to compute the power at a significance level of 0.05 (Klein, 2007).
As a reference population of non-Ashkenazi white Americans, we used the GWAS data from the CGEMS Prostate Cancer GWAS -Stage 1 -PLCO (phs000207.v1.p1) in dbGaP (http://www.ncbi. nlm.nih.gov/gap), removing duplicate individuals. To test for the heterogeneity of the OR between the CGEMS data and our data, we used the Breslow -Day test as implemented in PLINK.

RESULTS
In the current study, we genotyped 29 SNPs previously reported as being associated with prostate cancer risk in 963 AJ prostate cancer cases and 613 AJ controls. The overall genotype call rate (fraction of genotypes for which a call is made) was 95%. After quality control (QC) filtering the Sequenom data as described in the methods, resulting in 23 SNPs that pass QC, we added data from 20 SNPs in 1241 male AJ controls genotyped with the Illumina Omni-1 Quad platform. As some controls came from Israel and some from the United States, we first queried if we observed allele frequency differences between AJ controls based on origin. Although two SNPs showed nominal differences in allele frequencies (rs9364554 and rs4242384; Po0.05), neither of these differences were significant after correcting for multiple testing. We were also concerned that the use of different genotyping platforms could lead to errors in our results. To test for this, we compared allele frequencies between individuals genotyped on the Illumina and Sequenom platforms and observed no differences (all nominal P40.3).
We tested for association in 875 cases and 1810 controls total under an additive model. Before adjusting for age, 12 SNPs were nominally associated with prostate cancer risk (Po0.05; Table 3). Of these, 10 were significant at a false discovery rate of 5%. Among the 12 significant SNPs, only one -rs7008482 -shows a direction of effect opposite from that which was previously reported. As the SNP was identified in a case -control study of African-American men, we queried what effect this SNP had in the stage 1 data from the CGEMS prostate cancer GWAS of white Americans (Yeager et al, 2007). Although not significantly associated with risk (P ¼ 0.2), this SNP has the same direction of effect that we observe in the Ashkenazi population (OR ¼ 0.92; 95% CI ¼ 0.81 -1.04). We next queried whether adjusting for age would influence these results. After removing 26 individuals without age information and adjusting for age as a covariate, nine SNPs were nominally significant (Po0.05), of which three are significant at a false discovery rate of 5% (Table 3). Notably, seven SNPs are nominally significant both with and without age adjustment.
We next wished to query if we could observe any heterogeneity between the effect size we observed in the Ashkenazi population and the effects observed in other populations of European ancestry. To do so in a systemised way, we used the stage 1 CGEMS data. There are 18 SNPs that we tested here that are also present in the CGEMS data. Of these, only one (rs4962416) shows heterogeneity (P ¼ 0.002). Although this SNP is associated with prostate cancer risk in the CGEMS stage 1 study (OR ¼ 1.3; 95% CI ¼ 1.2 -1.5), we observe no evidence for association in the AJ population (OR ¼ 1.0; 95% CI ¼ 0.9 -1.1).
For several of the SNPs, we did not replicate the association with prostate cancer risk observed in a number of prior studies (Kim et al, 2010). We note that the minor allele frequencies (MAF) at most of these SNPs in controls differ by at least 5 percentage points between the original report and our AJ individuals, and for seven SNPs the MAFs differ by at least 10 percentage points. As these differences in MAF could influence our power to replicate the previous findings, we computed the power using initially observed ORs from prior studies for each SNP along with the MAF observed in the AJ controls. We found that out of the nine SNPs for which we do not observe association with prostate cancer risk, our power to detect an association ranges from 37 to 99%; there is 480% power to detect association at four of these SNPs (Table 3). We next queried if we would expect to find 12 or fewer significant associations by chance given the computed powers. In 10 000 simulations, given the computed power, we expect to find 12 or fewer significant associations 16% of the time.

DISCUSSION
Here, in the AJ population, we have replicated the association with prostate cancer risk for many of the prostate cancer risk SNPs tested. Overall, the effect of these SNPs in the AJ population is similar to that previously reported with the discovery of the SNPs. However, there are some SNPs for which we did not replicate the previously reported association despite having adequate power to do so. One potential explanation is the 'winner's curse', in which the first report of an association overestimates the magnitude of effect, leading to an inflated power estimation. In fact, for rs10486567, our estimate of 99% power was based on an OR of 0.74 as initially reported (Thomas et al, 2008). A more recent replication study suggests a more modest effect size of 0.84, which would result in only 80% power to detect the association in the present study (Prokunina-Olsson et al, 2010). Another possible explanation is that a previously reported risk SNP is truly not associated with prostate cancer in the AJ population, either due to differences in linkage disequilibrium patterns or the absence of the functional variant in the AJ population. This may explain the lack of association between rs16901979 and prostate cancer in the AJ population, as the observed OR for this rare SNP was 1.0 in our study.
Of the significant results, only one SNP -rs7008482 -showed a different direction of effect than that which has been previously reported. This SNP was first identified as a prostate cancer risk allele in the African-American population (Robbins et al, 2007). In a well-powered study of Japanese men, no evidence for association between this SNP and prostate cancer was observed (Yamada et al, 2009). Similarly, despite being present on the genome-wide genotyping chips used in several prostate cancer GWAS, this SNP was never reported as being associated with prostate cancer in European populations (Eeles et al, 2009;Gudmundsson et al, 2009;Yeager et al, 2009), though it does have the same direction of effect in the CGEMS stage 1 data (white American individuals) as we observe in the Ashkenazi population. As the direction of effect is opposite from that previously reported, more studies will be needed to determine if this is a real prostate cancer risk SNP, perhaps tagging different functional alleles in different populations, or represents a false positive finding. Fine mapping this locus in the AJ and African-American populations could help shed light on this question.
Of the 18 SNPs also present in the CGEMS Stage 1 data set, only one (rs4962416) showed a significant heterogeneity of the OR between the CGEMS study and our AJ study. This SNP does not appear to be associated with prostate cancer in the AJ population. Although other SNPs appear to have ORs near 1, the weak effect of these SNPs in the initial reports makes it difficult to determine if these SNPs are truly not associated in the AJ population, illustrating population heterogeneity, or that our sample size is simply not large enough. Larger studies in which we are well powered to replicate the known prostate cancer associations will be necessary to answer this question. Furthermore, our sample size is likely too small to distinguish more subtle differences in the ORs between the populations.
We used controls from both the United States and Israel. Although we have previously found that AJ individuals from these two countries cluster similarly using principal components analysis (Gold et al, 2008), we nevertheless tested each SNP for allele frequency differences between Israeli and United States controls. As we did not find any significant difference, we do not think this is a major source of potential error in our study. These results provide evidence of some differences in the genetic architecture of prostate cancer between the AJ population and other populations of European ancestry. However, these few differences are not enough evidence to argue that there are substantive differences in genetic susceptibility to prostate cancer between these populations. Further study of the genetics of prostate cancer in this unique population will be needed to understand to what extent genetic risk to prostate cancer is similar to that in other European populations and to what extent it is different.
This work is published under the standard license to publish agreement. After 12 months the work will become freely available and the license terms will switch to a Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License.