Introduction

Pancreatic neuroendocrine tumors (PNETs) are heterogeneous neoplasms which are generally considered rare with an incidence around 0.40 per 100,000 subjects/year1,2 but their incidence has more than doubled in the last 30 years1,3,4. PNETs represent only 2% of all pancreatic neoplasms by incidence, but 10% by prevalence2,5. PNETs have been poorly studied, due to their rarity, and compared to other cancer types very little is known regarding either environmental or genetic risk factors for their occurrence. The role of traditional cancer risk factors such as smoking and alcohol consumption seems controversial6,7,8. On the contrary, type two diabetes (T2D) and family history of cancer have been consistently associated with PNET risk7,8.

Therefore genetic risk factors may have a role in disease aetiology, at least in a subset of PNET cases. Despite this, the impact of the genetic variability in the disease incidence is poorly understood. Only a small number of case-control studies have been performed to uncover the genetic susceptibility to PNETs9,10,11,12 and no genome wide association study (GWAS) has been performed yet.

With the goal of further our knowledge on PNET susceptibility we have performed a case-control study considering the genetic variability of the CDKN2A/2B region. The selection of the region was motivated by a fairly large amount of evidences pointing to a key role of this locus in pancreatic cancer onset and prognosis. For example CDKN2A is commonly mutated or de-regulated in both endocrine and exocrine pancreatic cancer13. Genetic polymorphisms in the locus have been reported to be associated with type two diabetes mellitus (T2DM), which is a one of the few suggested risk factors for PNETs14,15,16, suggesting a shared genetic background between T2DM and PNETs. Genetic variants belonging to the CDKN2A/2B region have been identified through GWAS as susceptibility markers for several human traits and diseases, including a large number of tumor types17,18,19,20,21,22,23,24. In addition we have recently showed the association of the CDKN2A/2B-rs3217992 SNP with increased risk of pancreatic ductal adenoma carcinoma (PDAC)25. The pleiotropic role of this region is justified by its crucial role in the regulation of the cell cycle26,27. Finally in a manuscript investigating the genetic susceptibility to endocrine tumors (NETs) Ter-Minassian and colleagues have suggested the association of four SNPs (representing two independent signals because of high linkage disequilibrium (LD)) in this region and an increased risk of the disease11.

Our hypothesis was that common genetic variability at the locus could modulate the risk of developing PNETs, as it has been shown for other cancer types.

Results

Data filtering and quality control

The origin of the population by country is shown in Table 1. None of the SNPs were out of Hardy-Weinberg equilibrium (HWE) in controls (p > 0.05). A total of 311 subjects (17 PNET cases and 294 controls) were removed after genotyping because they had a call rate < 75%. After the removal of these subjects the average SNP call rate was 95.54% with a minimum of 89.79% (rs3731246) and a maximum of 99.15% (rs3218009). The quality control analysis showed a concordance rate of 99.68% between the duplicate samples. After exclusions, 320 cases and 4,436 controls were used for statistical analyses.

Table 1 Study population.

SNPs main effect

We observed a statistically significant association between the carriers of the A allele of the rs2518719 SNP and an increased risk of developing PNET (ORhom = 2.08, 95% CI 1.05–4.11, p = 0.035). The association was statistically significant only comparing the rare and common homozygous individuals. None of the other SNPs showed any statistically significant associations. The frequencies and distributions of the genotypes, the odds ratios (ORs) for the association of each polymorphism with PNET risk and relative confidence intervals (CI) are shown in Table 2.

Table 2 Associations between selected SNPs in the CDKN2A/2B region and PNET risk.

Possible functional effects

We used several bioinformatic tools to predict possible functional relevance for the SNPs showing the most significant associations. RegulomeDB showed a score of 4 suggesting the presence of a transcription factor binding motif and a DNase sensitivity peak for rs2518719. HaploReg also suggested the presence of a DNase sensitivity peak and the polymorphism to alter the sequence recognized by the Neuron-Restrictive Silencer Factor (NRSF) regulatory repressor. No significant association between rs2518719 and expression of any gene is reported in the GTEx project. We used The SNAP software to find SNPs in LD with rs2518719 and we found 9 variants that had a minimum LD of 0.760 (rs2188127, rs3731222, rs3731217, rs3731204, rs3731198, rs2811711, rs495490, rs575427, rs647188) but also for them there was no evidence of association with gene expression in GTEx.

Discussion

The background of common genetic susceptibility to sporadic PNETs is largely unknown. The rarity of the disease is certainly one motivation for the scarcity of the information on the disease genetic susceptibility. Only a small number of studies have been performed with the largest study having 101 cases and 432 controls11. To further our understanding on the topic we conducted the largest study on the disease, with up to 320 cases and 4,436 controls, taking advantage of the mainframe of the Pancreatic Disease Research (PANDoRA) Consortium. The only finding of potential significance was that the carriers of the rare A allele of the rs2518719 SNP had an increased risk of developing the disease. This SNP belongs to the CDKN2A gene and lies in the second intron of the gene around 2000 bp from the start of the 3′UTR. Rs2518719 is in tight LD with another variant in the gene, rs3731217 (r2 = 0.925, D’ = 1 in Caucasian, as reported by 1000 Genomes), that lies in the first intron of the gene. This latter SNP is a well known pleiotropic susceptibility polymorphism and it has been found to be associated with increased risk of developing childhood acute lymphoblastic leukemia28,29, differentiated thyroid carcinoma30 and salivary gland carcinoma31. In addition the SNP was also reported to modulate the survival of oropharingeal cancer patients32. Despite all these evidences, indicating a role of the variant allele in developing various diseases, no functional studies have been performed and therefore it is not possible to elucidate a possible direct effect of the SNP. It has been suggested that rs3731217 might be involved in the regulation of the p53 gene expression, but given the before mentioned lack of direct functional evidence this remains highly speculative32.

The observation that it is always the same allele, in different cancer types, to be associated with an increased risk indicates a pleiotropic role for rs2518719/rs3731217 (or one of the other variants in tight LD) and also strongly suggests that the causal variant alters the function of the protein, the regulation of the gene expression or both in a way that influences the chances of developing cancer in different organs. The 9p21.3 locus in general, and the CDKN2A/CDKN2B genes in specific, are a classic examples of pleiotropic regions since they are associated with a very large number of human traits and diseases17,18,19,22,23,24. Pleiotropic regions are probably more accessible DNA stretches than normal and therefore variability within them may result to be non neutral more likely than in any other randomly selected DNA sequence. However the regulation of pleiotropic region is likely to be more complex than other genome parts and therefore this increases the difficulties in understanding the effect of the genetic effect at each single locus.

The results from HaploReg suggested that rs2518719 could alter the sequence recognized by the NRSF regulatory repressor. NRSF is encoded by the RE1-Silencing Transcription factor (REST) gene, and its deregulation has been associated to the development of several tumors including colon and lung cancer33. Therefore a possible explanation of the pleiotropic valence of this SNP may rely on the regulatory effect of the sequence recognized by NRSF/REST. However, even if it is intriguing, also this functional explanation remains speculative especially considered that NRSF is primarily involved in the silencing of neural genes in non neuronal tissues34. The lack of reliable functional data and eQTLs for the SNP can be explained by the fact that the entire CDKN2A/2B region is under a very complex gene regulation.

The results from Ter-Minassian and colleagues, in their study on neuroendocrine tumors, are in agreement with what we found suggesting, also in their case, an increased risk specifically of PNET for individuals carrying the rare allele for rs2518719 and the linked rs3731217 and rs373119811. In that paper the authors also observed an independent signal from the rs3731211 variant. Considering our sample size and their ORs we had a power greater than 99% of finding the association but we did not. Considering also the p value reported (p = 0.042) the most likely explanation for this discrepancy is that the association was due to statistical fluctuation due to the rarity of the disease.

In the light of multiple testing this association is not statistically significant, however considering the concordance with previous reports and the low statistical power permitted by the rarity of the disease a Bonferroni correction is too strict. We therefore used also the False Positive Report Probability (FPRP35) and using a prior of 0.25 the association retains noteworthiness (posterior p = 0.188). We used a prior p = 0.25 based on the fact that the polymorphism is pleiotropic or in almost complete LD with one, and that it has been found to be associated already with PNET risk. We are aware that the results given by the FPRP are indications and that the final confirmation of the association can be only given by functional studies.

The present study carries some limitation, such as limited clinical information on the sporadic PNETs patients in terms of environmental and familial risk factors and disease stage and grade. However, our results, taken together with what found by Ter-Minassian and colleagues convincingly suggest that the pleiotropic CDKN2A region is associated with the risk of developing PNETs as already observed for several other cancer types.

Materials and Methods

Study population

In the present study 320 sporadic PNET patients and 4,436 controls belonging to the Pancreatic Disease Research (PANDoRA) consortium were recruited in 4 European countries. Cases were sporadic, i.e. not observed in the context of genetic syndromes associated with PNET, such as MEN-1, MEN-2, VHL or TSC. Controls were recruited in the same hospitals, or at least geographical region from where the cases were recruited. All participants signed a written consent form. The study protocol was approved by the ethic board of the University of Heidelberg and was carried out according to declaration of Helsinki. Additional information on the PANDoRA consortium have been given elsewhere36.

SNPs selection

We investigated the common genetic variability in the CDKN2A/B region using tagging SNPs and potentially functional SNPs. Tagging SNPs were selected using a pairwise tagging method with a minimum r2 of 0.8 and a minor frequency allele of 0.05. We identified 11 SNPs that efficiently tagged the selected gene region. Subsequently we added two (rs1063192 and rs3217992) putative miR-SNPs that are polymorphic variants predicted to alter the binding of one or more microRNAs to their target. Therefore the final selection resulted in 13 SNPs.

Sample preparation and genotyping

For each sample DNA was extracted from whole blood using the AllPrep Isolation Kit (Qiagen, Hilden, Germany) or the Qiagen-mini kit (Qiagen, Hilden, Germany), according to the manufacturer’s protocol. Blood was kept frozen before the extraction. Genotyping was performed using KASP (KBioscence, Hoddesdon, UK) and TaqMan (Thermo Fisher Scientific, Waltham, MA, USA) technologies. Genotyping was carried out using 384 well plates using 5ng of DNA for each sample. The order of DNA samples was randomized on plates in order to ensure that similar numbers of cases and controls were analyzed in each batch. Detection was performed using an ABI PRISM Viia7 sequence detection system with Viia7 software (Applied Biosystems, Foster City, CA, USA). The personnel performing the genotyping was blinded on the identity of the subject (i.e. whether the DNA belonged to a case or a control subject). For quality control, duplicates of 10% of the samples were interspersed throughout the plates. In addition, we discarded all the samples that had a call rate < 75%.

Statistical analysis

Using Pearson’s chi-square test we checked the departure from Hardy–Weinberg equilibrium (HWE) for all SNPs in the control subjects of the study. Unconditional logistic regression computing odds ratios (OR), 95% confidence intervals (95% CIs) and p values was used to estimate the association between the genotypes of all polymorphisms and PNET risk. The more common allele among the controls was assigned as the reference category and the co-dominant model inheritance model was assessed. All analyses were adjusted for age, gender and geographic origin.

Multiple testing

We used two methods to correct for multiple testing: a robust conservative test and a Bayesian one. The threshold to declare an association to be significant with a Bonferroni correction is 0.0038 (0.05/13). Considering the vast a priori knowledge on the region and on the SNPs in particular we opted to use also the False Positive Report Probability (FPRP) method. The FPRP was developed by Wacholder and colleagues35 to assess if an association is ‘noteworthy’ using a Bayesian approach that includes a priori knowledge of the variable taken in consideration. For associations with moderate to high prior evidence (e.g. association reported in a previous study, convincing functional evidence) the prior probabilities used are in the range 0.10–0.25, whereas lower prior probabilities are employed with decreasing information on the SNP and/or the relation between the SNP and the disease35,37.

Bioinformatic analysis

We used several bioinformatic tools to assess the possible functional relevance for the SNP showing the most significant association with risk of developing PNET. RegulomeDB ( http://regulome.stanford.edu/)38 and HaploReg v2B39 were used to identify the regulatory potential of the region nearby each SNP. The GTEx portal web site40 was used to identify potential associations between the SNP and expression levels of nearby genes (eQTL). In addition we used the SNAP software41 to find SNPs in LD with the SNP that showed the strongest association with PNET risk using a threshold of r2 = 0.70.

Additional Information

How to cite this article: Campa, D. et al. Common germline variants within the CDKN2A/2B region affect risk of pancreatic neuroendocrine tumors. Sci. Rep. 6, 39565; doi: 10.1038/srep39565 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.