Cell cycle genes and ovarian cancer susceptibility: a tagSNP analysis

Background: Dysregulation of the cell cycle is a hallmark of many cancers including ovarian cancer, a leading cause of gynaecologic cancer mortality worldwide. Methods: We examined single nucleotide polymorphisms (SNPs) (n=288) from 39 cell cycle regulation genes, including cyclins, cyclin-dependent kinases (CDKs) and CDK inhibitors, in a two-stage study. White, non-Hispanic cases (n=829) and ovarian cancer-free controls (n=941) were genotyped using an Illumina assay. Results: Eleven variants in nine genes (ABL1, CCNB2, CDKN1A, CCND3, E2F2, CDK2, E2F3, CDC2, and CDK7) were associated with risk of ovarian cancer in at least one genetic model. Seven SNPs were then assessed in four additional studies with 1689 cases and 3398 controls. Association between risk of ovarian cancer and ABL1 rs2855192 found in the original population [odds ratio, ORBB vs AA 2.81 (1.29–6.09), P=0.01] was also observed in a replication population, and the association remained suggestive in the combined analysis [ORBB vs AA 1.59 (1.08–2.34), P=0.02]. No other SNP associations remained suggestive in the replication populations. Conclusion: ABL1 has been implicated in multiple processes including cell division, cell adhesion and cellular stress response. These results suggest that characterization of the function of genetic variation in this gene in other ovarian cancer populations is warranted.

mismatch repair genes), which are rare in the general population and estimated to account for no greater than 10 -15% of ovarian cancer (Chen et al, 2006;Lancaster et al, 2007). Owing to a consensus that genetic factors have a function in susceptibility to ovarian cancer, studies targeting specific pathways in ovarian cancer case -control studies have emerged (Dicioccio et al, 2004;Auranen et al, 2005;Beesley et al, 2007;Song et al, 2007;Mann et al, 2008;Pearce et al, 2008;Quaye et al, 2008) and some report nominally significant associations with ovarian cancer risk (Buller et al, 1997;Berchuck et al, 2004;Dicioccio et al, 2004;Kelemen et al, 2008;Pearce et al, 2008;Sellers et al, 2008).
Dysregulation of the cell cycle is a hallmark of many cancers Butt et al, 2008;Nam and Kim, 2008) and control and timing of the cell cycle involves checkpoints and regulatory pathways that ensure the fidelity of DNA replication and chromosome segregation (Elledge, 1996). These processes involve a large collection of key molecules, which are excellent candidates for ovarian cancer susceptibility variants. These include the cyclins (CCNA1, CCNA2, CCNB1, CCNB2, CCND1, CCND2, CCND3, CCNE1, CCNE2, CCNG1, CCNG2), cyclin-dependent kinases (CDKS: CDK2, CDK4, CDK6, CDK7, CDC2), CDK inhibitors (CDKN1A, CDKN1B, CDKN2A, CDKN2B, CDKN2C, CDKN2D) and CDC2 regulators (CDC25A, CDC25B). The catalytic subunit of CDKs is activated by one of many activating subunits, the cyclins. Cyclin levels oscillate during the cell cycle, and cyclin -CDK complexes finely regulate progression through the cell cycle. Inhibitors of CDK promote cell cycle arrest and may affect response to mitogenic stimuli. In addition to the cyclins, CDKs and CDK inhibitors, the E2 family of transcription factors is a critical element as well as the E2F family's dimerization partners TFDP1, TFDP2, CUL1 and SKP2, which are involved in the SCF ubiquitin ligase complex. In addition, Rb (and two Rb-like genes) regulates progression of cells from G1 to S to G2 phases. CCND, CCNE and E2F are over-expressed in a variety of cancer, including ovarian cancer (D 0 Andrilli et al, 2004), and data emanating from an immunohistochemical study of ovarian cancer (Hashiguchi et al, 2004) reveals alteration of G2 in ovarian cancer specimens. The SCF ubiquitin ligases are well-characterized mammalian cullin RING ubiquitin ligases (Frescas and Pagano, 2008), and this complex is an essential element in the CDKNA -CDK2 S phase. SKP2 activates CDK2 and CDK1 by directing the degradation of CDKN1 (p27) and CDKN1B (p21). SKP2 is also known to target tumour suppressor proteins p21 and CDKN1C, resulting in protein degradation (Frescas and Pagano, 2008). Activation and inactivation of CDKs is an additional crucial process, and dysregulation may be involved in cell transformation. Other important kinases include ABL1, a non-tyrosine kinase, that may regulate the CDC2 kinase (Lin et al, 2004), and PLK1, a cell cycle regulated kinase (Yuan et al, 2002).

MATERIALS AND METHODS
This study used a two-stage approach: a discovery set comprised of two populations and a replication set comprised of four additional populations. SNPs with suggestive statistical significance in the discovery set were carried through to the replication set to validate the results. Details for these sets and SNP selection are provided below.

Discovery set
The discovery population comprised of 2051 women participating in an ongoing ovarian cancer case -control study at the Mayo Clinic (MAY) and Duke University (NCO) recruited between June 1999 and March 2006, as described earlier Sellers et al, 2008). Study protocols were approved by the Institutional Review Boards at both institutions, and study participants provided written informed consent. Cases were women for whom a diagnosis of histologically confirmed primary epithelial ovarian cancer was ascertained within 1 year of consent. Information on known and suspected ovarian cancer risk factors was collected by in-person interviews, including race/ethnicity, menstrual and reproductive history, use of exogenous hormones, medical and surgical history, tobacco use levels, education level, height and weight 1 year before interview and family history of breast and ovarian cancer in first-or second-degree relatives. DNA was extracted from fresh peripheral blood using the Gentra AutoPure LS Puregene salting out methodology (Gentra Inc, Minneapolis, MN, USA). For NCO samples with limited DNA available, WGA was performed using the REPLI-G protocol (Qiagen) with 200 ng genomic DNA as input yielding high molecular weight DNA and reproducible genotype data . Of the 2051 eligible participants, 1967 (95.6%) were successfully genotyped, including 1770 white, non-Hispanic participants used in this report (829 cases and 941 controls).

Replication sets
Four case -control study populations were included in a replica-

Replication SNP selection and genotyping
SNPs with log-additive P-values o0.05 were considered for replication. In addition, for SNPs not selected under the logadditive model, but with a suggestion of association in either dominant or recessive models, a more stringent threshold was applied (P-value p0.03) for inclusion in the replication (statistical methods described below). One of these SNPs (CDK2 rs2069414) could not be genotyped using TaqMan, the replication platform, and one of these SNPs (CCND3 rs3218086) was replaced by rs3218092, which was in LD (r 2 ¼ 0.95) with rs3218086 and had earlier been genotyped. Thus, six replication SNPs were genotyped at the Strangeways Research Laboratory using TaqMan designed assays, following the manufacturer's recommended protocols; rs3218092 had been similarly assayed . Each assay used 10 ng DNA in a 5 ml reaction volume with TaqMan universal PCR Master Mix (Applied Biosystems, Warrington, UK); primer and probe sequences, as well as assay conditions, are available on request. TaqMan Allele Discrimination Sequence Detection software (Applied Biosystems) was used to determine genotype calls. SNP call rates were 40.95 and replicate concordance was 40.99.

Statistical analyses
Discovery set participants were examined initially and restricted to white, non-Hispanic participants. Departures from Hardy -Weinberg equilibrium (HWE) for each SNP were examined using Pearson goodness-of-fit w 2 tests or, for SNPs with minor allele frequencies o5%, exact tests (Weir, 1996). One SNP (rs12527393 in E2F3) had HWE P-value o0.001 among controls and was excluded from analysis. Pairwise LD was estimated using r 2 statistics and graphically displayed using the Haploview v14.1 (Barrett et al, 2005). Unconditional logistic regression analysis was used to estimate OR and 95% CI for risk of ovarian cancer associated with each SNP. Primary tests of association assumed a log-additive (multiplicative) genotypic effect, equivalent to the Armitage test for trend. We also performed separate comparisons of women with one copy (OR AB vs AA ) and two copies (OR BB vs AA ) of the minor allele to women with no copies (reference). Secondary analyses examined dominant and recessive SNP effects. All analyses were adjusted for the design variables of age and geographic region, as well as the following potential confounding variables found to be associated with ovarian cancer risk in the discovery set (P-value o0.05): body mass index, postmenopausal hormone use, oral contraceptive use, parity and age at first birth. Replication association testing was similarly carried out for each SNP using unconditional logistic regression analyses as described above. Associations were examined by site, as well as combined across sites, adjusting for age. Analyses were conducted including and excluding the discovery set participants, adjusting for age and study site. Two sets of P-values were calculated for the replication set: one based on the simple comparison-wise error rate and one accounting for the number of replication tests using a Bonferroni correction.

Results
Distributions of risk factor information for the discovery set have been described earlier (Sellers et al, 2005;Kelemen et al, 2008). Generally, case-control differences were similar across both discovery sites: overall, cases tended to be more obese, have lower parity, reported a greater family history of ovarian cancer and were more likely to have used hormone therapy (NCO site) or oral contraceptives (MAY site). Of the 288 SNPs attempted, 269 (93.4%) passed quality control and were included in the analysis. Eleven variants in nine genes showing significance at P-value o0.05 for adjusted (multivariate) analyses using log-additive (ordinal), recessive or dominant models are shown in Table 2. Assuming a log-additive model, variants in five genes revealed significant associations (P-value o0.05): ABL1 rs2855192, CDKN1A rs776246, CCND3 rs3218086, CDK2 rs2069414 and E2F3 rs7760528. SNPs in two of these genes (ABL1 and E2F3) revealed additional evidence of a recessive effect, whereas SNPs in CDKN1A, CCND3 and CDK2 revealed additional evidence of a dominant association (Table 2). Although our analysis used the log-additive model as the primary analysis, there were two additional SNPs, rs2448343 in CDC2 and rs12656449 in CDK7, with non-significant P-values in the log-additive model, but significant P-values using a recessive model: OR 0.67 (95% CI 0.50-0.89), P ¼ 0.006 and OR 2.91 (95% CI 1.11-8.05) P ¼ 0.03, respectively. CCNB2 rs1486878 (OR 1.50, 95% CI 1.05-2.15) also suggested association only with a recessive model (P ¼ 0.04).
Eight of the 11 significant SNPs were chosen for replication. These included ABL1 rs2855192, CDKN1A rs7767246, CCND3 rs3218086 (which was substituted with rs3218092, r 2 À0.95), E2F3 rs7760528 and CDK2 rs2069414 (the latter of which was excluded because of lack of TaqMan assay conversion) based on the logadditive model P-value o 0.05, E2F2 rs760607 based on dominant model, P-value of 0.02, and CDC2 rs2448343 and CDK7 rs12656449 based on recessive model, P-value of 0.01 and 0.03, respectively, in the discovery set analysis (Table 2). Table 3 provides results for site-specific and combined replication analyses. For one SNP, rs2855192 in ABL1, the results were similar to those obtained in the discovery sample set, in one of the four sites (STA), with a log-additive increase in risk (P-value ¼ 0.03, Table 3; Figure 1) and also consistent with a recessive effect. Combined analysis of all sites revealed a suggestion of a recessive association (OR for homozygous minor allele genotypes compared with homozygous major allele, OR BB vs AA 1.59, 95% CI 1.08-2.34, P-value ¼ 0.02). Excluding the discovery sites, this association was attenuated (OR 1.40, 95% CI 0.89-2.19, P-value ¼ 0.14) ( Table 3). E2F3 rs776052 was associated with ovarian cancer risk in one replication population (UKO), but did not remain significant in the combined analysis. CDKN1A rs776246 and CDC2 rs2448343 were associated with risk in one population each (MAL and OPS, respectively), but the risk estimates were in the opposite direction to that found in the discovery set and not considered replications. CDC2 rs2448343 was significantly associated using all datasets assuming a recessive model only. None of the replication results remained statistically significant after correction for multiple testing (data not shown). For SNPs in CCND3, CDK7, E2F2 and E2F3, no replication of the initial result was seen in any of the replication sites, and the combined analysis did not reveal any significant findings (Table 3; Figure 1).

Discussion
This study used a two-stage approach to assess the contribution of inherited variation in 39 cell cycle genes to the risk of epithelial ovarian cancer and found some evidence of association at ABL1 rs2855192. Cell cycle dysregulation is a hallmark of the malignant state, and the function of genetic variation in cell cycle genes, including in ovarian cancer, has been reported in a number of studies Goode et al, 2009); this study extends the prior findings by the inclusion of an additional 26 and 28 additional genes, respectively. In the discovery set, SNPs in several genes were found to be associated with the risk of ovarian cancer; of these, five genes (ABL1, CCND3, CDKN1A, E2F3 and CDK2) were significant in log-additive models (P-value o0.05). This study also found four additional variants in CCNB2, CDC2, CDK7 and E2F2 (rs3328203) to be significant assuming a recessive model only. One additional variant in E2F2 (rs76067) was found to be associated assuming a dominant model, but not in the log-additive model. Replication testing of seven SNPs revealed one SNP in ABL1 to have an association in one of the four replication populations assessed (also from the US) and was significant overall with a recessive model. However, once adjustments for multiple comparisons were made, no significant association was noted for any variant.
ABL1 is a ubiquitously expressed, non-tyrosine kinase, encoding both cytoplasmic and nuclear kinases (Preyer et al, 2007). The ABL1 gene is expressed as either a 6 or 7 kb mRNA transcript, with alternatively spliced first exons spliced to exons 2 -11. ABL1 has been implicated in processes of cell differentiation, cell division, cell adhesion and cellular stress response (Wang, 1993;Kharbanda et al, 1995;Lewis et al, 1996;Barila and Superti-Furga, 1998). A t(9;22) translocation, which results in the head-to-tail fusion of the BCR and ABL1 genes, is present in many cases of chronic myelogeneous leukaemia (De Keersmaecker and Cools, 2006). The DNA-binding activity of ABL1 tyrosine kinase is regulated by CDC2-mediated phosphorylation, suggesting a cell cycle function for ABL1 (Welch and Wang, 1993). The tyrosine kinase activity of nuclear ABL1 is regulated in the cell cycle through a specific interaction with Rb (Welch and Wang, 1993). When in the cytoplasm, ABL1 responds to growth factor and adhesion signals to regulate F-actin dynamics (Woodring et al, 2003). As acquired resistance to imatinib is associated with mutations in the kinase domain of BCR-ABL that interferes with drug binding, it may be possible that a coding SNP in ABL1 modulates the imatinib response (Crossman et al, 2005). The associated SNP, rs2855192, is in intron 1 and the functional aspects are unknown; this SNP was a tagSNP, but did not tag any other SNPs (i.e. it was in a singleton bin with r 2 o0.8 with other HapMap SNPs). ABL1 was included in this study because of its function in cell cycle function; however, the cytoplasmic form of ABL1 may have a function in cell adhesion in addition to DNA binding when localized to the nucleus.
In an earlier study, variants in CDKN1B and CDKNA2/2B were found to be associated with ovarian cancer risk in a combined analysis of 3601 cases and 5705 controls . In this study, no variant in either of these genes was significant in the  Table S2) and so were not carried forward to the replication phase. In another study using imputed genotypes, based on data from five independent ovarian cancer studies (Goode et al, 2009), the signal observed for CDNKN1A in the MAY þ NCO dataset was not supported by imputation of genotypes in the other four studies, consistent with the replication data in this report. For rs2069391 in CDK2 variant, which could not be genotyped in the replication set in this study (discovery set logadditive OR 1.36, CI 1.03 -1.78), imputation revealed a signal in the earlier combined analysis (log-additive OR 1.21, CI 1.01 -2.09), which included five of the six populations in this study (Goode et al, 2009). A strength of this study was its comprehensive nature in terms of the number of genes and number of tagSNPs and inclusion of putatively functional SNPs. Owing to a large number of tests (269 SNPs Â 3 genetic modes of inheritance), caution in interpreting the data is warranted; no adjustment was made for multiple testing because of a lack of complete independence of tests. An additional strength of this study is the inclusion of four replication populations, which improves power (Ioannidis et al, 2001;Morgan et al, 2007), although replication genotyping of only the top 2% of SNPs limited the power of our two-stage approach. In recent metaanalyses and pooled analyses 161 cancer genetic association studies (Dong et al, 2008), close to one-third of all associations were reported to be statistically significant and many of the false positive associations arose from small studies with multiple subset analyses. Therefore, we consider this analysis a preliminary screen of the cell cycle pathway and one which indicates modest evidence for association with disease risk for only one gene, ABL1. Additional examination of ABL1 rs2855192, and including other SNPs with suggestive discovery set results, is warranted in additional studies within the ovarian cancer consortium .