Epithelial ovarian cancer (EOC) has a heritable component that remains to be fully characterized. Most identified common susceptibility variants lie in non-protein-coding sequences. We hypothesized that variants in the 3′ untranslated region at putative microRNA (miRNA)-binding sites represent functional targets that influence EOC susceptibility. Here, we evaluate the association between 767 miRNA-related single-nucleotide polymorphisms (miRSNPs) and EOC risk in 18,174 EOC cases and 26,134 controls from 43 studies genotyped through the Collaborative Oncological Gene–environment Study. We identify several miRSNPs associated with invasive serous EOC risk (odds ratio=1.12, P=10−8) mapping to an inversion polymorphism at 17q21.31. Additional genotyping of non-miRSNPs at 17q21.31 reveals stronger signals outside the inversion (P=10−10). Variation at 17q21.31 is associated with neurological diseases, and our collaboration is the first to report an association with EOC susceptibility. An integrated molecular analysis in this region provides evidence for ARHGAP27 and PLEKHM1 as candidate EOC susceptibility genes.
Genome-wide association studies (GWAS) have identified hundreds of genetic variants conferring low penetrance susceptibility to cancer1. More than 90% of these variants lie in non protein-encoding sequences including non-coding RNAs and regions containing regulatory elements (that is, enhancers, promoters, untranslated regions (UTRs))1. The emerging hypothesis is that common variants within non-coding regulatory regions influence expression of target genes, thereby conferring disease susceptibility1.
MicroRNAs (miRNAs) are short non-coding RNAs that regulate gene expression post-transcriptionally by binding primarily to the 3′UTR of target messenger RNA (mRNA), causing translational inhibition and/or mRNA degradation2,3,4. MiRNAs have been shown to have a key role in the development of epithelial ovarian cancer (EOC)2. We5,6 and others7 have found evidence that various miRNA-related single-nucleotide polymorphisms (miRSNPs) are associated with EOC risk, suggesting they may be key disruptors of gene function and contributors to disease susceptibility8,9. However, studies of miRSNPs that affect miRNA–mRNA binding have been restricted by small sample sizes, and therefore have limited statistical power to identify associations at genome-wide levels of significance7,8,9. Large-scale studies and more systematic approaches are warranted to fully evaluate the role of miRSNPs and their contribution to disease susceptibility.
Here, we use the in silico algorithms, TargetScan10,11 and Pictar12,13 to predict miRNA:mRNA-binding regions involving genes and miRNAs relevant to EOC, and align identified regions with SNPs in the Single Nucleotide Polymorphism database (dbSNP) (Methods). We then genotype 1,003 miRSNPs (or tagging SNPs with r2>0.80) in 18,174 EOC cases and 26,134 controls from 43 studies from the Ovarian Cancer Association Consortium (OCAC) (Supplementary Table S1). Genotyping was performed on a custom Illumina Infinium iSelect array designed as part of the Collaborative Oncological Gene–environment Study (COGS), an international effort that evaluated 211,155 SNPs and their association with ovarian, breast and prostate cancer risk. Our investigation uncovers 17q21.31 as a new susceptibility locus for EOC, and we provide insights into candidate genes and possible functional mechanisms underlying disease development at this locus.
Seven hundred and sixty-seven of the 1,003 miRSNPs passed genotype quality control (QC) and were evaluated for association with invasive EOC risk; most of the miRSNPs that failed QC were monomorphic (see Methods). Primary analysis of 14,533 invasive EOC cases and 23,491 controls of European ancestry revealed four strongly correlated SNPs (r2=0.99; rs1052587, rs17574361, rs4640231 and rs916793) that mapped to 17q21.31 and were associated with increased risk (per allele odds ratio (OR)=1.10, 95% confidence interval (CI) 1.06–1.13) at a genome-wide level of significance (10−7); no other miRSNPs had associations stronger than P<10−4 (Supplementary Fig. S1). The most significant association was for rs1052587 (P=1.9 × 10−7), and effects varied by histological subtype, with the strongest effect observed for invasive serous EOC cases (OR=1.12, P=4.6 × 10−8) (Table 1). No heterogeneity in ORs was observed across study sites (Supplementary Fig. S2).
Rs1052587, rs17574361 and rs4640231 reside in the 3′UTR of microtubule-associated protein tau (MAPT), KAT8 regulatory NSL complex subunit 1 (KANSL1/KIAA1267) and corticotrophin-releasing hormone receptor 1 (CRHR1) genes, at putative binding sites for miR-34a, miR-130a and miR-34c, respectively. The fourth SNP, rs916793, is perfectly correlated with rs4640231 and lies in a non-coding RNA, MAPT-antisense 1. 17q21.31 contains a ~900-kb inversion polymorphism14 (ch 17: 43,624,578–44,525,051 MB, human genome build 37), and all three miRSNPs and the tagSNP are located within the inversion (Fig. 1).
Chromosomes with the non-inverted or inverted segments of 17q21.31, respectively, known as haplotype 1 (H1) and haplotype 2 (H2), represent two distinct lineages that diverged ~3 million years ago and have not undergone any recombination event14. The four susceptibility alleles identified here reside on the H2 haplotype that is reported to be rare in Africans and East Asians, but is common (frequency >20%) and exhibits strong linkage disequilibrium (LD) among Europeans14, consistent with our findings. The H2 haplotype has a frequency of 22% among European women in our primary analysis (Table 1) but only 3.2 and 0.3% among Africans (151 invasive cases, 200 controls) and Asians (716 invasive cases, 1573 controls), respectively.
To increase genomic coverage at this locus, we evaluated an additional 142 non-miRSNPs at 17q21.31 that were also genotyped as a part of COGS in the same series of OCAC cases and controls. We also imputed genotypes using data from the 1000 Genomes Project15. These approaches identified a second cluster of strongly correlated SNPs (r2>0.90) in a distinct region proximal to the inversion (centred at chromosome 17: 43.5 MB, human genome build 37) that was more significantly associated with the risk of all invasive EOCs (P=10−9) and invasive serous EOC specifically (P=10−10) than the cluster of identified miRSNPs (Fig. 1). Association results and annotation for SNPs in this second cluster are shown in Supplementary Table S2; this cluster includes three directly genotyped SNPs (rs2077606, rs17631303 and rs12942666), with the strongest association observed for rs2077606 among all invasive cases (OR=1.12, 95% CI: 1.08–1.16, P=7.8 × 10−9) and invasive serous cases (OR=1.15, 95% CI: 1.12–1.19, P=3.9 × 10−10). These SNPs were chosen for genotyping in COGS because they had shown evidence of association as modifiers of EOC risk in BRCA1 gene mutation carriers by the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA)16. Several imputed SNPs in strong LD (r2>0.90) were more strongly associated with risk than their highly correlated genotyped SNPs (Supplementary Table S2). This risk-associated region at 17q21.31 is distinct from a previously reported ovarian cancer susceptibility locus at 17q21 (ref. 17); neither the genotyped nor the imputed SNPs we report here are strongly correlated (maximum r2=0.01) with SNPs from the 17q21 locus (spanning 46.2–46.5 MB, build 37).
Genotype clustering was poor for rs2077606, but clustering was good for its correlated SNP, rs12942666 (r2=0.99) and so results for this SNP are presented instead (Supplementary Fig. S3; Table 1). Subgroup analysis revealed marginal evidence of association for rs12942666 with endometrioid (P=0.04), but not mucinous or clear cell EOC subtypes (Table 1), and results were consistent across studies (Supplementary Fig. S4). Rs12942666 is correlated with the top-ranked miRSNP, rs1052587 (r2=0.76) (Fig. 1). To evaluate whether associations observed for rs12942666 and rs1052587 represented independent signals, stepwise logistic regression was used; only rs12942666 was retained in the model. This suggests that the cluster which includes rs12942666 is driving the association with EOC risk that was initially identified through the candidate miRSNPs.
Functional and molecular analyses
To evaluate functional evidence for candidate genes, risk-associated SNPs, and regulatory regions at 17q21.31, we examined a 1-MB region centred on rs12942666 using a combination of locus-specific and genome-wide assays and in silico analyses of publicly available data sets, including The Cancer Genome Atlas (TCGA) Project18 (see Methods). Rs12942666 and many of its correlated SNPs lie within introns of Rho GTPase activating protein 27 (ARHGAP27) or its neighbouring gene, pleckstrin homology domain containing family M (with RUN domain) member 1 (PLEKHM1) (Supplementary Table S2). There are another 15 known protein-coding genes within the region: KIF18B, C1QL1, DCAKD, NMT1, PLCD3, ABCB4, HEXIM1, HEXIM2, FMNL1, C17orf46, MAP3K14, C17orf69, CRHR1, IMP5 and MAPT (Fig. 2a).
To evaluate the likelihood that one or more genes within this region represent target susceptibility gene(s), we first analysed expression, copy number variation and methylation involving these genes in EOC tissues and cell lines (Fig. 2b–g; Supplementary Tables S3 and S4). Most genes showed significantly higher expression (P<10−4) in EOC cell lines versus normal ovarian cancer precursor tissues (OCPTs); ARHGAP27 showed the most pronounced difference in gene expression between cancer and normal cells (P=10−16) (Fig. 2b and Supplementary Table S3). For nine genes, we also found overexpression in primary high-grade serous (HGS) EOC tumours versus normal ovarian tissue in at least one of two publicly available data sets, TCGA series of 568 tumours18and/or the Gene Expression Omnibus series GSE18520 data set consisting of 53 tumors19 (Fig. 2c and Supplementary Table S3). Analysis of DNA copy number variation in TCGA revealed frequent loss of heterozygosity in this region rather than copy number gains (Supplementary Fig. 5a–b; Supplementary Methods). We observed significant hypomethylation (P<0.01) in ovarian tumours compared to normal tissues for DCAKD, PLCD3, ACBD4, FMNL1 and PLEKHM1 (Fig. 2d and Supplementary Table S4), which is consistent with the overexpression observed for DCAKD, PLCD3 and FMNL1. Taken together, these data suggest that the mechanism underlying overexpression may be epigenetic rather than based on copy number alterations.
We evaluated associations between genotypes for the top risk SNP rs12942666 (or a tagSNP) and expression of all genes in the region (expression quantitative trait locus (eQTL) analysis) in normal OCPTs, lymphoblastoid cell lines and primary ovarian tumours from TCGA. The only significant eQTL association observed (P<0.05) in normal OCPTs was for ARHGAP27 (P=0.04) (Fig. 2e; Supplementary Table S3). Because rs12942666 was not genotyped in tissues analysed in TCGA, we used data for its correlated SNP rs2077606 (r2=0.99) to evaluate eQTLs in tumour tissues. Rs2077606 genotypes were strongly associated with PLEKHM1 expression in primary HGS-EOCs (P=1 × 10−4) (Fig. 2f; Supplementary Table S3). We also detected associations between rs12942666 (and rs2077606) genotypes and methylation for PLEKHM1 and CRHR1 in primary ovarian tumours (P=0.020 and 0.001, respectively) using methylation quantitative trait locus analyses (Fig. 2g; Supplementary Table S4). Finally, the Catalogue of Somatic Mutations in Cancer database20 showed that nine genes in the region, including PLEKHM1, have functionally significant mutations in cancer, although for most genes mutations were not reported in ovarian carcinomas (Supplementary Table S3).
Taken together, these data suggest that several genes at the 17q21.31 locus may have a role in EOC development. The risk-associated SNPs we identified fall within non-coding DNA, suggesting the functional SNP(s) may be located within an enhancer, insulator or other regulatory element that regulates expression of one of the candidate genes we evaluated. One hypothesis emerging from these molecular analyses is that rs12942666 (or a correlated SNP) mediates regulation of PLEKHM1, a gene implicated in osteopetrosis and endocytosis21 and/or ARHGAP27, a gene that may promote carcinogenesis through dysregulation of Rho/Rac/Cdc42-like GTPases22. To identify the most likely candidate for being the causal variant at 17q21.31, we compared the difference between log-likelihoods generated from un-nested logistic regression models for rs12942666 and each of 198 SNPs in a 1-MB region featured in Supplementary Table 2. As expected, the log-likelihoods were very similar due to the strong LD; no SNPs emerged as having a likelihood ratio >20 for being the causal variant.
To explore the possible functional significance of rs12942666 and strongly correlated variants (r2>0.80), we then generated a map of regulatory elements around rs12942666 using ENCyclopedia of DNA Elements (ENCODE) data and formaldehyde-assisted isolation of regulatory elements sequencing analysis of OCPTs (Supplementary Methods). We observed no evidence of putative regulatory elements coinciding with rs12942666 or correlated SNPs (Fig. 3a). A map of regulatory elements in the entire 1-MB region can be seen in Supplementary Fig. 5c–f. We subsequently used in silico tools (ANNOVAR23, SNPinfo24 and SNPnexus25) to evaluate the putative function of possible causal SNPs (Supplementary Methods). Of 50 SNPs with possible functional roles, more than 30 reside in putative transcription factor binding sites (TFBS) within or near PLEKHM1 or ARHGAP27; 12 SNPs may affect methylation or miRNA binding, and two are non-synonymous coding variants predicted to be of no functional significance (Supplementary Table S2).
As most of the top-ranked 17q21.31 SNPs with putative functions (including two of the top directly genotyped SNPs, rs2077606 and rs17631303) are predicted to lie in TFBS (Supplementary Table S2), we used the in silico tool, JASPAR26, to further examine TFBS coinciding with these SNPs. Two SNPs scored high in this analysis (Supplementary Table S5); the first, rs12946900, lies in a GAGGAA motif and canonical binding site for SPIB, an Ets family member27. Ets factors have been implicated in the development of ovarian cancer and other malignancies28, but little evidence supports a specific role for SPIB in EOC aetiology. The second hit was for rs2077606, which lies in an E-box motif CACCTG at the canonical binding site for ZEB1 (chr. 10p11.2), a zinc-finger E-box binding transcription factor that represses E-cadherin29,30 and contributes to epithelial–mesenchymal transition in EOCs31.
We analysed expression of SPIB and ZEB1 in primary ovarian cancers using TCGA data; we found no significant difference in SPIB expression in tumours compared with normal tissues (Fig. 3bi). In contrast, ZEB1 expression was significantly lower in primary HGS-EOCs compared with normal tissues (P=0.005) (Fig. 3bii). We validated this finding using qPCR analysis in 123 EOC and OCPT cell lines (P=8.8 × 10−4) (Fig. 3biii). As rs2077606 lies within an intron of PLEKHM1, this gene is a candidate target for ZEB1 binding at this site. Our eQTL analysis also suggests ARHGAP27 is a strong candidate ZEB1 target at this locus; ARHGAP27 expression is highest in OCPT cell lines carrying the minor allele of rs2077606 (P=0.034) (Fig. 3ci). Although we observed no eQTL associations between rs2077606 and ZEB1 expression in lymphoblastoid cell lines (Fig. 3cii), we found evidence of eQTL between rs2077606 and ZEB1 expression in HGS-EOCs (P=0.045) (Fig. 3ciii). ZEB1 binding at the site of the common allele is predicted to repress gene expression whereas loss of ZEB1 binding conferred by the minor allele may enable expression of ARHGAP27, consistent with the eQTL association in OCPTs (Fig. 3ci). Although these data support a repressor role for ZEB1 in EOC development and suggest ARHGAP27 may be a functional target of rs2077606 (or a correlated SNP) in OCPTs through trans-regulatory interactions with ZEB1, it is important to investigate additional hypotheses as we continue to narrow down the list of target susceptibility genes, SNPs, and regulatory mechanisms that contribute to EOC susceptibility at this locus.
The present study represents the largest, most comprehensive investigation of the association between putative miRSNPs in the 3′UTR and cancer risk. This and the systematic follow-up to evaluate associations with EOC risk for non-miRSNPs in the region identified 17q21.31 as a new susceptibility locus for EOC. Although the miRSNPs identified here may have some biological significance, our findings suggest that other types of variants in non-coding DNA, especially non-miRSNPs at the 17q21.31 locus, are stronger contributors to EOC risk. It is possible, however, that highly significant miRSNPs exist that were not identified in our study because (a) they were not pre-selected for evaluation (that is, they do not reside in a binding site involving miRNAs or genes with known relevance to EOC, or they reside in regions other than the 3′UTR3,4) and/or (b) they were very rare and could not be designed or detected with our genotyping platform and sample size, respectively. Despite these limitations, the homogeneity between studies of varying designs and populations in the OCAC and the genome-wide levels of statistical significance imply that all detected associations are robust. Furthermore, molecular correlative analyses of genes within the region suggest that cis-acting genetic variants influencing non-coding DNA regulatory elements, miRNAs and/or methylation underlie disease susceptibility at the 17q21.31 locus. Finally, these studies point to a subset of candidate genes (that is, PLEKHM1, ARHGAP27) and a transcription factor (that is, ZEB1) that may influence EOC initiation and development.
This novel locus is one of eleven loci now identified that contains common genetic variants conferring low penetrance susceptibility to EOC in the general population17,32,33,34. Genetic variants at several of these loci influence risks of more than one cancer type, suggesting that several cancers may share common mechanisms. For example, alleles at 5p15.33 and 19p13.1 are associated with estrogen-receptor-negative breast cancer and serous EOC susceptibility32,35, and variants at 8q24 are associated with risk of EOC and other cancers17,36. Genetic variation at 17q21.31 is also associated with frontotemporal dementia–spectrum disorders, Parkinson’s disease, developmental delay and alopecia37,38,39,40,41,42. Through COGS, the CIMBA also recently identified 17q21.31 variants that modify EOC risk in BRCA1 and BRCA2 carriers (P<10−8 in BRCA1/2 combined)16. In particular, rs17631303, which is perfectly correlated with rs2077606 and rs12942666, was among the top-ranking SNPs detected by CIMBA16. Consistent with our findings, CIMBA also provide data that suggest EOC risk is associated with altered expression of one or more genes in the 17q21.31 region16. Thus, results from this large-scale collaboration support a role for this locus in both BRCA1/2- and non-BRCA1/2-mediated EOC development. Before these findings can be integrated with variants from other confirmed loci and non-genetic factors to predict women at greatest risk of developing EOC and provide options for medical management of these risks, continued efforts will be needed to fine map the 17q21.31 region and to fully characterize the functional and mechanistic effects of potential causal SNPs in disease aetiology and development.
Forty-three individual OCAC studies contributed samples and data to the COGS initiative. Nine of the 43 participating studies were case-only (GRR, HSK, LAX, ORE, PVD, RMH, SOC, SRO, UKR); cases from these studies were pooled with case–control studies from the same geographic region. The two national Australian case–control studies were combined into a single study to create 34 case–control sets. Details regarding the 43 participating OCAC studies are summarized in Supplementary Table S1. Briefly, cases were women diagnosed with histologically confirmed primary EOC (invasive or low malignant potential), fallopian tube cancer or primary peritoneal cancer ascertained from population- and hospital-based studies and cancer registries. The majority of OCAC cases (>90%) do not have a family history of ovarian or breast cancer in a first-degree relative, and most have not been tested for BRCA1/2 mutations as a part of their parent study. Controls were women without a current or prior history of ovarian cancer with at least one ovary intact at the reference date. All studies had data on disease status, age at diagnosis/interview, self-reported racial group and histologic subtype. Most studies frequency-matched cases and controls on age group and race.
Selection of candidate genes and SNPs
To increase the likelihood of identifying miRSNPs with biological relevance to EOC, we reviewed published literature and consulted public databases to generate two lists of candidate genes: (1) 55 miRNAs reported to be deregulated in EOC tumours compared with normal tissue in at least one study43,44,45,46, and (2) 665 genes implicated in the pathogenesis of EOC through gene expression analyses47,48, somatic mutations49, or genetic association studies50,51. Many genes were identified through the Gene Prospector database51, a web-based application that selects and prioritizes potential disease-related genes using a highly curated, up-to-date database of genetic association studies.
Using each candidate gene list as input, we identified putative sites of miRNA:mRNA binding with the computational prediction algorithms TargetScan version 5.1 (refs 10, 11) and PicTar12,13 (Supplementary Methods). Each algorithm generated start and end coordinates for regions of miRNA binding, and database SNP52 version 129 was mined to identify SNPs falling within the designated binding regions. Of 3,246 unique miRSNPs that were identified, 1,102 obtained adequate design scores using Illumina’s Assay Design Tool. The majority (n=1,085, 98.5%) of the 1,102 SNPs resided in predicted sites of miRNA binding (and therefore represent miRSNPs), while the remainder (n=17) are tagSNPs (r2>0.80) for miRSNPs that were not designable or had poor-to-moderate design scores. Ninety-nine of the 1102 SNPs failed during custom assay development, leaving a total of 1,003 SNPs that were designed and genotyped.
Genotyping and QC
The candidate miRSNPs selected for the current investigation were genotyped using a custom Illumina Infinium iSelect Array as part of the international COGS, an effort to evaluate 211,155 genetic variants for association with the risk of ovarian, breast and prostate cancer. Samples and data were included from several consortia, including OCAC, the Breast Cancer Association Consortium, the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA) and the Prostate Cancer Association Group to Investigate Cancer-Associated Alterations in the Genome (PRACTICAL). Although one of the primary goals of COGS was to replicate and fine-map findings from pooled GWAS from each consortia, this effort also aimed to genotype candidate SNPs of interest (such as the miRSNPs). The genotyping and QC process has been described recently in our report of OCAC’s pooled GWAS findings34. Briefly, COGS genotyping was conducted at six centres, two of which were used for OCAC samples: McGill University and Génome Québec Innovation Centre (Montréal, Canada) (n=19,806) and Mayo Clinic Medical Genomics Facility (n=27,824). Each 96-well plate contained 250 ng genomic DNA (or 500 ng whole genome-amplified DNA). Raw intensity data files were sent to the COGS data coordination centre at the University of Cambridge for genotype calling and QC using the GenCall algorithm.
One thousand two hundred and seventy-three OCAC samples were genotyped in duplicate. Genotypes were discordant for greater than 40 per cent of SNPs for 22 pairs. For the remaining 1,251 pairs, concordance was greater than 99.6 per cent. In addition, we identified 245 pairs of samples that were unexpected genotypic duplicates. Of these, 137 were phenotypic duplicates and judged to be from the same individual. We used identity-by-state to identify 618 pairs of first-degree relatives. Samples were excluded according to the following criteria: (1) 1,133 samples with a conversion rate (the proportion of SNPs successfully called per sample) of less than 95 per cent; (2) 169 samples with heterozygosity >5 s.d’s from the intercontinental ancestry-specific mean heterozygosity; (3) 65 samples with ambiguous sex; (4) 269 samples with the lowest call rate from a first-degree relative pair; (5) 1,686 samples that were either duplicate samples that were non-concordant for genotype or genotypic duplicates that were not concordant for phenotype. A total of 44,308 eligible subjects including 18,174 cases and 26,134 controls were available for analysis.
The process of SNP selection by the participating consortia has been summarized previously34. In total, 211,155 SNP assays were successfully designed, including 23,239 SNPs nominated by OCAC. Overall, 94.5% of OCAC-nominated SNPs passed QC. SNPs were excluded if: (1) the call rate was <95% with MAF>5% or <99% with MAF<5% (n=5,201); (2) they were monomorphic upon clustering (n=2,587); (3) P-values of HWE in controls were <10−7 (n=2,914); (4) there was greater than 2% discordance in duplicate pairs (n=22); (5) no genotypes were called (n=1,311). Of 1,003 candidate miRSNPs genotyped, 767 passed QC criteria and were available for analysis; the majority of miRSNPs that were excluded were monomorphic (n=158, 67%). Genotype intensity cluster plots were visually inspected for the most strongly associated SNPs.
HapMap DNA samples for European (CEU, n=60), African (YRI, n=53) and Asian (JPT+CHB, n=88) populations were also genotyped using the COGS iSelect. We used the program LAMP53 to estimate intercontinental ancestry based on the HapMap (release no. 23) genotype frequency data for these three populations. Eligible subjects with >90 per cent. European ancestry were defined as European (n=39,773) and those with greater than 80 per cent. Asian or African ancestry were defined as Asian (n=2,382) or African, respectively (n=387). All other subjects were defined as being of mixed ancestry (n=1,766). We then used a set of 37,000 unlinked markers to perform principal components analysis within each major population subgroup. To enable this analysis on very large sample sizes, we used an in-house program written in C++ using the Intel MKL libraries for eigenvectors (available at http://ccge.medschl.cam.ac.uk/software/).
Tests of association
We used unconditional logistic regression treating the number of minor alleles carried as an ordinal variable (log-additive model) to evaluate the association between each SNP and EOC risk. Separate analyses were carried out for each ancestry group. The model for European subjects was adjusted for population substructure by including the first five eigenvalues from the principal components analysis. African- and Asian ancestry-specific estimates were obtained after adjustment for the first two components representing each respective ancestry. Due to the heterogeneous nature of EOC, subgroup analysis was conducted to estimate genotype-specific ORs for serous carcinomas (the most predominant histologic subtype) and the three other main histological subtypes of EOC: endometrioid, mucinous and clear cell. Separate analyses were also carried out for each study site, and site-specific ORs were combined using a fixed-effect meta-analysis. The I2 test of heterogeneity was estimated to quantify the proportion of total variation due to heterogeneity across studies, and the heterogeneity of ORs between studies was tested with Cochran’s Q statistic. The R statistical package ‘r-meta’ was used to generate forest plots. Statistical analysis was conducted in PLINK54.
Imputation of genotypes at 17q21.31
To increase genomic coverage, we imputed genotype data for the 17q21.31 region (chr17: 40,099,001–44,900,000, human genome build 37) with IMPUTE2.2 (ref. 55) using phase 1 haplotype data from the January 2012 release of the 1,000 genome project data15. For each imputed genotype the expected number of minor alleles carried was estimated (as weights). IMPUTE provides estimated allele dosage for SNPs that were not genotyped and for samples with missing data for directly genotyped SNPs. Imputation accuracy was estimated using an r2 quality metric. We excluded imputed SNPs from analysis where the estimated accuracy of imputation was low (r2<0.3).
Functional studies and in silico analysis of publicly available data sets
We performed the following assays for each gene in the 1-MB region centred on the most significant SNP at the 17q21.31 locus (see Supplementary Methods): gene expression analysis in EOC cell lines (n=51) compared with normal cell lines from OCPTs56, including ovarian surface epithelial cells and fallopian tube secretory epithelial cells (n=73) and CpG island methylation analysis in HGS ovarian cancer (HGS-EOC) tissues (n=106) and normal tissues (n=7). Genes in the region were also evaluated in silico by mining publicly available molecular data generated for primary EOCs and other cancer types, including TCGA analysis of 568 HGS EOCs18, the Gene Expression Omnibus series GSE18520 data set of 53 HGS EOCs19 and the Catalogue Of Somatic Mutations in Cancer database20.
We used these data to (1) compare gene expression between (a) EOC cell lines and normal cell lines and (b) tumour tissue and normal tissue from TCGA, (2) to compare gene methylation status in HGS-EOCs and normal tissue, (3) to conduct gene eQTL analyses to evaluate genotype–gene expression associations in normal OCPTs, lymphoblastoid cells and HGS-EOCs and (4) to conduct methylation quantitative trait locus analyses in HGS-EOCs to evaluate genotype–gene methylation associations. Data from ENCODE57 were used to evaluate the overlap between regulatory elements in non-coding regions and risk-associated SNPs. ENCODE describes regulatory DNA elements (for example, enhancers, insulators and promotors) and non-coding RNAs (for example, miRNAs, long non-coding and piwi-interacting RNAs) that may be targets for susceptibility alleles. However, ENCODE does not include data for EOC-associated tissues, and activity of such regulatory elements often varies in a tissue-specific manner57,58. Therefore, we profiled the spectrum of non-coding regulatory elements in ovarian surface epithelial cells and fallopian tube secretory epithelial cells using a combination of formaldehyde-assisted isolation of regulatory elements sequencing and RNA sequencing (Supplementary Methods).
How to cite this article: Permuth-Wey, J. et al. Identification and molecular characterization of a new ovarian cancer susceptibility locus at 17q21.31. Nat. Commun. 4:1627 doi: 10.1038/ncomms2613 (2013).
Freedman, M. L. et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nat. Genet. 43, 513–518 (2011) .
Dahiya, N. & Morin, P. J. . MicroRNAs in ovarian carcinomas. Endocr. Relat. Cancer 17, F77–F89 (2010) .
Lytle, J. R., Yario, T. A. & Steitz, J. A. . Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5' UTR as in the 3′ UTR. Proc. Natl Acad. Sci. USA 104, 9667–9672 (2007) .
Lee, I. et al. New class of microRNA targets containing simultaneous 5'-UTR and 3'-UTR interaction sites. Genome Res. 19, 1175–1183 (2009) .
Permuth-Wey, J. et al. LIN28B polymorphisms influence susceptibility to epithelial ovarian cancer. Cancer Res. 71, 3896–3903 (2011) .
Permuth-Wey, J. et al. MicroRNA processing and binding site polymorphisms are not replicated in the Ovarian Cancer Association Consortium. Cancer Epidemiol. Biomarkers Prev. 20, 1793–1797 (2011) .
Liang, D. et al. Genetic variants in MicroRNA biosynthesis pathways and binding sites modify ovarian cancer risk, survival, and treatment response. Cancer Res. 70, 9765–9776 (2010) .
Ryan, B. M., Robles, A. I. & Harris, C. C. . Genetic variation in microRNA networks: the implications for cancer research. Nat. Rev. Cancer 10, 389–402 (2010) .
Sethupathy, P. & Collins, F. S. . MicroRNA target site polymorphisms and human disease. Trends Genet. 24, 489–497 (2008) .
Lewis, B. P., Burge, C. B. & Bartel, D. P. . Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20 (2005) .
TargetScanHuman, http://genes.mit.edu/targetscan (2009) .
Krek, A. et al. Combinatorial microRNA target predictions. Nat. Genet. 37, 495–500 (2005) .
PicTar, pictar.mdc-berlin.de/ . Accessed January 2009 .
Stefansson, H. et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005) .
1,000 Genomes, http://www.1000genomes.org/page.php (2012) .
Couch, F. J. et al. Genome-wide association study in BRCA1 mutation carriers identifies novel loci associated with breast and ovarian cancer risk. PLoS Genetics doi: 10.1371/journal.pgen.1003212 (2013) .
Goode, E. L. et al. A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24. Nat. Genet. 42, 874–879 (2010) .
Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011) .
Mok, S. C. et al. A gene signature predictive for outcome in advanced ovarian cancer identifies a survival factor: microfibril-associated glycoprotein 2. Cancer Cell 16, 521–532 (2009) .
Forbes, S. A. et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39, D945–D950 (2011) .
Tabata, K. et al. Rubicon and PLEKHM1 negatively regulate the endocytic/autophagic pathway via a novel Rab7-binding domain. Mol. Biol. Cell 21, 4162–4172 (2010) .
Katoh, Y. & Katoh, M. . Identification and characterization of ARHGAP27 gene in silico. Int. J. Mol. Med. 14, 943–947 (2004) .
Wang, K., Li, M. & Hakonarson, H. . ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010) .
Xu, Z. & Taylor, J. A. . SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies. Nucleic Acids Res. 37, W600–W605 (2009) .
Dayem Ullah, A. Z., Lemoine, N. R. & Chelala, C. . SNPnexus: a web server for functional annotation of novel and publicly known genetic variants (2012 update). Nucleic Acids Res. 40, W65–W70 (2012) .
JASPAR, http://jaspar.cgb.ki.se/ (2012) .
Ray, D. et al. Characterization of Spi-B, a transcription factor related to the putative oncoprotein Spi-1/PU.1. Mol. Cell Biol. 12, 4297–4304 (1992) .
Fujimoto, J. et al. Clinical implications of expression of ETS-1 related to angiogenesis in metastatic lesions of ovarian cancers. Oncology 66, 420–428 (2004) .
Spaderna, S. et al. The transcriptional repressor ZEB1 promotes metastasis and loss of cell polarity in cancer. Cancer Res. 68, 537–544 (2008) .
Peinado, H., Olmeda, D. & Cano, A. . Snail, Zeb and bHLH factors in tumour progression: an alliance against the epithelial phenotype? Nat. Rev. Cancer 7, 415–428 (2007) .
Bendoraite, A. et al. Regulation of miR-200 family microRNAs and ZEB transcription factors in ovarian cancer: evidence supporting a mesothelial-to-epithelial transition. Gynecol. Oncol. 116, 117–125 (2010) .
Bolton, K. L. et al. Common variants at 19p13 are associated with susceptibility to ovarian cancer. Nat. Genet. 42, 880–884 (2010) .
Song, H. et al. A genome-wide association study identifies a new ovarian cancer susceptibility locus on 9p22.2. Nat. Genet. 41, 996–1000 (2009) .
Pharoah et al. GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer. Nat. Genet doi:10.1038/ng.2564 (2013) .
Couch, F. J. et al. Common variants at the 19p13.1 and ZNF365 loci are associated with ER subtypes of breast cancer and ovarian cancer risk in BRCA1 and BRCA2 mutation carriers. Cancer Epidemiol. Biomarkers Prev. 21, 645–657 (2012) .
Ghoussaini, M. et al. Multiple loci with different cancer specificities within the 8q24 gene desert. J. Natl. Cancer Inst. 100, 962–966 (2008) .
Coppola, G. et al. Evidence for a role of the rare p.A152T variant in MAPT in increasing the risk for FTD-spectrum and Alzheimer's diseases. Hum. Mol. Genet. 21, 3500–3512 (2012) .
Ghidoni, R. et al. The H2 MAPT haplotype is associated with familial frontotemporal dementia. Neurobiol. Dis. 22, 357–362 (2006) .
Koolen, D. A. et al. A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism. Nat. Genet. 38, 999–1001 (2006) .
Tobin, J. E. et al. Haplotypes and gene expression implicate the MAPT region for Parkinson disease: the GenePD Study. Neurology 71, 28–34 (2008) .
Li, R., B., F., Kiefer, A. K., Steffanson, H. & Nyholt, D. R. . Six Novel susceptibility loci for early-onset androgenic alopecia and their unexpected association with common diseases. PLoS Genet. 8, e1002746 (2012) .
Edwards, T. L. et al. Genome-wide association study confirms SNPs in SNCA and the MAPT region as common risk factors for Parkinson disease. Ann. Hum. Genet. 74, 97–109 (2010) .
Dahiya, N. et al. MicroRNA expression and identification of putative miRNA targets in ovarian cancer. PLoS ONE 3, e2436 (2008) .
Iorio, M. V. et al. MicroRNA signatures in human ovarian cancer. Cancer Res. 67, 8699–8707 (2007) .
Nam, E. J. et al. MicroRNA expression profiles in serous ovarian carcinoma. Clin. Cancer Res. 14, 2690–2695 (2008) .
Yang, H. et al. MicroRNA expression profiling in human ovarian cancer: miR-214 induces cell survival and cisplatin resistance by targeting PTEN. Cancer Res. 68, 425–433 (2008) .
Tothill, R. W. et al. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin. Cancer Res. 14, 5198–5208 (2008) .
Zorn, K. K. et al. Gene expression profiles of serous, endometrioid, and clear cell subtypes of ovarian and endometrial cancer. Clin. Cancer Res. 11, 6422–6430 (2005) .
Landen, C. N. Jr, Birrer, M. J. & Sood, A. K. . Early events in the pathogenesis of epithelial ovarian cancer. J. Clin. Oncol. 26, 995–1005 (2008) .
Fasching, P. A. et al. Role of genetic polymorphisms and ovarian cancer susceptibility. Mol. Oncol. 3, 171–181 (2009) .
Yu, W., Wulf, A., Liu, T., Khoury, M. J. & Gwinn, M. . Gene Prospector: an evidence gateway for evaluating potential susceptibility genes and interacting risk factors for human diseases. BMC Bioinform. 9, 528 (2008) .
NCBI dbSNP database, http://ncbi.nlm.nih.gov/SNP (2009) .
Sankararaman, S., Sridhar, S., Kimmel, G. & Halperin, E. . Estimating local ancestry in admixed populations. Am. J. Hum. Genet. 82, 290–303 (2008) .
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007) .
Howie, B., Marchini, J. & Stephens, M. . Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457–470 (2011) .
Lawrenson, K. et al. Senescent fibroblasts promote neoplastic transformation of partially transformed ovarian epithelial cells in a three-dimensional model of early stage ovarian cancer. Neoplasia 12, 317–325 (2010) .
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011) .
Heintzman, N. D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009) .
We thank all the individuals who participated in this research and all the researchers, clinicians and administrative staff who have made possible the many studies contributing to this work. In particular, we thank D. Bowtell, P. Webb, A. deFazio, D. Gertig, A. Green, P. Parsons, N. Hayward and D. Whiteman (AUS); D.L. Wachter, S. Oeser, S. Landrith (BAV); G. Peuteman, T. Van Brussel and D. Smeets (BEL); the staff of the genotyping unit, S LaBoissière and F Robidoux (McGill University and Génome Québec Innovation Centre); U. Eilber and T. Koehler (GER); L. Gacucova (HMO); P. Schuermann, F. Kramer, T.-W. Park-Simon, K. Beer-Grondke and D. Schmidt (HJO); G.L. Keeney, C. Hilker and J. Vollenweider (MAY); the state cancer registries of AL, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA and WYL (NHS); L. Paddock, M. King, U. Chandran, A. Samoila and Y. Bensman (NJO); M. Insua and R. Evey (Moffitt); The Sherie Hildredth Ovarian Cancer Foundation (ORE); M. Sherman, A. Hutchinson, N. Szeszenia-Dabrowska, B. Peplonska, W. Zatonski, A. Soni, P. Chao and M. Stagner (POL); C. Luccarini, P. Harrington the SEARCH team and ECRIC (SEA); the Scottish Gynaecological Clinical Trails group and SCOTROC1 investigators (SRO); W-H. Chow, Y.-T. Gao, G. Yang, B.-T. Ji (SWH); I. Jacobs, M. Widschwendter, E. Wozniak, N. Balogun, A. Ryan and J. Ford (UKO); M. Notaridou (USC); C. Pye (UKR); and V. Slusher (U19). The COGS project is funded through a European Commission’s Seventh Framework Programme grant (agreement number 223175—HEALTH-F2-2009-223175). The Ovarian Cancer Association Consortium is supported by a grant from the Ovarian Cancer Research Fund, thanks to donations by the family and friends of Kathryn Sladek Smith (PPD/RPCI.07).
The scientific development and funding for this project were in part supported by the US National Cancer Institute (R01-CA-114343 and R01-CA114343-S1) and the Genetic Associations and Mechanisms in Oncology (GAME-ON): a NCI Cancer Post-GWAS Initiative (U19-CA148112). This study made use of the data generated by the Wellcome Trust Case Control consortium. A full list of the investigators who contributed to the generation of the data is available from http://www.wtccc.org.uk/. Funding for the project was provided by the Wellcome Trust under award 076113. The results published here are in part based upon data generated by The Cancer Genome Atlas Pilot Project established by the National Cancer Institute and National Human Genome Research Institute. Information about TCGA, and the investigators and institutions who constitute the TCGA research network can be found at http://cancergenome.nih.gov/. D.F.E. is a Principal Research Fellow of Cancer Research UK, G.C.-T. and P.M.W. are supported by the National Health and Medical Research Council. B.K. holds an American Cancer Society Early Detection Professorship (SIOP-06-258-01-COUN). L.E.K. is supported by a Canadian Institutes of Health Research Investigator award (MSH-87734). M.G. acknowledges NHS funding to the NIHR Biomedical Research Centre. A.C.A. is Cancer Research UK Senior Cancer Research Fellow. Funding of the constituent studies was provided by the American Cancer Society (CRTG-00-196-01-CCE); the California Cancer Research Program (00-01389V-20170, N01-CN25403, 2II0200); the Canadian Institutes for Health Research (MOP-86727); Cancer Council Victoria; Cancer Council Queensland; Cancer Council New South Wales; Cancer Council South Australia; Cancer Council Tasmania; Cancer Foundation of Western Australia; the Cancer Institute of New Jersey; Cancer Research UK (C490/A6187, C490/A10119, C490/A10124, C536/A13086, C536/A6689); the Celma Mastry Ovarian Cancer Foundation; the Danish Cancer Society (94-222-52); the Norwegian Cancer Society, Helse Vest, the Norwegian Research Council; ELAN Funds of the University of Erlangen-Nuremberg; the Eve Appeal; the Helsinki University Central Hospital Research Fund; Imperial Experimental Cancer Research Centre (C1312/A15589); the Ovarian Cancer Research Fund; Nationaal Kankerplan of Belgium; Grant-in-Aid for the Third Term Comprehensive 10-Year Strategy for Cancer Control from the Ministry of Health Labour and Welfare of Japan; the L & S Milken Foundation; the Radboud University Nijmegen Medical Centre; the Polish Ministry of Science and Higher Education (4 PO5C 028 14, 2 PO5A 068 27); the Roswell Park Cancer Institute Alliance Foundation; the US National Cancer Institute (K07-CA80668, K07-CA095666, K07-CA143047, K22-CA138563, N01-CN55424, N01-PC067001, N01-PC035137, P01-CA017054, P01-CA087696, P50-CA105009, P50-CA136393, R01-CA014089, R01-CA016056, R01-CA017054, R01-CA049449, R01-CA050385, R01-CA054419, R01-CA058598, R01-CA058860, R01-CA061107, R01-CA061132, R01-CA063678, R01-CA063682, R01-CA064277, R01-CA067262, R01-CA071766, R01-CA074850, R01-CA076016, R01-CA080742, R01-CA080978, R01-CA087538, R01-CA092044, R01-095023, R01-CA106414, R01-CA122443, R01-CA136924, R01-CA112523, R01-CA114343, R01-CA126841, R01-CA149429, R03-CA113148, R03-CA115195, R37-CA070867, R01-CA83918, U01-CA069417, U01-CA071966, P30-CA15083, PSA 042205, and Intramural research funds); the NIH/National Center for Research Resources/General Clinical Research Grant (M01-RR000056); the US Army Medical Research and Materiel Command (DAMD17-98-1-8659, DAMD17-01-1-0729, DAMD17-02-1-0666, DAMD17-02-1-0669, W81XWH-10-1-02802); the Department of Defense Ovarian Cancer Research Program (W81XWH-07-1-0449); the National Health and Medical Research Council of Australia (199600 and 400281); the German Federal Ministry of Education and Research of Germany Programme of Clinical Biomedical Research (01 GB 9401); the state of Baden-Württemberg through Medical Faculty of the University of Ulm (P.685); the German Cancer Research Center; Pomeranian Medical University; the Minnesota Ovarian Cancer Alliance; the Mayo Foundation; the Fred C. and Katherine B. Andersen Foundation; the Malaysian Ministry of Higher Education (UM.C/HlR/MOHE/06) and Cancer Research Initiatives Foundation; the Lon V. Smith Foundation (LVS-39420); the Oak Foundation; the OHSU Foundation; the Mermaid I project; the Rudolf-Bartling Foundation; the UK National Institute for Health Research Biomedical Research Centres at the University of Cambridge, Imperial College London, University College Hospital ‘Womens Health Theme’ and the Royal Marsden Hospital; WorkSafeBC.
The authors declare no competing financial interests.
A list of consortium members appears in Supplementary Note 1
A list of consortium members appears in Supplementary Note 2
A list of consortium members appears in Supplementary Note 3
About this article
Cite this article
Permuth-Wey, J., Lawrenson, K., Shen, H. et al. Identification and molecular characterization of a new ovarian cancer susceptibility locus at 17q21.31. Nat Commun 4, 1627 (2013). https://doi.org/10.1038/ncomms2613
The genetic case for cardiorespiratory fitness as a clinical vital sign and the routine prescription of physical activity in healthcare
Genome Medicine (2021)
Current Epidemiology Reports (2020)
Nature Communications (2019)
A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants
Nature Genetics (2019)
Early transcriptional response of human ovarian and fallopian tube surface epithelial cells to norepinephrine
Scientific Reports (2018)