The age-standardised incidence rate of ovarian cancer in Europe is 13 per 100 000, but its high fatality makes it the fifth leading cause of cancer deaths among women in this region (Ferlay et al, 2013). Among European women aged 55–74 years, the 5-year relative survival rate for ovarian cancer overall is 37.1% (Oberaigner et al, 2012), which, however, would increase up to 90% if the tumours were detected at the localised stages (American Cancer Society, 2014), implying the importance of early detection in reducing the ovarian cancer mortality rate.

In epidemiological literature, late menopause (Franceschi et al, 1991; Tsilidis et al, 2011a), early menarche (Gong et al, 2013), nulliparity (Adami et al, 1994; Tsilidis et al, 2011a), miscarriage (Braem et al, 2012), hormone replacement therapy (HRT) (Beral et al, 2007; Greiser et al, 2007; Tsilidis et al, 2011b), endometriosis (Pearce et al, 2012; Heidemann et al, 2014), genital powder use (Terry et al, 2013), family history of breast/ovarian cancer (Cramer et al, 1983; Kazerouni et al, 2006), body mass index (BMI; Olsen et al, 2013), cigarette smoking (Faber et al, 2013), and pre-existing diabetes (Lee et al, 2013) have been reported to increase the ovarian cancer risk, whereas breast-feeding (Jordan et al, 2012), contraception (oral contraceptives (OCs), intrauterine device (IUD), and tubal ligation) (Collaborative Group on Epidemiological Studies of Ovarian Cancer et al, 2008; Ness et al, 2011; Tsilidis et al, 2011a; Rice et al, 2012; Sieh et al, 2013; Rice et al, 2014), hysterectomy and unilateral ovariectomy (Rice et al, 2012, 2014), and possibly alcohol drinking (Genkinger et al, 2006; Kelemen et al, 2013) have shown protective effects. A risk prediction model based on a selection of these factors could help identify women who reach a minimal risk level to benefit from targeted prevention measures such as cancer screening or use of chemopreventive agents. Recently, Pfeiffer et al (2013) have developed an ovarian risk prediction model for US women. Their model included parity, HRT, OC use, and family history of breast/ovarian cancer, and showed a modest discriminatory power (concordance statistic=0.59) in external validation.

In the present study, we aimed to build an ovarian cancer risk prediction model for women in Western Europe using data from the European Prospective Investigation into Cancer and Nutrition (EPIC), with particular interest in examining whether the discriminatory power could be improved by considering more epidemiological risk factors.

Material and Methods

The EPIC cohort

The EPIC study is a multicentre, population-based cohort study including >520 000 participants (367 903 women), recruited between 1992 and 2000 in 23 study centres across 10 European countries (Norway, Sweden, Denmark, the United Kingdom, the Netherlands, Germany, France, Spain, Italy, and Greece). The study rationale and design have been described in detail elsewhere (Riboli and Kaaks, 1997; Riboli et al, 2002). At recruitment, all study participants completed questionnaires on lifestyle factors (including smoking history and alcohol drinking) and medical history, and for women, menstrual and reproductive histories, and history of HRT (details in Supplementary Table S1). Menopausal status at the time of enrolment was determined using information on recent menstrual cycles, hysterectomy, ovariectomy, and current HRT. Women aged 46–55 years with no or incomplete data to determine menopausal status were classified as peri-menopausal or unknown menopausal status. Baseline anthropometric data, including height, weight, and body circumference data were measured directly in all study centres except France, Norway, and the Oxford centre in the United Kingdom, where the data were self-reported. All study participants provided written informed consent, and the local ethics review boards of the participating institutions gave approval for the studies.

Prospective ascertainment of disease outcome and vital status

Through the follow-up period (until end of 2009, varying by centre), new cancer cases were identified through record linkage with regional or national cancer registries (Norway, Denmark, the United Kingdom, the Netherlands, Spain, and Italy), or by a combination of active follow-up, linkages to health insurance records and complementary requests of records from pathology registries (Germany, France, and Greece). In the present study, ovarian cancer was defined as ovarian, fallopian tube, and primary peritoneal cancer (ICD-O-2 codes C56.9, C57.0, and C48, respectively). Data on vital status were obtained from death registries at the regional or national level.


We excluded women with the following characteristics: (1) history of cancer (except non-melanoma skin cancer) at recruitment (n=19 707); (2) bilateral ovariectomy (n=10 500); (3) never menstruated (n=61); (4) incomplete follow-up data (n=2205); or (5) did not return the baseline questionnaire (n=509). We further excluded women whose baseline age was <45 years (n=69 215) because the incidence rate of ovarian cancer in this age group is extremely low (Quirk et al, 2002), and some reproductive factors (i.e., parity) are subject to change among younger women. Participants from Norway and Sweden were excluded from analyses, as most of the reproductive risk factors considered in the present study were not collected in these countries. After these exclusions, the final study population consisted of 202 206 women.

Statistical analysis

After reviewing the literature and considering available data, we identified the following factors as candidate predictors: menopausal status, age at menopause, age at menarche, number of full-term pregnancies (FTPs), age at first FTP, duration of breast-feeding, number of miscarriages, unilateral ovariectomy, hysterectomy, HRT, OC use, IUD use, BMI, smoking status, alcohol consumption, and pre-existing diabetes. Information on family history of breast cancer was missing for 53.0% of the study participants, and information on family history of ovarian cancer was not collected, thus these two factors were not considered in our model building process.

Multiple imputation of missing data

Missing data of individual candidate predictors were mostly sporadic and occurred in all participating study centres. However, a complete-data analysis would contain only 120 827 subjects (including 453 cases). To avoid this loss, we imputed the missing values with five-fold multivariate imputation by chained equations (Buuren and Groothuis-Oudshoorn, 2013), resulting in five complete data sets. All statistical analyses were conducted on the five data sets. The imputation quality is summarised in Supplementary Table S2.

Derivation of the risk prediction model

We built a cause-specific competing risk model from two components. As one part, the risk of incident ovarian cancer was estimated with a multivariable Weibull model. This was adjusted for the competing risk of death or other incident cancer with a Gompertz model as a second part in a modular fashion, as suggested by Benichou and Gail (1990).

All models were fit on each of the five imputed data sets, stratified by country. Age was used as the underlying timescale and observations were regarded as left-truncated at recruitment.

The development of incident ovarian cancer was estimated with a multivariable Weibull proportional hazards (PH) model (Collet, 2003) including all candidate predictors. The choice of this parametric model was based upon graphical comparisons with the non-parametric Nelson–Aalen estimate of the cumulative incidence (Supplementary Figure S1 in Supplementary Material). Age at exit was defined as the age at diagnosis of any cancer (except non-melanoma skin cancer), death from non-cancer causes, withdrawal from the cohort, or end of follow-up, whichever came first. Age at menopause, duration of HRT, duration of OC use, number of FTPs (0, 1, 2, 3, and 4+), duration of breast-feeding, and BMI were included in the models as continuous variables. Power transformations of these variables (Royston et al, 1999) did not improve the model fit (Supplementary Table S3). We created indicators for menopausal status, parity, HRT use, and OC use, and centred age at menopause, number of FTPs, age at the first FTP, duration of HRT use, and duration of OC use. Indicators for menopausal status, parity, HRT use, and OC use were coded as interaction parameters to reflect the conditional relations among variables. Information regarding how the candidate predictors were coded is detailed in Supplementary Table S4.

We fit one model including all the described risk factors. To derive a parsimonious model of similar prediction quality, we also performed backward elimination within the Weibull PH model. To preserve all important predictors, we used an inclusion criterion of P0.1 and finally retained any predictors selected by at least three of the five imputed data sets (Vergouwe et al, 2010). Eventually, a Weibull PH model including the final predictors was fitted on each of the five imputed data sets, and the parameter estimates were combined according to Rubin’s rules (Rubin, 1987). A relative risk score was calculated as the sum of the products of the individual predictor values and the associated linear covariate parameter estimates. We found no violation of the proportionality assumption for the final predictors from the Schoenfeld residuals (Supplementary Table S5).

Incident cancers other than primary ovarian cancer and deaths due to non-cancer causes were modelled as competing events with a country-stratified Gompertz model using age as the timescale. Again the parametric model class was decided upon after graphical comparison with the non-parametric Nelson–Aalen estimate, and estimates from the imputed data sets were combined under Rubin’s rules (Supplementary Figure S2 in Supplementary Material). To estimate the absolute 5-year risk of developing ovarian cancer for women at age t in a cause-specific competing risk context, we finally used the following equation adapted from Benichou and Gail (1990):

In this equation, h1(u; x) is the multivariable Weibull-hazard function for incident ovarian cancer, h2(ν) is the Gompertz-hazard function for the competing events, and the term expresses the probability of remaining alive and free of any cancer at age u.

Assessment of model quality with cross-validation

We performed a five-fold cross-validation to obtain estimates of model predictive accuracy from data sections that were kept aside from model fitting. The discriminatory power was evaluated using the overall concordance index (C-index), a modification to the area under the receiver-operating characteristic curve (AUROC) adapted to survival data (Pencina and D’Agostino, 2004). To assess the overall calibration, we calculated the ratio of the expected (E) to the observed number (O) of incident ovarian cancer cases within the first 5 years of follow-up. The expected number of cases was calculated as the sum of the 5-year ovarian cancer absolute risk among women who either remained alive and free of any cancer or developed ovarian cancer or competing events in the first 5 years of follow-up. The 95% confidence interval (CI) of the E/O was calculated with . We also assessed the agreement between the estimated and observed numbers of cases across each tenth of the predicted 5-year absolute risk using the Hosmer-Lemeshow (H-L) test and provide the corresponding calibration plot. In addition, the calibration slope was calculated from a regression of the cross-validation data, again combining estimates across multiple imputation with Rubin’s rules.

Missing data imputation was performed using the ‘mice’ package in R (version 3.0.1, R Foundation for Statistical Computing, Vienna, Austria). The Weibull models and the Gompertz models were fitted with the R package ‘eha’. The other statistical analyses were performed using SAS (version 9.2, SAS Institute, Cary, NC, USA).


After a median follow-up time of 11.7 years (range: 0.1–16.6), 791 primary ovarian cancers were diagnosed (median age at diagnosis: 63.4 years; range: 45.6–98.7), of them 324 cases were diagnosed within the first 5 years of follow-up. Other primary cancers were diagnosed in 9975 women, and 6386 women died due to non-cancer causes. Distribution of the baseline characteristics of the study population is presented in Table 1. Median age at recruitment was 52.4 years (range: 45.0–77.8). A total of 61.5% of women were postmenopausal, and the median age at menopause was 50 years (range: 12–67). Nearly 30% of women had ever used HRT and more than half (51.9%) of women had taken OCs. A relatively high frequency of missing values was observed for age at menopause (among postmenopausal women, 22.6%), duration of OC use (among ever users, 12.7%), and number of miscarriages (10.1%). Baseline BMI was on average 24.5 kg m−2 (range: 13.0–74.5).

Table 1 Distribution of baseline characteristics among the study population in the EPIC cohort presented as median (range) or percent

Risk factor effects were little affected by model selection, as can be seen from the parameter estimates and hazard ratios (Table 2). After backward selection, the risk factors menopausal status and age at menopause, HRT use and duration of HRT, OC use and duration of OC use, parity and number of FTPs, unilateral ovariectomy, and BMI remained in the model. Older age at menopause, longer duration of HRT use, and higher BMI were associated with an increased ovarian cancer risk, whereas OC use, longer duration of OC use, parity, more FTPs, and unilateral ovariectomy were associated with a reduced risk. Using these predictors’ coefficients, a woman’s ovarian cancer relative risk score was calculated as: RR=exp(0.019 × meno_stat+0.034 × meno_age+0.086 × hrt+0.057 × hrt_dur −0.181 × oc −0.034 × oc_dur −0.308 × parity −0.094 × ftps −0.691 × uni_ovariect+0.021 × bmi).

Table 2 Combined HRs and 95% CIs and β-coefficients for risk predictors in the full and in the selected model

The country-specific Weibull and Gompertz parameter estimates are provided as Supplementary Table S6. The deciles of the relative risk score are translated into absolute risk levels across an age range from 45 to 80 years (Figure 1). For women with the lowest and the highest tenth relative risk score, the corresponding 5-year risk at 45 years of age was 0.04% and 0.10%, respectively. The highest risk was observed at 68 years of age, ranging from 0.10% for the lowest tenth risk score to 0.24% for the highest tenth risk score, followed by a decline in older age groups.

Figure 1
figure 1

Predicted age-specific 5-year absolute ovarian cancer risk at decile cutoffs of the relative risk score. RR=exp(0.019 × meno_stat + 0.034 × meno_age + 0.086 × hrt + 0.057 × hrt_dur −0.181 × oc −0.034 × oc_dur −0.308 × parity −0.094 × ftps −0.691 × uni_ovariect + 0.021 × bmi). Country effect was fixed at the average level.

The C-index from cross-validation was 0.64 (95% CI: 0.58, 0.70) for the full model and 0.64 (95% CI: 0.57, 0.70) for the selected model (Table 3), implying a modest discrimination. With respect to calibration, the selected model predicted 293 ovarian cancers to occur during the first 5 years of follow-up, in contrast to 324 cases that were actually observed (E/O=0.90; 95% CI: 0.81, 1.01). This underestimation can be observed in eight decile groups in the calibration plot of the selected model (Figure 2), although the H-L test gave no evidence for miscalibration in general (P=0.14). With estimates 0.9, the calibration slope confirmed the above described tendency of overfitting, but the CIs do not indicate significance.

Table 3 Discrimination and calibration of full and selected model from five-fold cross-validation
Figure 2
figure 2

Predicted versus observed number of cases in the first 5 years of the follow-up by risk deciles.

We also externally validated Pfeiffer’s model (Pfeiffer et al, 2013) in a subgroup of EPIC women (n=66 493) who had information on family history of breast cancer, resulting in an overall C-index of 0.55 (95% CI: 0.52, 0.59). Our model performed on this subset with a C-index of 0.63 (CI: 0.51–0.76) and E/O ratio of 0.90 (95-%CI: 0.74–1.08). The E/O for the Pfeiffer’s model was 1.35 (95% CI: 1.12, 1.63), indicating significant overestimation.


We developed a risk prediction model for ovarian cancer based on epidemiological questionnaire data from European women aged 45 years and over. The risk factors menopausal status, age at menopause, duration of HRT, duration of OC use, unilateral ovariectomy, number of FTPs, and BMI were selected as major predictors. Cross-validation indicated that this model’s discriminatory power was modest (C-index: 0.64; 95% CI: 0.57, 0.70). Our model showed acceptable internal calibration, although the absolute risk was somewhat underestimated (E/O=0.90; 95% CI: 0.81, 1.01).

In preventive oncology, an important use of ovarian cancer risk prediction models is to identify women who are at high risk and thus may benefit from targeted interventions, such as population-based screening programs (van Nagell and Hoff, 2013; Menon, Griffin,Gentry-Maharaj, 2014), or chemoprevention (e.g., use of low-dose, non-steroidal anti-inflammatory drugs (Baandrup et al, 2013; Trabert et al, 2014)). To our knowledge, only two epidemiologic risk models so far have been developed. The recent one was developed by Pfeiffer et al (Pfeiffer et al, 2013), incorporating the data that had been used to develop the earlier model by Rosner et al (2005). The predictors in the Pfeiffer model were also included in our model except for family history of breast/ovarian cancer. We observed that including additional predictors (namely, age at menopause, unilateral ovariectomy, and BMI) slightly improved the discriminatory power, compared with the modest discriminatory statistic of 0.60 for the two US models. Application of the Pfeiffer model to the eligible part of our data showed less discriminatory ability than previously reported and substantial overestimation of the 5-year absolute risk, which may have been due to limited generalisability of predictive models between US and European populations. In the presence of the selected risk factors, other risk effects such as smoking, alcohol intake, or prevalent diabetes did not contribute independently to the predictive capacity of our model. The modest performance of the model is comparable to the discriminative potential of existing breast cancer risk-assessment tools, which also have a C-statistic 60%. Such models are applied in prevention programs that have a low invasive character in general, including careful surveillance (as in chemoprevention (e.g., USPSTF, 2002)), and are also used as inclusion criteria in prevention trials (e.g., McCaskill-Stevens et al, 2013).

While the comparatively large study population and high number of ovarian cancer cases were the strengths of our model development, our study had several limitations that may compromise the predictive accuracy of our model. First of all, the EPIC cohort lacked or had incomplete information on several risk factors for ovarian cancer, first among these family history of breast/ovarian cancer, as ovarian cancer is strongly associated with mutations in BRCA1/2 genes (Powell, 2014). Tubal ligation, endometriosis, talcum use, and other possible risk factors such as polycystic ovary syndrome (Barry et al, 2014) were other factors lacking from our database. We also had no updated information on bilateral ovariectomy after recruitment, which may affect the competing risk estimates and model calibration but should not affect the discriminatory power. Regarding HRT, it has been suggested that oestrogen-only therapy has a stronger association with ovarian cancer risk than oestrogen/progestin therapy (Greiser et al, 2007; Tsilidis et al, 2011b), therefore distinguishing that HRT subtypes may improve the predictive accuracy. However, the information on use of specific HRT subtypes was only available for the current users at baseline. Finally, the number of ovarian cancer cases in our study was not large enough for more detailed risk modelling by ovarian cancer subtypes, as defined by histology or molecular (‘type-I/type-II’) characteristics (Shih and Kurman, 2004; see Supplementary Table S7 for details on subtypes). In a large US case–control study (1571 cases), a history of endometriosis and parity, having a previous tubal ligation and previous hysterectomy, showed a stronger association with endometrioid/clear cell ovarian tumours or tumours classified as type-I (Merritt et al, 2013), implying that it might be more appropriate to build subtype-specific risk models for targeted prevention rather than a single model shared by all subtypes.

Our ovarian cancer risk prediction model has not been externally validated. However, we applied cross-validation to estimate model validity from parts of the data that were not used for model building, and this added credibility to our findings. Calibration towards country of origin was necessary in the model building process; this implies that the reported variation in ovarian cancer incidence rates in European countries (Ferlay et al, 2013) persisted even after adjustment for the risk factors considered in the model building process. Thus, model generalisation is difficult and validity may be impaired by population heterogeneity, as can also be seen in the poor performance of the Pfeiffer model on our data, although that model performed well on US cohort data.

An ovarian cancer risk prediction model based on clinical symptoms such as abdominal pain and rectal bleeding was first developed by Goff et al (2007) on a US-American case–control study showing promising levels of sensitivity (57% and 80% for early-stage and advanced-stage disease, respectively) and specificity (87% and 90% for women below and above the age of 50 years). A similar model recently developed in a UK study model showed a fairly high discriminatory accuracy (AUROC=0.84) with regard to 2-year risk in the internal validation (Hippisley-Cox and Coupland, 2012), suggesting that the predictive accuracy of our model could be improved by adding (pre-) clinical symptoms. However, the expanded model may not perform well in early identification of women who are at high risk yet asymptomatic.

In summary, the ovarian cancer risk model we built using non-invasively measured epidemiologic risk factors showed a modest discriminatory power in a Western European population, comparable to previously developed models on US cohort data. Future studies should consider adding informative biomarkers to possibly improve the predictive ability of the model.