Main

Ovarian cancer is a common cause of cancer death in women from developed countries (Jemal et al, 2011). Most women have advanced stage disease at initial presentation and, despite debulking surgery and platinum-based chemotherapy, the majority of patients with advanced disease relapse within 2 years (McGuire et al, 1996; Piccart et al, 2000; du Bois et al, 2003; Ozols et al, 2003) and are offered more chemotherapy.

Recurrent ovarian cancer comprises a heterogeneous group of patients, with a wide variation in response to chemotherapy as well as in progression-free and overall survival (ICON and AGO collaborators, 2003; Pfisterer et al, 2006; Pujade-Lauraine et al, 2010). For over two decades, relapse-free interval or the platinum-chemotherapy-free interval has been used to predict the likelihood of response to subsequent chemotherapy and guide treatment decisions (Blackledge et al, 1989; Gore et al, 1990; Hoskin et al, 1991; Markman et al, 1991). Patients who initially respond to platinum-based chemotherapy and who subsequently have a relapse-free interval of 6 months or longer are classified as ‘platinum-sensitive’ (Thigpen et al, 1994). Most of these patients are then offered further platinum-based combination chemotherapy.

In addition to using time to recurrence to predict platinum sensitivity, a number of other factors may also be important predictors of clinical outcomes in patients with recurrent ovarian cancer. In a study of patients previously treated with platinum chemotherapy, tumour size, serous histology and number of organs and anatomic sites of disease involvement were more important than presumed ‘platinum sensitivity’ as predictors of response to subsequent chemotherapy (Eisenhauer et al, 1997). Other chemotherapy trials have identified age, performance status, CA-125, residual tumour volume after surgical cytoreduction and ascites to be independent predictors of progression-free survival (PFS) and overall survival (Cantu et al, 2002; Gordon et al, 2004; Pfisterer et al, 2006; Ferrero et al, 2007).

No tool is currently available to combine all these putative prognostic predictors into a summary measure for prediction of individual patient outcome. In this study, we aimed to develop and validate a prognostic nomogram that uses widely available pre-treatment clinical and laboratory data to improve our ability to predict PFS in patients with recurrent ovarian cancer receiving platinum-based chemotherapy. Our goal was to develop a tool that could be used to stratify the patients according to risk in clinical trials by examining multiple prognostic factors simultaneously. We expect that a better prognostic classification will lead to more precise identification and selection of patients for entry as well as improved stratification in prospective clinical trials. This tool could also be used to better inform patients with recurrent ovarian cancer regarding likely outcomes with further chemotherapy.

Materials and methods

Study population

The training cohort consisted of 976 patients enrolled in the CALYPSO study (Pujade-Lauraine et al, 2010) between April 2005 and September 2007. These ‘platinum-sensitive’ patients had been treated for ovarian cancer with first- or second-line platinum-based chemotherapy, including taxane therapy. Patients were randomised to CPLD (carboplatin and pegylated liposomal doxorubicin) or to CP (carboplatin and paclitaxel).

The validation cohort comprises 366 ‘platinum-sensitive’ patients enrolled in the AGO-OVAR 2.5 Study (Pfisterer et al, 2006) between September 1999 and April 2002. These patients had only platinum-based chemotherapy, but prior taxane treatment was not required for eligibility. Patients were randomised to CG (carboplatin and gemcitabine) or C (carboplatin) alone.

In both studies, patients were treated with a planned total of six cycles of chemotherapy in the absence of progressive disease or unacceptable toxicity. Patients benefiting from treatment could continue beyond six cycles at the discretion of the investigators.

Statistical method

The primary end point was PFS by the RECIST (Response Evaluation Criteria In Solid Tumors) criteria (Therasse et al, 2000). Twenty-five variables related to baseline patient and disease characteristics, haematological, biochemical and tumour marker parameters, past treatments and trial chemotherapy received were examined univariately in the training cohort. Multivariable Cox proportional-hazards analysis (Cox, 1972), stratified according to treatment received in the trial, was performed with backward stepwise selection, and only statistically significant variables (P<0.05) were retained. The variables were assigned points on a scale for constructing the nomogram.

Patients were grouped by quartile on the basis of the predicted probability of PFS. The first quartile (score30) formed the good-prognosis group. The middle two quartiles (31score68) were combined to form the intermediate-prognosis group. The final quartile (score69) formed the poor-prognosis group.

We validated the nomogram using several approaches. Harrell's discrimination concordance index (C-index) statistic (which is the equivalent of an area under the receiver-operating characteristics curve for survival data) was calculated with the model refitted 200 times by bootstrap resampling in the training cohort. The C-index estimates the proportion of all pairwise combinations of patients whose PFS times are ordered, such that the patient with the longest predicted PFS was the one who actually lived longer (discrimination) (Harrell et al, 1996). The nomogram was then applied to patients in the validation cohort and the C-index was also calculated and compared with that of the training cohort. We also compared the C-indices, from the training and validation cohorts, of the nomogram against prognostic classification based on the platinum-chemotherapy-free interval alone. Calibration, which refers to the ability of nomogram's predictions to match the observed PFS across the entire spread of the data in the validation cohort, was examined visually by comparing the actual vs predicted PFS for each of the three prognosis groups. Tests of goodness-of-fit were used to compare observed and predicted PFS over deciles of the risk score. A significant P-value for this statistic shows lack of calibration of the model (i.e., a significant difference between expected and observed PFS) (May et al, 2003).

Five pre-treatment variables, platinum-free interval, serum CA-125, size of the largest tumour, number of organ sites of metastasis and serum white blood count, were identified as significant in the training cohort. An organ site is defined as the presence of metastasis within the organ, regardless of the extent of metastasis within that organ. Surface involvement is considered as peritoneum site. Information on CA-125 was not available from the AGO-OVAR 2.5 Study. We therefore validated the nomogram as follows the training cohort was refitted with four pre-treatment variables without CA-125 and we then applied the new risk stratification to patients in the validation cohort.

Results

The median PFS was significantly shorter in the validation cohort than the training cohort (log-rank P<0.0001) (Table 1; Figure 1). Women in the validation cohort had poorer performance status (ECOG1) than the training cohort. More patients had a tumour size 5 cm in the validation cohort, and fewer patients had received second-line treatment. The two cohorts did not differ significantly otherwise.

Table 1 Baseline characteristics of patients in the training and validation cohorts
Figure 1
figure 1

PFS in the training (CALYPSO) and validation (AGO-OVAR 2.5) cohorts.

Development of the nomogram and internal validation

In the training cohort, the median follow-up duration was 17.5 months (range 0–37.5). The proportion of women surviving at 12 months was 44.3% (95% CI (confidence interval), 40.9–47.6%) (Figure 1). In multivariable analysis based on 955 patients with complete information (Table 2), the platinum-free interval, the size of the largest tumour, serum CA-125 level, the number of organ sites of metastasis and serum white blood count and were significant predictors of PFS.

Table 2 Multivariable proportional hazard regression model, stratified for treatment received, for predicting progression-free survival using data from the training cohort (CALYPSO)

A point scale was used to assign points to these five variables in the nomogram based on the multivariable model (Table 2). The sum of the points assigned for each variable was rescaled to range from 0 to 100 (Figure 2). Estimated median PFS and probability of PFS at 12 months are obtained by drawing a vertical line from the total point's axis straight down to the outcome axes. In this nomogram, the size of the largest tumour contributed 30 points to the variation in PFS relative to all of the other predictors. The relative contributions of the other variables were platinum-free interval (27 points), CA-125 (21 points), number of organ sites of metastasis (12 points) and serum white blood count (10 points).

Figure 2
figure 2

Nomogram for predicting PFS in platinum-sensitive recurrent ovarian cancer. Points are assigned for tumour size, platinum-chemotherapy-free interval, CA-125, number of organ sites of metastasis and serum white blood cell by drawing a line upward from the corresponding values to the ‘Points’ line. The sum of these five points, plotted on the ‘Total points’ line, corresponds to predictions of median PFS, probability of PFS at 12 months. A web-based electronic version of this nomogram is available at http://roconline.ctc.usyd.edu.au.

The model showed good discrimination (C-index, 0.645; bootstrap-corrected, 0.641). The good-prognosis group (score30) comprises 229 patients (24%) with a median PFS of 16.1 months (95% CI, 13.9–20.9) and 1-year PFS of 64.6% (95% CI, 57.4–70.8%). The intermediate-prognosis group (31score68) comprises 502 patients (53%) with a median PFS of 11.8 months (95% CI, 10.6–12.0) and 1-year PFS of 45.6% (95% CI, 40.9–50.3%). The poor-prognosis group (score69) comprises 224 patients (23%) with a median PFS of 9.0 months (95% CI, 8.9–9.1) and 1-year PFS of 20.9% (95% CI, 15.5–26.8%). Figure 3A also illustrates the discriminatory value of the nomogram according to the three prognosis groups (log-rank P<0.0001).

Figure 3
figure 3

PFS stratified according to prognosis groups in (A) training cohort, (B) training cohort (without CA-125) and (C) validation cohort.

When CA-125 was omitted from the multivariable model, the bootstrap-corrected C-index for the nomogram decreased to 0.631 (bootstrap-corrected, 0.626). The good-prognosis (222 patients), intermediate-prognosis (486 patients) and poor-prognosis (247 patients) groups had median PFS of 17.4, 11.7 and 9.1 months, respectively (Figure 3B; log-rank P<0.0001).

External validation of the nomogram

In the validation cohort, the median follow-up was 23.9 months (range 0–44.2). Information on CA-125 was unavailable from this cohort. The C-index, when the nomogram (without CA-125) was applied, was 0.594.

Only 53 patients (15.6%) were classified with this nomogram as having good prognosis and 221 (65.0%) and 66 (19.4%) patients were classified as having intermediate and poor prognosis, respectively. Figure 3C shows the good discrimination between the three prognosis groups (log-rank P=0.008). The good-prognosis group had a median PFS of 9.1 months (95% CI, 6.5–11.0) and 1-year PFS of 32.2% (95% CI, 19.9–45.2%). The intermediate-prognosis group had a median PFS of 7.8 months (95% CI, 6.8–8.6) and 1-year PFS of 23.4% (95% CI, 17.9–29.3%). The poor-prognosis group had a median PFS of 6.4 months (95% CI, 4.2–7.8) and 1-year PFS of 12.1% (95% CI, 5.7–21.2%).

The calibration plot of the actual vs predicted PFS for each of the three prognosis groups (Figure 4) indicates that the nomogram does not systematically underpredict or overpredict for any of the three groups (goodness-of-fit (log-likelihood ratio test) χ2=11.41 (8 degrees of freedom), P=0.18).

Figure 4
figure 4

Calibration of the prognostic nomogram in the validation cohort at 12 months.

Nomogram stratification compared with classification based on platinum sensitivity only

In the training cohort, complete ‘platinum-sensitive’ patients (platinum-free interval >12 months) had a longer median PFS than partial ‘platinum-sensitive’ patients (platinum-free interval 6–12 months) (12.2 vs 9.2 months, log-rank P<0.0001). The C-index was 0.571.

In the validation cohort, complete ‘platinum-sensitive’ patients also had a longer median PFS than partial ‘platinum-sensitive’ patients (5.7 vs 8.5 months, log-rank P=0.0003). The C-index was to 0.560.

Web-based interface

A web-based version of our nomogram, ROC Online, provides individualised estimates of PFS based on values of the identified characteristics and is available at http://roconline.ctc.usyd.edu.au.

Discussion

We developed a prognostic nomogram to predict PFS in women with platinum-sensitive recurrent ovarian cancer receiving platinum-based chemotherapy by using widely available baseline clinical and laboratory information from the 955 patients in the CALYPSO study. This nomogram uses the five factors with the most significant influence on PFS from a range of prognostic variables commonly accepted as important in this patient population. When validated in an independent population, this nomogram provided good discrimination for classifying prognosis.

Randomised trials report wide variation in PFS for patients with platinum-sensitive recurrent ovarian cancer, ranging from a few weeks to >3 years (ICON and AGO collaborators, 2003; Pfisterer et al, 2006; Pujade-Lauraine et al, 2010). Rather than relying on two or three prognostic stratification factors in randomised trials, this nomogram represents an important advance and will allow better patient stratification in trials using multiple prognostic predictors evaluated simultaneously. Although these predictors are not new, this nomogram ranks the importance of each predictor variable in association with another. This study confirms that longer platinum-chemotherapy-free interval is associated with better PFS. However, relative to the other four predictors, the platinum-chemotherapy-free interval contributed only 27 to the total prognostic score of 100. The size of the largest tumour had greater prognostic significance, contributing 30 points. CA-125 (21 points), number of organ sites of metastasis (12 points) and serum white blood count (10 points) were also individually important contributors to PFS information.

When the nomogram was used in the validation cohort, the C-index was 0.594. This means that for two randomly selected patients, if one patient with the shorter follow-up time has disease progression, the nomogram has a 59% chance of predicting disease progression for the other patient. Since the validation was performed without CA-125 information, it is likely to underestimate the true performance of the nomogram. Our analysis of the training cohort with five predictors, including CA-125, showed a C-index of 0.645. Bootstrap correction to prevent overfitting of the prognostic model revealed minimal change to the C-index (0.641). In both the training and validation data sets, the platinum-chemotherapy-free interval alone performed poorly in discriminating a patient's prognosis (C-indices 0.559 and 0.558, respectively). Prognostic stratification is improved with the combination of five predictors over the platinum-chemotherapy-free interval alone.

In the absence of a cancer staging system for patients with recurrent ovarian cancer, the nomogram could be used to stratify patients on the basis of our risk-score cutoff points in randomised trials. A consistent definition of risk based on this nomogram will allow selection of a more homogeneous cohort of patients, ensure better balance of important prognostic factors in various arms of a randomised study and improve international collaboration in clinical trials through adherence to an identical prognostic classification.

The nomogram can also be used to enrich clinical trials by targeting specific risk groups. For example, only poor-prognosis patients could be recruited to trials of novel approaches to treatment. In contrast to ‘all comers’ studies, ‘enrichment’ trials will have more power to detect the treatment effect and substantially reduce the patient accrual needed. Such an approach is also ethically desirable because it can minimise patients’ exposure to experimental treatment of unproven benefits.

This nomogram is also a pragmatic tool that uses readily available clinical information to provide simple prognostic information for oncologists and patients from complex statistical estimates. Most patients with advanced cancer would like information about their prognosis (Hagerty et al, 2004). However, a major barrier is to provide an accurate estimate of prognosis particularly in patients with incurable cancers (Mackillop and Quirt, 1997). Recent work by others to develop simple rules for estimating typical, best-case and worse-case scenarios in advanced breast cancer provide an important advance in personalising discussion between oncologists and their patients regarding prognosis (Kiely et al, 2011). This present work is an important first step towards providing personalised prognostic information in ‘platinum-sensitive’ recurrent ovarian cancer. This tool can be used as a platform that can be further adapted as we refine our understanding of the biology of ovarian cancer with novel biomarkers and improvement in therapeutics.

The nomogram improves our understanding of the relationship between disease burden and PFS. It has been argued that conventional imaging techniques do not easily detect peritoneal carcinomatosis in patients with recurrent ovarian cancer (Hopper et al, 1996; Gronlund et al, 2004; Ferrandina et al, 2008). As demonstrated in this study, organ sites of metastasis and size of tumour, as determined by these imaging techniques, do not capture the overall prognostic information. In contrast, this nomogram included CA-125 as an additional prognostic marker to evaluate the unmeasurable disease burden. However, discrimination of outcome, based on histology, is limited due to the small number of patients within each of non-serous histological subtypes.

This study also supports the role of the white blood cell count as a potential marker of inflammation and prognosis. Although the reason for the association between elevated white cell count and PFS is uncertain, several studies have reported that it was an adverse prognostic factor in patients with recurrent ovarian cancer (Bishara et al, 2008; Cho et al, 2009). An elevated white cell count, together with elevated C-reactive protein and other pro-inflammatory cytokines, and hypoalbuminaemia are thought to be markers of tumour–host interaction involved in the anorexia–cachexia syndrome in patients with cancer (Sharma et al, 2008; McMillan, 2009) and are associated with poor tolerance to, and increased toxicity, from standard-dose chemotherapy.

This nomogram has a number of limitations. First, it was developed and validated on data from highly selected patients who were enrolled in two large phase III trials. Its applicability to patient groups outside clinical trial settings and patients treated with non-platinum treatments (Monk et al, 2010) needs to be tested further. Second, the applicability to patients with a platinum-free chemotherapy interval of <6 months remains unclear. This nomogram also did not include molecular or genetic biomarkers that have been reported to have independent prognostic value (Schultheis et al, 2008). The incremental value of these biomarkers, in addition to the factors used in our nomogram, in predicting PFS remains unknown and warrants further research. Despite these limitations, this nomogram represents a major improvement over current practice in prognostication of patients with recurrent ovarian cancer. We anticipate that this nomogram will stimulate ongoing research that will lead to improvements over time as our understanding of the biology of ovarian cancer progresses and access to a larger number of effective therapies becomes available.

In conclusion, we have developed a robust prognostic nomogram to predict PFS in patients with platinum-sensitive recurrent ovarian cancer undergoing platinum-based chemotherapy. This nomogram represents an improvement in prognostication over the platinum-free chemotherapy interval alone. This tool can facilitate the design and implementation of future collaborative randomised and non-randomised clinical trials. It also represents an important first step towards providing prognostic information for patients with this life-threatening illness to guide treatment selection.