Main

Clinical management in the recurrent epithelial ovarian cancer (EOC) setting is not standardized. Although numerous chemotherapeutic regimens are currently recommended (National Comprehensive Cancer Network, 2013a), there is insufficient evidence from clinical trials to demonstrate that any single treatment is superior to any other, particularly in terms of overall survival (OS; Coleman et al, 2013), and, in practice, the choice of treatment remains empiric. Chemoresponse assays are one approach that has the potential to improve the selection of a clinically effective therapy among the many options. Although there have been multiple studies providing encouraging data, the use of chemoresponse assays in the treatment of recurrent EOC continues to be debated (Schrag et al, 2004; Burstein et al, 2011).

Most of the previous chemoresponse assay studies in EOC were retrospective analyses evaluating patients who had tissues assayed over many years, making it difficult to ensure the quality of the assay data and its consistency with the clinical data (Holloway et al, 2002; Matsuo et al, 2010; Pant et al, 2010). Recently, a large, prospective study evaluating the clinical relevance of a chemoresponse assay (ChemoFx, Precision Therapeutics, Inc., Pittsburgh, PA, USA) in recurrent or persistent EOC was reported (Rutherford et al, 2013). The study showed a significant improvement in the clinical outcome for patients treated with therapies to which their tumours were assayed as sensitive compared with those treated with therapies assayed as non-sensitive, supporting the clinical benefit of the assay. However, questions remain regarding this assay’s prognostic and/or predictive utility given that (1) cross-drug response is common in ovarian cancer (i.e., if a tumour is sensitive to treatment A, then the same tumour may also be sensitive to treatments B, C, D, etc.), making it challenging to select a specific therapy among the multiple available treatments; and (2) the assay may simply reflect the intrinsic biology of a tumour (i.e., patients with assay-sensitive treatment results may simply have better prognoses than patients with assay-resistant results, regardless of the treatment clinically administered) and, thus, this assay may have only a prognostic role. As such, it is important to further evaluate the ability of this assay to function as a predictive marker, that is, select specific effective therapies for personalised treatment.

A prognostic marker is a clinical or biological characteristic that ‘identifies patients with differing risks of a specific outcome, such as progression or death’ (Sargent et al, 2005), regardless of treatment administered. Prognostic markers are helpful for identifying patients that are at high risk of relapse and therefore are potential candidates for alternative management strategies. In contrast, predictive markers ‘are associated with response (benefit) or lack of response to a particular therapy relative to other available therapy’ (McShane, 2012) and can be used to identify patients most likely to benefit from a specific therapy. In a clinical setting where a number of treatment options with similar impact on patient outcome are available, the utility of a predictive marker for individualised therapy would generally be greater than the utility of a prognostic marker.

The current study sought to address whether this chemoresponse assay could function as a predictive marker, with the capacity to discern specific therapies that are likely to be more effective and those that are not. The study results reported herein are consistent with REMARK guidelines (McShane et al, 2005).

Materials and Methods

Study Population

The current analyses were conducted using the evaluable cohort of 262 women with persistent or recurrent EOC from a recent prospective, multisite, noninterventional clinical study (Trial registration ID: NCT00288275; Rutherford et al, 2013). All patients were enrolled under an IRB approved protocol and provided written informed consent consistent with all federal, state and local regulations before participating in the study. The study design, patient eligibility, chemoresponse assay methodology and patient outcome assessment have been described elsewhere (Rutherford et al, 2013). Briefly, patients were treated with one of 15 prospectively defined, common chemotherapy options available for recurrent EOC (National Comprehensive Cancer Network, 2013a), based on the medical judgment of the oncologist who was blinded to the assay results. Fresh tissue tumour samples were collected at the time of recurrence and then assayed for chemoresponse against this panel of therapies. Each therapy for each patient tumour sample was classified by the assay as sensitive (S), intermediate (I) or resistant (R). A CONSORT diagram describing the study population is presented as Supplementary Figure 1.

Statistical analysis

The primary endpoint of this study was progression-free survival (PFS), defined as the length of time from the start of therapy until the date of first-documented disease progression or death. Disease progression was measured by radiologic examination, physical examination and CA-125 measurements using RECIST or GCIC criteria. Assessment was performed every other cycle during the treatment, every 3 months for the first 2 years, every 6 months for the next 3 years and annually thereafter. The association of chemoresponse assay results with PFS was assessed using the Cox proportional hazards model (Cox, 1972); the proportional hazards assumption was tested by examining the relationship between scaled Schoenfeld residuals and time (Grambsch and Therneau, 1994). The Kaplan–Meier method was used to estimate the probability and duration of PFS (Kaplan and Meier, 1958), and the log-rank test was used to test for the differences between PFS curves by assay category (Peto and Peto, 1972).

Several novel approaches were used to evaluate the predictive value of the assay (Supplementary Figure 2). First, association of the assay with PFS was compared between match and mismatch analyses. The match analysis was performed using the assay result for the administered therapy (assayed therapy=administered chemotherapy) for each patient; the mismatch analysis was performed using the assay results for a treatment randomly selected from all assayed treatments, not necessarily matching the administered therapy, for each patient (assayed therapy≠administered chemotherapy). Both univariate and multivariate analyses controlling for other clinical factors were conducted. Using this approach, if the assay is neither prognostic nor predictive, the HR of PFS for assay-sensitive vs assay-non-sensitive for both match and mismatch would be 1.0; if the assay is only prognostic, then both HRs for match and mismatch would be <1.0 and have identical values. However, if the assay has both prognostic and predictive value, then both HRs for PFS for match and mismatch would be <1.0, and the HR for match would be lower than HR for mismatch. In short, if the assay is a predictive marker, then the association of PFS with match assays is expected to be stronger than with mismatch assays. To obtain the HR for mismatch, the assay result for one therapy for each patient was randomly selected with equal probability from the (up to) 15 study-designated therapies, and the association with PFS was calculated. This procedure was repeated 3000 times and the mean HR for mismatch was assessed.

Second, the influence of cross-drug response (either sensitivity or resistance to all assayed therapies) was evaluated by calculating a multiple drug response index (MDRI). This index represents the percentage of all assayed therapies to which a patient scored as sensitive . Patients were classified into four groups based on their MDRI and administered clinical treatment: sensitive to all assayed therapies (MDRI=100%) and therefore treated with a sensitive therapy (SA), sensitive to some therapies (0<MDRI<100%) and treated with a sensitive therapy (SP), non-sensitive to all therapies (MDRI=0) and therefore treated with a non-sensitive therapy (RA), and non-sensitive to some therapies (0<MDRI<100%) and treated with a non-sensitive therapy (RP). This analysis investigates if the observed association between assay result and clinical outcome is driven by the inherent chemosensitivity of the tumour (i.e., homogeneous response to all therapies) or is due to the ability of the assay to specifically identify sensitive therapies. In this approach, the chemoresponse assay is suggested to be a predictive marker when the PFS for sensitive categories (SA, SP) differs from the PFS for non-sensitive categories (RA, RP) and no substantial difference is observed within each category (SA vs SP and RA vs RP). In other words, PFS should be associated with the chemosensitivity of the tumour for the administered therapy, independent of MDRI (i.e., regardless of homogeneous or heterogeneous assay responses across all of the assayed therapies).

In addition, a two-dimensional clustering analysis was conducted using the chemoresponse assay scores for seven single-agent therapies (carboplatin, cisplatin, gemcitabine, pegylated liposomal doxorubicin (PLD), paclitaxel, docetaxel and topotecan). These seven agents, alone or in combination, comprise the 15 treatments included in this study. This analysis examines the level of correlation between assay results for different therapies and also explores whether patients could be clustered into distinct groups, based on their assay results. If the assay simply reflects the intrinsic biology of a tumour, these clusters may have different prognostic profiles. The clustering analysis was performed using a hierarchical clustering algorithm based on the complete linkage method (Kaufman and Rousseeuw, 1990).

All analyses were performed using SAS version 9.3 (SAS Institute, Cary, NC, USA), except for clustering analysis which was implemented using R version 3.01 (cran.r-project.org). All reported P-values are two-sided with P<0.05 considered statistically significant.

Results

A total of 335 patients were clinically eligible for inclusion in the study, with 262 of them evaluable for analysis. Patient characteristics for the 262 patient cohort have been previously reported (Rutherford et al, 2013) (Table 1) and are comparable to the superset of clinically eligible patients (Supplementary Table 1). Of the 15 therapies prospectively defined in the protocol, 12 were administered clinically, including both single agent and combination therapies (Rutherford et al, 2013).

Table 1 Patient characteristics

Association of clinical outcome with chemoresponse assay results: match/mismatch analysis

As previously reported, the assay result for match was significantly associated with PFS (HR=0.67, 95% CI=0.50–0.91, P=0.009), with patients treated with an assay-sensitive therapy showing an improvement in PFS as compared with assay-non-sensitive (I+R) patients (Figure 1). This association was consistent after controlling for clinical covariates (HR=0.66, 95% CI=0.47–0.94, P=0.020; Rutherford et al, 2013). The average prognostic value of assay results for multiple different therapies was examined using the assay results for mismatch, in which the assay result for one treatment was randomly selected from the (up to) 15 designated therapies with equal probability for each patient, and the association with PFS was estimated. Based on 3000 repeated re-samplings, the mean HR for mismatch was 0.81 (95% range=0.66–0.99) (Figure 2A). Based on the distribution of HRs for mismatch, only 3.4% of HRs were <0.67 (Figure 2A). When multivariate analysis was performed, the HRs were 0.66 and 0.88, based on match and mismatch, respectively, with only 0.7% of HRs from mismatch <0.66 (Figure 2B). The results for OS were similar. In univariate analysis, the HRs for death were 0.61 and 0.76, based on match and mismatch, respectively, with 5.3% of HRs from mismatch <0.61; in multivariate analysis, the HRs were 0.59 and 0.79, based on match and mismatch, respectively, with 3.6% of HRs from mismatch <0.59. The HRs obtained from both match and mismatch analyses demonstrate the prognostic value of the assay (i.e., HR for both match and mismatch are significantly less than 1.0). More importantly, these results also indicate that the HR for match is materially lower than that for mismatch which supports the predictive nature of the assay.

Figure 1
figure 1

Association of clinical treatment assay result (match) with PFS. Patients treated with assay-sensitive treatments experienced a median PFS of 8.8 months, whereas those treated with assay-non-sensitive treatments experienced a median PFS of 5.9 months. Reproduced by kind permission of Elsevier from Rutherford et al (2013).

Figure 2
figure 2

Association of randomly selected assay result (mismatch) with PFS. The mean HR for mismatch was calculated from repeated (3000) simulations in univariate (A) and multivariate (B) analyses.

Cross-drug response and impact on clinical outcome

Assay tumour responses differ by therapy, and cross-drug resistance was evident in this population. Of 262 tumours, 123 (47%) were identified as non-sensitive to all tested therapies (RA), whereas 24 (9%) were defined as sensitive to all tested therapies (SA); the remainder (44%) were sensitive to at least one therapy. For those patients showing a heterogeneous pattern of response, 51 were treated with sensitive therapies (SP), and 64 were treated with non-sensitive therapies (RP). The median PFS was 9.1, 8.8, 5.9 and 5.9 months for SA, SP, RP and RA, respectively, with significant difference between SA+SP vs RA+RP, as previously shown (HR=0.67, 95% CI=0.50–0.91, P=0.009; Rutherford et al, 2013), as well as approximately 3 months difference between SA vs RA and between SP vs RP, but without meaningful differences between SP vs SA or between RP vs RA (Figure 3).

Figure 3
figure 3

PFS for patients displaying homogeneous vs heterogeneous assay responses. Patients were classified into four groups based on their calculated multiple drug response index (MDRI) and administered clinical treatment: sensitive to all assayed therapies (MDRI=100%) and treated with a sensitive therapy (SA), sensitive to some therapies (0<MDRI<100%) and treated with a sensitive therapy (SP), non-sensitive to all therapies (MDRI=0) and treated with a non-sensitive therapy (RA), and non-sensitive to some therapies (0<MDRI<100%) and treated with a non-sensitive therapy (RP).

The impact of cross-drug response on PFS was further assessed based on MDRI using the Cox model. In univariate analysis, patients whose tumours were sensitive to more therapies (i.e., higher MDRI) experienced better prognoses, as expected (HR=0.96, 95% CI=0.92–0.99, P=0.034). However, MDRI was no longer significant in multivariate analysis (HR=1.02, 95% CI=0.95–1.09, P=0.629), whereas the association between assay result for administered therapy and PFS remained evident (HR=0.60, 95% CI=0.36–1.02, P=0.057; Table 2).

Table 2 Univariate and multivariate analyses of the association of PFS with chemoresponse assay results

Chemoresponse assay scores for the seven single-agent therapies were further evaluated using two-dimensional clustering analysis. Therapies are represented along the x axis, with carboplatin, cisplatin and doxorubicin clustering together and paclitaxel, docetaxel, topotecan and gemcitabine clustering together. Along the y axis, patients were classified into three clusters (A, B and C), consisting of 63%, 36% and 1% of the study population, respectively (Figure 4). Cluster C was excluded from the analysis owing to the limited number of patients in this group. There is no evidence that this unsupervised clustering was significantly associated with either clinical characteristics or patient outcome (data not shown). These results, together with the data shown in Figure 3 and Table 2, indicate this assay is likely not prognostic of outcome independent of treatment, like some clinical factors (e.g., age, stage, grade, etc.). Rather, the assay’s prognostic value (as observed in the mismatch analysis) is likely attributed to cross-drug response (i.e., correlation of assay results for randomly selected therapies with assay result for clinically administered therapy), as anticipated.

Figure 4
figure 4

Two-dimensional clustering analysis based on chemoresponse assay data for single-agent treatments. Therapies are represented along the x axis, and patients are represented along the y axis. Red indicates lower assay score (i.e., sensitive), and yellow indicates higher assay score (i.e., resistant).

Discussion

The development and validation of predictive markers has become increasingly vital to improving patient outcomes, especially in the face of an increasing number of available therapies. The current study investigated the association of specific treatment outcomes in 262 prospectively accrued persistent and recurrent EOC patients with assay results blinded to the investigator and patient. Using several approaches, the predictive value of a chemoresponse assay was evaluated and supported.

The analyses conducted herein address common clinical concerns regarding the predictive properties of a chemoresponse assay (Markman, 2011), including the ability of the assay to discern specific therapies that are likely to be more effective among the multiple clinically relevant treatment options. Specifically, the ability of a chemoresponse assay to identify treatment-specific sensitivity has been questioned given the clinical and biological factors that may limit the ability of an assay to be predictive of patient response. For example, since cross-drug response to different therapies is a common phenomenon in chemoresponse assays and it is thought that patients can exhibit multidrug resistance, there is concern that these factors may limit the predictive ability of an assay. Furthermore, it is hypothesised that the improved outcome observed for patients treated with assay-sensitive therapies may be driven by the inherent biology of the tumour represented by the chemoresponse profile (i.e., homogeneous sensitivity vs homogeneous resistance to all therapies), suggesting that the assay may simply reflect the inherent biological-based tumour behaviour and, hence, patient prognosis. The studies described herein attempt to address each of these concerns.

First, the match/mismatch analysis assesses whether cross-drug response and correlated assay results across different therapies limit the predictive ability of the assay. This simple approach, evaluating the ability of the assay to predict patient outcome based on the assay result for the clinically administered therapy (match) as compared with a randomly selected treatment assay result for the same patient (mismatch), showed that the association of outcome and assay result for match was stronger than for mismatch (HR: 0.67 vs 0.81, respectively), indicating that a patient receiving an assay-sensitive therapy would be more likely to experience a better outcome than a patient treated empirically.

Next, the heterogeneity of assay response for a given patient to multiple therapies was evaluated, showing that nearly half (44%) of the patients included in the study demonstrated heterogeneous assay response to all therapies tested (i.e., were sensitive to at least one therapy tested, but not all), whereas the remainder were homogeneously sensitive (9%) or non-sensitive (47%). Comparison of PFS for heterogeneous patients vs those displaying homogeneous responses demonstrated that improved outcome in assay-sensitive patients is unlikely due to the inherent (presumably homogeneous) chemosensitivity of a tumour. Further, multivariate analysis demonstrated that an increased or decreased percentage of assay-sensitive results across all therapies tested was not associated with patient outcome when the matched assay result was included. Therefore, although tumour sensitivity or resistance to multiple therapies is evident in a portion of recurrent EOC patients, improved outcome was associated with clinical administration of a sensitive therapy, regardless of whether that tumour displayed homogeneous or heterogeneous assay responses.

Finally, unsupervised clustering analysis, based on assay results for multiple single-agent therapies, was neither associated with common clinical factors nor with patient outcome, further supporting that this assay is not just reflective of intrinsic tumour biology or common clinical factors normally associated with tumour prognosis (i.e., not only prognostic).

With rapid advancements in cancer research and the number of new therapies being developed, clinical interest lies in identifying predictive markers for personalised therapy. Although many molecular-based biomarkers have been clinically accepted (Cobleigh et al, 1999; Slamon et al, 2001; Amado et al, 2008; Karapetis et al, 2008; Mok et al, 2009; Van Cutsem et al, 2009; Rosell et al, 2012) and are supported by regulatory bodies’ approvals and inclusion in standardized treatment paradigms for other cancer types (Allegra et al, 2009; Burstein et al, 2010; Keedy et al, 2011; National Comprehensive Cancer Network, 2013b, 2013c, 2013d), unfortunately, to date, no predictive markers have been clinically validated for use in EOC. Although numerous clinical factors (e.g., age, stage, grade, histology) are used extensively in making treatment decisions especially in the recurrent EOC setting, these factors are limited in their ability to discern specific therapies that are likely to be more effective in second-line treatment and beyond.

Clinical validation of prognostic and predictive markers requires study designs that differ from those employed in traditional drug trials (Institute of Medicine, 2011). Clinical validation for prognostic markers is more straightforward, as only the clinical outcomes between marker-positive and marker-negative patients are compared within a cohort of uniformly treated patients. If a marker is significantly associated with outcome when the effects of other possible confounding factors are controlled, then the marker is considered to be prognostic. In contrast, the clinical studies required to demonstrate the clinical validation of predictive markers are more complex. Often, data from previously conducted randomized drug trials are employed to conduct such validations if archived tissue is available and appropriate for the marker. This strategy may be advantageous in certain instances where outcomes require prolonged follow-up to achieve a meaningful difference in survival.

Clinical study designs for markers have been extensively discussed and reviewed in the literature (Simon and Maitournam, 2004; Sargent et al, 2005; Freidlin et al, 2010; Tajik et al, 2013). The biomarker-stratified design, which uses the marker status to guide analysis but not to assign treatment, is considered to be more efficient regarding required study sample size, can be used when the number of markers and treatment choices is limited, and can investigate a marker’s predictive properties. However, this approach is not directly compatible with chemoresponse assays where a larger number of therapies (multiple markers) are evaluated simultaneously (e.g., the response profile of each therapy is considered to be a separate marker). The biomarker-strategy design, in which patients are randomized to two arms (assay-directed vs empirical treatment), is a reasonable approach for evaluating multiple markers and has been recommended for evaluation of chemoresponse assays (Blue Cross and Blue Shield Association, 1995, 2000, 2002; Schrag et al, 2004; Burstein et al, 2011). However, this design is inefficient in its use of patients (i.e., large sample size not logistically feasible) and may not clearly distinguish between better treatments in the marker arm vs the standard of care arm (Freidlin et al, 2010; Buyse et al, 2011; Ziegler et al, 2012; Buyse and Michiels, 2013; Center for Medical Technology Policy, 2013). Furthermore, this approach has been shown to be subject to a treating physician ‘learning effect’ bias where, across the study duration, empirically selected treatments became more similar to those indicated in the assay-informed arm, as described by Cree et al (2007).

Over the last two decades, numerous assessments have called for more thorough evaluation of the validation and justification of chemoresponse assays for clinical use (Blue Cross and Blue Shield Association, 1995, 2000, 2002; Schrag et al, 2004; Burstein et al, 2011). The study design uniformly recommended in these assessments is based on the biomarker-strategy design, specifically an interventional (i.e., treatment assigned by marker), randomized, two-arm (assay-directed vs empiric treatment assignment) marker study design. Although the design proposed in these assessments can theoretically address a few markers simultaneously and has been historically referenced as the ‘gold standard’, it would require a larger number of patients, would be subject to the physician bias ‘learning effect’ and, yet, still not able to cleanly evaluate the predictive marker elements. Also, importantly, alternate approaches to evaluating both predictive and prognostic marker elements can be executed on a more achievable sample size (Wieand, 2005). As a result, the biomarker-strategy approach is not the only design that can effectively validate a predictive marker and may not be the preferred approach (Center for Medical Technology Policy, 2013).

The match/mismatch analysis method employed in this study represents a promising alternate approach to validating a chemoresponse assay, in that it has an achievable sample size, retains the ability to assess both prognostic and predictive marker properties and is appropriate for use with multiple markers (Simon and Maitournam, 2004; Maitournam and Simon, 2005; Simon, 2005; Mandrekar and Sargent, 2009, 2010; Simon, 2010; Center for Medical Technology Policy, 2013). This is especially important in recurrent EOC where numerous, clinically-equivalent therapies are available for use and enrolment of a large sample size is not logistically feasible. Furthermore, this approach has been previously recommended for assessment of chemoresponse assays and is expected to be able to evaluate whether the assay is predictive of response (Wieand, 2005). This method does lack the ability to assess each therapy/marker individually due to prohibitive sample size requirements, which increases with each individual therapy/marker evaluated.

In both the current analyses and the prior analysis reported by Rutherford et al, the association of assay result with clinical outcome was evaluated using two strata (sensitive vs non-sensitive), with the thresholds for each treatment predefined in an external reference sample, independent of clinical outcome. Analysis of quantitative data, such as a receiver operating characteristic (ROC) curve, may further illustrate assay performance; however, due to the individualised nature of establishing assay treatment concentrations and response category thresholds for each distinct treatment, assay results cannot be directly compared across different treatments (i.e., the assay result range varies across individual treatments).

Clinical validations for chemoresponse assays, which simultaneously assess multiple markers/therapies, must be carefully considered. Through further analysis of a prospective study and by using several analytical approaches, the current study further evaluated the clinical value of a chemoresponse assay. The results provide reasonable evidence that this assay is a predictive marker, with the capacity to discern specific therapies that are likely to be more effective, and women with recurrent EOC may benefit from assay-informed therapy selection.