Main

The clinical outcome of patients with metastatic colorectal cancer (mCRC) has significantly improved during the last decade, partly due to the increased availability of targeted drugs. The addition of bevacizumab to fluoropyrimidine-based chemotherapy has resulted in a prolonged overall and progression-free survival, and is considered a standard option in first-line treatment of mCRC (Hurwitz et al, 2004; Kabbinavar et al, 2008; Saltz et al, 2008; Tebbutt et al, 2010; Cunningham et al, 2013). Until recently, the optimal duration of systemic therapy including bevacizumab in first-line treatment of mCRC was not well established.

The phase 3 CAIRO3 and AIO 0207 trials showed that maintenance treatment with fluoropyrimidine and bevacizumab is the preferred strategy in mCRC patients with stable disease or better after induction treatment with a fluoropyrimidine, oxaliplatin and bevacizumab, as it maintains disease control and quality of life without relevant toxicity (Hegewisch-Becker et al, 2015; Simkens et al, 2015; Quidde et al, 2016). However, not all patients may benefit from this strategy. The ability to identify subgroups of patients in which a treatment break is safe and on the other hand those in which continuous treatment is prerequisite for better survival, would improve clinical decision-making and reduce therapy costs.

In this individual patient data (IPD) meta-analysis of the CAIRO3 and AIO 0207 trials with updated follow-up, we aim to provide more precise estimates of treatment effects regarding the use of fluoropyrimidine plus bevacizumab maintenance treatment after induction treatment with combination chemotherapy and bevacizumab. In addition, we aim to identify patient subgroups according to clinical and pathological characteristics that benefit most from fluoropyrimidine and bevacizumab maintenance treatment or observation.

Materials and methods

Study design and participants

This analysis is based on the individual patient data from two open-label, randomised phase 3 trials on maintenance treatment vs observation in first-line treatment of mCRC: CAIRO3 (NCT00442637) and AIO 0207 (NCT00973609)(Hegewisch-Becker et al, 2015, Simkens et al, 2015). The CAIRO3 study, a superiority trial done by the Dutch Colorectal Cancer Group, was conducted in 64 hospitals in The Netherlands between 30 May 2007 and 15 October 2012. The AIO 0207 study, a non-inferiority trial conducted by the AIO Studien gGmbH, enrolled patients from 106 institutions (55 hospitals and 51 private practices) in Germany between 17 September 2009 and 21 February 2013. Detailed eligibility criteria, ethical approvals, treatment protocols and outcomes have been reported elsewhere (Hegewisch-Becker et al, 2015; Simkens et al, 2015). In brief, eligible patients in both trials were older than 18 years, had WHO/ECOG performance status (PS) 0–2, histologically proven colorectal adenocarcinoma with distant metastases, previously untreated for metastatic disease, with stable disease, partial or complete response according to Response Evaluation Criteria in Solid Tumours (RECIST, version 1.1) after induction treatment with a fluoropyrimidine, oxaliplatin and bevacizumab.

In the two-armed CAIRO3 study, patients with stable disease or better after 6 cycles (18 weeks) induction treatment with capecitabine, oxaliplatin and bevacizumab (CAPOX-B) in whom reintroduction of oxaliplatin appeared feasible were randomised (1:1) to either observation or maintenance treatment with capecitabine and bevacizumab (CAP-B). Patients were not enrolled if they had experienced toxicity from the fluoropyrimidine, oxaliplatin, or bevacizumab during induction treatment that would prevent its safe continuation or reintroduction. Induction treatment was not an integral part of the trial. Randomisation was stratified by previous adjuvant chemotherapy (yes vs no), response to induction treatment (stable disease vs complete or partial response), WHO/ECOG PS (0 vs 1), serum lactate dehydrogenase (LDH) concentrations (normal vs abnormal), and treatment centre.

In the three-armed AIO 0207 study, eligible patients were registered prior to the start of a 24-week induction treatment with a fluoropyrimidine (infusional fluoropyrimidine or capecitabine), oxaliplatin and bevacizumab. The choice of a standard protocol (i.e. FOLFOX, CAPOX-B) was left to the local investigator’s discretion. Patients with stable disease or better and without option for metastasectomy after 24 weeks of induction treatment were randomised (1:1:1) to either maintenance treatment with any fluoropyrimidine and bevacizumab, bevacizumab monotherapy, or observation. Preliminary discontinuation of oxaliplatin or other drugs (for example, due to toxicity) during induction treatment was allowed. Randomisation was stratified by response to induction treatment (stable disease vs complete or partial response), treatment with oxaliplatin (stopped before termination of induction treatment vs ongoing until end of induction phase), previous adjuvant therapy (with oxaliplatin vs without oxaliplatin vs no adjuvant treatment), and WHO/ECOG PS (0–1 vs 2). Patients from the bevacizumab monotherapy arm were excluded from the present analysis. All patients in both trials provided written informed consent.

Study treatments

In the CAIRO3 study, maintenance treatment consisted of capecitabine 625 mg m−2 orally twice daily continuously, plus bevacizumab 7.5 mg kg−1 intravenously every 3 weeks. Patients with progressive disease in either the observation or maintenance arm were to receive reintroduction of the induction treatment regimen, that is, CAPOX-B. Reintroduced CAPOX-B was to be continued until progression, death, or an unacceptable adverse event, whichever occurred first. If CAPOX-B reintroduction was not possible after all due to persisting sensory neuropathy (grade 2) or any other reason, the treatment choice was left to the local investigator’s discretion.

In the AIO 0207 study, randomised patients received either continuation of a fluoropyrimidine (infusional every 2 weeks, or capecitabine every 3 weeks in standard dosages; the fluoropyrimidine could be switched between induction and maintenance treatment) plus bevacizumab (7.5 mg kg−1 every 3 weeks, or 5 mg kg−1 every 2 weeks), or bevacizumab monotherapy (same dosage), or no treatment. Maintenance treatment was continued until disease progression, unacceptable toxicity, surgical resection, other ablative treatment, at patient’s request, or local investigator’s decision. If either the fluoropyrimidine or bevacizumab was discontinued before progression, the remaining drug was continued as monotherapy in the fluoropyrimidine plus bevacizumab arm. At first progression, all patients were to receive reintroduction of the induction treatment regimen (i.e., any fluoropyrimidine, oxaliplatin plus bevacizumab) according to protocol. Reintroduction included all drug components of the induction treatment, except for those that could not be used due to persistent toxicity or contraindications. If reintroduction of the induction treatment regimen was not possible for any reason, the choice of treatment was at the local investigator’s discretion.

Outcomes

Patients in both trials were assessed for disease status according to RECIST criteria. The primary end points in both trials (time to second progression upon reintroduction of the induction treatment regimen in CAIRO3, and time to failure of strategy in AIO 0207) were comparable in definition. The primary end point in this IPD meta-analysis was second progression-free survival (PFS2), defined as the interval between randomisation and second progression (for those who had a first progression) while under treatment with reintroduction of a fluoropyrimidine, oxaliplatin and bevacizumab, or until the beginning of another treatment (including a new drug), death or end of trial for patients who did not have a second progression. PFS2 was regarded as equal to first progression-free survival (PFS1) if patients did not receive reintroduction of the induction regimen for any reason, or if a valid response evaluation was not performed. Secondary end points in both trials included time until first progression (PFS1), and overall survival (OS). PFS1 was defined as the interval between randomisation and first progression while under maintenance treatment or observation, or until death or end of trial for patients without progression. OS was defined as the time from randomisation to death from any cause or date of last follow-up, at which point patients who were still alive were censored. Cut-off dates for the present analysis were March 2017 for CAIRO3, and December 2016 for AIO 0207.

Statistical analysis

This pooled analysis was based on individual patient data of the intention-to-treat population of the CAIRO3 and AIO 0207 trials, comprising all patients who were randomised to fluoropyrimidine plus bevacizumab maintenance treatment or observation. Patients from the bevacizumab monotherapy arm of the AIO 0207 study were excluded from the analyses, since the CAIRO3 study did not include this treatment option.

First, the median duration of follow-up was calculated for the pooled study population using the reverse Kaplan–Meier method. Survival curves were estimated with the Kaplan–Meier method and compared with the log-rank test. Next, we performed subgroup analyses, including the following parameters: age (< or 70 years at randomisation), sex (male vs female), primary tumour location (colon vs rectum or rectosigmoid), response to induction treatment (stable disease vs complete or partial response), WHO/ECOG PS (0 vs 1–2), number of metastatic sites (1 vs >1), stage of disease and primary tumour resection status (synchronous, resected vs synchronous, non-resected vs metachronous disease), serum LDH at randomisation (normal vs elevated), platelet count (<400 vs 400 × 109 l−1) and serum CEA (20 vs>20 ng ml−1) at start of induction treatment, and RAS/BRAF mutation status (RAS plus V600EBRAF wild-type vs RAS mutant vs V600EBRAF mutant). No power, sample size, or sensitivity calculations were done as these subgroup analyses were exploratory in nature. We analysed overall and subgroup treatment effects using mixed effect Cox models with study as random intercept to take clustering of patients within studies into account, and treatment (and any co-variables) as fixed effects to calculate hazard ratios (HRs) and 95% confidence intervals (CIs). We refrained from including a random treatment slope per study as none of the models significantly improved upon such extension. Analyses were stratified for prior adjuvant chemotherapy, response to induction treatment, and WHO/ECOG PS, and adjusted for the following potential confounders by including as co-variable: age, sex, stage, primary tumour location, primary tumour resection, number of metastatic sites, LDH at randomisation, and the interval between primary diagnosis and randomisation. Subgroup analyses regarding stage of disease combined with primary tumour resection status were not adjusted for stage and primary tumour resection. Patients with missing values in variables relevant for a particular analysis were excluded from that analysis. Interaction terms between treatment and each subgroup variable were used to assess and test heterogeneity of treatment effects. Inspection of Schoenfeld residuals showed that the proportionality of the hazard assumption was not violated. We report nominal, two-sided P-values (significance level set to 0.05), without taking multiple testing into account. Statistical analyses were performed using IBM SPSS Statistics, version 21.0 (Armonk, NY: IBM Corp) and R version 3.0.3 (particularly library coxme version 2.2–5).

Results

Patients

By pooling individual patient data from both trials, including both treatment arms of CAIRO3 and two out of three treatment arms of AIO 0207, we obtained data of 871 patients: 437 assigned to the observation group and 434 assigned to the fluoropyrimidine plus bevacizumab maintenance treatment group (Figure 1). Patient characteristics were comparable between treatment groups, except for a higher percentage of patients with age 70 years in the observation group (Table 1). Differences in overall patient characteristics between CAIRO3 and AIO 0207 (bevacizumab monotherapy arm excluded) were found regarding WHO/ECOG PS, prior adjuvant chemotherapy, primary tumour location, stage of disease combined with primary tumour resection status, and serum LDH at randomisation (Supplementary Table 1).

Figure 1
figure 1

Trial profiles for CAIRO3 and AIO 0207. IPD = individual patient data; OS = overall survival; PFS1 = first progression-free survival; PFS2 = second progression-free survival.

Table 1 Patient characteristics

Efficacy

Median follow-up time for all patients was 68.5 months (IQR 54.6–87.0 months). Overall, there was a significant benefit from maintenance treatment compared with observation for PFS1 (HR 0.40 (95% CI 0.34–0.47)) and the primary end point PFS2 (HR 0.70 (0.60–0.81)). The benefit of maintenance treatment was observed in all subgroups that were investigated (Figures 2 and 3), although for patients with metachronous disease this was non-significant in PFS2 (at a nominal P-value for significance of 0.05). In particular, primary tumour location was not predictive of the benefit of maintenance treatment or observation. Patients with elevated compared to normal platelet count at start of induction treatment showed a significant interaction in favour of maintenance treatment regarding PFS1 (HR 0.32 (95% CI 0.24–0.42) vs HR 0.45 (0.37–0.55), nominal P-value for interaction (Pinteraction)=0.042), and PFS2 (HR 0.55 (95% CI 0.42–0.72) vs HR 0.77 (0.64–0.93), nominal Pinteraction=0.040), respectively. Supplementary Table 2 shows efficacy outcomes in the pooled study population and individual studies for PFS1 and PFS2. Supplementary Tables 3 and 4 show individual study results regarding subgroup analyses for PFS1 and PFS2.

Figure 2
figure 2

Forest plot showing adjusted treatment effects for PFS1 in subgroups with P -values for heterogeneity across subgroups. Analyses were performed using a mixed effect Cox model with study as random intercept and treatment (and any co-variables) as fixed effects. Subgroup analyses were stratified for prior adjuvant chemotherapy, response to induction treatment, WHO/ECOG PS, and adjusted for age, sex, stage, primary tumour location, primary tumour resection, number of metastatic sites, LDH at randomisation, and interval between primary diagnosis and randomisation. Subgroup analyses for ‘stage of disease and primary tumour resection status’ were not adjusted for stage and primary tumour resection. CR/PR=complete or partial response; SD=stable disease.

Figure 3
figure 3

Forest plot showing adjusted treatment effects for PFS2 in subgroups with P -values for heterogeneity across subgroups. Analyses were performed using a mixed effect Cox model with study as random intercept and treatment (and any co-variables) as fixed effects. Subgroup analyses were stratified for prior adjuvant chemotherapy, response to induction treatment, WHO/ECOG PS, and adjusted for age, sex, stage, primary tumour location, primary tumour resection, number of metastatic sites, LDH at randomisation, and interval between primary diagnosis and randomisation. Subgroup analyses for ‘stage of disease and primary tumour resection status’ were not adjusted for stage and primary tumour resection. CR/PR=complete or partial response; SD=stable disease.

Overall treatment effect in OS did not reach statistical significance, neither in the individual trials, nor when the data were pooled (HR 0.91 (95% CI 0.78–1.05))(Figure 4). In fact, overall treatment effect for OS was significantly different between the two trials (likelihood ratio P-value=0.008). While maintenance treatment versus observation resulted in a clinically relevant increase in median OS in CAIRO3, this was not observed in AIO 0207 (Supplementary Table 2). Subgroup analyses for OS showed a marked heterogeneity with opposite results between the two trials (Supplementary Table 5). Despite this, the combined data suggested that maintenance treatment improved OS for female sex (nominal Pinteraction=0.003) and complete or partial response as best response on induction treatment (nominal Pinteraction=0.035)(Figure 4).

Figure 4
figure 4

Forest plots showing adjusted treatment effects for OS in subgroups with P -values for heterogeneity across subgroups. Analyses were performed using a mixed effect Cox model with study as random intercept and treatment (and any co-variables) as fixed effects. Subgroup analyses were stratified for prior adjuvant chemotherapy, response to induction treatment, WHO/ECOG PS, and adjusted for age, sex, stage, primary tumour location, primary tumour resection, number of metastatic sites, LDH at randomisation, and interval between primary diagnosis and randomisation. Subgroup analyses for ‘stage of disease and primary tumour resection status’ were not adjusted for stage and primary tumour resection. CR/PR=complete or partial response. SD=stable disease.

Treatment upon first progression

After first progression, 407 (47%) of 871 patients underwent reintroduction of the induction treatment regimen. Out of these 407 patients, 377 (93%) received reintroduction of all components, that is, fluoropyrimidine, oxaliplatin and bevacizumab. The percentage of patients that underwent reintroduction according to protocol was significantly lower in the fluoropyrimidine plus bevacizumab group compared with the observation group (165 out of 429 (38%) vs 242 out of 437 (55%), respectively, P<0.001). The percentage of patients that received reintroduction of the induction treatment regimen was significantly higher in CAIRO3 compared with AIO 0207 (304 out of 557 (54%) vs 103 out of 309 (33%), respectively, P<0.001). Subsequent therapies received during the course of metastatic disease were comparable between the two trials and within treatment groups, although anti-EGFR therapy was more frequently received by patients in AIO 0207 compared with CAIRO3 (84 out of 314 (27%) vs 102 out of 557 (18%), respectively; Table 2).

Table 2 Treatment upon first progression

Discussion

This IPD meta-analysis of the CAIRO3 and AIO 0207 trials with updated follow-up confirms the benefit of fluoropyrimidine plus bevacizumab maintenance treatment compared with observation in first-line treatment of mCRC. Despite differences in the study design of CAIRO3 and AIO 0207, our pooled results show that fluoropyrimidine plus bevacizumab maintenance treatment is more effective compared with no treatment for PFS1 and the primary end point PFS2, regardless of the investigated subgroups.

By using individual patient data, this pooled analysis distinguishes itself from study-level meta-analyses (Berry et al, 2015; Pereira et al, 2015; Stein et al, 2016; Zhao et al, 2016). Our pooled subgroup analyses provide the best available evidence on predictors of response to fluoropyrimidine and bevacizumab maintenance treatment compared with observation thus far. All investigated subgroups showed a significant benefit from maintenance treatment regarding PFS1 and PFS2, except for patients with metachronous disease in PFS2. The latter may be partly due to the small number of patients with metachronous disease assigned to maintenance treatment (n=76). Another possible explanation could be a (partial) chemoresistance due to previous adjuvant treatment (Mekenkamp et al, 2010), since 108 out of 201 patients (54%) with metachronous disease received prior adjuvant chemotherapy. There is growing evidence that primary tumour sidedness (right colon: caecum-transverse colon; left colon: splenic flexure-rectum) influences prognosis and therapy response in mCRC patients (Petrelli et al, 2016; Holch et al, 2017). Although the specific data on sidedness were lacking in the present analysis, our findings do not suggest a predictive role of primary tumour location (colon vs rectosigmoid or rectum) for the benefit of maintenance treatment or observation. Patients with elevated compared to normal platelet count at start of induction treatment showed a significant interaction in favour of maintenance treatment regarding PFS1 and PFS2. Given the exploratory nature of our subgroup analyses, these findings do not allow definitive conclusions. Nonetheless, our results are in line with the MRC COIN trial, which previously showed that patients with elevated baseline platelet count had inferior survival and quality of life with intermittent chemotherapy, and should therefore not receive a treatment break (Adams et al, 2011).

Regarding OS, it should be noted that both trials were not designed or powered to show a difference in this end point. Overall treatment effect for OS differed significantly between CAIRO3 and AIO 0207, which limits the credibility of subgroup analyses regarding this end point. There was no significant difference in overall treatment effect when data were pooled. Although subgroup analyses for OS showed a marked heterogeneity between the two trials, significant interactions with OS and maintenance treatment were observed for females, and patients with complete or partial response as best response to induction treatment. The latter subgroup was also a significant predictor for the effect size of maintenance treatment in OS in the initial subgroup analyses of CAIRO3 (Simkens et al, 2015). This may be partly explained by the fact that pooled OS results were more influenced by CAIRO3 due to a larger sample size per arm.

There are several reasons that could explain the diverging overall treatment effect in OS between CAIRO3 and AIO 0207. For instance, OS can be highly influenced by subsequent treatment lines (Shi et al, 2015). In our analysis, therapies received during subsequent treatment lines were comparable between both trials, except for a higher rate of patients that received anti-EGFR therapy in AIO 0207 compared with CAIRO3. The data on systematic differences in the sequence of agents used or in the total number of agents received were beyond the scope of the present analysis, since the data are likely to be too limited for a proper investigation on the impact of these differences. Furthermore, several important differences exist between CAIRO3 and AIO 0207 regarding patient registration (after vs before start induction therapy), fluoropyrimidine maintenance protocols (capecitabine vs any fluoropyrimidine), duration of induction treatment (18 vs 24 weeks), and exclusion of patients who experienced toxicity from oxaliplatin during induction treatment that precluded reintroduction of this agent (yes vs no). These differences in study designs, together with varying study populations, could have influenced treatment outcomes, especially regarding OS.

The rate of reintroduction according to protocol was significantly higher in CAIRO3 (54%) compared with AIO 0207 (33%). This is likely to be related to the exclusion of patients who were not eligible for oxaliplatin reintroduction in CAIRO3. It may also be related to a higher cumulative oxaliplatin dose resulting from the longer induction period in AIO 0207, suggesting that a 24-week induction period may be too long. These differences between CAIRO3 and AIO 0207 in number of cycles and cumulative doses administered during the induction and reintroduction phase may have influenced OS outcomes.

Our findings support the ESMO consensus guidelines recommendation that a combination of a fluoropyrimidine plus bevacizumab is the optimal maintenance treatment following induction treatment with fluoropyrimidine, oxaliplatin and bevacizumab (Van Cutsem et al, 2016). Our results suggest that both patients with poor prognostic characteristics and patients with favourable prognostic characteristics derive a significant benefit from maintenance treatment. Clearly, alternative outcome measures and factors should be considered in the treatment decision-making process such as quality of life (QoL) and a patient’s cultural and social preferences. Although inclusion of QoL measures in this IPD meta-analysis was difficult due to differences in time points of assessment and compliance rates, the individual trials reported comparable findings in the QoL analyses. Both trials showed that active maintenance treatment was not associated with a detrimental effect on QoL when compared with no treatment (Simkens et al, 2015; Quidde et al, 2016). Most importantly, treatment decisions should be individualised after a thorough discussion with the patient. This should include discussion of the estimated survival time, time free from cancer-related symptoms, side-effects and treatment constraints, and the impact on career and family life (social and financial), as stated in the ESMO consensus guidelines (Van Cutsem et al, 2016).

In conclusion, this IPD meta-analysis shows that fluoropyrimidine plus bevacizumab maintenance treatment is effective in mCRC patients with stable disease or better after induction treatment with a fluoropyrimidine, oxaliplatin, and bevacizumab, with a significant benefit in PFS1 and PFS2. Subgroup analyses did not identify any subpopulations that derived comparable benefit from observation after induction treatment.