Efficacy and Safety of Bevacizumab Combined with Chemotherapy for Managing Metastatic Breast Cancer: A Meta-Analysis of Randomized Controlled Trials

Although the FDA revoked metastatic breast cancer (MBC) from bevacizumab (BEV) indication in 2011, BEV combined with paclitaxel has been written in the breast cancer NCCN guidelines. This systematic assessment was performed to evaluate the efficacy and safety of BEV + chemotherapy (CHE) for managing MBC. PubMed and EMBASE were searched for original articles written in English and published before July, 2015. Progression-free survival was significantly improved in the CHE + BEV arms compared to the CHE arms in overall group and in human epidermal growth factor receptor 2-negative group (HR 0.75, 95% CI: 0.68–0.84, P < 0.001; HR 0.75, 95% CI: 0.69–0.82, P < 0.001). There were no significant improvement in overall survival in the CHE + BEV arms compared to the CHE arms. Significantly more grade 3 febrile neutropenia, hypertension, proteinuria, and cardiac events were observed in the CHE + BEV arm, which are controllable and reversible. Severe bleeding occurred more in the BEV + taxane arms and in patients with brain metastases. Therefore, CHE + BEV significantly increases progression-free survival in patients with MBC, it should be considered as a treatment option for these patients under the premise of reasonable selection of target population and combined CHE drugs.


Selection of RCTs.
illustrates the inclusion and exclusion of studies in this systematic assessment [7][8][9][11][12][13][14] . In accordance with our search strategy, the database search identified 113 abstracts. Following the primary screening, 86 abstracts were excluded. Twenty of the remaining 27 full-text articles were excluded for various reason: they involved neoadjuvant BEV therapy, represented repeated or single-arm studies, or analyzed only quality of life or toxicity. A final set of seven articles was included in the quantitative synthesis. The PRISMA checklist can be found online in Supplementary Table S1.

Risk of bias in the included studies.
Due to inadequate randomized sequence generation and inadequate allocation concealment, we assessed three RCTs as having an unclear risk of selection bias. Due to their open-label trial design, we assessed four RCTs as having a high risk of performance and detection bias (Fig. 2).
Main characteristics of the studies included in the systematic assessment. Six trials were randomized phase III trials, and one was a randomized phase II trial. The primary endpoint was PFS in six trials and ORR in one trial. There were 4456 patients in this assessment, of which 2691 were included in the CHE + BEV group, and 1765 were included in the CHE group. There were 2848 HER2-negative   patients in four trials and 1608 non-HER2-negative patients in three trials. There were 1312 patients with triple-negative breast cancer. Regarding treatment regimens, 1321 patients received BEV + CAP versus CAP, 1429 received BEV + DOC versus DOC, and 1126 received BEV + PAC versus PAC (Table 1). Efficacy analysis. OS, hazard risk (HR), and 95% confidence interval (CI) data were reported in four of the RCTs. One RCT reported only the OS and Kaplan-Meier curves; we used the Engauge Digitizer V4.1 (http://digitizer.sourceforge.net/) screenshot tool and a formula proposed by Parmar to estimate the HR and 95% CI for this study 15,16 . There was no significant heterogeneity in OS between the CHE + BEV and CHE groups (P > 0.05), so a fixed effects model was applied. The overall analysis  Complete PFS, HR and 95% CI data were reported in seven RCTs. There was significant heterogeneity in PFS between the CHE + BEV and CHE groups (P < 0.10) in the overall analyses and non-HER2-negative subgroup analyses, so a random effects model was selected. However, there was no significant heterogeneity in PFS between the CHE + BEV and CHE groups in the HER2-negative subgroup analysis (P > 0.10), so a fixed effects model was applied. The overall analysis and HER2-negative subgroup analysis indicated significantly improved PFS in the CHE + BEV group compared to the CHE group (HR 0.75, 95% CI: 0.68-0.84, P < 0.001; HR 0.75, 95% CI: 0.69-0.82, P < 0.001; respectively) ( Fig.  4). The non-HER2-negative subgroup analyses did not yield similar results (HR 0.78, 95% CI: 0.57-1.05, P > 0.05) (Fig. 4). The PFS was significantly improved in the CHE + BEV group compared to the CHE group (HR 0.61, 95% CI: 0.47-0.80, P < 0.001) in patients with triple-negative MBC. Begg's test and Egger's test identified no significant publication bias in the overall, non-HER2-negative subgroup, and HER2-negative subgroup analyses (all P > 0.05).
Complete ORR, HR, and 95% CI data were reported in seven RCTs. There was significant heterogeneity in ORR between the CHE + BEV and CHE groups in the overall analyses and non-HER2-negative subgroup analyses, so a random effects model was applied. However, the heterogeneity in ORR in the HER2-negative subgroup was not significant (P > 0.05), so a fixed effects model was applied. The overall and HER2-negative subgroup analyses indicated a significantly improved ORR in the CHE + BEV group compared to the CHE group (RR 1.37, 95% CI: 1.18-1.59, P < 0.001; RR 1.33, 95% CI: 1.20-1.47, P < 0.001; respectively) (Fig. 6). The non-HER2-negative subgroup analysis indicated no significant improvement in ORR in the CHE + BEV group (RR 1.54, 95% CI: 0.93-2.55, P = 0.090). Begg's test and Egger's test indicated no significant publication bias in the HER2-negative subgroup and non-HER2-negative subgroup analyses (all P > 0.05). Begg's test revealed no significant publication bias (P = 0.108), while Egger's test indicated significant publication bias (P = 0.013) in the overall population. Toxicity analysis. We extracted toxicity rates from all seven RCTs. There was no significant heterogeneity in the toxicity rates (P > 0.05) between the CHE + BEV and CHE groups, except for the incidence of hypertension, so a fixed effects model was applied. However, the heterogeneity for hypertension was significant (P > 0.05), so a random effects model was applied. There were significant increases in febrile  Table 2).

Discussion
BEV is the most active targeted agent, and it significantly improves survival and controls clinical symptoms in many types of cancer. However, the role of BEV in MBC has been controversial.
BEV was approved in 2008 by the FDA for first-line treatment of HER-2 negative MBC in combination with PAC. In 2011, the FDA revoked MBC from BEV indication because of unexpected clinical outcome and the higher incidence of severe toxicity in BEV + CHE group. Nevertheless, BEV + PAC has been written in the breast cancer national comprehensive cancer network guidelines under the insistence of the experts. For a more comprehensive analysis of the efficacy and safety of BEV + CHE for managing MBC, the systematic assessment was performed.  Although some trials have achieved positive results in terms of PFS, others have reached opposite conclusions. Our overall analysis indicated a significantly improved PFS in the CHE + BEV group compared to the CHE group (HR 0.75, 95% CI: 0.68-0.84, P < 0.001), which is consistent with the conclusion by Valachis 10 . The analyses of the HER2-negative, triple-negative, DOC + BEV versus DOC, and PAC + BEV versus PAC subgroups yielded similar results to that of the overall analysis. From the point of view of PFS significant prolongation, BEV should be a treatment option for MBC patients in combination with CHE, especially in combination with PAC, DOC and in HER2-negative MBC patients.
The possible reasons that some studies did not observe an improved PFS are as follows: 1. BEV was used as a general therapy rather than as a targeted therapy and was not administered to patients with a specific molecular phenotype. 2. In the AVADO trial, the primary analysis determined that both 7.5 mg/ kg BEV and 15 mg/kg BEV significantly improved PFS; however, an updated analysis determined that only the 15 mg/kg BEV arm experienced this benefit. Treatment assignments were not blinded following the primary data analysis, and potential investigator bias during tumor progression assessments may have negatively influenced the final result 9 . 3. The AVF2119g trial (CAP + BEV versus CAP) did not meet its primary endpoint of prolonged PFS (4.9 versus 4.2 months), but improved PFS was observed in the CAP cohort in the RIBBON-1 trial (8.6 versus 5.7 months) 8,14 , This latter positive finding suggests that the AVF2119g findings may have been due to the more heterogeneous nature of the study and to the higher number of patients with advanced MBC rather than from a lack of effectiveness of the combination therapy. 4. In the AVEREL trial, the investigator-assessed PFS HR was 0.82 (P = 0.0775), whereas the independent review committee-assessed PFS HR was 0.72 (P = 0.0162). The discrepancy between these two PFS values could have resulted from differences in imaging and lesion selection, the use of non-radiographic data, and clinical perceptions of new lesions 11 .
Despite the striking and promising improvements in PFS in these studies, the systematic analysis indicated no significant improvement in OS in the CHE + BEV group compared to the CHE group (HR 0.95, 95% CI: 0.86-1.05, P = 0.313). OS is considered the gold standard for clinical outcome; however, PFS has been used as an alternative endpoint to identify potential benefit at an earlier time point 17 . We found that the addition of BEV to CHE did not significantly prolong OS, but this by no means is a definite indication that BEV has no value in prolonging survival in MBC patients. The reasons for this conclusion may include the following: 1. PFS was designated as the primary endpoint in six of the RCTs, and ORR was designated as the primary endpoint in one RCT. According to statistical requirements, these study designs required an adequate number of patients to yield sufficient power to detect improvements in the median PFS or ORR. Therefore, these trials were not designed or adequately powered to detect differences in OS. 2. Many factors affect the final analysis of OS. After discontinuing their assigned treatment, the majority of patients received additional lines of treatment that included either CHE or hormonal agents, and patients were permitted to cross over from the CHE + PLA arm to the CHE + BEV arm 8 . Subsequent treatment data were not collected and analyzed for these patients, which may have compromised the ability to detect an improvement in OS 18,19 . Under such conditions, PFS better reflects the efficacy of BEV for the treatment of MBC than OS. 3. Treatment assignments were unblinded after the primary data analysis, which created the potential for investigator bias in tumor assessments, thereby potentially affecting the OS analysis 9 . 4. The AVADO trial results confirmed that the improvement in PFS was more pronounced in patients with high plasma VEGF-A concentrations after DOC + BEV treatment than in those with low VEGF-A concentrations 20 . Identifying and analyzing patient subgroups who show a BEV-specific molecular phenotype may encourage better survival. 5. The efficacy of BEV combination therapy may be affected by synergistic or antagonistic effects between BEV and different CHE regimens 21,22 . Therefore, the chemotherapeutic drugs that synergize with BEV are worthy of further research.
Treatment benefits and risks are equally important to patients, and the efficacy and safety of a drug are equally important in clinical trials. Under the premise that BEV + CHE significantly improves PFS, the safety of BEV determines its fate. Hamilton EP summarized the BEV safety data and arrived at the conclusion that BEV is generally well tolerated and that the majority of adverse events are mild and manageable 23 . Huang H's meta-analysis revealed an increased risk of fatal adverse events in patients receiving BEV who had non-small cell lung cancer, pancreatic cancer, prostate cancer, or ovarian cancer. However, fatal adverse events were less common among breast cancer patients who were treated with BEV (RR0.61; 95% CI, 0.39-0.95) 24 . The fact that BEV has also been associated with certain severe toxicities should not be ignored, and these events should be reasonably analyzed to avoid their occurrence. Our meta-analysis indicated that severe neutropenia (≥ Grade 3), venous thromboembolic events (≥ Grade 3), arterial thromboembolic events (any grade/≥ Grade 3), and gastrointestinal perforation (any grade/≥ Grade 3) were infrequent and occurred at similar rates in the two arms. Severe febrile neutropenia, hypertension and proteinuria (all ≥ Grade 3) were significantly more common in the BEV combination group, but these adverse events are controllable and reversible in clinical practice. With the exception of a higher rate 5.4% and 1.7% of bleeding complications in the BEV + taxane arm of the RIBBON-1 8 trial and in the RIBBON-2 trial 12 , BEV + CHE does not significantly increase the incidence of serious bleeding (≥ Grade 3). Patients taking anticoagulants or aspirin and those with treated CNS metastases, occult brain metastases or developing brain metastases were included in these trials, and this may partially explain the increased risk of serious bleeding 8,9,12,25 . Severe cardiac events (≥ Grade 3) were apparently increased in the CHE + BEV group. However, the small number of events, including left chest wall radiation, left ventricular ejection fraction < 50% at study entry, and pericardial metastatic involvement, Scientific RepoRts | 5:15746 | DOi: 10.1038/srep15746 prior to anthracycline exposure renders this comparison uncertain. To avoid severe toxicity, clinicians are required to carefully select patients and to avoid drug combinations that can lead to severe toxicity.
In summary, this systematic assessment indicates that CHE + BEV therapy confers clinical benefit in terms of increased PFS and ORR in patients with MBC, especially in HER2-negative MBC patients. Although CHE + BEV did not significantly improve OS, numerous influential factors in the study process dictate that we cannot simply dismiss the clinical value of BEV in OS. However, this combination therapy is associated with frequent adverse events. Thus, CHE + BEV should be considered as a treatment option for the patients with MBC under the premise of reasonable selection of target population and combined chemotherapy drugs. However, a major barrier to developing and implementing anti-angiogenic treatments is the difficulty in identifying patients who may benefit from BEV therapy. There is an urgent need to develop predictive molecular biomarkers that can guide patient selection and ensure the selection of optimal timing and the most synergistic chemotherapeutic drug combinations in patients with MBC. Large-scale, long-term follow-up studies will certainly reveal more answers.

Methods
Literature search strategy. We performed a systematic assessment according to Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) criteria 26  RCT selection and exclusion criteria. The following inclusion criteria were utilized: 1. The trial was prospective, properly randomized, controlled, well-designed, and matched for age, sex, tumor stage, and performance status or Karnofsky performance status. 2. Subjects were patients with MBC, and histological or cytological confirmation was required. 3. Control arm patients received CHE, CHE + PLA or CHE + TRA (collectively referred to as the CHE group); and experimental arm patients received CHE + BEV, CHE + PLA + BEV, or CHE + TRA + BEV (collectively referred to as the CHE + BEV group). 4. The endpoint was OS, PFS, ORR, and toxicity rates. 5. Explicit survival information or survival curves in the original article were presented as censored at last follow-up, with a follow-up rate of > 95%. 6. Whenever trials with overlapping patient populations were encountered, we included only the trial with the longest follow-up.

Data collection and extraction. Two investigators (Qin Li and Pengfei Zhao) independently assessed
all the identified abstracts according to the predefined inclusion criteria. If only one investigator considered an abstract eligible, the full text of the article was retrieved, and both investigators reviewed it in detail. An arbiter (Han Yan) resolved any discrepancies, or the investigators contacted the authors of the original study. We extracted and evaluated the variables, including author names, journal, publication year, sample size per arm, performance status, treatment regimens, line of treatment, median patient age, sex ratio, tumor stage, and prespecified efficacy and safety outcomes.
Assessment of methodological quality. Using the Cochrane Handbook for Systematic Reviews of Interventions 26 , the two investigators independently assessed the methodological quality of the included studies and resolved any disagreements by discussion. The investigators evaluated the risk of bias in the studies using Review Manager Software (RevMan Version 5.1, The Nordic Cochrane Center, The Cochrane Collaboration, Copenhagen, Denmark).

Statistical analysis.
We performed a systematic assessment using RevMan Version 5.1.7 (http://ims. cochrane.org/revman, The Nordic Cochrane Center) and Stata 11.0 (StataCorp LP, College Station, TX, USA). We investigated heterogeneity using Cochrane's Q-test and I 2 statistics. P > 0.1 and I 2 < 50% indicated a lack of inter-study heterogeneity, and we calculated the pooled estimations of HR and risk ratio (RR) for each study using a fixed effects model (Mantel-Haenszel method). P < 0.1 and I 2 > 50% indicated that the studies were heterogeneous, and we applied a random effects model (DerSimonian-Laird method). The principal measurements of effects were the HR and RR; these data are presented with a 95% confidence interval. All reported P-values are from 2-sided versions of the respective tests; P < 0.05 was considered statistically significant. Publication and selection bias were investigated through funnel plots using Egger's test and Begg's test 27,28 .