Introduction

Inhibition of cyclin dependent kinase 4 (CDK4) and cyclin dependant kinase 6 (CDK6) in combination with endocrine therapy is the first-line standard of care for hormone receptor positive and erb-B2-negative (HR+/HER2−) locally advanced or metastatic breast cancer (MBC)1,2,3. In phase III trials, CDK4/6 inhibitors trials, palbociclib, ribociclib, and abemaciclib have shown a consistent improvement in progression-free survival (PFS) when combined with an aromatase inhibitor (AI), fulvestrant, or tamoxifen4,5 with the hazard ratios for PFS ranging between 0.50 and 0.59. In contrast, while individual trials for ribociclib with endocrine therapy have reported statistically significant improvement in overall survival (OS), such improvements have not been reported for palbociclib or abemaciclib6,7,8,9,10,11. This is reflected in the National Comprehensive Cancer Network guidelines where ribociclib is the only category 1 preferred first-line treatment option for HR+/HER2− MBC in combination with an AI; whereas both abemaciclib and ribociclib are category 1 preferred first-line in combination with fulvestrant3.

CDK4/6 inhibitors can be associated with significant symptom burden that may limit tolerability and impact patients’ health-related quality of life12. In a pooled analysis of clinical trials, more than 70% of older patients had their treatment dose reduced and more than 15% discontinued treatment13. Tolerability is a key metric for CDK4/6 inhibitors given the duration of treatment can extend over 2 years, especially when used in the first-line setting.

A robust analysis of both relative efficacy and relative tolerability is therefore of interest to help clinicians and patients make informed decisions about the optimal agent to be used.

Methods

Search strategy and study selection

A network meta-analysis, registered in PROSPERO (registration number CRD42023392416) was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines (PRISMA)14. Inclusion criteria comprised phase 3 randomized controlled trials (RCTs) in which patients with HR+/HER2− metastatic breast cancer were treated with a CDK4/6 inhibitor in combination with endocrine therapy (AI, fulvestrant or tamoxifen) compared to endocrine therapy alone in the first or second-line setting. There was no limitation on year or language of publication. Meta-analyses, single-arm trials, and observational studies were excluded. Only studies of human subjects were included. When more than one publication was identified for the same clinical trial, data from the most recent or complete report were included.

A search strategy was constructed using ClinicalTrials.gov. Titles and abstracts identified by these strategies were screened independently by two reviewers (C.K. and E.A.) for inclusion; disagreements were resolved by consensus. The following variables from all eligible manuscripts were extracted: year of publication, median duration of follow-up, study sample size and the treatment in the experimental and control groups. For each approved CDK4/6 inhibitor, data was extracted on efficacy and on pre-specified common and serious treatment related adverse events. For efficacy outcomes, the study-reported hazard ratios (HR) and respective 95% confidence intervals (CI) for overall survival (OS) were extracted. For safety and tolerability, the data extracted included treatment-related death, treatment discontinuation due to adverse event and selected adverse events (AEs). For hematological toxicities, data were extracted on grade 3–4 neutropenia, anemia, and thrombocytopenia. For GI toxicities, data were extracted on both grade 1–2 and grade 3–4 diarrhea, nausea, and vomiting. Additional data was extracted on grade 1–2 stomatitis, grade 1–2 fatigue and/or asthenia, grade 3–4 venous thromboembolism (VTE), grade 3–4 transaminitis, grade 3–4 dyspnea and/or cough, grade 3–4 infection, grade 3–4 prolonged QT and grade 1–2 alopecia. The number of events and the number of patients at risk were extracted individually for both the CDK4/6 inhibitor and control groups in each trial. Outcome measures were obtained from the most recently published manuscripts and cross-referenced with data in the clinicaltrials.gov registry to ensure consistency.

Data synthesis and statistical analysis

When more than one study reported data for either efficacy or safety and tolerability outcomes, these were pooled in a meta-analysis using RevMan 5.4 (Cochrane Collaboration, Copenhagen, Denmark). For efficacy, HR for OS and associated 95% CI were pooled using generic inverse variance. For toxicity profile, the odds ratio (OR) and associated standard error (SE) for each adverse event were calculated relative to endocrine therapy alone using the Mantel–Haenszel method. Pooling was performed using fixed effects modeling irrespective of statistical heterogeneity. Due to the expected differences in endocrine therapy and patient characteristics between studies, analyses were performed separately for each endocrine therapy backbone (AI/tamoxifen or fulvestrant) to compare ribociclib and abemaciclib to palbociclib. Then a network meta-analysis was performed using WINBUGS within Microsoft Excel (Microsoft Corp, Redmond WA). A post-hoc sensitivity meta-analysis was also performed repeating the analysis utilizing post-hoc data from one trial in which there were substantial missing data. Statistical tests were two-sided, and statistical significance was defined as p < 0.05. No correction was made for multiple statistical testing.

Ethics approval

This study was exempt from ethics board approval since it used publicly available data exclusively.

Results

The study selection schema is shown in Fig. 1. Seven phase III RCTs were included in the analysis including PALOMA-2, PALOMA-3, MONALEESA-2, MONALEESA-3, MONALEESA-7, MONARCH-2, and MONARCH-36,7,8,9,10,11,15,16. In total, the analysis comprised of 4415 patients, of which 2718 patients received a CDK4/6 inhibitor (1153 ribociclib, 791 palbociclib,774 abemaciclib). In 4 RCTs (1441 patients), the endocrine therapy backbone was an AI or tamoxifen and in 3 RCTs (1277 patients) it was fulvestrant. The median follow-up was 70.2 months (range: 48.7–97.2 months). Characteristics of the studies are outlined in Table 1.

Figure 1
figure 1

PRISMA flow diagram.

Table 1 Characteristics of included studies.

Efficacy

In the meta-analysis of the CDK4/6 inhibitors with an AI backbone, palbociclib had a non-significantly worse OS compared to ribociclib and abemaciclib (HR 1.26 [95% CI 0.88–1.80, p = 0.21] and 1.19 [95% CI 0.80–1.76, p = 0.39]) respectively. There were no differences in OS with ribociclib compared to abemaciclib (HR 1.06 [95% CI 0.80–1.41, p = 0.70]). For the fulvestrant backbone, palbociclib had similar OS compared to both ribociclib and abemaciclib; HR 1.12 (95% CI 0.75–1.66, p = 0.59) and 1.08 (95% CI 0.72–1.61, p = 0.73) respectively. Similarly, there were no differences in OS between ribociclib and abemaciclib (HR 0.96 [95% CI 0.66–1.42, p = 0.85]). Table 2 summarizes all indirect comparisons between the 3 different CDK4/6 inhibitors (forest plots for these analyses are shown in the Supplementary File.

Table 2 Differences in OS between the CDK4/6i with any ET or AI backbone with the PALOMA-2 sensitivity analysis.

In the PALOMA-2 trial, OS data was missing in 13% of the participants in the experimental arm and 21% in the control arm. In the post-hoc analysis utilizing data from the PALOMA-2 trial which excluded missing data, there was a smaller magnitude association with worse OS with palbociclib compared to ribociclib and abemaciclib (HR 1.14 [95% CI 0.80–1.63, p = 0.46] and 1.08 [95% CI 0.73, 1.60 p = 0.70] respectively). This lower magnitude effect remained statistically non-significant.

Safety and tolerability

Differences in safety and tolerability were observed between the 3 different CDK4/6 inhibitors (see Table 3). When assessing the AI/tamoxifen backbone, compared to palbociclib, abemaciclib had significantly more GI toxicity including more grade 1–2 vomiting and grade 1–2 diarrhea. Grade 3–4 neutropenia was significantly lower with abemaciclib however grade 3–4 infections were significantly higher. Grade 3-transaminitis was also higher with abemaciclib. Compared to palbociclib, ribociclib had significantly more GI toxicity with more grade 1–2 nausea, more grade 1–2 vomiting, grade 3–4 vomiting and grade 3–4 transaminitis. In comparison to ribociclib, abemaciclib had significantly more diarrhea of any grade and more grade-3–4 anemia. When assessing the fulvestrant backbone, compared to palbociclib, abemaciclib had significantly more GI toxicity including all grade nausea, grade 1–2 vomiting, grade 1–2 vomiting, grade 3–4 diarrhea. Abemaciclib had less grade 3–4 neutropenia than palbociclib but more grade 3–4 infections. Furthermore grade 3–4 dyspnea/pneumonitis was higher with abemaciclib. Compared to palbociclib, ribociclib had significantly more grade 3–4 QT prolongation and grade 3–4 transaminitis. Furthermore, ribociclib had more GI toxicity than palbociclib including more grade 1–2 nausea, grade 1–2 vomiting, and grade 1–2 diarrhea. Ribociclib had less grade 1–2 fatigue/asthenia than palbociclib, less grade 3–4 neutropenia, but more grade 3–4 infections.

Table 3 Adverse events between the CDK4/6i with any ET or AI backbone.

Compared to ribociclib and palbociclib, abemaciclib had more treatment discontinuation secondary to adverse events. There was no significant difference between ribociclib and palbociclib. Treatment-related death was higher with abemaciclib compared to other CDK4/6 inhibitors (see Table 3). This association was statistically significant for the comparison between abemaciclib and ribociclib and approached but did not meet statistical significance significant for the comparison between abemaciclib and palbociclib.

Discussion

Three CDK4/6 inhibitors have been approved for use in combination with endocrine therapy for HR+/HER− MBC. While all have shown superiority over endocrine therapy alone, the relative efficacy, safety and tolerability is unknown as no head-to-head trials have been performed. PFS effects have been very consistent for all CDK4/6i trials, with HR ranging between 0.50 and 0.59 and with meta-analyses not suggesting any statistically significant or clinically meaningful differences in PFS between drugs17. Therefore, the main markers of differentiation in the efficacy of drugs have been measured by OS. In this study, we performed a network meta-analysis to indirectly evaluate the differences in OS and safety profile of these agents. Our results show that efficacy differences in OS between the three agents are non-significant, and in most cases, effect sizes are not clinically meaningful irrespective of statistical significance. However, as expected, marked differences in safety and tolerability were identified.

While no statistically significant difference in OS was observed between the 3 CDK4/6 inhibitors, there was a non-significant association with shorter OS benefit with palbociclib than the other CDK4/6 inhibitors. The reasons for this are unclear but may reflect trial design rather than inter-drug differences. The OS analysis for PALOMA-2 was limited by a substantial proportion of missing data. OS was missing in 13% of the participants in the experimental arm and 21% in the control arm. In a post-hoc sensitivity analysis of the PALOMA-2 trial excluding participants with missing OS data, larger magnitude relative (HR 0.87 vs 0.96) and absolute effects (difference in median OS 7 vs 2.7 months) were observed. However, as expected, with the loss of power associated with any sensitivity analysis, the effect remained non-significant15,18. Using these post-hoc data in our meta-analysis resulted in lower magnitude effects for OS between palbociclib and other CDK4/6 inhibitors. These effects remained non-significant and based on thresholds recommended by the American Society of Clinical Oncology, were of borderline clinical meaningfulness19.

Another notable difference between these trials relates to the potential for informative censoring. The difference between study arms in the proportion of patients who were censored for reasons other than end of follow-up (e.g. premature loss to follow up due to AEs or withdrawal of consent) was higher with ribociclib than with palbociclib studies (> 5% in MONALEESA-2 versus < 1% in PALOMA-2). The reasons for unbalanced censoring are unclear, but may impact both the cross-trial comparison of different CDK4/6 inhibitors and meta-analytic comparisons20,21.

Consistent with prior reports, substantial differences in safety and tolerability were observed between the different CDK4/6 inhibitors22. In general, compared to ribociclib and abemaciclib, palbociclib showed more frequent hematological toxicity, but less frequent gastrointestinal toxicity. Patients with HR+/HER2− MBC report shortness of breath, fatigue, pain and vomiting as the most bothersome symptoms affecting their quality of life23. Furthermore, in a meta-analysis of phase 3 breast cancer trials, patients reporting more diarrhea had lower health-related quality of life and worse physical function24. Discussing side effect differences between drugs is an important method to increase patients’ satisfaction and increase adherence to treatment25. Of note, there were more treatment-related deaths reported with abemaciclib than with other CDK4/6 inhibitors, although this observation was only statistically significant in the comparison of abemaciclib with ribociclib. This finding should be interpreted with caution given that it is possible that these deaths may be related to the breast cancer despite not meeting imaging criteria for progression. It can be difficult to distinguish treatment-related from disease-related causes of death especially among patients with breast cancer who do not have disease which is measurable by Response Evaluation Criteria in Solid Tumors (RECIST) criteria26. However, with data suggesting that mechanisms of resistance to CDK4/6 inhibitors seem uniform between the different agents, the higher odds of non-cancer deaths with abemaciclib relative to placebo compared to other CDK4/6 inhibitors is an important observation27.

Our study has some limitations. This is a literature-based network meta-analysis rather than using individual patient data. Although the included studies were generally homogenous, there were differences in endocrine therapy backbone, patient populations (e.g. menopausal status) and as detailed above there was concern for post-randomization differences such as missing data and potential for unbalanced informative censoring among some studies. To address inter-study heterogeneity, our analysis compared studies with the same endocrine therapy backbone which would limit heterogeneity, but results in a smaller sample size for comparison and consequently reduced statistical power. There is therefore an incomplete ability to assess the assumptions of transitivity. However, in the absence of direct comparisons, assessment of relative efficacy, safety, and tolerability therefore needs to be based on indirect comparisons ideally based on network meta-analytic methods as utilized in this study. The main limitation of a network meta-analysis in this setting is that unlike in individual patient data analysis where the unit of analysis is an individual study participant, in a meta-analysis, the unit of analysis is each individual trial. With only seven trials included, statistical power is reduced and this may decrease the certainty of our analysis.

In summary, despite differences between trial effect sizes and statistical significance, in this network meta-analysis, there was no statistically significant difference in OS between the different CDK4/6 inhibitors. Significant differences between CDK4/6i were observed for safety and tolerability outcomes. Real-world data analyses may help to identify if a there is a meaningful inter-drug difference in efficacy, safety or tolerability.