Introduction

Allogeneic hematopoietic stem cell transplantation (Allo-HSCT) is a curative option for patients with acute myeloid leukemia (AML) but can be limited by high procedure related toxicity and donor availability. Since the early 2000’s, the development of non-myeloablative conditioning (NMAC) regimens contributed largely to extend the feasibility of Allo-HSCT to patients who are ineligible for myeloablative conditioning (MAC) regimens due to advanced age and/or comorbid conditions [1,2,3]. More recently, the worldwide use of haploidentical stem cell transplantation (Haplo-SCT) based on post-transplantation cyclophosphamide (PT-Cy) as part of graft-versus-host disease (GVHD) prophylaxis [4,5,6] has facilitated treatment of many patients lacking an HLA-matched donor. However, the optimal conditioning regimen for AML patients remains unknown. T cell-replete Haplo-SCT with PT-Cy was pioneered by the Johns Hopkins group using an NMAC regimen (cyclophosphamide + fludarabine + 2-Gy total body irradiation [CyFluTBI]) and a bone marrow (BM) graft, as reported in 2008 [4]. Although this platform results in low incidences of GVHD and non-relapse mortality (NRM), allowing its use in older and/or comorbid patients [7, 8], the relapse incidence (RI) among patients with myeloid malignancies is still a concern because lower conditioning intensity is usually associated with higher risk of relapse [4, 9]. In 2015, the Johns Hopkins group confirmed these results by reporting an RI of 45% at 3 years in patients with intermediate disease risk index (DRI) AML [10]. To overcome this drawback, conditioning regimens based on various doses and combinations of anti-leukemic drugs like thiotepa, busulfan, treosulfan and/or melphalan were developed [11,12,13]. In these settings, RI seems lower than previously reported using NMAC, ranging from 20% to 35%. The reduced intensity conditioning (RIC) regimen based on thiotepa, reduced-dose busulfan and fludarabine (TBF) is widely used, notably in Europe, as a more intensive alternative to NMAC in patients with AML who are still unfit for truly MAC regimens [14, 15]. However, the benefit of such conditioning intensification may be counterbalanced by an excess of toxicity, GVHD and NRM, making difficult the choice of conditioning therapy, notably for frail patients in complete remission (CR) AML. To date, no prospective study comparing TBF versus (vs.) CyFluTBI in CR AML has been published. We thus performed this retrospective comparison on behalf of the Acute Leukemia Working Party (ALWP) of the European Society for Blood and Marrow Transplantation (EBMT).

Patients and methods

Design and selection criteria

This retrospective study included data entered into the Promise EBMT database registry. All patients gave signed informed consent for data collection and subsequent analysis. This study was conducted in accordance with the principles of the Declaration of Helsinki. Selection criteria were: (1) adult patient (age ≥ 18 years) at time of transplant with CR1 or CR2 AML; (2) Haplo-SCT between 2009 and 2019; (3) no prior Allo-HSCT; (4) PT-Cy as part of GVHD prophylaxis; (5) no in vivo depletion using antithymocyte globulin (ATG) or alemtuzumab; (6) no ex vivo T-cell depletion; and (7) TBF RIC or CyFluTBI NMAC regimen. The list of participating centers is provided in the Supplementary Appendix 1.

Transplantation modalities

All patients in the TBF group received a maximal total busulfan dose of 6.4 mg/kg or 260 mg/m² (2-day equivalent IV busulfan) and a total thiotepa dose ranging from 5 mg/kg to 12 mg/kg. All patients in the CyFluTBI group received 2 Gy TBI, fludarabine and cyclophosphamide as previously described by Luznik et al. [4]. Patients received either BM or peripheral blood stem cells (PBSC) as graft source. In addition to PT-Cy, all types of GVHD prophylaxis were allowed except in vivo or ex vivo T-cell depletion.

Statistical analyses

Baseline characteristics of patient, disease, and transplantation procedures were expressed as frequency and percentage for qualitative variables and as median and interquartile range (IQR) for quantitative variables. According to the conditioning regimen, baseline variables were compared using the Chi-square or Fisher’s exact test, and by the Wilcoxon test for qualitative and quantitative variables respectively. All time-to-event analyses started from the time of Allo-HSCT, and patients were censored at last follow-up in the absence of a relevant event. For overall survival (OS), death from any cause was the relevant event. For leukemia-free survival (LFS), AML relapse and death from any cause were considered as relevant events. For GVHD-free, relapse-free survival (GRFS), grade III-IV acute GVHD (according to Glucksberg classification [16]), extensive chronic GVHD (according to Seattle classification [17]), AML relapse, and death from any cause were the relevant events [18]. The primary endpoint was LFS. The Kaplan–Meier estimator was used to calculate survival probabilities of OS, LFS and GRFS [19]. As recommended for analyzing cumulative incidences, GVHD, relapse and NRM were computed taking into account the presence of competing risks [20, 21]. Death or relapse prior to GVHD were the competing events for GVHD. AML relapse and death before relapse (i.e. NRM) were mutually competing events for the calculation of RI and NRM. Because we expected a differential impact of conditioning intensity across age groups, we compared TBF vs. CyFluTBI in patients aged < 60 years and in those aged ≥ 60 years. In each age subgroup, univariate analyses were performed using the Log-rank and Gray’s tests for survival (OS, LFS and GRFS) and cumulative incidence (GVHD, RI and NRM) outcomes, respectively. The hazard ratio (HR), together with the 95% confidence interval (95% CI) and p values were estimated using a multivariate Cox proportional-hazards regression model, providing a cause-specific HR in the presence of competing risks. To adjust the HRs for conditioning regimen, the following covariates were included: number of CR (CR1 vs. CR2), AML cytogenetic risk group according to ELN classification [22] (good or intermediate vs. poor vs. failed of missing), patient age (continuous, impact per 3-year increment) at Haplo-SCT, year (continuous, impact per 3-year increment) of Haplo-SCT, sex mismatch (female to male vs. other), donor cytomegalovirus (CMV) serostatus (positive vs. negative) and graft source (BM vs. PBSC). For the older group (≥60 years) the Karnofsky performance score (KPS < 90 vs. ≥90) was also included. These variables were selected because of known clinical relevance, impact in univariate analysis and/or statistically significant imbalance between TBF and CyFluTBI groups. Differences were considered statistically significant in case of p < 0.05. Center effect was taken into account by introducing a random effect or ‘frailty’ into the model. Statistical analyses were performed with R- software version 4.0.2 (https://www.r-project.org).

Results

Patient characteristics

A total of 490 patients were analyzed, including 203 and 287 patients younger and older than 60 years, respectively (Table 1). Among patients below 60 years of age, 106 and 97 received TBF and CyFluTBI, respectively. In the TBF group, year of Haplo-SCT was significantly more recent (2018 vs. 2016, p < 0.01) and donor CMV serostatus was more frequently positive (72% vs. 47%, p < 0.01) compared to the CyFluTBI group. Moreover, patients in the TBF group were more frequently given cyclosporine A (CSA) and mycophenolate mofetil (MMF) as additional GVHD prophylaxis compared to the CyFluTBI group (71% vs. 34%, p < 0.01). Conversely, tacrolimus and MMF was given to 17% and 53% of patients in the TBF and CyFluTBI groups, respectively. No significant imbalance was observed across conditioning groups in terms of AML cytogenetic risk, secondary AML, CR1 or CR2 AML, female to male Allo-HSCT, graft source, age, and KPS.

Table 1 Patient and transplantation characteristics.

In the cohort of older patients, 170 and 117 patients received TBF and CyFluTBI, respectively. Compared to patients in the TBF group, those in the CyFluTBI group were significantly older (median age 68 vs. 65 years, p < 0.01), more frequently received PBSC as graft source (68% vs. 50%, p < 0.01), more frequently had a KPS < 90 (48% vs. 23%, p < 0.01) and negative donor CMV status (52% vs. 38%, p = 0.02). Additional GVHD prophylaxis was more frequently based on CSA and MMF in the TBF group (74% vs. 56%, p < 0.01).

Impact of conditioning intensity in patients younger than 60 years

In patients aged <60 years, 2-year LFS was 63% and 41% (p = 0.01), 2-year OS was 63% and 51% (p = 0.25), 2-year RI was 15% and 43% (p < 0.01) and 2-year NRM was 22% and 16% (p = 0.12) for TBF and CyFluTBI respectively (Fig. 1, Table 2). The cumulative incidence of neutrophil recovery at day + 42 was 94% and 95% in the TBF and CyFluTBI group, respectively (median [IQR]: TBF: 21 days [16,17,18,19,20,21,22,23,24], CyFluTBI: 19 days [17,18,19,20,21]). The 100-day incidence of acute grade II–IV GVHD was 25% and 21% (p = 0.51), 2-year incidence of chronic GVHD was 33% and 28% (p = 0.79) and 2-year GRFS was 50% and 34% (p = 0.09) for TBF and CyFluTBI, respectively. Grade III–IV acute GVHD at day + 100 was 10% and 7% in the TBF and CyFluTBI group, respectively. Multivariate analysis confirmed the higher RI (HR = 3.59, 95% CI = 1.75–7.37, p < 0.01), lower LFS (HR = 1.98, 95% CI = 1.22–3.22, p < 0.01) and lower OS (HR = 1.73, 95% CI = 1.04–2.88, p = 0.04, Table 3) in the CyFluTBI group aged <60 years. No significant difference in NRM was observed between conditioning groups, incidence of acute and chronic GVHD, and GRFS (Table 3). Beyond the type of conditioning regimen, multivariate analysis showed that compared with patients transplanted in CR1, those with CR2 AML had a higher risk of NRM (HR = 3.19, 95% CI = 1.55–6.55, p < 0.01) and lower OS (HR = 1.70, 95% CI = 1.02–2.81, p = 0.04). In addition, poor cytogenetic risk was associated with significantly higher RI (HR = 2.00, 95% CI = 1–3.98, p = 0.049) and lower LFS (HR = 1.74, 95% CI = 1.01–3.01, p = 0.047). Age, donor CMV serostatus, gender mismatch, and graft source did not significantly influence any endpoint. Hazard ratios, 95% CIs and p values for all covariates included in the multivariate model are provided in Supplementary Table 1.

Fig. 1: Outcome after Haplo-SCT in patients aged <60 years according to conditioning regimen.
figure 1

LFS (a), OS (b), RI (c) and NRM (d).

Table 2 Univariate analyses: influence of conditioning regimen on outcomes after Allo-HSCT.
Table 3 Multivariate analyses: influence of conditioning regimen on outcomes after Allo-HSCT. Hazard ratio, 95% confidence intervals and p values for all covariates included in the model are provided in Supplementary Tables 1 and 2.

Impact of conditioning intensity in patients older than 60 years

In patients aged ≥60 years, the 2-year LFS was 48% and 49% (p = 0.76), 2-year OS was 54% and 55% (p = 0.84), 2-year RI was 22% and 28% (p = 0.09) and 2-year NRM was 30% and 23% (p = 0.20) for TBF and CyFluTBI, respectively (Fig. 2, Table 2). The cumulative incidence of neutrophil recovery at day+42 was 95% and 92% in the TBF and CyFluTBI group, respectively (median [IQR]: TBF: 19 days [17,18,19,20,21,22] CyFluTBI: 18 days [16,17,18,19,20,21,22]). The 100-day acute grade II–IV GVHD was 24% and 27% (p = 0.56), 2-year incidence of chronic GVHD was 26% and 33% (p = 0.11) and 2-year GRFS was 40% and 34% (p = 0.17) for TBF and CyFluTBI, respectively. Grade III–IV acute GVHD at day+100 was 16% and 8% in the TBF and CyFluTBI group, respectively. Multivariate analysis confirmed that conditioning regimen did not significantly influence LFS (HR = 0.90, 95% CI = 0.56–1.44, p = 0.65), OS (HR = 0.81, 95% CI = 0.49–1.32, p = 0.39) and RI (HR = 1.78, 95% CI = 0.90–3.50, p = 0.10), but showed that CyFluTBI was associated with a significantly lower risk of NRM (HR = 0.48, 95% CI = 0.25–0.92, p = 0.03, Table 3). No significant difference between conditioning groups was observed with respect to incidence of acute and chronic GVHD and GRFS (Table 3). Other factors that independently influenced outcomes in multivariate analyses were graft source and AML cytogenetic risk. The use of PBSC was significantly associated with a lower risk of relapse (HR = 0.52, 95% CI = 0.27–0.98, p = 0.04) and a higher risk of grade II–IV acute GVHD (HR = 1.95, 95% CI = 1.10–3.44, p = 0.02), without significantly influencing chronic GVHD, NRM, LFS, GRFS and OS. As observed in the younger cohort, poor risk cytogenetics was associated with a significantly higher RI (HR = 2.56, 95% CI = 1.32–4.98, p < 0.01) and lower LFS (HR = 1.73, 95% CI = 1.07–2.78, p = 0.03). Age, donor CMV serostatus, gender mismatch, KPS, and the number of CR (CR1 vs. CR2) did not significantly influence any endpoint. Hazard ratios, 95% CIs and p values for all covariates included in the multivariate model are provided in Supplementary Table 2.

Fig. 2: Outcome after Haplo-SCT in patients aged ≥60 years according to conditioning regimen.
figure 2

LFS (a), OS (b), RI (c) and NRM (d).

Discussion

In the context of AML, the optimal conditioning regimen remains unknown. Several studies have previously reported RIC vs. MAC comparisons, in the setting of both HLA- matched and haploidentical Allo-HSCT showing that RIC regimens are a valuable alternatives for patients who are unfit for MAC regimens [23,24,25,26,27,28,29]. However, the lower NRM following RIC might be counterbalanced by a higher risk of relapse compared to MAC, underlining that a fine tuning of conditioning intensity is still of importance [27, 29, 30]. Moreover, intensity and myeloablative potential are highly heterogeneous across the various types of RIC regimens, ranging from truly NMAC to RIC that contain combinations of low dose myeloablative drugs. It is important to note that these definitions can be ambiguous, notably when high dose PT-Cy is used. Although the CyFluTBI regimen was initially reported as a NMAC regimen by the Baltimore’s group [4, 7], it does not strictly fit the definition of NMAC regimen of the EBMT classification and could be also considered as a RIC regimen [31]. It would be also the case using the recent transplant conditioning intensity (TCI) score, highlighting the limit of such classifications [32]. This complexity explains in part why the role of conditioning intensity across the various types of RIC regimens is still a matter of debate. Only one prospective trial has randomized a NMAC vs. RIC regimen: Blaise et al. showed that, in the context of matched related donor Allo-HSCT, NMAC (fludarabine + 2 Gy TBI) was associated with lower NRM compared to RIC (fludarabine + oral busulfan 8 mg/kg), but also with a higher risk of relapse, thus resulting in similar survivals [33]. However, there is no similar comparative study in the context of Haplo-SCT for AML. Haplo-SCT using the PT-Cy platform was initially reported with the CyFluTBI NMAC regimen and a BM graft [4]. This resulted in low incidences of both GVHD and NRM, making this regimen a valuable option for patients with advanced age and/or comorbidities [7, 8]. However, the RI remains high for patients with AML, as previously reported by the Johns Hopkins group [10]. Compared to the truly NMAC CyFluTBI regimen, a TBF regimen – even in its RIC version – carries much more myeloablative potential due to the combination of two anti-leukemic alkylating agents. Thus, many studies have reported the efficacy of a TBF regimen for AML patients, in both HLA-matched and haploidentical Allo-HSCT [14, 15, 34,35,36]. However, some studies also showed a high incidence of NRM following PT-Cy Haplo-SCT prepared with TBF [36, 37]. Pagliardini et al. reported promising outcomes for CR1 AML patients receiving a TBF regimen (2-year RI and NRM of 17% and 16%, respectively) while patients with more advanced AML had a higher risk of NRM (38%) [36]. Bazarbachi et al. recently reported on behalf of the ALWP of the EBMT that TBF (RIC and MAC) for Haplo-SCT decreased the risk of relapse compared to the fludarabine-busulfan platform (RIC and MAC) in matched unrelated donor Allo-HSCT, but with a significant increase in the risk of NRM, leading to similar survivals [37]. Taken together, these results showed that TBF RIC could be more effective than CyFluTBI NMAC for AML patients undergoing Haplo-SCT but might also lead to higher risk of NRM.

In the present study, we report the first comparison of TBF RIC vs. CyFluTBI NMAC regimens in the context of PT-Cy Haplo-SCT for AML patients. The results support that in patients younger than 60 years, the use of TBF provides significantly better LFS than CyFluTBI due to lower RI, without significant increase in the risk of NRM. Interestingly, multivariate analysis showed that except for the conditioning regimen, the other factors that independently influenced outcome were solely disease related (disease status [CR1 vs. CR2] and cytogenetics). Of note, Haplo-SCT in CR2 was the only factor that was significantly associated with NRM. This is in line with the previous report of Pagliardini et al. suggesting that TBF is safe and effective for CR1 AML patients, but should be used with caution in patients with more advanced disease [36].

By contrast, older patients did not benefit from such conditioning intensification. Indeed, despite a non-significant lower risk of relapse after TBF, the significantly higher risk of NRM using TBF precluded any benefit in LFS or OS. It is important to note that the significantly higher risk of NRM after TBF was observed although the patients in the CyFluTBI group had significantly lower KPS and higher HCT-CI. This underlines the major impact of conditioning intensity on NRM. Interestingly, the use of PBSC rather than BM was significantly associated with a lower risk of relapse. This underlines the importance of graft source selection to increase the graft-versus-leukemia (GVL) effect, notably in older patients, who seem not to benefit from conditioning intensification.

Our study highlights that conditioning intensity across the RIC regimens still plays a major role in Haplo-SCT for AML. In addition, we think that an optimal RIC regimen that can fit all CR AML patients is unlikely. Indeed, we showed that the optimal balance between the conditioning related anti-leukemic effect and its toxicity might depend on age. Beyond age, patient related factors such as HCT-comorbidity index (CI) and geriatric evaluation should also be considered to fine-tune conditioning intensity in a more personalized approach [38]. However, in this retrospective study, we were not able to fully analyse these parameters due to missing values. In addition, the nature of our analysis made it impossible to identify the reason for choosing a NMAC or a RIC regimen, thus limiting in part the interpretation of the results. In older patients, the significantly lower KPS and higher HCT-CI observed in the NMAC group suggests that these variables were possibly used to select conditioning intensity but other unavailable parameters probably influenced the choice of physician decision. In addition, we were not able to test the impact of conditioning regimen intensity according to pre transplantation minimal residual disease (MRD) because of the unavailability of this parameter in most cases. This is an obvious limitation to interpret our data when it is recognized that pre transplantation MRD is a strong predictive factor of outcome in AML. Thus, although this study does not allow identifying optimal conditioning regimen, our results draw the basis that may help the design of future prospective trials investigating conditioning intensity in AML patients undergoing RIC PT-Cy Haplo-SCT.

To conclude, in CR AML patients who will not receive a truly myeloablative regimen prior to PT-Cy Haplo-SCT, patient age could be used for determining the conditioning intensity. Younger patients seem to benefit from conditioning intensification from CyFluTBI to TBF regimens, while older ones do not. For the latter, post-transplantation therapies should be actively investigated as an effective alternative to conditioning intensification in order to decrease relapse after Haplo-SCT.