Introduction

In high-risk adult acute lymphoblastic leukemia (ALL) allogeneic hematopoietic cell transplantation (alloHCT) represents the treatment option with the highest potential for cure. Hence, alloHCT has been increasingly used over the years [1, 2], resulting in an age-dependent 5-year overall survival (OS) of between 41 and 66% [3,4,5]. For patients up to about 45 years of age, total body irradiation (TBI) ≥ 12 Gray (Gy) represents the standard backbone for conditioning, which is applied in combination with different chemotherapeutic agents (etoposide, cyclophosphamide, fludarabine) [6,7,8]. However, these classical myeloablative conditioning regimens are accompanied by a relevant toxicity in older patients [9, 10]. Therefore, dose-adapted or intermediate intensity conditioning (IIC) regimens such as fludarabine/TBI 8 Gy (FluTBI8) are frequently used in patients over the age of 45 years. These regimens are still myeloablative but contain reduced dosages of classical conditioning elements. Considering patients’ comorbidities, availability of irradiation facilities and patients’ or physicians’ preference, irradiation-free alternatives have been developed, with fludarabine/busulfan being the most popular combination. In Germany the recommended regimen is fludarabine/busulfan 6.4 mg/kg (FluBu6.4) [11], whereas in other countries fludarabine/busulfan 9.6 mg/kg (FluBu9.6) is used more frequently. While several studies have investigated the issue of the reduced/IIC regimen in ALL [12,13,14,15,16,17], no direct comparison has addressed the role of TBI among IIC regimens, both with respect to efficacy and toxicity.

Materials and methods

Study design and cohort

A retrospective, European Society of Blood and Marrow Transplantation (EBMT) registry-based study was performed. The EBMT is a non-profit, scientific society representing more than 600 transplant centers mainly in Europe, that are required to report to the registry all consecutive stem cell transplantations including follow-up once a year. Data are managed in a central database with internet access, in which each EBMT center is represented. Annual audits are performed to verify data accuracy. EBMT centers commit to obtain informed consent according to the local regulations applicable at the time of transplantation in order to report pseudonymized data to the EBMT. Inclusion criteria were: (1) Age >45 years, (2) allogeneic hematopoietic cell transplant (alloHCT) from matched sibling or matched unrelated donors between 2005 and 2020 for ALL in first complete remission, (3) conditioning with either FluTBI8, FluBu6.4 or FluBu9.6. The study was approved by the general assembly of the Acute Leukemia Working Party of the EBMT.

Endpoints and definitions

Overall survival (OS) and leukemia free survival (LFS) were the major endpoints of interest. The cumulative incidence of non-relapse mortality (NRM) and relapse incidence (RI), as well as acute and chronic graft-versus-host disease (a/cGvHD) and GvHD-free, relapse-free survival (GRFS) were also analyzed. OS was defined as the interval from date of alloHCT to date of last follow-up (LFU) or date of death, regardless of cause. LFS was calculated as the interval between the date of alloHCT and death, relapse, or last follow-up. NRM was defined as death without previous relapse or progression, GRFS as survival from alloHCT without aGvHD grade III–IV, without severe cGvHD and without evidence of relapse. Acute and chronic GvHD were classified as previously described [18, 19]. Measurable residual disease (MRD) status was evaluated according to local standards, including BCR::ABL PCR, flow cytometry and individually identified IgHV or T-cell receptor rearrangements, and was included into the analysis as reported by the participating centers.

Statistical analysis

Patient-, disease-, and transplant-related characteristics were compared using the chi-square or Fisher’s exact test for categorical variables, and the Wilcoxon test for continuous variables between the three conditioning regimens. The probabilities of OS, LFS, and GRFS were calculated using the Kaplan-Meier estimate. The probabilities of relapse incidence (RI), NRM, acute and chronic GVHD were estimated using cumulative incidence curves. For GVHD, death and relapse were considered competing events. Univariate analyses were performed using the log-rank test for OS, LFS, and GRFS, and Gray’s test for RI, NRM and GVHD. Multivariate analysis was performed using a Cox proportional-hazards model which included variables differing significantly (p < 0.05) between the groups, factors known to be associated with outcomes, plus a center frailty effect to take into account the heterogeneity across centers. For all comparisons, follow-up was censored at 2 years in order to take into account for the differences in follow-up between the 3 groups.

Results were expressed as the hazard ratio (HR) with the 95% confidence interval (95% CI). All tests were two-sided with a type 1 error rate fixed at 0.05. The Bonferroni correction was used to control the type I error when testing the differences among the three levels of the conditioning. Statistical analyses were performed with SPSS 25.0 (IBM Corp., Armonk, NY, USA) and R 4.0.2 (R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/).

Results

Patients’ characteristics

In total, 501 patients were identified (Philadelphia chromosome negative (Ph-) B-ALL, n = 139; Ph+ B-ALL, n = 296; T-ALL, n = 66). Conditioning for alloHCT comprised FluTBI8 (n = 262), FluBu6.4 (n = 188) or FluBu9.6 (n = 51). Patient characteristics revealed imbalances among the three cohorts with respect to median age (p < 0.0001), ALL subtype (p = 0.025), HCT-CI (p = 0.0002), in-vivo T-cell depletion (p < 0.0001), CMV patient/donor serostatus (p = 0.01) and median year of transplant (p < 0.0001), whereas all other features were balanced. Median follow-up from transplant was 21, 53, and 32 months, p < 0.0001. Further information on patient- and transplant-related characteristics are shown in Table 1.

Table 1 Patient and transplant characteristics.

Outcome

After conditioning with FluTBI8, FluBu6.4 or FluBu9.6, OS at two years from alloHCT was 68.5%, 57%, and 62.2% (p = 0.06). RI at two years was significantly different between the 3 groups: 24.7% among patients receiving FluTBI8, 37.3% and 30.9% in FluBu6.4 and FluBu9.6, respectively (p = 0.014), translating to a different LFS, being 58% after conditioning with TBI8Gy, 42.7%, and 45% after Bu-based conditioning (p = 0.003). Among the three groups, cumulative incidence of NRM at two years was not different (FluTBI8: 17.3%, FluBu6.4: 20.1%, and FluBu9.6: 24%, p = 0.38), and GRFS was 39.9%, 34.3%, and 40.1%, respectively (p = 0.29). Time-to-event outcomes are illustrated in Fig. 1 and Supplementary Table 1.

Fig. 1: Comparison of FluTBI8, FluBu6.4 (FB2) and FluBu9.6 (FB3).
figure 1

Relapse incidence (a), non-relapse mortality (b), leukemia-free survival (c) and overall survival (d).

GvHD and cause of death

The cumulative incidence of aGvHD grade II-IV did not vary among the groups being 26.3% for FluTBI8 and 26.3% and 29.2% for FluBu6.4 and FluBu9.6, respectively (p = 0.9). Acute GvHD grade III–IV was 9%, 8.8% and 8.2% for the three regimens (p = 0.97). Regarding chronic GVHD (limited and extensive), cumulative incidences were 45.7%, 33.2% and 31.1% (global p value = 0.018, but the differences were not confirmed in multivariate analysis). For extensive cGVHD, cumulative incidences were 23.5%, 15.9% and 10.9%, p = 0.07 (see Supplementary Tables 1, 2).

Leukemia relapse was the main cause of death in all three cohorts (51.7% of all fatalities), followed by GvHD (20.2%) and infections (15.7%). Patients receiving FluTBI8 showed the highest rate of infection-associated deaths (24.3% as compared to 15.8%/8.2% after FluBu9.6/FluBu6.4). The highest rate of death caused by GvHD was seen with FluBu9.6 (36.8%), while FluBu6.4 (23.5%) and FluTBI8 (12.2%) showed lower rates. Further details on other causes of death are shown in Table 2.

Table 2 Causes of death.

Multivariate analysis

In comparison to FluTBI8, RI was significantly higher after FluBu6.4 (hazard ratio [HR] [95% confidence interval [CI]: 1.85 [1.16–2.95], p = 0.01), but not after FluBu9.6 (HR: 1.51 [0.82–2.77], p = 0.19). LFS was inferior after conditioning with both FluBu6.4 (HR: 1.56 [1.09–2.23], p = 0.014) and FluBu9.6 (HR: 1.63 [1.02–2.58], p = 0.039) as compared to FluTBI8. However, OS was not significantly different among the three subgroups. Risk of NRM, GRFS, aGvHD II–IV, aGvHD III–IV and both limited and extensive cGvHD were not influenced by conditioning type either.

Other factors significantly influencing outcomes were increasing age (per 10 years, no interaction between age and conditioning), Ph+ ALL subtype, and MRD status at start of conditioning for alloHCT: Age was associated with higher NRM and lower OS, LFS, and GRFS. Patients with Ph+ ALL had a significantly lower RI and a better OS, LFS, and GRFS, whereas MRD had a negative impact on RI, LFS and GRFS. Results of the multivariate analysis are shown in Table 3 and Supplementary Table 2.

Table 3 Multivariate analysis of risk factors on outcome parameters.

Discussion

To the best of our knowledge, this is the first analysis evaluating the role of TBI for ALL patients aged >45 years transplanted in first complete remission receiving an IIC regimen. The most frequently used radiation-free regimens (FluBu6.4 and FluBu9.6) have been used for comparison.

Among standard conditioning regimens in ALL, TBI containing regimens (usually comprising ≥12 Gy), were associated with better antileukemic efficacy as compared to irradiation-free protocols [8, 20, 21]. Hence, ≥12 Gy TBI-based regimens are regarded as standard for myeloablative conditioning for alloHCT in younger adults and children with ALL [11, 22, 23]. However, NRM was increased among patients >45 years of age receiving standard protocols [9, 10, 24]. Relevant toxicities have been described in particular for TBI compared to TBI-free conditioning regimens, with non-infectious pulmonary toxicity being a relevant issue in this context [25,26,27,28]. Therefore, dose adapted IIC regimens have been developed. In contrast to the observations made after standard conditioning, our comparison of IIC regimen did not show an increased toxicity of the TBI containing protocol as compared to busulfan-based regimen. Overall, NRM was 23.1% after TBI and 20.7% and 26.8% after FluBu6.4 and FluBu9.6 conditioning, respectively. With respect to the causes of death, lethal infectious complications were more frequent after TBI-based conditioning, which may be a consequence of an increase in mucosal toxicity as a possible cause for bacterial infections. Particular attention should therefore be paid to mucosal protection and early anti-microbiological intervention in patients receiving TBI. In contrast to data reported following standard conditioning protocols, non-infectious lung toxicity was not observed as a frequent cause of death among patients receiving FluTBI8. Overall, GvHD was a more frequent cause of death among patients receiving a busulfan-based regimen, although the cumulative incidence of severe a/cGvHD was not different.

With respect to efficacy, we found that FluTBI8 was associated with a lower RI than FluBu6.4, and a superior LFS (58% at two years) as compared to both FluBu cohorts (42.7%, HR: 1.56 [1.09–2.23] and 45%, HR: 1.63 [1.02–2.58]). This data suggests a superior antileukemic potential of intermediate dose TBI in comparison to chemotherapy-based conditioning. Similar to our data, a recent study comparing FluTBI8 to FluBu9.6 given before alloHCT in acute myeloid leukemia observed an improved LFS for patients <50 years receiving TBI [29]. In contrast, an earlier study from the ALWP of the EBMT in patients with ALL found no difference between FluBu6.4 and a conditioning containing TBI at a non-myeloablative dose of 2 Gy, underpinning that a minimal dosage of TBI that is necessary for an antileukemic effect [16]. Similarly, no difference was observed in a recent randomized phase III trial comparing standard dose busulfan to myeloablative TBI in younger patients with standard risk ALL [30]. On the other hand, 8 Gy might also represent an optimal upper dose for TBI, given that in a recent EBMT study, identical LFS and RI were observed after 8 Gy and 12 Gy TBI-based conditioning [31].

Despite lower RI and superior LFS following TBI-based conditioning, OS was not significantly different among recipients of TBI as compared to both FluBu groups. This might be explained in part by a higher percentage of patients with Ph+ ALL in the two FluBu cohorts, given that the introduction of tyrosine kinase inhibitors (TKI) both into the induction therapy [32,33,34,35], and in the prophylaxis and treatment of relapse after alloHCT [36,37,38] has significantly improved the outcome of this patient subgroup. Unfortunately, available data among our patients were insufficient to estimate the influence of TKI given preemptively or for maintenance following alloHCT. A second reason for similar OS among the three cohorts could be a higher upper age limit among TBI recipients (76.1 vs 72.0 and 65.6 years). In the multivariate analysis, increasing age showed a significant negative impact on all outcome parameters, except RI, a finding in line with previous observations [9, 39].

Data on MRD before start of conditioning were available for about 74% of all patients. Among these, MRD positivity showed a significantly negative impact on RI and LFS. The evaluation of MRD as a predictive factor for post-transplant outcome was not the focus of our study, in particular considering the fact that MRD was examined according to local standards with different levels of sensitivity. Nevertheless, this finding is in line with other studies analyzing of the role of MRD status before alloHCT for outcome [20, 40, 41]. Importantly, MRD did not modify the role of TBI based conditioning on RI and LFS.

Some limitations of our study need to be considered. First, the reason why patients have been selected to receive their respective conditioning regimen could not be evaluated retrospectively. Further, the retrospective design was associated with several imbalances of some risk factors among the cohorts, which we tried to account for when fitting the multivariate models. Nevertheless, as discussed above, the lower percentage of Ph+ patients among TBI recipients might have counterbalanced the superior antileukemic effect of TBI. In contrast, imbalances concerning median age did not influence NRM as one may have expected. Similarly, different rates of in-vivo T-cell depletion (TCD) and differences concerning the year of transplantation among cohorts did not appear to significantly influence outcome. A deleterious effect of TCD on anti-tumor efficacy of chemotherapy-based RIC alloHCT had been observed in a large registry study [42]. However, only 4% of patients analyzed in that study suffered from ALL. In general, ALL is regarded as a disease with lower sensitivity to a graft-versus-leukemia effect [43]. Hence TCD might be less relevant among ALL patients as compared to myeloid diseases or slower proliferating lymphoid disorders.

In conclusion, this study represents the first direct comparison of intermediate intensity conditioning regimens comprising TBI versus chemotherapy in ALL patients >45 years. Considering the limits of a retrospective registry analysis, antileukemic efficacy was stronger after TBI-based conditioning within this cohort of 501 patients, as shown in the multivariate analysis by a lower RI and longer LFS. However, despite similar NRM rates, this only translated into a non-significant advantage in OS. Independently from the conditioning regimen, increasing age, MRD positivity and Ph+ ALL were the most important factors for overall outcome.