Introduction

Conventional therapy for patients with transfusion dependent thalassemia (TDT) has to be lifelong and often results in iron overload and severe organ dysfunction leading to significant morbidity and mortality over time [1, 2]. Besides promising results in gene therapy in clinical trials [3,4,5,6], hematopoietic stem cell transplantation (HSCT) is the only curative treatment option that is widely available and has been performed for decades in TDT [7]. Initially, myeloablative conditioning regimens consisting of busulfan and cyclophosphamide followed by HLA-matched sibling donor (MSD) HSCT represented the standard of care in TDT patients [8,9,10,11,12]. Despite chelation therapy, previous blood transfusions and subsequently organ damage due to iron overload compromise HSCT outcome [10]. Accordingly, signs of inadequate chelation therapy and increased iron overload leading to endorgan damage have been used for risk stratification (Pesaro [13]). High rates of graft failure (GF) and treatment-related mortality (TRM) pose significant challenges in high risk patients [12, 13]. In addition, a healthy MSD is not available for the majority of patients. With progress in high resolution HLA typing and supportive management as well as application of risk-adapted protocols, the outcome after HSCT from a well-matched unrelated donor (UD) has become comparable to results obtained with a MSD although higher rates of complications and graft versus host disease (GvHD) are frequently observed. [7, 14,15,16,17] Due to the unfavorable toxicity profile of busulfan (lung, brain, gonadal, sinusoidal obstruction syndrome (SOS)), several transplant centers have started to use reduced-toxicity protocols with treosulfan, fludarabine and thiotepa (TFT) instead. These concepts are characterized by more intensive immunosuppression with lower organ toxicity [18, 19]. On the other hand, there is some concern in terms of increased rates of mixed chimerism (MC) and GF in treosulfan-based concepts in TDT [20]. These issues prompted us to conduct this retrospective analysis of HSCT in thalassemia patients reported to the German pediatric registry for stem cell transplantation and cell therapy (PRSZT) between 2011 and 2020.

First, we assessed newer concepts of matched (10/10) unrelated donor (MUD) and mismatched (9/10) unrelated donor (MMUD) HSCT in comparison with standard MSD-HSCT regarding outcome (overall survival (OS), thalassemia-free survival (TFS), thalassemia-free and persisting chronic-GvHD-free survival (TGFS), MC, GF, TRM and GvHD). Second, we compared treosulfan- and busulfan-based conditioning regimens in order to identify critical differences that need to be addressed by the concept. Finally, we wanted to identify factors preventing MC and GF without leading to severe GvHD (especially in TFT-based regimens) to support the development of improved, risk-adapted concepts of low toxicity.

Methods

Data source

This is a retrospective multicenter registry analysis from the German PRSZT, which is a nationwide association of pediatric transplant centers. Informed consent for registration and data collection was obtained from all patients and/or their legal guardians following the principles of the Declaration of Helsinki (IRB approval #1979–2013).

Patients

Patients who received a first allogeneic HSCT for TDT between June 2011 and February 2020 were included. HSCTs from MSD, matched (10/10) family donor other than siblings (MFD), MUD or 9/10 MMUD were analyzed only. A follow up for at least 12 months was aimed for in all survivors.

Outcomes and definitions

The primary endpoints were OS (time from HSCT to death of any cause/last follow up), TFS (time from HSCT to graft failure (recurrence of transfusion dependency) or death – whichever occurred first - or last follow-up), as well as TGFS (definition as TFS but occurrence of persisting chronic GvHD (cGvHD) as additional event). Secondary endpoints included GF, lowest and last chimerism, incidence of acute GvHD grade III-IV (aGvHD III-IV), extensive chronic GvHD (extCGvHD), and cytomegalovirus-reactivations. Lowest chimerism was defined as lowest percentage of donor derived hematopoiesis ever documented in the post-transplant period and MC as <95% of donor cells. aGvHD and cGvHD severity were graded according to Glucksberg criteria [21]. Patients were stratified into Pesaro risk classification [13] except patients for whom relevant data were missing. Grafalon® (formerly ATG-Fresenius, ATG-F) 60 mg/kg body weight (b.w.) and Thymoglobuline® 10 mg/kg b.w. were considered as high dose, everything below as low dose ATG. Based on the initially aimed, mean ciclosporin A (CSA) trough target level we divided the patients into three groups (≤100 µg/l (low), 101–149 µg/l (medium) and ≥150 µg/l (high)).

Statistical methods

Demographic, baseline and treatment variables as well as outcome parameters were reported for the entire study population and in the following separately according to donor type and conditioning regimen. Categorical data were summarized by absolute and relative frequencies and compared by Fisher’s exact (chi square) test. For continuous variables, median (range) was calculated and then compared using Mann–Whitney-U or Kruskal–Wallis H Test. Survival probabilities were estimated by Kaplan–Meier methodology and compared using the log-rank test. The impact of the following variables on outcome were assessed: recipient age and sex, last serum ferritin level before HSCT, liver iron concentration [22, 23], Pesaro risk classification, sex- as well as cytomegalo- and Eppstein-Barr virus status-matching between recipient and donor, donor type, pre-conditioning therapy, conditioning regimen, antithymocyte globulin (ATG) application, dosage of ATG application, stem cell source and cell counts, mean targeted CSA trough level, and GvHD prophylaxis. Multivariable Cox regression analysis was limited to cytomegalovirus reactivation due to the small number of events regarding OS, TFS, aGvHD III-IV, extCGvHD and GF with several subgroups having no events. Regarding mixed chimerism a multivariable logistic regression analysis was performed which included all variables that were significant (p < 0.05) in univariable analysis. Median follow-up was calculated using the inverse Kaplan–Meier method. P-values were 2-sided and were considered statistically significant if <0.05. Statistical analyses were performed with SPSS version 27.0 (IBM SPSS Inc, IL, USA), SAS version 9.4 (Cary, NC, USA) and R version 4.1.2.

Results

Patient and donor characteristics

124 pediatric and young adult patients received allogeneic HSCT for TDT between June 2011 and February 2020 at 15 different pediatric transplant centers. Characteristics for the whole group as well as stratified according to donor subgroups are presented in Table 1 and Supplementary Table 1.

Table 1 Patient and HSCT characteristics as well as outcome in the entire cohort of 124 patients and differences according to donor subgroup.

Transplant characteristics

Details regarding HSCT procedure are also shown in Table 1. 57 patients (46%) received their graft from an UD. Conditioning regimens including busulfan was applied in 32 patients and always contained a busulfan-fludarabine (BF)-based approach (details see Supplementary results). All 92 patients with treosulfan-based conditioning received TFT. 109 of 119 patients with ATG serotherapy received the last dose on day-3 or later.

Overall survival, TFS and TGFS

The outcome stratified by donor type is shown in Fig. 1a–c. Median follow up of surviving patients was 3.2 years (range 0.6–9.2). 4y-OS in the entire cohort was excellent (95.1%) with four deaths occurring after 1st and one after 2nd HSCT. Death in two patients were associated with aGvHD grade IV. The other two died due to systemic candida infection and systemic cytomegalovirus disease with pulmonary failure, respectively. All four patients who died after 1st HSCT were older than 12 years and had received highly immunosuppressive regimens with pre-conditioning and either high-dose ATG or post-transplant cyclophosphamide with low-dose ATG serotherapy. There was a trend towards worse TFS with increasing HLA disparity (Fig. 1b). When analyzing TGFS, this trend became significant with patients following MMUD transplant reaching only 73.2% (Fig. 1c).

Fig. 1: Outcome according to donor and lowest chimerism in relation to CD3+ cell count in the entire cohort.
figure 1

ac Presentation of OS, TFS and TGFS according to donor groups. (a) OS (b) TFS and (c) TGFS. MFD matched family donor other than sibling, MSD matched sibling donor, MUD matched unrelated donor 10/10, MMUD mismatched unrelated donor 9/10. df Relationship of lowest donor chimerism (percentage of donor cells) and CD3+ cell count in the graft. In 99 patients with available CD3+ cell count, the relationship of lowest donor chimerism and CD3+ cell count in the graft (x107/kg body weight) is depicted with the specification of (d) donor (e) graft source (f) acute GvHD. MFD matched family donor other than sibling, MSD matched sibling donor, MUD matched unrelated donor 10/10, MMUD mismatched unrelated donor 9/10, PBSC peripheral blood stem cells, TCD ex vivo T-cell depletion.

Engraftment

The overall neutrophil engraftment rate was 99.2%. One patient died on day +13 before reaching neutrophil engraftment. Time to neutrophil-engraftment was significantly delayed after MSD-HSCTs (Supplementary Table 1) as well as when cord blood (CB) or bone marrow (BM) grafts were used (Supplementary Fig. 1a). Median time to platelet engraftment was also mainly dependent on stem cell source: patients receiving peripheral blood stem cells (PBSC, n = 26) had fast median platelet engraftment of 19 days, whereas this was significantly delayed (median 29 days, P = 0.011) in patients with BM (n = 94, see also Supplementary Fig. 1b).

Graft failure and chimerism

Eight patients suffered from GF (Table 1). One patient following MMUD transplant achieved only neutrophil engraftment 2 weeks after HSCT, but subsequently rejected the graft a few days later. The remaining (7/8) showed sustained graft function during the first months after HSCT, even with complete donor chimerism in the majority of these patients. However, this was followed by autologous recovery and secondary GF between 6 weeks and 9 months after HSCT requiring regular red cell transfusions. All GF (5 after BM and 3 after ex vivo T-cell depleted PBSC (PBSC-TCD)) occurred in the group of 35 patients that had received TFT conditioning without pre-conditioning therapy and less than 6 × 107/kg b.w. CD3+ cells in the graft as well as medium to high targeted CSA trough levels (Supplementary Tables 2, 3). The apparent association of GF and TFS with sex was caused by more transplantations of girls in regimens that were of higher risk.

Mixed donor chimerism was a frequently observed phenomenon (Tables 1, 2; Fig. 1d–f; Supplementary Table 4; Supplementary Fig. 2), especially in MSD-HSCT (with TFT-conditioning (27/40)). However, risk of secondary graft failure in pre-existing mixed chimerism increased with HLA-disparity (MSD 5% (2/40), MFD 20% (1/5), MUD 29% (2/7), MMUD 40% (2/5)). Risk factors for the development of mixed chimerism (<95%) included MSD, missing pre-conditioning, low CD3+ cell count, and higher CSA levels (Table 3). Risk of mixed chimerism below 75% was higher with TFT-conditioning.

Table 2 Differences in patient and HSCT characteristics as well as outcome in 124 patients receiving different conditioning regimens.
Table 3 Risk factors for mixed donor chimerism in 98 patients with available data analyzed by multivariable logistic regression analysis.

In 56 HSCTs from UD, that were evaluable regarding chimerism, none of 19 patients who received either unmanipulated PBSC (n = 12; CD3 + ≥ 6  × 107/kg) or BM with high amounts of CD3+ cells in the transplant (n = 7; ≥6  × 107/kg) developed MC. In contrast, in 4/10 HSCT with PBSC-TCD as well as in 9/27 with BM (lower amounts of CD3+ cells in the graft (<6 × 107/kg)) MC was observed.

Next, we analyzed a cohort of 20 patients who received TFT conditioning and HSCT from MSD with ATG, with BM as graft and without pre-conditioning therapy (and therefore represented a homogeneous cohort potentially at higher risk for MC). Seven of eight patients with low targeted CSA levels and early cessation of immunosuppression, had complete chimerism at last follow up and lowest chimerism of at least 93%. None received donor lymphocyte infusions (DLI)/boost. In the other thirteen patients with medium to high targeted CSA levels 6 patients received DLI, only three patients had full chimerism at last follow up (one with DLI/Boost), two patients had GF and 8 patients had last chimerism of 15–90%.

Based on cumulative incidences for GF (Supplementary Tables 2, 3), multivariable logistic regression analysis (Table 3) and on pathophysiological considerations we developed a score potentially suitable to predict the risk of mixed donor chimerism in our patient cohort. The four adjustable variables “myeloablation”, “pre-conditioning”, “CD3+ cell count with graft source” and “mean trough level of targeted CSA” were included (Table 4).

Table 4 Lowest donor chimerism and graft failure in relation to scoring system that reflects myeloablation, pre-conditioning therapy, CD3+ cell count with graft source and mean targeted trough level of CSA.

GvHD

In general, severe aGvHD III-IV started early between day +10 and day +40 approximately around the time of neutrophil engraftment. None of the patients with MSD suffered from aGvHD III-IV ((P log-rank 0.003) Supplementary Table 2; Table 1). aGvHD III-IV was mainly observed in patients transplanted from UDs (12 of 13). Next to a higher degree of HLA-disparity, also graft source and cell count were significantly different in UDs in comparison to MSDs. PBSC were administered more often and larger amounts of MNC and CD3+ cells were given, in contrast to BM as graft source (Table 1 and Supplementary Table 1). Nevertheless, reduced GvHD prophylaxis (low level of targeted CSA (P log-rank 0.007) and/or reduction/replacement of MTX by MMF (P log-rank <0.001)) seemed to be the main factor for severe aGvHD III-IV in HSCTs from UD following TFT conditioning (Table 5). Stem cell source and CD3+ cell count in the graft were no major contributors for aGvHD III-IV as only one patient with unmanipulated PBSC and high CD3+ cell count in the graft developed severe aGvHD (Fig. 1e, f) whereas five patients after BM and three after PBSC-TCD transplants. GvHD prophylaxis in patients with UD, TFT concept and unmanipulated PBSC comprised ATG Grafalon® (30–60 mg/kg and last dose mostly given on day-1), CSA with targeted trough level of at least 120 µg/l, and usually three doses of MTX.

Table 5 Univariable analysis of acute GvHD III-IV analyzed in patients with unrelated donor and TFT conditioning (N = 45).

Chronic GvHD (cGvHD) with systemic treatment (twelve extensive, one limited) also mainly occurred in UDs (Table 1). Extensive cGvHD was often associated with high ferritin of ≥3000 µg/l (P log-rank 0.022) before HSCT and the occurrence of severe aGvHD III-IV (P log-rank 0.001) (Supplementary Table 2). In 9/12 patients extensive cGvHD resolved.

Complications

The rate of relevant complications increased in parallel with the gradually increasing HLA disparity between different donor groups (Table 1). In a multivariable analysis of 89 patients at higher risk for cytomegalovirus reactivations (recipient cytomegalovirus IgG positive), patients with MMUD (HR 5.02; 95% CI 1.79–14.07; P = 0.002), high dosage of ATG (HR 3.68; 95% CI: 1.22–11.10; P = 0.021) or pre-conditioning therapy (HR 3.34; 95% CI: 1.43–7.80; P = 0.005) showed a significantly increased risk of cytomegalovirus reactivation (Supplementary Table 5).

Comparison of BF-based and TFT conditioning concepts

Outcome (OS, TFS, TGFS) was similar in patients with TFT and BF-based conditioning (Table 2), although there was a trend towards more GF, lower TFS and higher rates of MC (27.5% versus 6.2% for lowest chimerism<75%) in TFT-based concepts (Supplementary Fig. 2a). On the other hand, patients with BF-based concepts suffered from almost twice as many severe complications and showed delayed platelet engraftment (Supplementary Table 6). The incidence of severe aGvHD was similar in both conditioning groups but patients with BF-based regimens received more intense GvHD prophylaxis (ATG dose higher, often a third agent in addition to CSA and methotrexate, higher CSA target levels).

Discussion

Many important insights have been gained from previous large retrospective, multicenter reports in TDT [8,9,10, 20, 24, 25]. The present study contributes new aspects in particular regarding the interdependence of pre-conditioning therapy, conditioning regimen, stem cell source with CD3+ cell counts, donor type and GvHD prophylaxis. This was made possible by a very granular set of data including, for example, ferritin, CD3+ cell count in the graft, pre-conditioning therapy, details of GVHD prophylaxis, time course of chimerism, post-transplant cellular therapy, cytomegalovirus reactivations and relevant complications. In comparison to other reports [8, 10, 16, 26], the majority of patients had access to well managed conservative treatment before HSCT (Pesaro class 1 or 2, median ferritin 1800 µg/l). With a 4y-OS of 95.1%, 4y-TFS of 90.3% and 4y-TGFS of 87.9% the participating centers achieved a very good overall outcome, especially considering the high proportion of UD (MUD 22%, MMUD 24%). Although outcome with MMUD was significantly inferior supporting other reports [20, 24, 27], MMUD-HSCT seems justifiable if an appropriate conditioning concept is used, even though the combined risk of GvHD and infections is likely to be higher. In addition, MC occurred frequently (47.2%) leading occasionally to the administration of post-transplant cellular therapy (19/124). This prompted us to focus on preventive and predisposing settings, which eventually led us to develop an algorithm in order to assist in the development of future risk-adopted immunosuppressive strategies.

When comparing BF- and TFT-based therapies in our cohort, both approaches achieved similar results. However, the two regimens had different challenges (TFT: more MC and GF; BF-based: more complications and need for more intense GvHD prophylaxis [28, 29]). These issues underline that patient outcome largely depended on the adaption of GvHD prophylaxis to the conditioning concept, timely management of mixed chimerism and on the handling of potential complications.

Primary graft failures and rejections were no major problem in our cohort, which was dominated by highly immunosuppressive conditioning regimens. On the other hand, mixed chimerism was a significant concern, especially in MSD-HSCT with TFT conditioning. In addition to less differences in minor histocompatibility antigens in MSD and therefore reduced T cell alloreactivity, the pediatric setting with mainly use of bone marrow in MSD-HSCT leading to low CD3+ and mononuclear cell count in the graft probably also had an impact on high rates of MC and delayed neutrophil engraftment in this subgroup.

In order to better understand the interaction of influencing factors on donor chimerism, we performed a multivariable regression analysis regarding mixed chimerism. Also, we propose a scoring system that assess risk for MC in our cohort. It is based on four major influencing variables: high CD3+ cell count in the graft (≥6 × 107/kg), myeloablation with busulfan, pre-conditioning therapy as well as low-targeted CSA levels had protective impact. Unmanipulated PBSC, myeloablation with busulfan, and pre-conditioning therapy are general risk reducing factors for MC/GF that already have been described in other reports [11, 20, 26]. However, the T cell count in the graft seemed to be an important protective parameter for MC in our cohort. So far, only one report has described the association of high CD3+ cell count in the graft and reduced GF rate in TDT [30]. This was a single center study that differed significantly from ours in many aspects. The protective effect of low-targeted CSA level in our cohort is mainly attributed to TFT concepts. If these regimens are combined with reduced CD3+ cell count and are given without pre-conditioning therapy a well-adapted concept of GvHD prophylaxis seems to be of major importance.

In MSD-HSCT with Pesaro risk class 1–2 and well-controlled iron load before HSCT, a regimen with TFT, ATG, BM and without pre-conditioning therapy might be feasible, because CSA levels can be kept low and rapidly reduced due to the very low risk of severe GvHD. In UD-HSCT, however, the risk of severe aGvHD III-IV is significantly increased. Even with bone marrow as graft source, higher targeted CSA levels seem to be necessary especially at the time of engraftment and shortly thereafter. As a consequence, other measures such as either application of pre-conditioning or higher CD3+ cell count in the graft (for instance by application of unmanipulated PBSC [15, 26]) might be necessary to avoid MC and secondary graft failure. Unmanipulated PBSC seem promising in combination with treosulfan-based regimens [26] and high resolution HLA typing because risk of infections is not elevated and risk of severe aGvHD or cGvHD can be balanced by adequate GvHD prophylaxis.

The combination of extensive pre-conditioning therapy and high amount of ATG although protective against GF might be challenging due to higher incidences of cytomegalovirus reactivations and possibly also other infectious complications as well as TRM. Based on our retrospective data, such concepts might be considered primarily for high risk patients regarding GF (such as MMUD and/or Pesaro class 3) with careful virus monitoring and antiinfective prophylaxis.

This retrospective, multicenter analysis is limited by the comparison of various concepts differing in several HSCT characteristics. In addition, the explanatory power of the statistical analysis was largely limited due to small number of events. Nevertheless, we have performed a very detailed data analysis which enabled us to describe important protective factors for secondary GF and MC such as CD3+ cell count in the graft or GvHD prophylaxis. These aspects may be used for optimization of risk-adapted conditioning regimens and the development of randomized studies in TDT. Ultimately, clinical experience with specific concepts, such as management of concept-specific complications and control of GvHD prophylaxis over time, certainly plays an essential role in final outcome.