Introduction

Myelofibrosis is a Philadelphia chromosome-negative myeloproliferative neoplasm (MPN) with a heterogenous clinical phenotype characterized by splenomegaly, progressive anemia, ineffective hematopoiesis, and constitutional symptoms. It can present as a primary disease (primary myelofibrosis, PMF) or evolve from polycythemia vera or essential thrombocythemia (secondary myelofibrosis) [1, 2].

Despite recent improvements in the field of JAK inhibitors and novel agents with alternative targets, allogeneic hematopoietic stem cell transplantation (alloHCT) remains the only potentially curative treatment for patients with myelofibrosis to date [2, 3]. However, alloHCT is associated with treatment-related morbidity and mortality, especially in older and comorbid myelofibrosis patients [4, 5]. Acute and chronic graft-versus-host disease (GvHD) are two major life-threatening complications for patients undergoing transplant and they appear to occur more frequently in patients with MPN rather than in those with other hematological malignancies, configuring a field of deep unmet medical needs [6].

In vivo T-cell depletion with rabbit antithymocyte/antilymphocyte globulin as part of the conditioning regimen prior to alloHCT is widely used as an addition to the standard GvHD prophylaxis with a calcineurin inhibitor (cyclosporine or tacrolimus) and either methotrexate or mycophenolate mofetil. Currently, there are two commercial rabbit antithymocyte/antilymphocyte globulin formulations available for clinical practice: antithymocyte globulin (ATG) (thymoglobulin) is generated by immunizing rabbits with human thymocytes whereas antilymphocyte globulin (ATLG) is derived from rabbits vaccinated with the human Jurkat T-cell line. Patient outcomes for the two preparations seem to be similar [7, 8].

Several randomized clinical trials have shown that ATG/ATLG can both prevent incidence and severity of acute and chronic GvHD after alloHCT from unrelated and related donors [9,10,11,12,13,14,15]. However, evidence on their role in the setting of myelofibrosis is scarce. The aim of this study was to evaluate the impact of ATLG on clinical outcomes of patients with myelofibrosis undergoing an alloHCT in a large international retrospective study.

Methods

Data collection

Patients with PMF, post-polycythemia vera and post-essential thrombocythemia myelofibrosis (post-PV MF and post-ET MF) undergoing first alloHCT between 1992 and 2022 were included in this study. GvHD prophylaxis consisted either of ATLG (Grafalon, Neovii Biotech, Graefelfing, Germany) or non-ATLG-based strategies. Data were collected from 5 European and US academic transplant centers: the University Medical Center Hamburg-Eppendorf (Hamburg, Germany); the West German Cancer Center (Essen, Germany); the Hannover Medical School (Hannover, Germany); the Fred Hutchinson Cancer Research Center (Seattle, WA); and the Leukemia Program, Department of Hematology and Medical Oncology, Cleveland Clinic, (Cleveland, OH). Patients with transformation to secondary acute leukemia at the time of transplant were not included in this study. Detailed information on transplant-, disease- and patient-specific characteristics was collected including the DIPSS score, which was calculated prior to transplantation [16]. Patients gave informed consent and the study is in accordance with the Helsinki declaration.

Statistical analysis

Main endpoints of this study were GvHD-free and relapse-free survival (GRFS), which was defined as the absence of grade III-IV acute GvHD, severe chronic GVHD, relapse, or death from any cause [17], and acute GvHD.

Further endpoints were chronic GvHD, relapse incidence, non-relapse mortality, and overall survival. Non-relapse mortality was defined as death from any cause other than recurring disease with relapse as a competing event, whereas overall survival was defined as time from alloHCT to latest follow-up or death from any cause. For the outcome of relapse, death without relapse was the competing event. Acute and chronic GvHD were diagnosed and classified according to previously described criteria [18, 19]. Grade II-IV were included to calculate incidence of acute GvHD. For the development of GvHD, relapse and death without relapse were competing risks.

Categorical variables were compared with the Chi-squared method while continuous variables were analyzed using the Mann–Whitney-test. Survival probabilities were estimated by means of the Kaplan-Meier method and probabilities of acute and chronic GvHD, relapse and non-relapse mortality were computed using the cumulative incidence function, accounting for competing risks. To evaluate independent effects on the endpoint of GRFS, a multivariable model using the Cox regression was developed. Additionally, a Dependent Dirichlet Process model for censored data was used in case of potential violations of the ubiquitous proportional hazards assumption. Variables with a p value of <0.05 were considered to be significant. Statistical analyses were done with SPSS version 29.0.1 and R statistical software version 4.0.3.

Results

Patients

Overall, we identified 707 patients, of whom 469 received ATLG-based GvHD prophylaxis, while 238 received non-ATLG-based GvHD prophylaxis. Within the non-ALTG cohort, 13 patients underwent mismatched related donor transplant and were treated with post-transplant cyclophosphamide. In the ATLG group, patients with a matched related donor were administered 15 or 30 mg/kg ATLG, whereas patients with a matched unrelated donor were given either 30 or 60 mg/kg ATLG. In addition, all patients received either tacrolimus/cyclosporine plus mycophenolate mofetil and/or methotrexate as GvHD prophylaxis. No significant differences were observed between the two groups with regards to median age at alloHCT (58 years for both) nor frequency of driver mutations, year of transplant and disease risk according to DIPSS. Patients in the ATLG and the non-ATLG group differed for other characteristics, including HLA-match, performance status, ruxolitinib exposure prior to HCT, and conditioning intensity. Notably, the most common stem cell source in the ATLG and the non-ATLG cohort was in both cases peripheral blood (98% vs. 95%) and the most frequently used conditioning regimen was busulfan-fludarabine for patients receiving ATLG-based GvHD prophylaxis (60%) and cyclophosphamide-busulfan for the non-ATLG group (55%). Patient characteristics are summarized in Table 1.

Table 1 Patient characteristics.

Engraftment and GvHD

Neutrophil engraftment was similar between both groups (P = 0.41), showing incidence at 28 days of 95% for the ATLG group vs. 93% for the non-ATLG group. The cumulative incidence of acute GvHD grade II-IV at day +180 was significantly different between the two groups and amounted to 30% (95% CI, 26–34%) for patients receiving ATLG-based GvHD prophylaxis vs. 56% (95% CI, 50–62%) for the non-ATLG cohort (P < 0.001; Table 2). Median time to acute GvHD was significantly different between both groups (P = 0.009), being 24 days for the ATLG group vs. 32 days for the non-ATLG group. Acute GvHD grade III-IV occurred in 20% vs. 25% in the cohort given or not given ATLG, respectively (P = 0.01), with severe acute GvHD grade IV in 5% vs. 10%. No difference in incidence of mild-to-severe chronic GvHD was observed at 6 years, showing 49% (95% CI, 45-54%) for the ATLG group vs. 50% (95% CI, 44–57%) for the non-ATLG group (P = 0.52). In contrast to acute GvHD, median time to chronic GvHD was longer for the ATLG group (P = 0.05), being 207 days vs. 182 days for patients receiving no ATLG prior to transplant. Absolute rates for severe chronic GvHD were lower for the ATLG group (7%) vs. non-ATLG group (18%; P = 0.04).

Table 2 Patient outcomes according to ATLG use.

Survival and relapse

With a median follow-up of 5.9 years (95% CI, 4.8–6.8 years) for the ATLG group and 6.9 years (95% CI, 5.8–8.2 years) for the non-ATLG group (P = 0.07), the 3-year estimates for the composite endpoint of GRFS were 53% (95% CI, 48–58%) for the ATLG group vs. 47% (95% CI, 40–53%) for the non-ATLG group (P = 0.04; Fig. 1) and the 6-years estimates of GRFS were 45% (95% CI, 40–50%) vs. 37% (95% CI, 27–46%), respectively (P = 0.02). The 6-year estimates for overall survival were 64% (95% CI, 59–68%) for patients receiving an ATLG-based GvHD prophylaxis vs. 60% (95% CI, 54–67%) for the non-ATLG cohort (P = 0.53) and the 6-year cumulative incidence of relapse was 12% (95% CI, 8–16%) vs. 15% (95% CI, 12–18%), respectively (P = 0.62). The early non-relapse mortality 1 and 3 years after transplant was 20% (95% CI, 17–24%) and 27% (95% CI, 23–31%) for the ATLG group vs. 19% (95% CI, 14–24%) and 29% (95% CI, 24–35%) for the non-ATLG group (P = 0.80 and 0.55, respectively).

Fig. 1: Impact of ATLG on transplant outcomes.
figure 1

GRFS, overall survival, relapse incidence, and non-relapse mortality for the ATLG and non-ATLG cohort.

Factors on GRFS

In terms of donor type, the use of a mismatched unrelated donor was significantly associated with worse GRFS, showing a hazard ratio of 1.45 (95% confidence interval, 1.13–1.86; P = 0.003), whereas other donor types did not impact risk for GvHD, relapse, or death. The intensity of conditioning therapy appeared to influence outcome, showing lower rates of GRFS for myeloablative compared to reduced intensity conditioning (hazards ratio, 1.21; 95% confidence interval, 1.02–1.45; P = 0.03). The complete univariable analysis on GRFS including cause-specific hazards is shown in Table 3.

Table 3 Univariate analysis on GRFS.

HLA-match did not appear to confound the comparison between ATLG vs. no ATLG, at least in univariable analysis for matched related or unrelated donors, whereas no significant difference in GRFS was found for both groups in the mismatched unrelated setting (Fig. 2). In matched related donor alloHCT, the 6-year estimates of GRFS were 52% (95% CI, 41–62%) for the ATLG group vs. 40% (95% CI, 29–51%) for the non-ATLG group (P = 0.05). In matched unrelated donor alloHCT, 6-year GRFS was 46% (95% CI, 32-53%) vs. 39% (95% CI, 29–48%; P = 0.03). In mismatched unrelated donor transplants, 6-year GRFS was similar in univariable analysis, showing 31% (95% CI, 21–41%) vs. 32% (95% CI, 12–51%; P = 0.70).

Fig. 2: Impact of ATLG in matched related or unrelated donor HCT.
figure 2

GRFS in matched related, matched unrelated, and mismatched unrelated donor transplantation for the ATLG and non-ALTG cohort.

Multivariable analysis

We then developed a multivariable model on GRFS, adjusting for potential confounders and interactions (ASXL1 and driver mutation status, performance status, HLA-match, conditioning intensity and patient sex). Additionally, a Dependent Dirichlet Process model for censored data was developed, because of violations of the ubiquitous proportional hazards assumption (Fig. 3). The final model was adjusted for a potential interaction of ATLG use and HLA-match, identifying no significant interaction (P = 0.10). As a result, the use of ATLG was independently associated with improved GRFS, showing a hazard ratio of 0.62 (95% CI, 0.46–0.82). Other factors associated with worse GRFS were ASXL1 mutation at time of alloHCT, reduced performance status, mismatched unrelated donor alloHCT, and JAK2 driver mutation genotype.

Fig. 3: Multivariate analysis on GRFS.
figure 3

Forest plot and corresponding risk ratios of GRFS for clinical and molecular subgroups.

Discussion

To the best of our knowledge, this is the first and largest study evaluating the role of ATLG on transplant outcomes and independent predictors of GRFS in myelofibrosis patients undergoing first alloHCT. We found that the administration of ATLG significantly and independently improved GRFS, with 6-year estimates of 45% for ATLG and 37% for non-ATLG-based GvHD prophylaxis. In our myelofibrosis cohort, this effect was mainly driven by a significant reduction in acute GvHD for ATLG-based GvHD prophylaxis. Incidence of chronic GvHD was similar between both strategies. The effect of ATLG on GRFS was irrespective of HLA-match of patient and donor in multivariable analysis, and most pronounced in patients receiving matched (related or unrelated) donor alloHCT.

Several randomized studies have reported a beneficial impact of ATLG on incidence and severity of GvHD. Kröger et al. conducted a randomized, open-label trial of ATLG added to standard GvHD prophylaxis in 168 patients undergoing matched related donor alloHCT and found significantly lower rates of chronic GvHD and higher GRFS following ATLG without a decrease in incidence of acute GvHD [10]. Other randomized studies among patients receiving alloHCT from unrelated donors have reported that the administration of ATLG resulted in significantly lower rates of acute and chronic GvHD, and with the exception of one study [14], improved GRFS [9]. However, these prospective studies included almost exclusively acute leukemia patients. Of note, the incidence of chronic GvHD in our patients was relatively high, compared to the studies mentioned, whose reported rates were around 30% and resulted lower in the ATLG cohort [9, 10, 14, 15]., but similar to those of other studies including solely myelofibrosis patients [5, 20, 21].

To our knowledge, only one retrospective EBMT study investigated the role of ATLG in 287 myelofibrosis patients and found reduced rates of acute grades II-IV GvHD, while no influence on chronic GvHD, GRFS, overall survival, relapse risk and non-relapse mortality was observed [20]. However, this study also included patients receiving thymoglobulin, which might have affected outcomes, as seen for other previous experiences [7]. Compared to our data, Robin et al. reported lower rates of acute GvHD, but included only patients receiving matched related donor alloHCT. Moreover, due to the lack of data on late acute GvHD, any GvHD occurring after day 100 post-transplant was considered as chronic GvHD in that study. Interestingly, in contrast to our findings, the EBMT analysis showed no significant difference in acute GvHD grade III-IV between the ATLG and non-ATLG cohort.

Another relevant barrier to a successful patient’s outcome remains disease relapse. There have been some contradictory results regarding the incidence of relapse following the administration of ATLG. While few studies have reported an increased risk for relapse in patients receiving ATLG-based GvHD prophylaxis [22, 23], our data, in line with several previous studies[9, 10, 15, 20], suggest no adverse impact of ATLG on absolute recurrence rates of the underlying disease. In addition, incidence of relapse in our cohort was relatively low compared to previously reported rates[20, 24, 25]. Most recently, we were able to show that the risk of relapse was significantly lower in myelofibrosis patients who developed GvHD [26]. These results are in line with an EBMT study showing that the occurrence of GvHD within two years after transplant reduced the risk of relapse in long-term survivors [5]. Stern et al. reported in 2014 that the strength of GvHD and graft-versus-tumor correlation was especially high in patients with myelofibrosis [6]. In our cohort, in spite of significantly lower rates of acute GvHD grade III-IV, we did not find a reduction in non-relapse mortality or an increase in overall survival following the administration of ATLG. This might be due to the fact, that only severe acute GvHD (especially grade IV) seems to be associated with higher risk for death [26].

An essential aspect in alloHCT, which is particularly important for patient counseling, is the selection of the optimal conditioning regimen and intensity prior to transplant. To date, no conditioning intensity has shown to be superior to the other with regards to survival outcomes in myelofibrosis [27, 28]. In this study, the administration of higher intensity conditioning resulted in worse GRFS, although patients treated with myeloablative conditioning were significantly younger (median, 55 years) than patients who received reduced-intensity conditioning (median, 60 years). Concerning the impact of the different conditioning regimens, we showed significantly higher GRFS in patients receiving busulfan-fludarabine-based conditioning. These results are in line with a recent study from CIBMTR favoring busulfan-fludarabine in the setting of myeloablative and reduced-intensity conditioning [29].

In terms of donor choice, the use of mismatched unrelated donors has been reported to be associated with worse survival outcomes [30, 31], while matched related and matched unrelated donor alloHCT showed similar outcomes in several studies [28, 29]. In our cohort, GRFS was significantly worse in individuals transplanted from a mismatched unrelated donor. Notably, we included a minority of 13 patients receiving alloHCT from a mismatched related donor. Although numbers of haploidentical donor alloHCT performed with post-transplant cyclophosphamide have been increasing in the past years, data on its use in myelofibrosis patients remain scarce. Small studies reported contradictory results of haploidentical alloHCT using the post-transplant cyclophosphamide strategy in the setting of myelofibrosis with high relapse and rejection rates [32, 33].

Despite the relatively high GRFS of our cohort compared to other studies [28, 29], GRFS in myelofibrosis remains a significant burden for patients. Compared to other indications such as acute leukemia [17], GRFS appeared similar while events through which this composite endpoint was driven were significantly different. In myelofibrosis, outcome in GRFS is mainly driven by higher incidence of acute and chronic GvHD. Whether outcomes can be improved by new treatment modalities such as the continuation of JAK inhibitors throughout the peri-transplant period until stable engraftment is currently being investigated [34,35,36].

Finally, we acknowledge several limitations that primarily originate from the retrospective nature of our study. We cannot exclude selection bias but aimed to control for it by applying multivariable analysis. Given that patients were included from 5 distinct transplant centers, there is heterogeneity in BMT platforms used. Molecular data other than driver mutations was not available for most individuals. Moreover, we were not able to collect data on timing of ATLG administration nor absolute lymphocyte count, both of which were previously found to influence efficiency of ATLG or transplant outcomes [14, 37].

In conclusion, this collaborative study on myelofibrosis patients undergoing first alloHCT showed significantly improved GRFS following the use of ATLG, both for matched related and matched unrelated donors. This effect was mainly driven by a significant reduction in acute GvHD, whereas incidence of chronic GvHD was similar between the ATLG and non-ATLG group.