Introduction

The advent of tyrosine kinase inhibitors (TKIs) has profoundly changed the outlook for patients with chronic myeloid leukemia (CML). The majority of patients with this previously fatal disease achieve a long-lasting clinical, hematological, cytogenetic and, in a substantial proportion, molecular remission of their disease. Still, for most patients probably lifelong drug treatment is required.1, 2, 3 Hematopoietic stem cell transplantation (HSCT) from an allogeneic healthy donor achieves a consistent eradication of the malignant clone in the majority of patients. On the downside, HSCT is limited to patients with an available donor and remains associated with significant morbidity and mortality.4, 5 Hence, use of HSCT has changed over the past decade from an early intervention to a treatment that has been deferred in the lines of therapy.6, 7, 8

Current recommendations advise that HSCT should be reserved for patients who are resistant or intolerant to at least one second-generation TKI or for patients with blastic phase. HSCT is primarily considered as a salvage procedure.9, 10 There are no prospective randomized studies to prove this concept. The recommendations are based on the stunning early results of imatinib compared with interferon-α-based therapies.1, 3, 11 They are, in part, supported by an earlier prospective randomized study from our group where HSCT was associated with increased early mortality.12

This view might need to be modified. Transplant-related mortality has declined, standardization and accreditation have improved treatment and risk factors for transplant outcome have been defined.13, 14, 15 In a recent study with predetermined criteria, survival of carefully selected patients with low-risk scores and allogeneic HSCT was similar to a risk-matched group of patients treated with imatinib.16 HSCT offers a reasonable option for patients with advanced disease and low transplant risk, but outlook for patients transplanted with high transplant risk in advanced refractory disease remains limited.15, 17 With a better knowledge on risk factors as compared with CML-study III, this study aimed to look for clinically relevant differences between early allogeneic HSCT and best drug treatment in patients eligible for both strategies. Clinically relevant differences comprise higher probabilities of survival and molecular remission, of living without treatment, with less symptoms and a higher Karnofsky score. Supported by prognostic scores, risk groups with a particular benefit should be identified.

Patients and methods

Study design

This prospective randomized multicenter CML-study IIIA of the German CML-study group followed the previously published concept of CML-study III.12 Patients with newly diagnosed CML were assessed for eligibility to HSCT (for details see Supplementary eMethods). A suitable patient was genetically randomized to primary allogeneic HSCT (group A) if a matched family donor was reported to be available; if no matched family donor was identified, the patient was randomized to best available drug treatment (group B). As a key difference, risk factors associated with allogeneic HSCT were integrated.14 Teams were urged to perform HSCT within 1 year from diagnosis. They were instructed to refrain from interferon-α treatment within 90 days before HSCT to avoid an interferon-associated increase of non-relapse mortality.18 As a second difference, and according to some promising preliminary data, autologous SCT was added into the control arm in an attempt to improve best available drug treatment.19

Patients

Between July 1997 and January 2004, a total of 722 patients were consecutively registered at the central data management office by 143 participating centers (35 university based, 72 municipal hospital based and 36 office based)20 (see Supplementary eAcknowledgements). Of these, 669 patients with BCR-ABL-positive CML fulfilled the inclusion criteria and entered the study. All patients considered eligible for transplantation (n=427) were randomized within 2 months from registration to group A (n=166) or best available drug treatment (group B; n=261) (Figure 1). Patients not eligible for HSCT were older and had a higher disease score compared with the eligible ones; patients in group A were significantly younger (38 vs 41 years; P=0.03) compared with those in group B; otherwise, no statistically significant differences between groups A and B were observed (Table 1). Patients were analyzed as of May 2014.

Figure 1
figure 1

Consort flow diagram of enrollment, allocation, follow-up and analysis of patients. In group B, all 261 patients started with best available drug treatment. During the course of disease, 131 patients received an allogeneic HSCT with an unrelated donor in first CP. Their survival time was censored at the day of transplant. Ph, Philadelphia.

Table 1 Baseline characteristics at diagnosis

Allogeneic HSCT

A total of 305 patients received allogeneic HSCT, 151 patients (144 with a matched, 7 with a mismatched family donor) in group A, 148 patients in group B (all with an unrelated donor; 131 in first chronic phase (CP), 17 in advanced phase) and 6 originally non-eligible patients in group C (Figure 1 and Supplementary eTable 1). The median time from diagnosis to transplantation from a matched related donor was 7 months (range: 2–33 months). The recommended treatment before HSCT was hydroxyurea. The source of stem cells was bone marrow for 64 patients (42%) and peripheral blood for all others. In the 30 centers of four countries, preparative regimens were classified according to the CIBMTR (Center for International Blood and Marrow Transplant Research) functional definitions.21 Standard conditioning was applied in 116 patients (77%). Graft-versus-host disease prophylaxis was primarily cyclosporine based (62%).

Drug treatment

At the time of recruitment to this study in the pretyrosine kinase inhibitor era, the recommended primary drug treatment consisted of interferon in combination with hydroxyurea and low-dose cytosine arabinoside as described before.12 With its availability, imatinib was offered in the case of interferon failure or upon request. Over time, a total of 183 patients received imatinib or a second-generation TKI, 47 in group A (28%) and 136 in group B (84% of 113 patients who did not undergo transplantation and 28% of 148 patients who underwent unrelated donor HSCT). After starting with imatinib (n=181), 16 patients switched to dasatinib and 12 patients to nilotinib.

A total of 44 patients in CP were randomized to autologous HSCT. There was no difference between autologous HSCT and drug treatment, hence they were analyzed as ‘drug treatment’.19

Molecular analysis

BCR-ABL transcript levels were determined every 3 months until year 3, then every 6 months as described previously.22 A major molecular response was defined by a BCR-ABL/ABL ratio of 0.1 or lower, and a complete molecular response, by undetectable BCR-ABL transcripts using total ABL as an internal sensitivity control.

Ethics

The protocol was approved by the ethics committee of the Medical Faculty Mannheim of the University of Heidelberg and by the local ethics committees of participating centers. Written informed consent was obtained from all patients before entering the study. The study protocol is registered with ClinicalTrials.gov (NCT00025402).

Statistical analysis

Outcome of the groups A and B was analyzed according to the intention-to-treat principle. Primary end point was survival time from diagnosis. Secondary end points included ongoing CML therapy, absence of detectable BCR-ABL transcripts, presence of symptoms and Karnofsky performance score, all at 10 years after diagnosis. Survival times of patients in group B who received an unrelated donor HSCT in first CP were censored at the day of transplantation as these transplantations were performed independent of prior treatment success and early transplant-related mortality cannot be linked to survival probabilities with best available drug treatment. Patients undergoing transplantation in accelerated phase or blast crisis were not censored, as drug treatment had failed before. Survival probabilities were estimated by the Kaplan–Meier method and compared with the log-rank test. Assuming a late crossing of survival curves for the main comparison between groups A and B,23 overall survival and survival after the crossing were compared with the Wilcoxon–Gehan test. The influence of HSCT as salvage treatment in blast crisis was analyzed by Simon–Makuch curves24 and Mantel–Byar test.25 Patients were stratified for disease risk by the Euro score26 calculated at diagnosis and for transplant risk by the European Group for Blood and Marrow Transplantation (EBMT) score, calculated at the time of allogeneic HSCT.14 To be able to compare survival probabilities from diagnosis and to avoid time-to-transplantation bias, adapted EBMT scores were computed for all 427 patients eligible for HSCT. The algorithm for calculation and the determination of median P-values is given in the Supplementary eMethods.

The significance level α was chosen to be 0.05 two-sided. All survival comparisons within subgroups are understood as explorative analyses; adjustments for multiple comparisons were not performed. Patients' characteristics at baseline were descriptively compared using Fisher's test or Wilcoxon–Mann–Whitney U-test, as appropriate.

Code availability

All analyses were performed with the program package SAS 9.2 (SAS Institute, Cary, NC, USA).

Results

Survival

Median survival of all 669 patients was not reached. Ten-year survival probability was 0.61 (95% confidence interval (CI): 0.57–0.65). Median observation time for living patients was 12.1 years (range, 7.3–16.1 years). Survival probabilities were better for the 427 patients eligible for HSCT than for the 242 non-eligible patients (Supplementary eFigure 1a).

Overall and before the crossing of the survival curves at 4.5 years, survival was not significantly different between groups A and B. Before the crossing, 36 patients (22% of 166) in group A (primary HSCT) died, whereas in group B (best available drug treatment), 31 (12% of 261) died and 128 (49%) were censored because of allogeneic HSCT with an unrelated donor in first CP (Figure 2). From 4.5 years on, survival probabilities in group A were significantly higher (Wilcoxon–Gehan test: P<0.001). Ten-year survival probabilities were 0.76 (95% CI: 0.69–0.82) in group A and 0.69 (95% CI: 0.61–0.76) in group B.

Figure 2
figure 2

Kaplan–Meier estimates of overall survival of the 427 patients stratified according to genetic randomization. Of 427 patients, 166 were randomized to early allogeneic HSCT (group A) and 261 patients to best available drug treatment (group B). Analysis was performed by intention to treat. In group B, the survival time of patients receiving an allogeneic HSCT with an unrelated donor was censored at the day of transplant. The overall survival differences between the two curves were not significant (Wilcoxon–Gehan test). At 1, 5 and 10 years, horizontal crossbars indicate the upper and lower limits of the 95% CIs for the estimated survival probabilities (s.p.).

Euro score, EBMT risk score, adapted EBMT risk score and outcome

The Euro score was an appropriate prognostic system for the 261 patients with drug treatment. Patients with low, intermediate or non-evaluable risk had a significantly better survival compared with the high-risk group (P=0.002; Supplementary eFigure 1b) with no significant difference between the former groups. The EBMT risk score was able to discriminate survival within the 151 patients actually transplanted in group A (overall P=0.002; Supplementary eFigure 1c) as well as between all 305 patients transplanted within the whole patient population (Supplementary eFigure 1d).

As expected, the adapted EBMT risk score at diagnosis had a prognostic impact on outcome in group A (Supplementary eFigure 2a) but not in group B (Supplementary eFigure 2b). Hence, five prognostic groups at diagnosis could be differentiated: three risk groups defined by the adapted EBMT score from diagnosis for group A and two risk groups defined by the Euro score in group B (Figure 3a). Patients of group A with adapted EBMT scores 0–1 (10-year survival probability 0.85 (95% CI: 0.74–0.92) had significantly higher survival probabilities (median P<0.001) as compared with patients with high-risk Euro score of group B (10-year survival probability: 0.41 (95% CI: 0.19–0.63)) and also as compared with the 230 non-high-risk patients (median P=0.047; 10-year survival probability: 0.73 (95% CI: 0.65–0.80)). In contrast, there were no statistically significant survival differences of patients in group A with adapted EBMT score 2 (10-year survival probability: 0.70 (95% CI: 0.58–0.79)) or scores 3–4 (10-year survival probability: 0.68 (95% CI: 0.45–0.83)) when compared either with the non-high-risk or the high-risk patients of group B.

Figure 3
figure 3

(a) Kaplan–Meier estimates of overall survival since diagnosis of the 427 patients eligible for allogeneic HSCT with a related donor. At diagnosis, all 166 patients randomized to group A were stratified according to the adapted EBMT risk score at diagnosis and all 261 patients randomized to group B were stratified according to disease risk (Euro score) at diagnosis. Analysis was performed by intention to treat. In group B, the survival time of patients receiving an allogeneic HSCT with an unrelated donor was censored at the day of transplant. With EBMT score 0 or 1 in group A, survival probabilities were significantly higher compared with the survival probabilities of Euro high- (log-rank test: median P<0.001) and Euro non-high-risk patients (log-rank test: median P=0.047) in group B. At 1 and 5 years, horizontal crossbars indicate the upper and lower limits of the 95% CIs for the estimated survival probabilities. The abbreviation ‘s.p.’ stands for ’survival probability’. (b) Simon–Makuch estimates of overall survival of 48 patients with blast crisis. Patients were stratified according to the reception of allogeneic HSCT as salvage therapy. All patients started in the non-transplant group. If transplanted, patients changed to the HSCT group at the time of transplant (finally, 23 were transplanted). Survival differences were not significant (Mantel–Byar test). At 1 year, horizontal crossbars indicate the upper and lower limits of the 95% CIs for the estimated survival probabilities.

Blast crisis occurred in 48 (11%) of the 427 eligible patients. Of these, 23 received HSCT for treatment of blast crisis. Median time to transplantation after beginning of blast crisis was 3.6 months. Thirteen of the 23 patients started with imatinib therapy before HSCT. Regarding the 25 patients without HSCT in blast crisis, 15 had started imatinib treatment before blast crisis, 6 started imatinib treatment after blast crisis and 4 have neither received imatinib nor HSCT. Survival probabilities were not significantly different between patients treated or not with HSCT (Mantel–Byar test; Figure 3b).

Causes of death

The causes of death differed between groups A and B (P<0.001; Supplementary eTable 2). Transplant-related mortality was the most frequent cause of death with 63% of all causes in group A. Disease progression was the most frequent cause of death in group B (60%; not considering causes of death if transplanted with an unrelated donor in first CP). Ten percent (group A) and 23% (group B) of causes of death were reported as not directly related to the disease or the transplant.

Patient status at 10 years

Concerning all comparative analyses on patient status at 10 years, the 131 recipients of unrelated donor HSCT in first CP were not considered. Reporting on the 296 remaining patients according to the intention-to-treat principle, 291 patients (98%) had sufficient follow-up. At 10 years, 122 (75%) of the 162 patients with a related donor and 86 (67%) of the 129 patients without related donor were still alive (Table 2).

Table 2 Status at 10.0 years after diagnosis for 296 patients eligible for allogeneic HSCT

At the time of analysis, significantly more patients in group A (56% of 157, (95% CI: 48–64%)) were alive and without CML treatment than in group B (6% of 127, (95% CI: 2–11%); P<0.001) and significantly more patients in group A (56% of 140, (95% CI: 48–65%)) than in group B (39% of 126, (95% CI: 30–48%); P=0.005) were in molecular remission. Details on the kind of treatment given at 10 years are provided by Supplementary eTable 3.

There were no significant differences between groups A and B regarding the number of patients with a Karnofsky score below 80% or reporting symptoms.

Patients transplanted in first CP with an unrelated donor

In group B, 131 patients were transplanted in first CP with an unrelated donor. Owing to unknown reasons why patients were selected, the immanent selection of 131 less frail patients surviving up to the day of transplantation, and ‘guaranteed’ survival between diagnosis and the day of transplantation, an unbiased comparative analysis between these patients and either the patients of group A or the 130 patients of group B not transplanted in first CP was not possible. Interpreting their outcome with this in mind, 10 years after transplantation, 118 of the 131 patients had either died (n=54) or were still alive (n=64, 54%). Fifty percent patients of the 118 patients were without therapy, 45% with undetectable BCR/ABL, 45% with a Karnofsky score >80 and 36% without symptoms. Thirteen patients were alive but not observed for 10 years.

Discussion

Long-term overall survival in this multicenter prospective randomized CML-study IIIA was not different whether patients were randomized to primary HSCT or to best available drug treatment. There was no difference in performance status and symptom reporting between the groups at time of the analysis. However, significantly more patients assigned to early transplant were in complete molecular remission at 10 years and free of CML treatment. Outcome was determined by three key factors: disease risk, transplant risk and treatment allocation. Integrating these predetermined factors at diagnosis into the intention-to-treat analysis, patients with HSCT and a low EBMT risk score (0–1) showed no excess mortality and fared significantly better than patients with no donor. In contrast, the concept of salvage HSCT in advanced disease despite a high EBMT score failed.

The present data contradict some previous findings and current concepts.1, 2, 9, 10 They can be criticized for their relevance in the times of tyrosine kinase inhibitors, of high-resolution molecular HLA typing and of availability of more than 20 million typed unrelated volunteer donors worldwide when outcome after a well-matched unrelated donor HSCT might be even better than after a sibling donor transplant.1, 27, 28, 29 And, the present results are derived from a study designed 20 years ago in the preimatinib era. Nevertheless, we consider the main findings as valid. Outcome after HSCT can be predicted as has been repeatedly validated. Risks are approximately additive; outcome is substantially better for young patients with early disease, a short time interval from diagnosis and a well-matched, gender identical donor than for older patients with more advanced disease, a longer time interval from diagnosis, a mismatched female donor for a male recipient, independent of stem cell source or transplant technique.14

There are some notes of caution. The results differ substantially from those of the previous CML-study III.12 This could be explained by differences in the risk profile of the patients transplanted with a related donor in first CP in group A (144 of 166 patients in CML-study IIIA).23 Significantly improved survival results for these patients led to a different outcome in the comparison between groups A and B and thus to different conclusions. In the CML-study IIIA, transplants had to be performed whenever possible within 1 year from diagnosis. In addition, outcome of HSCT has significantly improved over the past decade. The majority of the patients (51%) received imatinib or a second-generation TKI before transplantation or in the later course of their disease. In a complex multicenter study involving many participants, patients and donors, local physicians, CML-study centers and transplant teams, different opinions prevail. The impact of standardized strategies in qualified centers has only recently been recognized.15, 20 Timing was not always consistent, transplant protocols or drug treatment schedules could have been modified. A statistical model can neither adjust for all variables nor consider a subjective decision for HSCT with an unrelated donor transplant in first CP. However, randomization of patients with a related donor, analysis in accordance with intention to treat and the use of an adapted EBMT score at time of diagnosis considerably reduced the selection and the time-to-transplant bias. Reducing this bias, results showed no early excess mortality for the HSCT group.

Outcome of HSCT with high transplant risk score has not improved over the past decade.15 In our series, the comparative analysis of the salvage treatment after start of blast crisis was limited by the small patient number (n=48) and the unknown reasons why some patients were selected for HSCT. Independent of that, no patient with EBMT risk score 6 or 7 survived long term.

Although the best available drug treatment has meanwhile considerably improved and a significant survival difference with Euro non-high-risk patients are less likely, the favorable survival results of the patients with low EBMT score are noteworthy on their own; in particular, as the results are based on an intention-to-treat analysis and on a median observation time beyond 10 years. Without the expense of early excess mortality, patients with newly diagnosed high-risk CML and non-responders to first-line TKI can benefit from an early low-risk HSCT through improved long-term survival, shorter time of treatment, a higher rate of molecular remissions and lower health-care costs. Our data fit with a recent comparison in the imatinib era. Patients with an early low-risk HSCT showed no early excess mortality but a significantly higher rate of molecular remissions as compared with those on imatinib treatment.16 Taken together, these recent data and the results from this long-term study indicate that patients with a low transplant score who failed initial tyrosine kinase inhibitors might be considered for an early HSCT rather than rescue drug treatment. Assessment of donor availability will be a prerequisite to achieve this goal; renouncement for HSCT in the absence of a low-risk donor as well. HSCT for advanced disease with a high transplant risk should not be advocated; ongoing drug treatment or best supportive care might be the better option and will also save costs.30, 31 The overall strategy should involve close cooperation between local physicians, treatment centers and transplant teams. The concept of donor versus no donor should probably change to a risk-adapted strategy, defined by disease and transplant risk for acquired hematological disorders in general.