Introduction

Since long-term remission rates for Hodgkin lymphoma (HL) patients with early-stage favorable disease are already above 90% after combined modality first-line therapy (CMT) [1,2,3,4,5], research is putting more emphasis on improving the balance between long-term side effects on the one hand and tumor control on the other [6].

A widely used standard of care for early-stage favorable HL is treatment with two cycles of doxorubicin, bleomycin, vinblastine, and dacarbazine (ABVD) followed by 20 Gy involved-site radiotherapy (RT). However, the RT consolidation within the CMT concept is assumed to be more harmful than the chemotherapy [7,8,9,10]. Therefore, several trials have assessed whether the use of positron emission tomography (PET) may allow to omit RT from the standard regimen, including the German Hodgkin Study Group (GHSG) HD16 trial [10,11,12,13]. In HD16, we aimed at demonstrating non-inferiority when omitting consolidation RT in patients with a negative PET after two cycles of ABVD (PET-2) in terms of progression-free survival (PFS) as compared to CMT. Secondly, we analyzed whether a positive PET-2 was a risk factor for PFS among patients who were treated with CMT. Here, we present the follow-up analysis on the final data status reporting the long-term results of the international randomized GHSG phase III HD16 trial.

Methods

Study design and patients

The randomized phase 3 trial HD16 was conducted at 250 sites in Germany, Switzerland, Austria, and the Netherlands. It was approved by the responsible ethics committees, conducted according to the Declaration of Helsinki and the ICH-GCP guidelines, registered at ClinicalTrials.gov (NCT00736320), and completed enrollment in December 2015 [13]. All patients provided written informed consent before enrollment.

We recruited patients aged 18–75 years with newly diagnosed, histology-proven classical HL in clinical stages I or II, or nodular lymphocyte-predominant HL in Ann Arbor stages IB, IIA, or IIB, without any of the GHSG risk factors large mediastinal mass (≥a third of the maximal thoracic diameter), extranodal lesions, elevated erythrocyte sedimentation rate (≥50 mm/h without B symptoms, ≥30 mm/h with B symptoms), or ≥3 involved nodal areas. Further details of study design and inclusion/exclusion criteria have been previously published [13].

Randomization

Patients were centrally randomized 1:1 between two parallel treatment groups before starting treatment: CMT consisting of two cycles of ABVD followed by a centrally reviewed PET/CT-based staging (PET-2) and 20 Gy IF-RT or PET-2-guided treatment, with two cycles of ABVD for all patients and 20 Gy IF-RT only for those with a positive PET-2 as defined by a Deauville score (DS) ≥ 3. Randomization was stratified according to recruiting site, age (<45 vs. ≥45 years), sex, B symptoms, disease localization (supradiaphragmatic vs infradiaphragmatic), albumin level (<40 g/L vs ≥40 g/L), and presence vs. absence of initial bulk ≥5 cm in largest diameter. Patients, investigators, and central review panel were masked to treatment allocation until a central review of PET-2 was completed.

Procedures

Procedures have been published in detail [13] and are summarized in the supplement. ABVD was given as previously described [10]. PET-2 was performed between day 22 and 35 of the second ABVD cycle, centrally reviewed by a multidisciplinary panel of experts masked to treatment group allocation and rated according to the DS using the mediastinal blood pool as cut-off for positivity (DS ≥ 3) [14]. Patients with progressive disease were to be taken off study treatment. IF-RT was centrally planned for all patients based on initial staging imaging [15].

Patients were followed for at least 5 years within the study. If separate written informed consent was given, individual follow-up was extended until the end of the study, scheduled 5 years after enrollment of the last patient.

Outcomes

The primary endpoint was PFS, defined as time from completion of staging until progression, relapse, or death from any cause. If none of these events had occurred, PFS was censored at the date of last information on the disease status. Predefined secondary endpoints reported herein were overall survival (OS; time from completion of staging until death from any cause or censored at the date of last information on the patient being alive), time to second primary malignancy (SPM; calculated from completion of staging until first SPM diagnosis or censored at the date of last information on disease status, accounting for death as a competing risk), time to in-field and out-field recurrence, with in-field recurrence defined as progression or relapse with at least one localization within the (potential) radiation field, and out-field recurrences as those with at least one localization outside of the (potential) radiation field (calculated from completion of staging until first respective event or censored at the date of last information on disease status, accounting for death or other HL recurrence as competing risks), cardiac function in terms of mean left ventricular ejection fraction (LVEF) after 5 years of follow-up, time to first childbirth (calculated in female patients from last day of study therapy until the day of birth of the first child born after study therapy, or censored at the date of last information on disease status, accounting for death as a competing risk), as well as frequency of children born after therapy and use of cryopreservation.

Statistical analysis

The HD16 trial had two independent co-primary objectives. The first question addressed whether RT could be omitted from standard CMT after a negative PET-2 without a clinically relevant loss of tumor control. Non-inferiority of ABVD alone over standard CMT would be established if the upper limit of the 2-sided 95% CI for the hazard ratio (HR) for PFS was below the predefined non-inferiority margin of 3.01 in a per-protocol analysis of the PET-2-negative patient population. Second, the HD16 trial aimed to assess whether a positive PET-2 represented a risk factor for PFS among patients assigned to receive CMT by demonstrating superiority of patients with a negative PET-2 result from the standard CMT group over those with a positive PET-2 from both treatment groups. Details on statistical methods and sample size calculation have been published previously [13].

OS as well as in-field and out-field recurrences, were analyzed as secondary endpoints for the before mentioned RT and PET objectives. SPM were analyzed in the PET-2-negative population to complement the primary non-inferiority objective. Cardiac function was analyzed in the entire study population by assigned treatment group separately for male and female patients in a complete-case analysis. Fertility outcomes were analyzed by assigned treatment groups in pre-defined subgroups of male patients aged 18–60 years at enrollment and female patients aged 18–40 years at enrollment.

We analyzed PFS and OS using the Kaplan-Meier method, including HRs and 95% CIs obtained from Cox regression models and log-rank tests where applicable. To assess the prognostic impact of PET-2 independently from baseline factors, this analysis was performed by multivariate Cox regression including all stratification factors (except for recruiting site). SPM, in-field and out-field recurrences and childbirth after therapy were analyzed using cumulative incidence functions, with sub-distribution HRs and p values obtained from univariate Fine-Gray models. Other secondary endpoints were analyzed by means of descriptive statistics. Analysis sets remained unchanged from the primary analysis [13]. The non-inferiority test was primarily performed in the per-protocol population, excluding all patients with severe protocol deviations. Sensitivity analyses and all other analyses were done by intention-to-treat. However, all patients dropping out before the central review of PET-2 were excluded from analyses regarding the main objectives of the trial (ITTPET population). We did post-hoc subgroup analyses of female patients, male patients, and patients below the age of 50 years at enrollment for the non-inferiority objective; results are reported in Supplementary Fig. 2. All authors had access to primary clinical trial data. Data were analyzed in the GHSG Trial Coordination Center. We used SAS version 9.4 for all analyses. All analyses are based on the final data status of July 2021.

Data sharing

The datasets generated and analyzed during the study and single patient data can be made available upon reasonable request. Decisions regarding data sharing will be made on a case-by-case basis by the corresponding author considering data protection and other applicable regulations.

Results

A total of 1150 patients were enrolled and randomized between November 25, 2009, and December 29, 2015 (Fig. 1). Among 1007 patients with a centrally reviewed PET-2, 340 were rated positive, with DS 3 in 218, DS 4 in 122, and DS 5 in 0 cases. Among the 667 patients with a negative PET-2, the per-protocol set comprises 628 patients, 328 of which were treated with standard CMT and 300 with ABVD alone. Patient characteristics have been published [13] and were similar between randomized treatment groups in the ITTPET as well as the PET-2-negative per-protocol population (Supplementary Tables 12).

Fig. 1: Trial profile.
figure 1

Abbreviations: CMT combined-modality treatment, PET-2 positron emission tomography after two cycles of chemotherapy, HL Hodgkin lymphoma, ITT intention-to-treat, PP per-protocol. *Deauville score 1–2. **Deauville score ≥3.

With a median follow-up of 64 months, estimated 5-year PFS in the PET-2-negative per-protocol population was 94.2% (91.6–96.9) with CMT and 86.7% (82.5–90.9) with ABVD alone (Fig. 2A). With a HR of 2.05, the respective 95% CI ranged from 1.20 to 3.51 and thus included the pre-defined margin for non-inferiority of 3.01. The ITT analysis led to similar results (5-year PFS 94.0% [91.4–96.6] with CMT and 86.6% [82.4–90.7] with ABVD alone, HR 1.99 [1.19–3.34], Supplementary Fig. 1A). The PFS difference primarily resulted from an incidence of in-field recurrences in the ABVD only group, with 5-year cumulative incidences of 2.0% (0.4–3.7) after CMT vs. 10.4% (6.7–14.1) after ABVD alone (p = 0.0002, Fig. 2B), while out-field recurrences were more balanced between the groups (5-year cumulative incidence 3.7% [1.5–5.9] vs 6.4% [3.3–9.5], p = 0.37, data not shown). Most patients received high-dose chemotherapy (HDCT) and autologous stem-cell transplantation (ASCT) at progression or relapse (Table 1).

Fig. 2: Kaplan-Meier estimates for the PET-2-negative per-protocol population.
figure 2

A Progression-free survival. B Cumulative incidence of in-field recurrences. C Cumulative incidence of second primary malignancies. D Overall survival. Abbreviations: PFS progression-free survival, CMT combined-modality treatment, ABVD doxorubicin, bleomycin, vinblastine, and dacarbazine, SPM second primary malignancy.

Table 1 Outcomes of the PET-2-negative per-protocol population.

SPM occurred in 17 and 12 patients treated with CMT and ABVD alone, respectively (Table 1), with corresponding 5-year cumulative incidences of 4.6% (2.1–7.1) vs. 4.2% (1.6–6.8, p = 0.57, Fig. 2C).

With a median follow-up for OS of 66 months, 9 and 3 patients have died, including four deaths from SPM and one from HL (Table 1). In the per-protocol analysis, estimated 5-year OS was 98.3% (96.9–99.8) with CMT and 98.8% (97.4–100) with ABVD alone (p = 0.14, Fig. 2D, ITT analysis: Supplementary Fig. 1B).

In addition to the 340 patients with a positive PET-2, i.e., DS ≥ 3, 353 patients with a negative PET-2, i.e., DS 1-2, were assigned to receive CMT. With a median follow-up of 64 months, estimated 5-year PFS was 94.0% (91.4–96.6) after a negative PET-2 and 90.3% (86.9–93.6) after a positive PET-2 (p = 0.012, Fig. 3A). With 9 and 8 deaths, respectively, OS was similar in the PET-2-negative and positive subgroups (98.4% [97.1–99.8] vs. 98.6% [97.2–100]; p = 0.43, Table 2, Fig. 3B). Using the less conservative and more commonly applied cut-off of DS 4 for PET-2 positivity, the prognostic value of PET-2 became more apparent with estimated 5-year PFS rates of 94.0% (91.9–96.0) after a DS 1–3 and 83.6% (76.6–90.6) after a DS ≥ 4 (p < 0.0001, Fig. 3C). Again, there was no OS difference (Fig. 3D). There were both more in-field and out-field recurrences in patients with positive PET-2 (Supplementary Fig. 3).

Fig. 3: Kaplan-Meier estimates for PET-2-negative and PET-2-positive patients assigned to receive combined-modality treatment.
figure 3

A Progression-free survival, DS 1–2 vs DS ≥ 3. B Overall survival, DS 1–2 vs DS ≥ 3. C Progression-free survival, DS 1–3 vs DS ≥ 4. D Overall survival, DS 1–3 vs DS ≥ 4. Abbreviations: PET-2 positron emission tomography after two cycles of chemotherapy, PFS progression-free survival, DS, Deauville score. *Cox model adjusted for stratification factors age, sex, B symptoms, disease localization (supra- vs. infra diaphragmatic), albumin level (<4 g/dL vs. ≥4 g/dL), and bulky disease.

Table 2 Outcomes of PET-2-negative and PET-2-positive patients assigned to receive combined-modality treatment.

Cardiac toxicity after 5 years was not substantially different neither between treatment groups nor between male and female patients with regard to LVEF (Table 3).

Table 3 Cardiac toxicities.

Within a median follow-up of 63 months from the end of study therapy, 54 out of 293 women of age 18-40 at enrollment reported to have given birth. Cumulative incidences of childbirth after 5 years were 24.0% (15.9–32.2) in patients assigned to standard CMT and 17.9% (10.4–25.5) with PET-guided treatment (Fig. 4). Cryopreservation before start of therapy was documented in 10 (19%) of 54 women who later had children, and one reported to have used cryogenic material (fertilized oocytes, after receiving CMT as well as treatment for breast cancer).

Fig. 4: Cumulative incidence of first childbirth among female patients aged 18–40 years at enrollment.
figure 4

Abbreviations: CMT combined-modality treatment, PET-2 positron emission tomography after two cycles of chemotherapy.

Within a median follow-up of 62 months after therapy, 37 of 590 male patients aged 18–60 years at enrollment reported to have had children. In 28 (76%) of these patients, sperm cryopreservation had been done before therapy. However, information about whether cryogenic material had been used was not available.

Discussion

This follow-up analysis of the GHSG HD16 trial confirms that RT cannot be omitted from the treatment of early-stage favorable HL after a negative PET-2 without a clinically relevant and statistically significant loss of efficacy. We could not observe any disadvantage of standard CMT over PET-2-guided treatment in terms of acute or late toxicities. OS remains on a high level and was - with the available follow-up - not impaired by the omission of RT, likely due to effective second-line therapies. We also could confirm PET-2-positivity as risk factor regarding PFS in patients receiving CMT, indicating the need for further improvement and innovative approaches in this group of patients.

In the primary HD16 analysis [13], non-inferiority of ABVD alone could be clearly ruled out in the primary per-protocol analysis. However, the sensitivity analysis in the ITT population showed a smaller PFS difference at that time. In the updated analysis presented herein, ITT results are much more pronounced and fully support the conclusion that omission of RT leads to inferior PFS in PET-2-negative patients.

The fact that in the CMT group of the ITT population only two relapses more have been observed compared with the per-protocol population, and both patients dropped out of the per-protocol population because they refused RT, supports this result as well. Moreover, the majority of relapses after ABVD alone were within a theoretical IF-RT field, highlighting an important role of RT in local tumor control and preventing in-field relapses.

Here, the question arises, how to improve the negative predictive value of PET-2 in order to allow for the omission of RT in selected patients. Novel tools such as circulating tumor DNA and metabolic tumor volume are currently under evaluation and might improve risk prediction in the future [16, 17].

Our results are in line with results from other randomized phase III trials such as the European Organisation for Research and Treatment of Cancer (EORTC)/The Lymphoma Study Association (LYSA)/Fondazione Italiana Linfomi (FIL) H10 and the UK National Cancer Research Institute (NCRI) RAPID studies [11, 18]. However, the authors of the RAPID trial drew different conclusions from their results by accepting a greater loss of efficacy for omitting RT, despite missing the pre-defined margin for non-inferiority. A joint analysis of the H10 and RAPID trials showed that relapse rates in patients with ABVD alone were higher than with CMT in the first two years. Afterwards, however, relapse rates in both groups were similar [19]. This assumption cannot be supported by our data and is in contrast to the findings in >20,000 early-stage HL patients from the US National Cancer Data Base, reporting superior OS with CMT versus chemotherapy only [20].

Even though PFS is significantly better in the standard CMT group compared to ABVD alone, OS continues to be very high in both groups – regardless of the higher rate of relapses for patients who did not receive RT. This can be attributed to the very effective second-line treatment options in HL. Most patients with relapse received HDCT and ASCT, which proves to be an effective treatment for relapse of HL, especially after initial early-stage disease [21,22,23]. Patients who received ABVD alone were treated with this intensive second-line therapy as well. Again, these results are in line with the results of H10 and RAPID [11, 18]. The authors of the RAPID trial concluded that the significant difference in PFS can be neglected since OS rates are still high in both groups. In line with the EORTC H10 investigators, we believe, however, that in spite of the high OS rates, the difference in PFS is clinically relevant and particularly of high importance for patients in terms of quality of life [24]. Additionally, while some patients were spared from first-line RT, they still received a very toxic and strenuous second-line treatment due to relapse to allow for these high OS rates. We postulate that most of these patients could have been spared from intensive second-line therapy if they would have had consolidation RT as part of CMT.

Of note, the findings from the present analysis comprising the whole study population of the HD16 trial and from the H10 and RAPID studies that only included individuals with classical HL also appear to hold true for patients with early-stage favorable nodular lymphocyte-predominant HL (NLPHL). A recent subgroup analysis of the patients with NLPHL treated within the HD16 study indicated that individuals presenting with this rare HL subtype also require consolidation RT irrespective of the PET-2 results to achieve the optimal disease control [25].

Importantly, there are not more SPM in the CMT group and therefore, late toxicities caused by RT seem to play a much smaller role than initially predicted – limited by the fact that the follow-up still is rather short and that SPM might occur 20 to 30 years after CMT. This is of great importance since late toxicities caused by RT have contributed to the decline in OS of HL patients in the past [26]. An analysis by Baues et al., including patients treated within the HD16 study, demonstrated that the rate of acute side effects with IF-RT was very low, with almost no grade 3 and mostly grade 1 toxicity [26]. This can be attributed to smaller irradiation fields, both due to technical improvements but also rather localized disease in early-stage HL given the absence of the risk factor ≥3 nodal areas and lower radiation doses. With the current standard consisting of involved-site RT, side effects are expected to decrease further [26]. Moreover, our data do neither demonstrate any strong negative impact on cardiac function measured by LVEF nor on fertility measured by the rate of child births.

With prolonged follow-up, PFS rates continue to be significantly higher for PET-2-negative patients. Even years after end of CMT, there are still more recurrences in patients with a positive PET-2, which are mostly located outside of the radiation field. Thus, improving response rates and importantly PFS in these patients could be achieved by intensifying chemotherapy instead of intensifying or modifying RT. This approach has been investigated in the EORTC H10 trial and showed promising results when intensifying chemotherapy to two cycles of escalated BEACOPP (bleomycin, etoposide, doxorubicin, cyclophosphamide, vincristine, procarbazine, prednisone) in case of PET-positivity after two cycles of ABVD [18].

Some of the inherent limitations of HD16 have been addressed already in the initial publication [13]. Additionally, although median follow-up is now 64 months, some late toxicities, such as SPMs or cardiovascular disease, may still not be evaluable sufficiently due to their delayed onset. In addition, data on birth rates and gonadal dysfunction, including whether or not in vitro fertilization was performed, is limited to a subset of patients due to limited data availability. Despite these limitations, the present update of this large international randomized phase III trial provides important long-term data and more robust findings than initially published.

In conclusion, this follow-up analysis of the GHSG HD16 trial confirms that RT cannot be omitted from treatment of early-stage favorable HL in case of a negative PET-2 without a significant and clinically relevant loss in efficacy. Accordingly, CMT remains standard treatment for early-stage HL patients. Since PET-2 positivity (≥DS 4) has a relevant negative impact on PFS, innovative treatment strategies are needed to improve outcomes after first-line treatment in these patients.