Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of malignant lymphoma. It has an aggressive clinical course but also a high chance of treatment response and cure. The addition of rituximab (R) to standard-of-care combination chemotherapy CHOP (doxorubicin, cyclophosphamide, vincristine and prednisolone) has significantly improved patient outcomes since its introduction in the mid 2000s [1,2,3]. The positive development in DLBCL survival is illustrated by decreasing relapse rates, improved long-term progression-free and overall survival over time [1, 4, 5]. Despite this trend, every fourth DLBCL patient who is treated with curative intent is still expected to experience progressive disease or relapse [6, 7], resulting in a worsened prognosis [8]. On the other hand, achieving a lasting remission for at least 2 years from diagnosis and primary treatment has emerged as a good indicator for a favourable prognosis long term [7, 9, 10].

Population-based cancer survival is often reported using net survival. Although this measure is useful for making comparisons between groups or across time (where mortality due to other causes may differ) they are less useful for understanding real-world probabilities. The aim of this study was to provide real-world summary measures of the chance for lasting remissions which can be useful in risk communication and clinical decision-making. We used multistate models to quantify patient trajectories from remission after first-line treatment for DLBCL, allowing for repeated occurrences of both relapse and remission.

Materials and methods


All patients diagnosed with DLBCL between 2007 and 2014 were first identified in the Swedish lymphoma register (SLR) (N = 4247 after excluding primary CNS lymphoma, primary mediastinal large B-cell lymphoma and transformed/discordant lymphomas). The SLR contains detailed clinical information on lymphoma characteristics at primary diagnosis, such as stage, WHO performance status, serum lactate dehydrogenase level (S-LDH) and extranodal sites. SLR also contains information on first-line treatment and treatment response. International prognostic index (IPI) was calculated with one point assigned for each of the following factors: age >60 years, Stage III/IV disease, elevated serum-LDH, WHO performance status ≥2 (patient is unable to carry out work and/or bedridden) and one or more extranodal sites [11]. Patients were also classified into risk groups based on the age-adjusted IPI (aaIPI), with one point assigned for each of the factors: Stage III/IV disease, elevated serum-LDH and WHO performance status ≥2.

In 2017–2019, a national review of all medical charts for patients diagnosed with DLCBL 2007–2014 was conducted to confirm recorded treatment responses and relapses and to identify unrecorded relapses. Among relapsed patients, a more detailed data collection on later treatment lines and treatment responses was also performed.

In this study, patients who were treated curatively (i.e., who received three or more cycles of anthracycline-based chemotherapy or equivalent) and responded to the treatment (i.e., who had achieved complete (CR) or partial remission (PR) at their final treatment evaluation with CT or PET-CT) were included (n = 2941). Among all patients who experienced a relapse after CR/PR, a more detailed medical record review could be performed in the majority of the cases (471/538, 88%). Reasons for not being included in the detailed data review was either lack of informed consent to review the medical journals for patients who were still alive (n = 13, 2%), or that the complete medical records following relapse could not be identified (n = 36, 7%).

Information on dates of death was obtained for all 2941 patients by linking the cohort to the national Swedish cause-of-death register through the use of the unique national identity numbers assigned to all Swedish residents [12]. Patients were followed from the end of first-line treatment until death or October 31, 2017 whichever came first. The study has been approved by the regional ethics committee in Stockholm, Sweden.

Statistical analysis

Conceptual modelling framework

Population-based cancer survival is often reported using net survival (e.g., cause-specific survival (CSS) or relative survival (RS)), and summarised at, e.g., 2 or 5 years after diagnosis. Although net survival measures are useful for making comparisons between groups or across time (where mortality due to other causes may differ), the interpretation of net survival is limited to the hypothetical scenario where patients are assumed immune to death from competing causes [13, 14]. An assumption of immortality is also present when estimating, e.g., time to progression or relapse if deaths (due to any cause) are censored in the analysis. To relax this assumption, patient survival estimated in the presence of competing risks can be used as a real-world summary measure of the anticipated prognosis [14, 15].

Our approach to modelling complex disease pathways and their associated probabilities in the presence of competing risks is via multistate models [16]. For illustration, the multistate model that was used to describe patient trajectories in this study is depicted in Fig. 1. The model was defined by multiple remission and relapse states. The multistate model can be described as a stochastic process with eight states (illustrated by boxes in Fig. 1) and seven possible transitions (illustrated by arrows in Fig. 1).

Fig. 1: Illustration of the multistate model used to define the transitions (arrow) between different states (boxes).
figure 1

The flow of individuals through the various patient trajectories is indicated by the numbers on each transition. (1) In all, 471 (88%) of 538 relapsing patients had follow-up data beyond the first relapse and could contribute to later transitions. (2) In total, 41 patients responded to later line treatment (following the second relapse) and 14 patients had a third relapse. No patients responded to later line treatment following the third relapse.

All patients were included at remission after first-line treatment (starting state). Patients who relapsed transitioned into the “First relapse” state (n = 538), from which they could subsequently enter the “Second remission” state if they responded to the relapse treatment and achieved a new remission (n = 208). Similarly, patients who experienced a second relapse entered the “second relapse” state (n = 105). Patients who progressed on relapse treatment remained in the relapsed state until they eventually died. Each transition between states can be viewed as a survival model accounting for competing events at each transition. The model was used to obtain summary measures for transition probabilities, including probabilities of ever visiting a state, and length of stay at different points of follow-up.

Approach to modelling transition rates

The multistate model can be specified as a combination of transition-specific survival models and the possible transitions defined by a transition matrix (Supplementary Fig. 1). In this study, all transition-specific survival models were modelled using flexible parametric models [17], fitted on the log hazard scale using restricted cubic splines to estimate the baseline hazard. The number of degrees of freedom for each transition model was chosen by re-fitting the model with a range of knots (different numbers and localisation) and comparing the Akaike information criterion (AIC) and Bayesian information criterion (BIC) (Supplementary Table 1). Transition probabilities were calculated assuming a Markov renewal or clock-reset process, where time since entry in the current state was used as the underlying time scale for the process. The reason for this approach was that time since e.g. relapse was considered to be of greater importance for the transition probabilities than time since the first remission. Confidence intervals (CI) were calculated using parametric bootstrap [16, 18].

For estimating the probability of remaining in remission by clinical subgroups, we considered eight groups defined by the patients' age at diagnosis (≤60, 61–70, 71–80 and >80 years) combined with their within-group age-adjusted International prognostic index (aaIPI: <2 or ≥2 risk factors). The transition rate to each state was allowed to differ for each clinical subgroup through the inclusion of interaction terms between the group and the baseline hazard function.

We also estimated the impact of sex, Stage (I, II, II, IV), WHO performance status, S-LDH (normal, elevated), number of extranodal sites (0, 1, ≥2) and IPI (0–1, 2–3, 4–5 risk factors) on the difference in the 2-year remission transition probabilities and in the length of stay [16, 19]. For example, with respect to transition probabilities, a difference of −16.1% units comparing patients with Stage IV to patients with Stage I means that the probability of being in remission after 2 years is 16% units lower among Stage IV patients than for Stage I patients. The difference in length of stay gives an estimate of how much the time spent in a particular state is impacted (due to factors included in the model). Continuing the example from above, a difference in the 2-year length of stay of −84.7 days means that patients with Stage IV lose, on average, 84.7 days in the remission state during the first 2 years of follow-up, compared to patients with Stage I. Both types of estimates were adjusted for age and calendar year of diagnosis (both included as continuous variables using restricted cubic splines) and sex.

Lastly, the assumption of proportional hazards was formally tested using likelihood ratio tests by comparing nested models with and without interaction effects between age and year of diagnosis, respectively, and time. For the variables of main interest (sex, stage, WHO performance status, S-LDH, number of extranodal sites and IPI) we included interaction terms with time to accommodate non-proportional hazards irrespective of the level of significance in order not to impose restrictions to the transition rates.

All statistical analyses were performed using Stata v.17 (StataCorp. 2019. Stata Statistical Software: Release 17. College Station, TX: StataCorp LLC) and Stata packages merlin and multistate. More details of the statistical analysis and modelling approach, together with STATA code and a simulated dataset, is available in the Statistical Appendix.


A total of 2941 patients who responded to first-line curative treatment were included in the study. The median age at diagnosis was 67 years and 44% were women (Table 1). The vast majority received CHOP in combination with rituximab (91%) and of all patients, 90% completed at least six cycles. The median follow-up was 5 years (range: 0–10.6 years). Of the 538 (18%) patients who experienced a relapse during follow-up, 385 (72%) relapsed within 2 years, and 33 (6% of relapses, 1% of all patients) relapsed after more than 5 years. Among the patients who relapsed and had their relapse records reviewed (471), 208 (44%) responded to relapse treatment, i.e., had a second remission, and out of those, 105 (50%) later had a second relapse.

Table 1 Clinical characteristics for 2941 patients diagnosed with Diffuse large B-cell lymphoma in Sweden between 2007 and 2014, and who achieved complete/partial remission after first-line curative treatment (at least three cycles).

In the entire study cohort, the proportion of patients who remained in the first remission dropped substantially during the first year after treatment completion (from 100% at the start of follow-up to 86.1% (95% CI: 84.6, 87.4) (Fig. 2a). At 2 years, 80.6% (95% CI: 78.9–82.2) of the patients remained in first remission and an additional 2.8% (95% CI 2.1, 3.6) were in the second remission after relapse treatment. A total of 13.2% (95% CI: 11.9–14.6) had relapsed, whereas 6% had died at 2 years whilst in the first remission. From the second year and onwards, the annual decline of patients still alive and in first-line remission, stabilised to ~3–4% units per year (Fig. 2a). Five years after first-line treatment completion, 69.5% (95% CI: 67.7–71.3) of all patients were alive and in a lasting remission from first-line treatment and in total 17.4% (95% CI: 16.0–18.9) had relapsed.

Fig. 2: Probabilities of being in a given state by time since first remission (in years) plotted together with the probability of having transitioned to a subsequent state (i.e., where did the patients leave to).
figure 2

Panel a shows the probability of being in remission and the probability of subsequently having transitioned into either first relapse or death. Panel b shows the probability of being in the state “first relapse” and the probability of subsequently having transitioned into either second remission or death or first relapse. Panel c shows the probability of being in “second remission” after relapse treatment and the probability of subsequently having transitioned into a second relapse or death. Panel d shows the probability of being in a second relapse and subsequently having transitioned into death (the only possible transition). Note: in order to provide greater detail to the figure, the scale of the y axis differs between panels.

The relapse risk peaked around 7 months after first remission, and at 2 years after remission, 6.2% (95% CI: 5.2–7.5) of the entire cohort had died following a relapse (without achieving a second remission) while 5.3% (95% CI: 4.2–6.6) had achieved a second remission. For patients achieving a second remission, the majority transitioned into a second relapse (Fig. 2c).

The prognosis varied by clinical risk factors. When stratifying patients by age and age-adjusted IPI score (aaIPI < 2 versus aaIPI ≥2), the initial drop of patients in remission was more pronounced in the latter group (irrespective of age at diagnosis). For example, 88.4% (95% CI: 85.4–90.9) and 84.0% (95% CI: 80.4–87.0) of patients aged 61–70 and 71–80 years with aaIPI<2 remained in remission after two years, compared to 78.3% (95% CI: 74.3–81.2) and 65.7% (95% CI: 60.8–70.3) for patients with the same age and aaIPI≥2 (Table 2). This initial drop in the probability of remission was a reflection of the relapse risk early during follow-up (Fig. 3). At 2 years, a total of 8.6% (95% CI: 5.6–13.0) and 9.2% (95% CI: 6.6–12.5) of patients aged 61–70 and 71–80 years with aaIPI<2 had experienced a relapse compared to 16.7% (95% CI: 13.3–20.7) and 21.2% (95% CI: 17.4–25.5) of patients with aaIPI ≥2. Naturally, the proportion of patients who died when in remission from their DLBCL was higher in the older age groups.

Table 2 Two-year probabilities (%) of being in (b) or having visited (v) a given state by age group and age-adjusted IPI.
Fig. 3: Probabilities of being in a given state at different times since first remission (in years) by age group and age-adjusted international prognostic index (aaIPI).
figure 3

a Age ≤60 years, aaIPI <2; b age ≤60 years, aaIPI ≥2; c age 61–70 years, aaIPI <2; d age 61–70 years, aaIPI ≥2; e age 71–80 years, aaIPI <2; f age 71–80 years, aaIPI ≥2; g age >80 years, aaIPI <2, and h age >80 years, aaIPI ≥2.

The probability of remaining in first-line remission lasting 2 years or longer was slightly higher for women compared to men (Fig. 4). However, advanced stage (Stage III–IV), performance status ≥1, elevated S-LDH, one or more extranodal sites and IPI score >1 were associated with lower probability of remission at 2 years (after adjustment for age at diagnosis, calendar year and sex). For example, the chance of a lasting remission (at least 2 years) was reduced by 10.3% units for patients with IPI scores of 2–3 (95% CI: −12.9, −7.6)) and by 20.4% units (95% CI: −25.0, −15.7) for those with IPI scores 4–5 compared to those with IPI scores of 0–1.

Fig. 4: Impact of clinical patient and DLBCL factors on the probability of being in remission and length of stay.
figure 4

The clinical characteristics are contrasted in terms of the difference in the 2 years remission transition probability and length of stay in days. All contrasts are adjusted for age at diagnosis, calendar year of diagnosis and sex at (a) difference in transition probability at 2 years (b) difference in length of stay at 2 years.

To translate these probabilities into the amount of time lost in the remission state, we estimated the impact of the clinical factors on the 2-year difference in length of stay. As an example, across the first 2 years of follow-up, poor performance status (≥2) and Stage IV disease both shortened the time in remission by ~3 months compared to asymptomatic patients and Stage I patients, respectively (Fig. 4 and Supplementary Table 1). The patients with the highest IPI scores [4, 5] lost 3.5 months (out of a potential 2 years) in remission (difference in length of stay: −107 days 95% CI: −135.2, −80.0), and patients with IPI scores 2–3 lost 1.6 months (difference in length of stay: −50.1 days, 95% CI: −64.1, −36.1) compared to patients with the lowest IPI score.


Overall, DLBCL patients achieving remission following first-line immunochemotherapy have a good long-term prognosis. We found that 8 of 10 patients were alive and relapse-free at 2 years after end of treatment. However, 13.2% had experienced a relapse, leading to a much worse prognosis. In fact, only 40% of those who relapsed within 2 years also achieved a second remission in that time (5.3% of the whole cohort), and the majority of patients achieving a second remission later transitioned into a new relapse. We also quantified the impact of well-known risk factors on the real-world probability of remaining in remission at 2 years after treatment and found that patients with, e.g., the highest IPI score [4, 5] had a reduced probability by 20% units and also stayed in remission for 3.5 months shorter than patients with the lowest IPI scores (0–1).

The estimated relapse risks from the multistate model were well in line with previously reported numbers in patients achieving first-line remission [1, 10]. The majority of relapses in DLBCL occur within the first 2 years [10, 20, 21] and several studies have pointed to the importance of reaching 2 years of relapse-free disease as a milestone for a future favourable prognosis. In fact, several population-based studies have shown that patients reaching this important milestone have a life expectancy that resembles that of the general population, with only minimal life loss thereafter [7, 9, 10]. These findings have led to changes in several international clinical guidelines of recommended follow-up schemes where the previous recommendation of 5 years of clinical follow-up has been shortened to 2 years for patients remaining in remission at that point [22,23,24].

The difficulties in treating relapsing patients, and the poor prognosis after relapse, are reflected in this study by the relatively low proportion of patients who enter a second remission as well as the fact that the majority of patients in the second remission transitioned into a second relapse. The standard-of-care for younger and fit patients is still salvaged multiagent chemotherapy followed by consolidation with autologous stem cell transplantation (ASCT) at first relapse. However, about half of the patients are not eligible for such aggressive treatment [25] and only half of those who are eligible eventually undergo ASCT [26]. Of those who complete the treatment 35–45% relapse again [27,28,29]. The prognosis for patients that are ineligible for ASCT is poor [30]. More recently, the standard-of-care for patients with early relapse or primary refractory disease has been challenged, as CAR-T (chimeric antigen receptor T cell) therapies have shown benefits compared to ASCT (ZUMA-7, Locke F NEJM 2021). CAR-T cell therapy has been available for selected in Sweden at second or later relapses in the last few years, but was not in use in routine care during the study period. Therefore, the current population-based results serve as benchmark of outcomes before the CAR-T era.

The probability of remaining in remission, obtained via this multistate modelling framework, is closely related to measures widely used in clinical trials: progression-free survival (PFS) (defined as time from randomisation to first progression or death from any cause) and disease-free survival (DFS) (defined as time from randomisation to disease recurrence or death from any cause) [31]. However, the multistate model goes further, as it is not limited to the estimated time to first event. In contrast to studies that have investigated patient outcomes in isolation (e.g., relapse or death), we adopted a multistate model approach to gain insights into the interplay of events that the patients may encounter, including death, second remission and second relapses.

Even in this large population-based cohort, the numbers of patients in remission after the first relapse and subsequently experiencing a second and third relapse were small, which prevented us from exploring in more detail the role of prognostic factors beyond the second relapse in this multistate model approach. Another limitation was that the Swedish lymphoma register does not record molecular data, e.g. cell of origin, or MYC/BCL-2/BCL-6 translocations. Even so, the IPI score can be viewed as a surrogate for biological heterogeneity [25, 32] and although it was developed in the 1990s [11] it is still used in clinical practice and has shown to be a robust clinical prognostic score also in the rituximab era [33].

By using registered data in combination with new methods for analysing the course of disease events we can gain understanding and provide measures that are useful for risk communication and health care planning. In this population-based study, we present a comprehensive overview of the real-world prognosis and patient trajectories for patients in remission after DLCBL in the rituximab era. Our results illustrate how the probability of a lasting remission vis-a-vis the risk for relapse and death evolves as a function of follow-up time and by established prognostic factors.

To conclude, we found that the prognosis for patients responding to first-line treatment was overall favourable, as over 80% had durable first remissions of at least 2 years. However, more than one in eight patients are expected to relapse before reaching this milestone and a majority of those reaching a second remission later transition into a second relapse.