Association of progression-free or event-free survival with overall survival in diffuse large B-cell lymphoma after immunochemotherapy: a systematic review

To investigate progression-free survival (PFS) and event-free survival (EFS) as early efficacy endpoints in diffuse large B-cell lymphoma (DLBCL), this systematic review included phase III randomized controlled trials (RCTs), phase II trials, and retrospective studies in newly diagnosed DLBCL receiving rituximab-containing chemotherapy through databases search up to 2019. Quality control was performed, where studies with high risk of bias were excluded. Prediction models were first established using the RCTs, and then externally validated in the phase II and retrospective populations. Trial-level surrogacy analysis was conducted by correlating the logarithmic (log) hazard ratio (HR) for PFS or EFS and log HR for OS. Correlation analysis at treatment arm-level was performed between 1-, 2-, 3-, and 5-year PFS or EFS rates and 5-year OS. The correlation was evaluated using the Pearson correlation coefficient r in weighted linear regression, with weight equal to patient size. Sensitivity analyses were performed to assess the consistency of predictive model by leaving one subgroup of trials out at a time. Twenty-six phase III RCTs, 4 phase II trials and 47 retrospective studies were included. In trial-level surrogacy, PFS (r, 0.772; 95% confidence interval [CI], 0.471–0.913) or EFS (r, 0.838; 95% CI, 0.625–0.938) were associated with OS. For rituximab immunochemotherapy treatment arms in RCTs, there was a linear correlation between 1 and 5-year PFS (r, 0.813–0.873) or EFS (r, 0.853–0.931) and 5-year OS. Sensitivity analysis demonstrated reasonable overall consistency. The correlation between PFS and OS was externally validated using independent phase II, and retrospective data (r, 0.795–0.897). We recommend PFS and EFS as earlier efficacy endpoints in patients with DLBCL primarily treated with rituximab-containing immunochemotherapy.


Introduction
Diffuse large B-cell lymphoma (DLBCL) is the most common aggressive lymphoma subtype. Immunochemotherapy, mostly with rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP), has These authors contributed equally: Jie Zhu, Yong Yang become the standard treatment over the past decade [1][2][3][4]. However, 15-40% of patients are refractory to initial immunochemotherapy, or relapse after complete response (CR). Such patients have poor outcomes, mainly depending on the risk group [5]. There is an urgent need to find more effective agents or regimens for high-risk patients in the immunochemotherapy era.
Overall survival (OS) is the gold-standard treatment endpoint in randomized controlled trials (RCTs). However, OS as the primary endpoint requires a large sample size and long follow-up time to observe the survival benefit, leading to high clinical development costs and delays in introducing novel drugs. When used as the primary endpoints in clinical trials, early efficacy endpoints such as progression-free survival (PFS) and event-free survival (EFS) may require a smaller sample size and shorter evaluation time than OS, and have been established in some malignancies [6][7][8]. Trial-and individual-level studies have demonstrated that 24-month PFS and EFS may be considered the early efficacy endpoints for OS in DLBCL [9][10][11][12]. However, these studies may not be comprehensive because they only included available 13 RCTs willing to disclose individual patient data and were based on a subset of all potentially eligible trials [1][2][3][4][12][13][14][15][16][17][18][19][20][21]. The association of PFS or EFS with OS has not been specifically addressed at trial-or treatment arm-level in RCTs on patients treated with immunotherapy; furthermore, its association and predictive value have not been externally validated. We investigated PFS and EFS as efficacy endpoints in DLBCL in the rituximab era through literature-based analysis at both trialand treatment arm-level. The correlation between PFS and OS was validated in independent cohort studies to confirm its significant role in guiding clinical practice.

Inclusion and exclusion criteria
This study was exempted from review by the institutional review board because it used existing data and enrolled no human subjects. The eligibility criteria included phase III RCTs, phase II trials, and retrospective studies investigating the long-term survival of DLBCL patients who received first-line rituximab-containing immunochemotherapy. Studies were excluded if they met any of the following conditions: phase I trial; transformed or relapsed/refractory DLBCL; inadequate survival data; serology-positive for HIV, hepatitis B/C virus, or Epstein-Barr virus; sample size of <100 patients per arm; or patients with DLBCL consisting of <80% of the whole-sample size.

Literature search
Studies published before 31 December 2019, were included via a systematic literature search of MEDLINE, Embase, and PubMed using the keyword "DLBCL AND rituximab" and with the restriction to RCT, phase II trial, and retrospective study. Formal publications and meeting abstracts were included. Two authors (J.Z. and J.T.) conducted the literature search independently, and reviewed the results with a third author (S.N.Q.). When disagreement in study inclusion was met, J.Z., J.T and S.N.Q. carefully reviewed the potential eligible study again. Disagreements about study inclusion were resolved by consensus.

RCT inclusion and quality control
All potentially eligible RCTs were assessed for risk of bias in seven domains (random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting, and other bias) using the Cochrane Collaboration tool. All information available in the assessment was acquired from formal publications, meeting abstracts, trial registry information on ClinicalTrials.gov (www.clinicaltria ls.gov), and e-mail contact with trial designers. RCTs with high risk of bias in any domain were excluded.

Phase II trial and retrospective study inclusion and quality control
To validate the RCT findings, we analyzed the relationship between PFS and OS using phase II and retrospective data. For single-arm phase II trials and retrospective cohort studies, quality was assessed, with a maximum 9-star score, using the Newcastle-Ottawa scale (NOS) in terms of selection, comparability, and outcome [38]. Studies with low to moderate risk of bias (≥6 stars) were included in the statistical analysis. For the LNH2007-3B randomized phase II trial [39], the risk of selection bias was assessed using the Cochrane Collaboration tool. A total of 1129 abstracts were reviewed. After excluding 865 unqualified records, the full texts of 264 records were reviewed. We excluded 203 ineligible studies, and included 61 studies in the quality assessment (Supplemental Table 2). After excluding 10 studies with high risk of bias, a total of 47 retrospective studies and 4 phase II trials with 67 rituximab immunochemotherapy treatment arms were included in the external validation ( Fig. 1b)  . The average NOS score was 6.9 stars. A total of 14,936 patients were included, with each arm containing 100-1322 patients (median, 177). The median follow-up time was 1.2-7.2 years (Table 2).

Endpoint definition
In the RCTs [1-4, 13-19, 22-36, 39], OS was defined as the time from randomization to death from any cause. EFS was defined heterogeneously, but generally from randomization to any treatment failure, including disease progression, death, and treatment discontinuity for any reason (e.g., adverse effects or withdrawal). PFS was generally measured from the time of randomization to disease progression, relapse, or death from any cause (Supplemental Table 3). In the retrospective studies , OS was generally defined as the time from diagnosis or treatment to death from any cause, and PFS from diagnosis or treatment to disease progression, relapse, or death from any cause (Supplemental Table 4).

Data extraction
In the RCTs, patient characteristics, sample size, follow-up period, primary endpoint, standard and treatment arms, hazard ratio (HR), absolute EFS/PFS rates (year 1, 2, 3, 5), and 5year OS were extracted (Table 1). For a repeatedly reported RCT, we included the most recent result with the longest follow-up time. All results of the standard and treatment arms were based on the intention-to-treat population. For the phase II trials and retrospective studies, patient characteristics, sample size, median follow-up time, treatment, absolute PFS rates (year 1, 2, 3, 5) and 5-year OS were extracted ( Table 2). As described previously [90], the HR or survival rates at the different time points was extracted from the full text (labeled "*") or the Kaplan-Meier survival curve using Engauge Digitizer software.

Correlation evaluation
The correlation analyses of the RCTs, weighted by trial size, were performed at both trial-and rituximab immunochemotherapy arm-level, without inclusion of treatment arms using conventional CHOP (like) regimen in arm-level analysis. At trial-level, the correlation of log HR (PFS) or log HR (EFS) with log HR (OS) was estimated using the Pearson correlation coefficient r in weighted linear regression, with weight equal to trial sample size. At rituximab immunochemotherapy arm-level, the linear correlation between the 1-, 2-, 3-, and 5-year PFS or EFS rates and 5year OS rate was also evaluated by the correlation coefficient r, with weight depending on the sample size of each treatment arm. A strong association was indicated when the value of r was close to 1, and the 95% confidence intervals (CIs) of r were obtained using the bootstrap method with 1000 replications.

Sensitivity analysis
Phase III RCTs were classified into five subgroups according to study purposes. To assess the consistency and robustness of the developed predictive model across different settings, sensitivity analyses were performed by leaving each subgroup of trials out at a time. The correlation coefficient r and its 95% CI in trial-level and treatment armlevel correlation were reported similarly.

External validation of RCT prediction model in phase II trials and retrospective studies
We validated our finding by applying the predictive linear regression models to the phase II and retrospective studies with adequate survival data. The predicted 5-year OS rate was calculated from the actual 1-5-year PFS rates in the phase II or retrospective studies using the established linear regression model from the RCTs. For example, the equation "5-year OS = α × 1-, 2-, 3-, or 5-year PFS + β" was derived from the RCTs. Using the reported 1-5-year PFS rate derived from the phase II and retrospective studies, we used these models to generate the predicted 5-year OS rates. The actual and predicted 5-year OS rates were plotted in scatter plots. Statistical analysis was performed in SPSS (version 21.0, IBM Inc.); data visualization was performed using the ggplot2 package in R software (version 3.3.2, R Foundation for Statistical Computing).

Data sharing statement
For original data, please contact yexiong12@163.com.

Trial-level correlation between treatment effects of PFS or EFS on OS in RCTs
Of 26 RCTs (Table 1), 20 (77%), 1 (4%), and 1 (4%) reported one, two, and three pairs of PFS HR and OS HR, respectively. A significant correlation was observed after   Fig. 3b). Sensitivity analyses demonstrated good consistency in most subgroups, except when leaving the subgroup R-CHOP (like) vs. CHOP (like) out (r = 0.732; 95% CI, 0.278-0.941) because of similar reasons as in PFS (Supplemental Fig. 1b). These results confirm that treatment gain in PFS or EFS can predict OS benefit at trial level with an acceptable consistency. Generally speaking, sensitivity analyses continued to demonstrate robust consistency in terms of correlation r. When leaving out 10 trials from R-CHOP (like) with ritux-imab+intensified/de-escalated chemotherapy subgroup (Supplemental Fig. 1c-f), which account for nearly half of all treatment arms, the findings remain consistent with wider confidence intervals due to the reduced number of arms.

External validation of association of PFS with OS in Phase II and retrospective studies
Sixty-seven treatment arms from the phase II and retrospective studies were used for external validation. As EFS was not available in the retrospective studies, only PFS prediction models could be evaluated. Using the PFS predictive models from the RCTs (Fig. 4), we calculated the predicted 5-year OS rate for each retrospective study using the actual 1-, 2-, 3-, or 5-year PFS rate ( Table 2). The simple regression line between the actual and predicted 5-year OS approached the diagonal line, indicating that the predicted OS was approximated to the actual OS. The Chemotherapy regimens: R, rituximab; R-ACVBP, rituximab, doxorubicin, cyclophosphamide, vindesine, bleomycin, and prednisone; R-chemo, rituximab-based chemotherapy; R-CHOP, rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone; R-COMP, rituximab, cyclophosphamide, non-pegylated liposomal doxorubicin, vincristine, and prednisone; R-CyclOBEAP, rituximab, cyclophosphamide, vincristine, bleomycin, etoposide, doxorubicin, and prednisolone. aaIPI age-adjusted International Prognostic Index, AGR albumin globulin ratio, ALI advanced lung cancer inflammation index, Arg72 arginine at codon 72, ASCT autologous stem cell transplantation, B2MG beta-2 microglobulin, BM bone marrow, DM diabetes mellitus, FU follow-up, GC germinal center, GCB germinal center B-cell, ICPS inflammation-based cumulative prognostic score, IPI International Prognostic Index, LMR lymphocyte-to-monocyte ratio, NA not available, NCCN-IPI National Comprehensive Cancer Network-IPI, No. number of patients, NOS Newcastle-Ottawa scale, OS overall survival, PET/CT positron emission tomography/computed tomography, PFS progression-free survival, PNI prognostic nutritional index, RT radiotherapy, sIL-2Rα soluble interleukin-2 receptor-α, SNP single-nucleotide polymorphism.
predicted 5-year OS rate correlated significantly with the actual 5-year OS rate, with the correlation coefficient r ranging from 0.795 to 0.897 (Fig. 6a-d). This finding validates the premise that PFS is predictive of OS.

Discussion
This is a large-scale, comprehensive study combining data from high-quality phase III RCTs, phase II trials, and retrospective studies to assess the association between the early efficacy endpoints of PFS or EFS with OS in patients with DLBCL primarily treated with immunochemotherapy. Consistent with previous findings [9][10][11][12], analyses of the 26 qualified RCTs showed that improved PFS or EFS correlated with OS benefit at trial level. There was a linear correlation between 1-5-year PFS or EFS and 5-year OS rates at treatment arm level. The comprehensive sensitivity analyses indicated an acceptable overall consistency of the developed predictive model across settings. The external validation showed good calibration between the actual and predicted 5-year OS rates based on the 1-5-year PFS rates in the phase II and retrospective studies. These findings provide new evidence supporting the clinical use of PFS and EFS as early efficacy endpoints for evaluating treatment benefit and accelerating approval for superior treatments. Previous studies, primarily using 13 RCTs conducted before 2015, concluded that the early efficacy endpoints of EFS or PFS are strongly related to OS at both individual and trial level [9][10][11]. The survival of DLBCL patients who  achieved PFS or EFS at 24 months is almost equal to that of the age-and sex-matched general population [9][10][11][12]. Therefore, 2-year EFS or PFS are accepted as early efficacy endpoints. Although the use of individual patient data allows better characterization of important covariates that affect survival, it restricts the analysis to a limited number of   Fig. 4), the predicted 5-year OS, as calculated according to the actual 1-, 2-, 3-, and 5-year PFS from the phase II trials and retrospective data ( Table 2), is plotted against the actual 5-year OS. The predicted OS approximates to the actual OS, as indicated by approaching the diagonal line, i.e., the line of identity; r indicates the correlation coefficient. PFS progression-free survival; OS overall survival.
RCTs, and the analysis is not easily replicated by independent researchers. In most recently published trials and in clinical practice, there are multiple effective agents not only as initial treatment but also in second-line or salvage settings. Any validation of an early efficacy endpoint is relevant only within the context in which the validation occurred. These factors prompted re-examination and external validation of the correlation between PFS or EFS at the given time points with OS. The present literature-based analysis relied on data from RCTs, phase II trials, and retrospective studies to assess the validity of the early efficacy endpoints, and represents a critical step toward understanding the impact of immunochemotherapy on PFS or EFS and OS in DLBCL. With strict inclusion criteria and quality control, we included large-scale, qualified RCTs for triallevel surrogacy analysis, and phase II trials and retrospective studies for external validation. The correlation between PFS or EFS with OS was well established for DLBCL at both the trial and treatment arm level from the RCTs. Furthermore, the correlation between 1-5-year PFS and OS was externally validated by analyzing the phase II and retrospective data. Consistent with previous studies [9][10][11][12], these results highlight the significant role of PFS and EFS as early efficacy endpoints in designing prospective trials.
As the association of improved PFS or EFS with prolonged OS in DLBCL in this study is straightforward, the use of PFS and EFS as early efficacy endpoints not only incorporates survival, but also reduces treatment-related events, disease relapse, and progression. Compared with long-term OS, dynamic assessment of PFS or EFS at 1-3 years has a lower likelihood of confounding by subsequent or salvage treatment. Innovative treatment strategies with a large magnitude of effect on PFS or EFS for high-risk patients with DLBCL may have a large effect on OS in RCTs. Importantly, we found that PFS or EFS as early as 1 year correlated with 5-year OS at the treatment arm-level, mainly because the majority of patients were at high risk of early relapse and poor post-progression survival. Consistent with this finding, other studies have demonstrated that 70% of disease failures occurred within the first year after treatment, but rarely after 5 years [9,12]. For patients who achieved EFS at 12 and 24 months, the risk of relapse in the next 5 years dropped to 13% and 8%, respectively [9]. If patients experienced progression or relapse within 2 years, the median OS after disease progression was only 7.2 months [11].
The strengths of this study include the quality control design, large sample size, external validation of PFS outcomes, and current standard treatment. First, the data were obtained from high-quality RCTs, phase II, and retrospective studies that enrolled large-scale cohorts (>31,000 patients) with newly diagnosed DLBCL uniformly treated with rituximab-containing immunochemotherapy. We could eliminate selection bias with great confidence due to the limited number of RCTs or treatment option heterogeneity. This comprehensive surrogacy study at trial-and treatment arm-level complements previous evidence and strengthens the clinical use of PFS and EFS as early efficacy endpoints. Second, the positive relationships between the 1-5-year PFS and 5-year OS rates were externally validated using independent data that included patients across different countries with varied eligibility criteria, immunochemotherapy regimens, radiotherapy, and follow-up times. As a variety of immunochemotherapy regimens was investigated in a heterogeneous population, we could examine for variability in treatment outcomes and hence improved the generalizability of our study. Our generation and validation of prediction models for describing the association between the 1-5-year PFS and 5-year OS rates is unique. The RCT validation in an independent cohort improved the reliability of the conclusions.
The study limitations include the lack of individual patient data and standardized definition of endpoints and follow-up assessments. First, this is a literature-based systematic review without individual patient data; therefore, patient-level surrogacy was absent. Second, precise modeling requires standardized definitions of endpoints and standardized follow-up assessments or surveillance strategies in DLBCL trials, which is infeasible to accomplish in our study. For example, while PFS was calculated from the date of randomization in RCTs, it was generally calculated from diagnosis or initial therapy in retrospective studies. In addition, EFS events typically consisted of both PFS events, as well as unplanned treatment, treatment discontinuation and toxic events as they were used to evaluate the safety, toxicity or compliance of a novel therapy. Moreover, EFS events were defined inconsistently across trials and dependent on the trial design and purpose. In clinical practice, the exact date of disease progression is difficult to determine precisely, such that the reported PFS or EFS event date was naturally dependent on the frequency and interval of two consecutive clinical visits and imaging assessments. Such an inherited heterogeneity in the interval and frequency of assessments across cannot be removed nor quantified. Third, the predicted model concluded in this study was based on findings in patients treated with anthracycline-based immunochemotherapy, and its extrapolation to other treatments would be speculative. The impact of post-progression management was beyond the scope of this study, and such information is not routinely collected in clinical trials. When more effective salvage treatment occurs and post-progression survival is significantly prolonged in the future, the predicted model should also be modified and optimized. Fourth, the correlation between EFS and OS was not externally validated in the retrospective populations, because EFS is generally not reported in retrospective studies.
In conclusion, our assessment of a large sample of highquality data for patients with DLBCL provides high-level evidence that PFS and EFS are valid early efficacy endpoints for OS in the immunochemotherapy era. funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and the decision to submit the paper for publication.
Author contributions YXL, SNQ, and CH designed the study, analyzed the data, and revised the paper. JZ performed the literature search and quality assessment, extracted and analyzed data, and wrote the first draft of the paper. JT performed the literature search. YY, BC, SLW, and JRD analyzed data. CH supervised data analysis.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.