Results of two consecutive treatment protocols in Polish children with acute lymphoblastic leukemia

The aim of the study was to retrospectively compare the effectiveness of the ALL IC-BFM 2002 and ALL IC-BFM 2009 protocols and the distribution of risk groups by the two protocols after minimal residual disease (MRD) measurement as well as its impact on survival. We reviewed the medical records of 3248 patients aged 1–18 years with newly diagnosed ALL who were treated in 14 hemato-oncological centers between 2002 and 2018 in Poland. The overall survival (OS) of 1872 children with ALL treated with the ALL IC 2002 protocol was 84% after 3 years, whereas the OS of 1376 children with ALL treated with the ALL IC 2009 protocol was 87% (P < 0.001). The corresponding event-free survival rates were 82% and 84% (P = 0.006). Our study shows that the ALL IC-BFM 2009 protocol improved the results of children with ALL compared to the ALL IC-BFM 2002 protocol in Poland. This analysis confirms that MRD marrow assessment on day 15 of treatment by FCM-MRD is an important predictive factor.


Scientific Reports
| (2020) 10:20168 | https://doi.org/10.1038/s41598-020-75860-6 www.nature.com/scientificreports/ cancer in which the assessment of early response to therapy by monitoring minimal residual disease (MRD) has been indicated to be an essential tool in making therapeutic decisions 1,4 . The International Berlin-Frankfurt-Münster Study Group (I-BFM-SG) includes national study groups from over 30 countries around the world that collaborate in working committees to address the essential aspects of childhood leukemia and lymphoma research. The European cooperative groups assessed modifications to basically all elements of therapy in randomized trials 5 . In the nineties, I-BFM-SG research reported that the detection of MRD at two consecutive time points (on days 33 and 78 of therapy) is helpful for distinguishing patients with a good prognosis (standard risk, MRD-SR) from patients with an intermediate prognosis (intermediate risk, MRD-IR) or a poor prognosis (high risk, MRD-HR) 6 . Gradually, the I-BFM group covered new countries with inadequate skills and less experience with intensive chemotherapy schedules, resulting in the study being adapted to local conditions. The ALL-IC BFM 2002 protocol was the first intercontinental randomized clinical trial of the I-BFM-SG and was advised for countries with limited resources for PCR-based MRD assessment. Based on the pioneering findings of the I-BFM-SG in measuring the early peripheral blood response to prednisone on day 8 and bone marrow blasts on day 15, all patients could be categorized into risk groups by available methods. In the next study protocol, ALL-IC BFM 2009, the new stratification based on minimal residual disease (MRD) evaluation was applied. The ALL-IC BFM group agreed to use only flow cytometry (FCM) for MRD analysis for this study, as the utilization of PCR-based methods appeared difficult in most participating countries. One of the main aims of the ALL IC 2009 protocol was to determine if the outcome of patients with SR, according to the ALL IC 2002 criteria for SR and FC-MRD burden < 0.1% at day 15, would be better than the outcome that could be expected from stratification by the criteria of ALL-IC 2002 5 .
The Polish Pediatric Leukemia/Lymphoma Study Group joined the ALL IC-BFM 2002 study in 2002 and then the ALL IC-BFM 2009.
In this study, we report the demographics, incidence and clinical outcomes of children with ALL in the Polish population. The aim of the study was to retrospectively compare the effectiveness of the ALL IC-BFM 2002 and ALL IC-BFM 2009 protocols and the distribution to risk groups by the two protocols after MRD measurement as well as its impact on survival. We were also interested in whether our analysis would confirm high event-free survival in risk groups, especially in the SR-MRD group.

Results
Patient characteristics. The demographics of the two groups of patients treated with the ALL IC 2002 (1872 pts) and ALL IC 2009 (1376 pts) protocols are presented in Table 1. There were statistically significant differences between these groups in the following features: average age, WBC count at diagnosis, risk group, BCR/ ABL1-positive fusion gene, immunophenotype and BM result on days 15 and 33 of the induction phase. We additionally calculated the effect size for each comparison. Effect size enables us to assess the strength/size of the relationship between variables in each test, independent of sample size. Therefore, based on the weak effect sizes of age, WBC count and immunophenotype, we can conclude that the differences in these variables between the groups, which were statistically significant (< 0.05), were not important from a clinical standpoint.
The difference between the frequency of BCR/ABL1 rearrangement (3.5% for ALL IC 2002 and 1.4% for ALL IC 2009) was because some of the children with ALL Ph+ were treated with the EsPhALL (European intergroup study on postinduction treatment of ALL Ph+) protocol, in which imatinib was incorporated into combination chemotherapy regimens from 2012/2013. There were no significant differences between these groups in the remaining features. Most children were male. Splenomegaly and hepatomegaly were reported in the majority of children (over 50%), and mediastinal involvement from the leukemic cells rarely occurred. The rearrangements of KMT2A (MLL) were positive in less than 2% of patients. Most patients had a good response to prednisone on day 8. The ALL IC 2009 protocol. The date of the last follow-up was 31 December 2019. The median follow-up time for the entire group was 3.36 years. A total of 88 (14.3%) deaths were noted in the entire group, including 2 (2.3%) deaths in the SR group, 40 (4.5%) in the IR group, and 46 (52.3%) in the HR group. Considering the size of the risk groups, the death rate observed was 1.1% (2/192 pts) in the SR group, 4.5% (38/868 pts) in the IR group, and 15.2% (48/316 pts) in the HR group. The cumulative death risk at 5 years was estimated to be 1.0% (SE 0.007) in the SR group, 6.1% (SE 0.014) in the IR group, and 18% (SE 0.026) in the HR group.
In 14 (16%) cases, deaths happened during induction therapy before complete remission (CR) was achieved, and the estimated death rate during induction therapy was 1.02% (0% for the SR group, 0.65% for the IR group, and 0.37% for the HR group). A total of 31 (35%) deaths were noted during the CR phase (death rate of 2.25%, with 0.15% in the SR group, 0.8% in the IR group, and 1.3% in the HR group). Deaths related to relapse were noted in 40 (45.4%) children.
Relapses were observed in 128 children (9.3%), of which 5 (3.9%) cases were in the SR group, 80 (6.3%) were in the IR group, and 43 (33.6%) were in the HR group. The cumulative relapse risk was 4.6% (SE 0.020) in the SR group, 13.6% (SE 0.016) in the IR group, and 20.8% (SE 0.031) in the HR group.
Comparison of the treatment results of both protocols. The overall survival (OS) rates of children with ALL treated with the ALL IC 2002 and ALL IC 2009 protocols were 84% and 87%, respectively, after 3 years in Poland, and the difference was statistically significant (P < 0.001); the corresponding event-free survival (EFS) rates of the www.nature.com/scientificreports/ two analyzed groups were 82% and 84% after 3 years (P = 0.006). The OS and EFS of the two groups of children with ALL and by risk groups are presented in Figs. 1 and 2, respectively. www.nature.com/scientificreports/ The cumulative incidence of death (CID) of all groups with the ALL IC 2002 protocol was higher than that with the ALL IC 2009 protocol, and the difference was statistically significant (P < 0.001) (Fig. 3a). The CID analyzed by risk group showed that IR and HR patients treated with the ALL IC 2002 protocol had a higher CID than those treated with the ALL IC 2009 protocol (P < 0.001) (Fig. 3b-d). The cumulative incidence of relapse   www.nature.com/scientificreports/ (CIR) of the whole group was higher for ALL IC 2002, but there was no significant difference (Fig. 4a). The CIR analyzed by risk groups presented a significant difference only for SR children (P = 0.02) (Fig. 4b-d).

Discussion
This study is a large retrospective analysis comparing the results of treatment with two protocols, ALL IC 2002 and ALL IC 2009, for children with ALL in Poland. In our cohort, the distribution into risk groups (especially to the standard risk group) and response to induction therapy differed significantly in both protocols. FCM-based MRD, which was implemented for risk stratification in the ALL IC 2009 protocol, had a substantial impact, leading to more accurate risk group assignments and induction phase outcome predictions. The level of MRD was only evaluated on day 15 of the induction phase of the protocol. It is known that MRD is a strong prognostic factor for determining risk groups and subsequently the intensity of post-induction therapy (from significant treatment reduction to mild or strong intensification) 5,[8][9][10][11] . The technologies of MRD measurement for childhood ALL studies are based on molecular or flow cytometry, and the use of these methods depends on the resources available. In Poland, the RQ-PCR method was not achievable during therapy with the ALL IC 2009 protocol.
The This was the first clinical trial in which MRD was determined prospectively during and after remission induction therapy. In this study, MRD was determined in bone marrow specimens by PCR on days 19 and 46 of remission induction and on week 7 of continuation treatment. The authors reported that two of these time points (days 19 and 46) were valuable for the stratification of patients into risk groups 14 .
In our study, we present the results of two ALL treatment regimens in Poland between 2002 and 2018. We observed that the overall survival of children treated with the ALL IC 2009 protocol was significantly improved compared to that of children treated with the ALL IC 2002 protocol, and this was particularly pronounced in IR and HR children. Similarly, the event-free survival rate of patients treated with the ALL IC 2009 protocol was higher than that of patients treated with the ALL IC 2002 protocol. Our analysis confirmed significantly higher event-free survival in the SR-MRD group of patients compared to the SR patients according to the ALL IC 2002 criteria. Regarding the cumulative incidence of death and the cumulative relapse rate of the two analyzed groups of children, we noticed that these rates were higher in children treated with the ALL IC 2002 protocol. The improvement in outcomes could be attributed to the introduction of FCM-based MRD assessment, the redefinition of stratification into risk groups and the development of supportive therapy (novel broad-spectrum antibiotics, antifungal drugs).
The results of the ALL IC 2009 protocol are comparable to those of reported studies conducted by major collaborative groups 3,15 .
Pui et al. presented a review of the impact of collaborative studies (14 study groups) on the heterogeneity and treatment of ALL in children and teenagers between 1995 and 2011. These studies confirmed that MRD assessment is the most predictive indicator in both B-cell and T-cell ALL. The authors reported the clinical outcomes of several therapeutic regimens. The 5-year EFS ranged from 75.9% (AIEOP-95) to 87.3% (NOPHO-2000 and SJCRH XV), and the 5-year OS ranged from 85.4% (CoALL-97) to 93.5% (SJCRH XV) [16][17][18][19][20][21] .
In conclusion, our study demonstrates that the ALL IC-BFM 2009 protocol improved the clinical outcomes of children with ALL compared to the ALL IC-BFM 2002 protocol in Polish pediatric oncohematology centers. This analysis confirms that the bone marrow assessment on day 15 of treatment (early time point) by FCM-MRD is an important predictive factor. Due to the MRD assessment, many patients were redirected from the SR group to the IR group, resulting in the EFS being higher in the SR group.

Patients and methods
Study group. The study protocols were approved by the Ethics Committee of Medical University of Lublin.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent was obtained from the parents or guardians of the participants until the age of 18. Definition of remission status. The prednisone response in the peripheral blood (PB) and bone marrow morphological response were evaluated in both protocols. The absolute blast count (ABC) in PB on day 8 after 7 days of prednisone pre-phase and one dose of intrathecal methotrexate on day 1 were evaluated. The prednisone-good responders (PGRs) are patients with an ABC on day 8 of < 1000/μL PB, and the prednisonepoor responders (PPRs) are patients with an ABC on day 8 of ≥ 1000 blasts/μL. M1 marrow was defined as bone marrow with < 5% blasts, M2 marrow was defined as bone marrow with 5% to 24% blasts, and M3 marrow was defined as bone marrow with ≥ 25% blasts on the 15th and 33rd days of the induction phase of therapy. The definition of relapse, which was followed after the first complete remission, was ≥ 25% blasts in the bone marrow or disease involvement elsewhere.
MRD assessment was only used in the ALL IC 2009 protocol. MRD can be detected by PCR (polymerase chain reaction) or FCM (flow cytometry) methods. Real-time quantitative PCR (RQ-PCR)-based MRD is detected by the polymerase rearrangement of the immunoglobulin and T-cell gene receptor. It presents high analytical sensitivity (< 10 −5 ) and is highly standardized, but some patients lack a suitable PCR marker. The methodology is highly complex and costly. At that time, we did not have such specific laboratories in our country, and the determination of PCR-MRD required the development and implementation of research standards for residual disease with the simultaneous establishment of a specialized network of reference laboratories at children's oncohematology centers throughout the country.
In the ALL IC 2009 protocol, MRD by flow cytometry was analyzed. This method is available in most countries. Flow cytometry is less sensitive (10 −3 -10 −4 ) than PCR-MRD, but the advantages are execution speed, availability and economic feasibility. The monoclonal antibody panels used in MRD monitoring were designed from the initial immunophenotype of each case by selecting markers conjugated to an eight-color immunostaining panel (EuroFlow 8-color antibody panel). MRD was evaluated in bone marrow samples on day 15 of the induction phase with a sensitivity of 0.01% or better. According to the ALL IC 2009 protocol, patients classified as SR should have FCM-MRD < 0.1%, patients with FCM-MRD > 0.1 and < 10% will be upgraded to IR, and those classified as SR with > 10% will be included in the HR group. Patients classified as IR with FCM-MRD > 10% will be included in the HR group, and all others will remain as IR.
Definition of central nervous system (CNS) status. CNS status 1 was defined as no clinical and imaging evidence of CNS disease and no blasts on cytospin of cerebrospinal fluid (CSF). CNS status 2 was defined as pleocytosis ≤ 5/µl with clearly identified blasts on cytospin of blood-contaminated CSF. CNS status 3 was defined as nontraumatic lumbar puncture with pleocytosis > 5/µl, a mass lesion in imaging studies of the brain and/or meninges or cranial nerve palsy unrelated to other origin, even if the CSF is blast-free.  Table 2. Patients treated according to these protocols