Treatment response is a strong outcome predictor for childhood acute lymphoblastic leukemia (ALL). Here, we evaluated the predictive impact of flow cytometric blast quantification assays (absolute blast count, BC, and blast reduction rate, BRR) in peripheral blood (pB) and/or bone marrow (BM) at early time points of induction therapy (days 0, 8 and 15) on the remission status in the AIEOP-BFM-ALL 2000 protocol. At the single parameter level (905 patients), the strongest predictive parameter for the remission status as a dichotomous minimal residual disease (MRD) parameter (positive/negative) has been provided by the BC at day 15 in BM (cutoff: 17 blasts/μl; 50 vs 15%; odds ratio: 5.6; 95% confidence interval: 4.1–7.6, P<0.001), followed by the BRR at day 15 in BM and by the BC at day 8 in pB (odds ratios: 3.8 and 2.6, respectively). In the multiple regression analysis (440 patients), BC in pB (d0 and d8) and in BM (d15) as well as BRR at day 8 in pB provided significantly contributing variables with an overall correct prediction rate of 74.8%. These data show that the quantitative assessment of early response parameters, especially absolute BCs at day 15 in BM, has a predictive impact on the remission status after induction therapy.
Evaluation of treatment response with morphology, flow cytometry (FCM) or molecular techniques constitutes one of the strongest predictors of outcome in childhood acute lymphoblastic leukemia (ALL).1, 2, 3, 4 Most of the improvements in the methodologies for response evaluation in terms of sensitivity and earliness in the course of the disease have been successfully integrated into clinical trials for an optimized risk assessment and therapy stratification in the past.5, 6 The cytomorphological assessment of therapy response to a 7-day induction prephase with corticosteroids provides the first treatment-related parameter for risk assignment and treatment stratification within the AIEOP-BFM-ALL study protocols for almost 20 years.7 Patients with a reduction in the peripheral blood (pB) count to less than 1000/μl have a more favorable prognosis (prednisone good responders) than patients with peripheral blast counts (BCs) remaining above this limit (prednisone poor responders)7, 8, 9 Although the day 8 prednisone response is a very robust and clinically well-established parameter for risk assignment, there are some limitations and difficulties to be considered. It is only able to identify a limited number of patients with a high risk of relapse, and because of the absolute cutoff value, it does not reflect individual blast reduction rates (BRRs). Furthermore, the cytomorphological assessment of the day 8 BC in pB can sometimes be difficult and misleading.
To detect individual clonal immunoglobulin or T-cell receptor gene rearrangements at two time points, after induction and after consolidation, day 33 and day 78, respectively, real-time quantitative PCR is used in most minimal residual disease (MRD)-based treatment protocols, but the early blast reduction kinetics during induction therapy have not been addressed with molecular techniques. Regarding this, we consider FCM much more suitable and faster in providing clinically relevant results at these early treatment time points.10, 11, 12
To improve the predictive impact of early blast reduction parameters, we investigated the potential of quantitative FCM (QFCM) in pB and bone marrow (BM) at diagnosis (day 0) and early follow-up time points (days 8 and 15) to detect MRD in BM on day 33 of the AIEOP-BFM-ALL 2000 study protocol. The analysis of the QFCM data was performed with receiver operating characteristic (ROC) curve analysis to screen for the optimal prognostic parameters with regard to time point, absolute or relative blast quantification and determination of cutoff values.13, 14, 15, 16 These parameters were then tested in a binary logistic regression model. In addition to the absolute counts of persisting blasts, we calculated BRR, which reflect the efficacy of the cytoreductive therapy according to relative changes in the blast load between different time points of the treatment protocol. Projected onto the different treatment phases of the ALL-BFM protocol, BRRs on day 8 (BRRd8) and day 15 (BRRd15) reflect the in vivo response to the prednisone prephase and to steroids plus one dose each of daunorubicin, vincristine, and asparaginase, respectively. Overall, 905 uniformly treated patients with childhood precursor B-cell ALL were prospectively analyzed for the purposes of this study.
Materials and methods
Patients and samples
Between January 2000 and September 2006, a total of 3900 samples from patients with precursor B-cell ALL (age: 1–18 years) treated according to the AIEOP-BFM-ALL 2000 protocol were analyzed for MRD with research-only FCM in four centers of the study group. In all centers, pB and BM samples of the patients were received at diagnosis and from follow-up during induction treatment: pB at diagnosis (n=798), BM at diagnosis (n=558), pB at day 8 (n=770), BM at day 15 (n=869), BM at day 33 (n=905) and BM at days 15 and 33. Disparities of the number of samples between time points are due to incomplete accrual or unavailability of samples for quantification. MRD investigations were approved as a part of the international trial by the institutional ethical committees, and were carried out according to informed consent guidelines. A brief outline of the induction treatment used in the trial has been summarized recently.6 Absolute MRD values (blasts per μl) were calculated using the cell count of the sample (assessed with a conventional hemocytometer) and the relative MRD estimate from the FCM analysis.5
Standardized operating protocols were used for sample preparation and staining as described recently.12 Commercially available products were used for erythrocyte lysis (BD FACS Lysing Solution; Becton Dickinson (BD) Biosciences, San José, CA, USA) as well as for permeabilization (Fix & Perm, Caltag Laboratories, Hamburg, Germany).
All monoclonal antibodies were selected because of high-quality performance after screening several similar products from different sources. Monoclonal antibodies were strategically assorted to fixed quadruple combinations of those markers that have been proven highest relevance for MRD studies in ALL. The panel of combinations was limited to a maximum of five tubes for BCP-ALL: CD20/CD10/CD19/CD34, CD58/CD10/CD19/CD34, CD10/CD34/CD19/CD45, CD10/CD11a/CD19/CD45 and CD10±CD20/CD38/CD19/CD34 (ordered by channels 1–4). Repetitive triple marker backbones were useful for stable discrimination of similar cellular phenotypes in different tubes. To avoid influences of fluorochrome interactions on MRD detection (particularly relevant with tandem dyes), we preferred fluorescein isothiocyanate (channel 1) or phycoerythrin (channel 2) as labels of monoclonal antibodies against the most relevant aberrations. To ensure a good separation of normal and malignant cells, antigens with aberrantly low expression on leukemic blasts were stained with monoclonal antibodies conjugated with strong fluorochromes such as phycoerythrin. From this panel of combinations, at least two were chosen for follow-up of individual patients according to the leukemic phenotype at diagnosis and during follow-up.
The instrument set-up was optimized by analyzing Calibrite beads (BD Biosciences) and normal adult pB cells stained with CD4/CD8/CD3/CD45. DAKO FluoroSpheres 6-Peak particles (Dako, Hamburg, Germany) with assigned values of molecules of equivalent soluble fluorochrome were used for longitudinal monitoring of instrument performance stability. Erythrocyte lysis efficiency was monitored per sample by an extra staining combination of SYTO16 (Molecular Probes, Leiden, The Netherlands), a live cell permeate nucleic acid stain (channel 1), together with CD10/CD19/CD45.
Immunophenotyping at diagnosis was performed collecting at least 30 000 events, whereas for MRD measurements 300 000 events were acquired. Cell acquisition was performed with FacsCalibur (BD Biosciences) or FC 500 cytometers (Beckman Coulter, Miami, FL, USA) using the CellQuest (BD Biosciences), Paint-A-Gate (BD Biosciences) or RXP (Beckman Coulter) software for data analysis. Leukemic cells and normal B-lymphocytes cells were identified in a bivariate dotplot as CD19-positive cells with low side scatter. MRD status has been defined earlier as positive, if an accumulation of at least 10 clustered events displaying lymphoid scattering properties and leukemia-associated immunophenotypic characteristics has been detected.17 Weighting of the proportional MRD quantification by irrelevant non-nucleated events, such as erythrocytes, platelets, and debris, was avoided by relating MRD only to SYTO 16-positive, that is nucleated, cellular events.
FCM measurements provided absolute BCs (blast/μl) in pB (BCpB) at days 0 and 8 and in BM (BCBM) at days 0, 15 and 33. BRR values have been calculated as absolute BC changes in relation to the initial blast burden: BRRd8=(BCpBd0−BCpBd8)/BCpBd0 × 100; BRRd15=(BCBMd0−BCBMd15)/BCBMd0 × 100. The FCM-MRD status in the day 33 BM was considered as a categorical parameter with the dichotomous outcome values ‘MRD-positive’ or ‘MRD-negative.’ The cutoff values for BRRd8 and BRRd15 were calculated according to ROC curve analyses, and BRR values above and below the cutoff values were categorized as BRRhigh or BRRlow, respectively.15, 16
Predictive parameters were generated according to the depicted scheme (Figure 1). Comparison of categorical variables was performed with Fisher's exact test or χ2 test including risk estimation and calculation of the odds ratio (OR) with a 95% confidence interval (CI). All calculations were performed with SPSS (version 15.0, SPSS Inc., Chicago, IL, USA). For the generation of ROC curves, the determination of cutoff levels and the depiction of the forest plot, MEDCALC− (version 220.127.116.11 MedCalc Software, Mariakerke, Belgium) was used.
Non-categorical, continuous variables of the early response parameters were tested in a binary logistic regression model.13, 14 Consecutive samples for each time point were available for 440 of 905 patients analyzed. This population was randomly assigned into a ‘learning sample’ (n=284) and a ‘test sample’ (n=156). Outcome parameters of this model were defined as follows: Positive measurement and positive prediction=true positive (TP), negative measurement and negative prediction=true negative (TN), positive measurement and negative prediction=false positive (FP), negative measurement and positive prediction=false negative (FN), positive predictive value (PPV)=TP/(TP+FP), negative predictive value (NPV)=TN/(TN+FP), sensitivity=TP/(TP+FN), specificity=(TN/(FP+TN) and overall correct prediction (OCP)=(TP+TN)/total n.
Quantification and descriptive analysis of blast persistence at early time points
The CD19-positive blast cell population was identified and gated according to the leukemia-associated immunophenotype defined by the differential expression of CD10, CD20, CD58, CD45 and CD34 as compared with normal B cells in the sample. The percentage of blast cells gated was calculated as absolute BCs per μl using the percentage of nucleated cells determined with the Syto16 nuclear staining in the sample and a second platform cell counter for the enumeration of nucleated cells in the sample. The comparison of the FCM-MRD-negative vs FCM-MRD-positive patient groups showed that patients with a negative FCM-MRD status had a lower BCpBd0 and BCBMd15, but no difference was observed with regard to BCBMd0 and BCpBd8 (Table 1). Furthermore, patients with a negative FCM-MRD status at day 33 had significantly higher BRR at both early time points (days 8 and 15) than patients with a positive MRD status (Table 1).
ROC curves analyses of early response parameters
The diagnostic performance of each of the single BC parameters with respect to the dichotomous end point of remission quality after induction was investigated with ROC curves analyses. ROC curves display the relationship between true-positive and false-positive rates for a test across various threshold values used to diagnose a dichotomous condition. In order to determine which of the various time points and specimens is best suited as an assay to diagnose the end point condition FCM-MRD status at day 33, we compared the areas under the curve (AUCs) of the ROC curves (Figure 2 and Table 2). The diagonal line in the plots indicates where the curve would rest if the tests were completely unreliable (AUC=0.5). The AUC is a measure for the test variable to distinguish between the MRD-positive and MRD-negative group. When there is a perfect separation, the curve will be in the upper left corner with an AUC of 1.0. Moreover, ROC curve analysis also defines the criterion value (‘cutoff point’) corresponding with the highest accuracy for the separation of MRD-positive and MRD-negative patients (minimal false-negative and false-positive results), which is indicated with a ° sign in the plots. According to the highest AUC values of the ROC curve analyses, both day 15-related parameters, BCBMd15 and BRRd15, showed the best discrimination between patients with a positive or negative FCM-MRD status with an AUC of 0.76 and 0.71, respectively, but the difference between these two parameters was not quite significant (P=0.07, Table 2). Although the day 8 parameters also discriminated MRD-positive and MRD-negative patients on day 33, the AUCs of the day 8 parameters were lower than the day 15 parameters (Figure 2). Similar to day 15, there was no significant difference between absolute (BC) and relative (BRR) blast quantification on day 8 (Table 2).
Combined application of the early response parameters: estimation of odds
Patients were categorized with respect to the cutoff points of each single test in groups with high and low BCs and groups with a low or high BRR in order to estimate the likelihood for a positive MRD status between these groups. In addition, we combined the four groups for BRRd8 and BRRd15 as follows: (1) BRRlow=BRRd8low+BRRd15low and BRRd8low+BRRd15high and BRRd8high+BRRd15low with n=263 (59.8%) and (2) BRRhigh=BRRd8high+BRRd15high with n=177 (40.2%).
The calculation of OR for these groups showed that patients with a high BC in pB at days 0 and 8 and with a high BC in the BM on day 15 have a significant higher chance for a positive MRD after induction than patients with low BCs at these time points (Figure 3), that is, patients with a high BC in the day 15 BM (cutoff=17 blasts/μl) have a 5.6 × higher chance for MRD positivity than patients with a low BC at that time point (OR for MRD (yes/no)=5.6, 95% CI: 4.1–7.6, P<0.001). This is also reflected in the BRR at day 15. The likelihood of a positive MRD status for patients with a low BRRd15 is 3.8 × higher than for patients with a high BRRd15 (OR=3.8, 95% CI: 2.5–5.6, P<0.001). The combination of the BRRd8 and BRRd15 groups showed that patients categorized into the BRR group with a low blast reduction had the second highest chance, next to patients with a high BCBMd15, for residual leukemia after induction compared with patients with a high BRR (OR=4.1, 95% CI: 2.5–6.4, P<0.001). BCBMd0 is the time point with the least predictive impact (OR=1.9, 95% CI: 0.9–4.1, P=0.12) (Figure 3).
Regression analysis and prediction model
In order to analyze the contribution of each of the early blast reduction variables on the FCM-MRD BM status after induction, a multiple logistic regression model using the QFCM parameters, BCpBd0, BCBMd0, BCpBd8, BCBMd15, BRRd8 and BRRd15, was calculated (Table 3). The data set of 440 patients with complete analyses at all time points was randomly partitioned into a training set with 284 patients and a test set with 156 patients. Except for BCBMd0 (P=0.65) and BRRd15 (P=0.078), all other parameters remained significantly contributing variables (BCpBd0 P=0.001, BCBMd15 P=0.003, BCpbd8 P=0.013, and BRRd8 P=0.035) to the regression model with an overall correct prediction rate of 71.8% for the training set and 74.4% for the test set. Other outcome parameters of this model included a positive predictive value of only 23.7 and 25.0% as well as a specificity of 71.0 and 76.0% for the training and test set, respectively. Of the 71 or 76% truly negative patients in both sets, QFCM correctly identifies 96.8 and 93.8% of these truly negative patients, respectively. These results emphasize that early negative QFCM results are useful at reassuring that a patient will have an FCM-MRD-negative remission on day 33 (Table 3). Table 4 summarizes test performance parameters for the single assays, the BRR groups and the regression model, including sensitivity, specificity, positive and negative predictive values, and overall correct prediction rates as defined in the section Material and methods.
The prognostic value of MRD detection with QFCM has been established across different treatment protocols for childhood and adult ALL. Most of these studies were able to show the potential usefulness of a risk stratification based on the quantification of MRD at almost every time point in the treatment protocol.5, 9, 18, 19 Stratification algorithms included in general multiple MRD measurements at various time points during or after induction and before consolidation. Considering early time points, the MRD measurement in the day 19 BM of the St Jude total therapy protocols for childhood ALL proved to be of independent prognostic importance.5 More recently, the clinical significance of day 8 MRD in pB and day 29 MRD in BM has been shown in a large cohort of patients enrolled in the studies of the Children's Oncology Group.18
We focused in this study on the evaluation of the early in vivo treatment efficacy using cytoreduction response parameters measured by QFCM with the end point of remission quality after induction. Different time points and samples were assessed to propose an optimal diagnostic tool, which is able to distinguish patients at risk for MRD-positive remission. The advantage of the early identification of these patients is obviously given by the possibility to modify the therapeutic strategy with either intensification or de-escalation of treatment earlier in the course of the disease. Currently, in the AIEOP-BFM-ALL 2000 protocol, the decision algorithm for stratification relies mainly on clinical grounds including morphological evaluation of the BM on day 33, and on the PCR-MRD results from BM on days 33 and 78, thus taking place after the completion of induction and consolidation.6, 20, 21 Very early treatment response during induction has been assessed only by cytomorphology on day 8 in pB, which, however, identifies only a small number of prednisone poor responders, in particular within the precursor B-cell ALL subgroup.8 Early blast reduction kinetics analyzed with QFCM contains much more detailed information, which may be utilized for the consideration of treatment stratifications according to the early therapy efficacy.
To this end, we performed flow cytometric blast quantification assays in pB at diagnosis and on day 8 as well as in BM at diagnosis and on day 15 and calculated BRRs on days 8 and 15. The day 33 FCM-MRD status was used as a dichotomous end point to these parameters. In order to evaluate the diagnostic performance and the ability of each of these assays to discriminate between FCM-MRD-positive and FCM-MRD-negative patients after induction, we applied ROC curve analysis, which also defined cutoff points to categorize patients into groups with high and low BRRs. From these single assay analyses, multiple performance criteria have been generated, including sensitivity, specificity, positive and negative predictive values, and overall correct prediction. The choice to use either one of these assays for clinical purposes does not only depend on these performance criteria but must also include an interpretation in the light of the clinical circumstances for the benefit of the patient. With regard to sensitivity and specificity, there is always a trade-off balanced by their relative clinical importance. If a stratification decision intends to rely on the identification of patients with low risk of MRD positivity, the specificity of the BCBMd15 (84.3%) will give a relatively good estimate of the percentage of truly negative patients. In contrast, a therapy stratification intending to intensify treatment based on the identification of truly positive patients by a positive BCBMd15 will have to rely on a low sensitivity with only half of the patients (50.9%) being truly positive.
For the single-parameter analyses, it turned out that BCBMd15 and BRRd8 had the best diagnostic performance with sensitivities (that is, correct prediction of MRD-positivity) of 50.9 and 46%, specificities (that is, correct prediction of MRD negativity) of 84.3 and 74.8%, and overall correct prediction rates of 69.5, and 65.6%, respectively (Table 2). Also the BRRd15 assay has been highly specific on a single-parameter level. However, in the multiple regression model, BRRd15 did not remain a significantly contributing variable. This apparent contradiction is explained by the fact that the diagnostic value of the BRRd15 is (i) highly correlated (colinear) to the BCBMd15 but (ii) becomes attenuated by the combination with the least accurate parameter BCBMd0. Nonetheless, the combination of BRRd8 and BRRd15 generates two groups of patients with low and high BRRs with different risks for MRD-positive remission. Patients within the BRRlow group have a significantly higher risk for MRD than patients within the BRRhigh group (60 vs 40%, OR=4.1). In terms of clinical applicability, the combination of different QFCM parameters in the first 2 weeks of therapy allows to define a group of 40% of patients with low risk of MRD. The most important single predictive parameter to attain an MRD-negative remission is given by a BCBMd15 of less than 17 blasts/μl.
The prognostic significance of early blast clearance has been recognized before in smaller patient cohorts by Coustan-Smith et al.,5 who reported a 6% cumulative response rate among 51 patients with less than 0.01% blasts in the day 19 BM. Of particular relevance are the most recent data by Borowitz et al.18 who showed that increasing levels of the day 8 MRD in pB are strongly associated with a progressively poorer outcome. Our data, therefore, are in line with the importance of early MRD detection in the other therapy protocols. The optimal time point of the MRD measurement may nonetheless depend on the treatment schedule. The fact that in the AIEOP-BFM-ALL 2000 trial, the day 15 reflects the disease response to a multidrug treatment, whereas the day 8 reflects only the steroid response, may explain the better performance of the day 15 MRD time point as compared with the day 8 in our study. However, for clinical implementation in the AIEOP-BFM-ALL trials, it seems rational to use either the day 15 BCBM with a cutoff of 17 blasts/μl or the BRR groups defined by the combination of QFCM parameters for the identification of patients with a low risk of MRD-positive remission.
In conclusion, we have shown that QFCM proves to be helpful at reassuring that patients with low BCs and high BRRs have a low probability of an MRD-positive remission after induction. The BCBMd15 especially measured by exact flow cytometric quantification of blast cells may open rational possibilities for an early treatment modification including the reduction of therapy in those patients with an excellent early response.
We thank all participants and institutions of the AIEOP-BFM-ALL 2000 trial for their close cooperation in providing samples and data from their patients. We are also grateful to M Dunken (Berlin), B Oestereich (Berlin), A Schumich (Vienna), G Froschl (Vienna), Z Husak (Vienna) and O Maglia (Monza) for excellent technical assistance and C Feiler (Berlin) for data management. This work was supported by Wilhelm Sander Stiftung (Grant no. 2004.072.1). Peter Rhein was supported by Deutsche José Carreras Leukämie-Stiftung (Grant no. DJCLS-F05/09). Michael Dworzak was supported by the Austrian National Bank (Grant no. 10962) and Giuseppe Gaipa was supported by Fondazione Tettamanti and Fondazione Cariplo.
About this article
The significance of peripheral blood minimal residual disease to predict early disease response in patients with B-cell acute lymphoblastic leukemia
International Journal of Laboratory Hematology (2016)