Efficacy and safety of different doses of cytarabine in consolidation therapy for adult acute myeloid leukemia patients: a network meta-analysis

Cytarabine (Ara-C) in consolidation therapy played important role in preventing relapses for AML patients achieved complete remission, but the optimum dose remains elusive. In this network meta-analysis, we compared benefit and safety of high-, intermediate- and low-dose Ara-C [HDAraC (>2 g/m2, ≤3 g/m2 twice daily), IDAraC (≥1 g/m2, ≤2 g/m2 twice daily) and LDAraC (<1 g/m2 twice day)] in consolidation, based on ten randomized phase III/IV trials from 1994 to 2016, which included 4008 adult AML patients. According to the results, HDAraC in a dosage of 3 g/m2 twice daily significantly improved disease-free survival (DFS) compared with IDAraC [hazard rate (HR) 0.87, 95% CrI 0.79–0.97) and LDAraC (HR 0.86, 95% CrI 0.78–0.95). Subgroup analysis further showed that the DFS advantage of HDAraC is focused on the patients with favorable cytogenetics, but not the other cytogenetics. Compared with LDAraC, HDAraC (HR 6.04, 95% CrI 1.67–21.49) and IDAraC (HR 3.80, 95% CrI 1.05–12.85) were associated with higher risk of grade 3–4 non-haematological toxicity. However, no significant difference between HDAraC and IDAraC was found. These findings suggest that Ara-C in a dosage of 3 g/m2 twice daily provides maximal anti-relapse effect.

Risk of bias in included studies. All included trials have been published as full manuscripts and most of them have a low risk of bias (Fig. S1). The sequence was adequately generated in nine out of ten trials 7-12, 16-18 and was not report in one trial 15 ; we judged the quality of this trial as unclear risk. Allocation was adequately concealed in six out of ten trials 7,8,11,[16][17][18] and was not reported in four trials 9,10,12,15 ; we judged the quality of these trials as unclear risk. Given the unambiguous study treatments and "strict" endpoints (DFS and OS), we did not anticipate any impact of lack of blinding on outcomes. For treatment-related toxicity, all studies used pre-planned standard grading methods and uniform follow-up scheme for all study groups. We judged all the ten trials as low risk. In all the ten trials, intention to treat principle was followed and the drop-outs were less than 10%. All the pre-planned outcomes were addressed. Pairwise meta-analysis. HRs for DFSs and OSs could be respectively estimated in all the ten trials including 4008 patients 7-12, 15-18 and eight trials including 3932 patients [8][9][10][11][12][15][16][17] . In pairwise comparisons across all cytogenetics ( Fig. 2a and b), when compared with LDAraC, HDAraC in consolidation significantly improved DFS (HR 0.80, 95% CI 0.70-0.91, p = 0.001) and OS (HR 0.84, 95% CI 0.70-0.99, p = 0.04). For both endpoints, no significant difference was found in other comparisons. The same results were acquired when used both fixed and random effect models because no significant heterogeneity was found in all comparisons (I 2 = 0). In subgroup analysis stratified by cytogenetics ( Fig. 2c and d), HRs for DFS was available in five studies including 2406 patients 7,9,10,12,16 . Compared with IDAraC, HDAraC in consolidation significantly benefited DFS (HR 0.43, 95% CI 0.33-0.57, p < 0.00001) for the patients with favorable cytogenetic. No significant difference for DFS or OS was found in other comparisons.
ORs for haematological toxic effects, infection and other non-haematological toxic effects could be respectively estimated in four, seven and four trials. Four studies did not report the overall number of haematological toxic effects, but separately reported the number of grade 3-4 leukopenia, thrombocytopenia and neutropenia. We used the largest of the three numbers to calculate the trial-specific ORs for haematological toxic effects 8,10,12,18 . Four studies did not report the overall number of non-haematological toxic effects, but separately reported the number of individual non-haematological toxic reactions, and we used their sum to calculate the trial-specific ORs for non-haematological toxic effects 8,11,12,18 . Two studies separately reported toxic effects in each course, but not the overall number during consolidation therapy, and we used the largest number to estimate to calculate the trial-specific ORs 8,11 . Network comparisons of grade 3-4 toxic effects were presented in Fig. 5. No significant difference was found for haematological toxic effects or infection among different doses of Ara-C. For other non-haematological toxic effects, when compared with LDAraC, HDAraC (HR 6.04, 95% CrI 3.78-8.98) and IDAraC (HR 3.80, 95% CrI 1.05-12.85) were associated with higher risk of incidences. No significant difference between HDAraC and IDAraC was found.
In all network comparisons, no significant inconsistency was indicated in node-splitting analysis.
Sensitivity analysis. Sensitivity analysis showed that excluding the three studies using the non-conventional Ara-C doses in induction did not alter the overall effect size for DFS across all cytogenetics (Supplementary

Discussion
A variety of strategies to prevent relapse for AML patients have been explored for over 30 years. One of the most notable progress is the standard post-remission chemotherapy for adult AML patients established based on the CALGB 8525 protocol: single-agent high-dose Ara-C in a dosage of 2-3 g/m 2 twice daily on days 1, 3, 5 for at least two cycles 4-6 . Given the toxicity and high price of HDAraC, numerous randomized trials were conducted for exploration of dosage de-escalation 7,8,[10][11][12]15 . However, most of them comparing IDAraC or LDAraC (usually in combination with other drugs) with "CALGB style" HDAraC failed to show a significant improvement in any survival endpoints. On the contrary, evidences tended to favor HDAraC in some trials: in Medical Research Council (MRC) AML 15 trial, halving dosage from 3 g/m 2 to 1.5 g/m 2 was associated with a strong trend towards a higher cumulative incidence of relapse 11 ; and a per protocol analysis in SAL AML 2003 trial showed an OS advantage in the single-agent HDAraC group 10 . With evaluating these individual trials, there are two different opinions on Ara-C dosage in consolidation for adult AML patients: (1) In consideration of the comparable therapeutic effect and less toxicity, the IDAraC in a dosage of 1-1.5 g/m 2 over 3 days with a cumulative dose of 6-18 g should be recommended to be a new standard 19 . (2) Due to lack of important and consistent improvements in outcome from existing evidences, HDAraC in a dosage of 2-3 g/m 2 over 3 days remains the standard for post-remission chemotherapy 5 . Nowadays, judgements on the standard post-remission chemotherapy do not reach consensus and the optimal dose of Ara-C remains unclear. Therefore, a network meta-analysis is needed to address this issue.
To the best of our knowledge, this is the first meta-analysis assessing the benefit and toxicity for different doses of Ara-C. Our results show that HDAraC in a dosage of 3 g/m 2 twice daily in consolidation chemotherapy can significantly prolong DFS by at least 13% when compared with lower-dose Ara-C (≤2 g/m 2 twice daily) for adult AML patients; and this advantage is focused on the patients with favorable cytogenetics, but not the other cytogenetics. Among the ten trials of our meta-analysis, SAL AML 96, Acute Leukemia French Association (ALFA) 9802 and Australasian Leukaemia and Lymphoma Group (ALLG) M7 trial used non-conventional doses of Ara-C which were distinct from others in induction therapy, so we did sensitivity analysis by excluding these three trials. Further, in Japan Adult Leukemia Study Group (JALSG) AML 201 trial, a single dose of 2 g/m² twice daily in consolidation was divided into LDAraC. However, relatively more frequent injections of Ara-C in this trial resulted in a cumulative dose of 60 g, which belonged to high cumulative dose range. We thus did sensitivity analysis by re-dividing this trial into HDAraC. In addition, we also made comparisons for the younger adults aged <65 in a sensitivity analysis ( Supplementary Figs S6-S8). All the three sensitivity analysis did not alter the DFS benefit of HDAraC.
In this meta-analysis, we found that HDAraC and IDAraC in consolidation chemotherapy were associated with higher risk of grade 3-4 non-haematological toxic effects when compared with LDAraC. However, importantly, we noticed no significant difference between HDAraC and IDAraC in terms of both grade 3-4 haematological and non-haematological toxic effects.
Our study has some advantages and important suggestions. First, rather than only comparing HDAraC with IDAraC or LDAraC in individual trials, our study included all the comparable randomized trials using different doses of Ara-C in consolidation within a single meta-analysis and compared these dosages simultaneously, achieving greater statistical power and avoiding potential selection bias. Second, RCTs included were multicenter, randomized phase III/IV trials performed at the national level by cooperative study groups, and these trials with generally high quality ensures reliability of the analysis results. Third, using Bayesian network methods, we compared dosages indirectly when head-to-head comparisons were insufficient and obtained precise estimates of effect by jointly evaluating direct and indirect comparisons. Fourth, we did several sensitivity analysis to test the robustness of results and the conclusion remains valid. Our synthesis of existing evidence provides useful information on clinical value of HDAraC, which should be reconsidered in clinical care and future research.
Potential limitations of our study should be noted. First, like most of the published meta-analysis, our analysis is based on the summary data from published literature rather than individual patient data, which limit the detail that can be captured regarding subgroups. We could not evaluate outcomes for clinically relevant subgroups other than cytogenetic risk. Therefore, our findings need to be considered as average effects. Second, there are few trials purely comparing different doses of Ara-C without other chemotherapeutic agents in consolidation 7,11 . The impact on other chemotherapeutic agents in our study could not be completely eliminated. As Ara-C is till now the most active compound in consolidation therapy, we believe that the other relevant agents performed in included RCTs played complementary roles in Ara-C based therapy. Thus, our estimates remain effective. Third, the reporting of toxic effects was incomplete and inconsistent in included trials, and thus we had to use imputed data as described in our results. Our meta-analysis on toxicity should be interpreted with some caution.
In conclusion, our meta-analysis shows that Ara-C in a dosage of 3 g/m 2 twice daily provides maximal therapeutic effect in consolidation chemotherapy for adult AML patients. Though it is associated with grade 3-4 non-haematological toxicity compared with low-dose Ara-C in a dosage <1 g/m 2 , the toxic difference between the doses of 3 g/m 2 and 1-2 g/m 2 is non-significant.

Methods
This study was reported according to preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines.
Ethics approval and consent to participate. Ethics approval for this network meta-analysis was not required.
Literature search and study selection. We underwent searches of PubMed, the Cochrane database and Embase, combing the search terms "cytarabine"; acut* and leukem*/leukaem*/leucem*/leucaem*/aml; myelo* or nonlympho* from January 1994 to June 2016 without language restriction. Two independent reviewers (W.D. and C.D.) conducted study selection based on the "PICOS" criteria (i.e., Patient, Intervention, Comparator, Outcome, Study design): • P: Adults aged 15 years or older and have newly diagnosed acute myeloid leukaemia (either de novo or secondary) or high-risk myelodysplastic syndrome. • I and C: Different doses of Ara-C performed in two or more arms in consolidation.
The trials that included only patients with acute promyelocytic leukaemia were excluded. We also searched for additional trials in the reference list of relevant reviews, meta-analysis and bibliographies in the discipline. Only the most updated or most inclusive data for a given study was included.
Data extraction and risk of bias assessment. Two reviewers (W.D. and C.D.) separately recorded trial design, entry criteria, patient characteristics, adequacy of induction therapy (regimens performed and percentage of patients achieved CR), Ara-C treatment in consolidation randomization, cumulative dose of Ara-C per course, follow-up and outcomes (disease-free survival, overall survival, grade 3-4 haematological and non-haematological toxic effects).
Risk of bias of individual trials were assessed independently by the same reviewers with the Cochrane risk of bias tool 20 . Conflicts were resolved by consensus. Statistical analysis. The primary outcome in our study was disease-free survival (DFS). Secondary pre-specified endpoints included overall survival (OS), treatment-related grade 3-4 haematological, infection and other non-haematological toxic effects. These outcomes were defined in accordance with the revised International Working Group criteria for the therapeutic trials in AML 21 . We measured hazard ratios (HRs) for time-to-event outcomes (DFS and OS) and odds ratio (ORs) for dichotomous data (grade 3-4 toxic effects). When HRs were not explicitly provided, we estimated them according to the method detailed by Tierney and colleagues 22 .
Two types of meta-analysis were conducted. First, standard pairwise comparisons were built with STATA 12.0 (STATA Crop., College Station, TX, USA). Both fixed and random effect models were reported. In all the comparisons, we used fixed effect models if the heterogeneity across trials was not significant (a P value < 0.10 in χ 2 test or an I 2 < 50% in I 2 metric); otherwise, we explored the heterogeneity and the random effect models were used 23 . Second, mixed network comparisons were built with WinBUGS 1.4.3 (MRC Biostatistics Unit, Cambridge, UK), allowing for the combination of direct and indirect evidence into a combined overall point estimate. Treatment effects were estimated by posterior means with corresponding 95% credible intervals (CrIs), which are the Bayesian analog of the 95% confidence intervals (CIs) 24 . Both fixed and random effect models were applied with non-informative uniform and normal prior distributions, yielding 50,000 iterations with a burn-in number of 10,000 iterations and a thin interval of 50 to obtain the posterior distributions of the model parameters 25 . Then the deviance information criterion (DIC) statistics were used to compare the two models: the effect Figure 5. Network meta-analysis for haematological toxic effects, infection and other non-haematological toxic effects. Upper triangles denote pooled hazard ratios (ORs). The column dose range is compared with the row dose range. In each cell, the first and second line used fixed-effect and random-effect model. Numbers in parentheses indicate 95% credible intervals. HRs with Bayesian p value < 0.05 are in red. Lower triangles denote the Bayesian deviance information criterion (DIC) statistics from the fixed-and random-effects models. Cumulative probabilities of each dose range ranking first, second and third best based on the corresponding effect-models with lower DIC values. model with relatively lower DIC value indicated lower heterogeneity across trials and a simpler model, and the corresponding results were chosen for summary estimation 26 . Convergence of iterations was evaluated according to Gelman-Rubin-Brooks statistic 27 . The probability of each treatment in the ranking was evaluated based on its posterior probabilities, which depended on counting the proportion of iterations in the Markov chain of HR or OR ranking in the treatments 28,29 . Results from network meta-analysis were compared with standard pairwise meta-analysis to evaluate whether there was inconsistency. Node-splitting analysis was also applied to evaluate inconsistency for closed loops in the network 30,31 . Significant inconsistency was indicated if node-splitting analysis derived P < 0.05 of disagreement between direct and indirect evidence.
In subgroup analysis, we assessed DFS and OS benefit for cytogenetic risk subgroups: the favorable, intermediate and unfavorable cytogenetic risk patients, which were classified by cytogenetic abnormalities 32,33 . The robustness of main findings was further tested by additional sensitivity analysis. We looked for systematic difference in induction and consolidation strategies.
Publication bias could not be formally evaluated because of the small number of studies included in each direct comparison. Although the potential for this bias is real given the small number of studies and the for-profit interest, we judged that this concern was not likely to decrease certainty in the evidence.