Mutations in DNMT3A, U2AF1, and EZH2 identify intermediate-risk acute myeloid leukemia patients with poor outcome after CR1

Intermediate-risk acute myeloid leukemia (IR-AML) is a clinically heterogeneous disease, for which optimal post-remission therapy is debated. The utility of next-generation sequencing information in decision making for IR-AML has yet to be elucidated. We retrospectively studied 100 IR-AML patients, defined by European Leukemia Net classification, who had mutational information at diagnosis, received intensive chemotherapy and achieved complete remission (CR) at Cleveland Clinic (CC). The Cancer Genome Atlas (TCGA) data were used for validation. In the CC cohort, median age was 58.5 years, 64% had normal cytogenetics, and 31% required >1 induction cycles to achieve CR1. In univariable analysis, patients carrying mutations in DNMT3A, U2AF1, and EZH2 had worse overall and relapse-free survival. After adjusting for other variables, the presence of these mutations maintained an independent effect on survival in both CC and TCGA cohorts. Patients who did not have the mutations and underwent hematopoietic cell transplant (HCT) had the best outcomes. HCT improved outcomes for patients who had these mutations. RUNX1 or ASXL1 mutations did not predict survival, and performance of HCT did not confer a significant survival benefit. Our results provide evidence of clinical utility in considering mutation screening to stratify IR-AML patients after CR1 to guide therapeutic decisions.


Introduction
Acute myeloid leukemia (AML) is a heterogeneous disease characterized by impaired differentiation and increased proliferation of myeloid progenitors. With the widespread use of high-throughput sequencing techniques, AML has been genetically characterized as a complex polyclonal disease with multiple somatically acquired driver mutations and disease evolution over time 1 . In 2013, the Cancer Genome Atlas (TCGA) Research Network profiled the genomes of 200 adult de novo AML patients and identified 23 commonly mutated genes, which were classified into 9 categories 2 . Subsequently, a more comprehensive study of the driver mutation landscape in AML enabled a full genomic classification with nonoverlapping subgroups 3 . These and multiple other studies have yielded important prognostic information and has led cytogenetically normal AML with NPM1 or biallelic CEBPA mutations in the absence of FLT3-ITD to be placed in a favorable risk category 4 . Moreover, patients with RUNX1, ASXL1, and TP53 mutations have recently been added to the adverse risk group in the European Leukemia Net (ELN) 2017 classification 5 .
Despite the increase in our knowledge of the biology of AML, treatment algorithms have not changed substantially over the last 40 years 6 . Regardless of their mutations and risk stratification, the majority of eligible patients receive intensive induction chemotherapy, with a primary goal of achieving complete remission (CR). This is followed by post-remission therapy tailored according to risk profile, whereby chemotherapeutic consolidation is preferred in favorable-risk AML and allogeneic hematopoietic cell transplant (HCT) is favored in poor-risk AML 5 . The optimal post-remission therapy for intermediate-risk AML, which comprises nearly half of all cases, is debated. Many are evaluated for HCT at the time of CR1 7 , with reduced intensity conditioning (RIC) approaches now available for use in older populations or those with co-morbidities [8][9][10][11] . The prognostic role of certain mutations within intermediate-risk AML is unclear, but might aid in decision making for this clinically heterogeneous patient population.
The use of mutational data to inform clinical practice is an active area of research. In a large study of 664 AML patients treated on two phase 3 trials conducted by the German AML Cooperative Group, mutations in RUNX1, SRSF2, U2AF1, and SF3B1 were found to be independent risk factors for achievement of CR1 12 . In addition, multivariable analysis of a smaller cohort of intermediate-risk AML patients (defined by cytogenetics alone) identified ASXL1 and FLT3-ITD as factors predicting lower chances of achieving CR1 13 . These mutations, along with a few others (e.g., DNMT3A, TP53) are known to be associated with poor overall survival (OS) 14,15 . However, whether these mutations retain their predictive values after achievement of CR1, at which time the clones harboring these mutations might have already been eradicated, is unknown. Furthermore, the utility of this genetic information in decision making for intermediate-risk AML patients who have achieved CR1 has not been determined. Therefore, we set out to investigate the clinical relevance of recurrent driver gene mutations in a well-characterized cohort of intermediate-risk AML patients who were homogenously treated at a single center and achieved CR1. The analysis aims to re-visit the predictors of outcome at CR1, and identify mutations that portend poor prognosis, hence may help select patients who may benefit from HCT.

Patient cohorts and study eligibility Cleveland clinic (CC) cohort
The diagnosis of AML was made or revised according to the 2016 WHO criteria 16 . Cytogenetic analysis was performed on metaphases from bone marrow aspirates taken at diagnosis, and the risk was ascribed by ELN 2010 criteria as intermediate-I or -II 17 . Namely, patients who had normal cytogenetics and harbored NPM1 or biallelic CEBPA mutations in the absence of FLT3-ITD were excluded. A total of 1589 AML patients treated at the Cleveland Clinic between 2002 and 2016 were screened for eligibility, of whom 825 were in ELN intermediate-risk category, and 355 received intensive induction chemotherapy.
We retrospectively analyzed 100 intermediate-risk AML patients who had available pretreatment myeloid mutational data and achieved CR1 after intensive chemotherapy. Patients who did not receive intensive therapy or achieve CR after 1 or more lines of induction (i.e., primary refractory) were excluded.
The induction regimen was 7 + 3 (i.e., 7 days of 100 mg/ m 2 cytarabine plus 3 days of anthracycline), and patients who had persistent disease at day 14 marrow were reinduced with 7 + 3 or 5 + 2 (i.e., 5 days of cytarabine plus 2 days of anthracycline). All patients achieved CR1 with or without count recovery. CR was defined as less than 5% bone marrow blasts with evidence of normal maturation of other marrow elements, no peripheral blast cells or extramedullary disease, peripheral blood neutrophil counts above 1 × 10 9 /L, and platelet counts above 100 × 10 9 /L. Complete response with inadequate count recovery (CRi) was defined as a response meeting the criteria of CR, except for residual neutropenia and/or thrombocytopenia 5 . Induction therapy was followed by consolidation chemotherapy (i.e., high or intermediate dose cytarabine) or allogeneic HCT. Clinical data and patient samples were collected prospectively with patient consent and approved by the Institutional Review Board in accordance with the Declaration of Helsinki.

TCGA cohort
A large publicly available database of 200 clinically annotated adult de novo AML patients with genomic and epigenomic profiling was created in 2013 2 . The study analyzed patients from a single institution tissue banking protocol in Washington University and samples were selected to represent morphologic and cytogenetic subtypes of AML. For external validation of our findings, we studied 48 ELN intermediate-risk AML patients from this cohort, who received intensive induction chemotherapy and achieved CR1. Patients who were unfit for intensive chemotherapy, and those with primary refractory disease and acute promyelocytic leukemia were excluded.

Sample processing, DNA sequencing and mutation analysis
Multi-amplicon deep sequencing was performed as previously described 18 . DNA was extracted from bone marrow or peripheral blood mononuclear cells collected at the time of diagnosis. In 10 patients, subsequent serial samples were also studied. A TruSeq Custom Amplicon panel (TSCA; Illumina, San Diego, CA, USA) targeting coding exons of 62 genes with available evidence in myeloid neoplasms was used for deep sequencing (Supplementary Table). For germline confirmation, mutations were analyzed in non-clonal CD3 positive T cells whenever DNA was available. Bidirectional sequencing was performed by standard techniques using an ABI 3730xl DNA analyzer (Applied Biosystems, Foster City, CA, USA). GATK3.3 pipeline was used to extract putative variants, following recommended best practices for variant discovery. Variants with at least 10 positive reads and variant allelic frequency of 5% were prioritized for further processing and annotation. Generated VCF files were used as an input for Annovar and were annotated with multiple databases (dbSNP138, COSMIC, ExacDb). Variants found in ExacDb with allelic frequency 40.0001 were excluded. Variant allelic frequencies (VAFs) of mutations were adjusted according to the zygosity and copy number confirmed by single nucleotide polymorphism (SNP)-array. In addition, mutations in NPM1, FLT3, and CEBPA were also tested using standard methods. The sequencing method for patients in TCGA database is described previously 2 .

Statistical analysis
Data were presented with percentage proportions for categorical variables and medians for continuous variables. Comparison of the distribution of categorical variables was carried out with either χ 2 -test or Fisher's exact test. Comparison of numerical variables between groups was performed using Wilcoxon rank sum test. Whisker plot boxes denote median and 25th and 75th percentiles, and ends of the whiskers display minimum and maximum values. Cox proportional hazards regression was used for univariable and multivariable analysis to identify the impact of clinical and genomic variables on survival outcomes. Data were presented with hazard ratios (HR) and 95% confidence intervals (CI). Overall survival (OS) was defined as the time from diagnosis until death or the last follow-up. Patients who were alive were censored at the last follow-up date. Relapse-free survival (RFS) was calculated from the time point of CR1 until the time of relapse or death or the last follow-up. Patients without relapse or death at last follow-up were censored. The survival analysis was based on Kaplan-Meier method and the log-rank test was used to compare the curves. P values were two-tailed and considered significant when o0.05. All analyses were performed using JMP software v.12.2.0 (SAS Inc. Cary, NC, USA).

Patient characteristics
Clinical characteristics of patients are summarized in Table 1. In the CC cohort, the median age was 58.5 years (range, 24 to 75 years). Forty-eight percent were female, 76% had de novo AML, 19% had secondary AML after an antecedent hematologic disorder (sAML), and 5% had treatment-related AML (tAML). Since all patients in the TCGA cohort had de novo AML, there were more patients with ELN intermediate-I risk category (i.e., normal cytogenetics) as compared to the CC cohort (81 vs 64%, p = 0.007). Concomitantly, patients in the TCGA cohort had a higher WBC count and bone marrow blast percentage at diagnosis. In the CC cohort, initial 7 + 3 induction chemotherapy resulted in CR in 58%, CRi in 11%, and persistent disease in 31% of patients based on day 14 bone marrow biopsy. Of the 31 patients with persistent disease, 20 were re-induced with 5 + 2, and 11 received 7 + 3. Best response achieved was CR in 23 patients and CRi in 8 patients. The number of patients who received HCT and the time of HCT were not significantly different between cohorts. However, there were more patients who underwent haploidentical HCT in our cohort, and 13% of the transplanted patients in the TCGA cohort received autologous HCT.
The median OS and RFS of the CC cohort were 24 (range, 2 to 108 months) and 14 months (range, 1 to 70 months), respectively. Sixty patients remained in remission after a median follow-up of 14.5 months (range, 2 to 108 months). The survival outcomes of the TCGA cohort were similar to ours with a median OS of 24.4 (range, 2.2 to 118.1 months) and median RFS of 13.4 months (range, 1.7 to 77.3 months).
Outcomes of patients with DNMT3A, U2AF1, and EZH2 mutations based on the performance of HCT Based on the finding that mutations and HCT were the major predictors of outcome after CR1, we analyzed the outcomes of patients harboring mutations in relation to HCT (Fig. 4). In both CC and TCGA cohorts, patients who did not have the mutations and underwent HCT had the best outcomes. In the CC cohort, the median OS and RFS for these patients were 45 months and not reached, respectively, as compared to 22 and 13 months for patients who did not undergo HCT (p = 0.05 for OS and p = 0.002 for RFS). In the TCGA cohort, performance of HCT did not have a significant impact on OS and RFS in patients without these mutations. However, HCT improved outcomes for patients who had these mutations in both cohorts. In the CC cohort, median OS and RFS for these patients who had HCT were 14 and 11 months, respectively, as compared to 7.5 and 5 months in patients who did not undergo HCT (p = 0.04 for OS and p = 0.002  for RFS). In the TCGA cohort, patients who harbored mutations and did not undergo HCT had significantly worse OS (median, 6.3 vs 24.6 months, p = 0.001) and RFS (median, 7.8 vs 12.5 months, p = 0.003), while there were no statistically significant differences between survival of patients with mutations who underwent HCT and patients who did not have the mutations. In view of the recently proposed ELN 2017 criteria that place intermediate-risk patients with RUNX1 or ASXL1 mutations into the adverse risk category, which leads to the recommendation of HCT for this group, we explored outcomes of these patients at the time of CR1 (Supplementary Figure 1). Presence of these mutations did not predict OS and RFS in our cohort, and performance of HCT in mutated patients did not confer a statistically significant survival benefit.

Driver mutation dynamics and clinical course
In light of the above results, we next evaluated the serially collected samples from 10 patients (Fig. 5). The serial sequencing data reveal interesting and informative aspects about the clonal evolution of AML. The majority of patients harbored 41 clone at diagnosis, which were eradicated with intensive therapy. For example, patient 1 had DNMT3A and IDH2 mutations at VAF 4 40%, which disappeared after 7 + 3. The patient underwent HCT in CR1 and was disease-free at 25 months follow-up. Patients 2 and 3 had U2AF1 mutated clones, which were absent following HCT in CR1. However, both of them relapsed shortly after HCT and died at 12 and 14 months follow-up, respectively. Additionally, patient 4 harbored three different mutations, which did not predict survival in this study, and they were eradicated with induction chemotherapy. However, the patient had evidence of an emerging DNMT3A clone, and relapsed 9 months after achieving CR1. Of interest, patient 5 had clones with FLT3-ITD, DNMT3A, RUNX1, and WT1 mutations, which persisted after chemotherapy and increased in size at the time of relapse. This patient died 2 months after the second sampling. Finally, patient 6 had a dominant BCORL1 clone, which disappeared with therapy, and relapsed with the emergence of new ASXL1 and TET2 mutations. The patient achieved CR3 after relapse therapy and was disease-free at 40 months follow-up. The evidence of clonal evolution for additional three patients is summarized in Supplementary Figure 2. We could not demonstrate driver mutations at diagnosis and during follow-up in one patient.

Discussion
The optimal post-remission consolidation therapy for intermediate-risk AML has been debated for years due to the clinical heterogeneity in outcomes of this group. These patients were sub-categorized based on normal vs abnormal cytogenetics (e.g., trisomy 8, MLLT3-MLL rearrangement) as intermediate-I and -II, respectively, but subsequent studies demonstrated no survival difference in between, which has led to the lumping of two groups in the recently proposed classification 5,20 . Considering the insufficiency of cytogenetic data to distinguish the differences in outcomes of this group, we hypothesized that driver mutations may have an effect in predicting outcomes of intermediate-risk AML and aid in clinical decision making with a particular focus on performance of HCT. Since achieving CR after intensive induction is a prerequisite for long-term survival in AML and 1-year mortality rate is 475% for primary refractory cases 21 , the present study focused on patients who achieved CR with 1 or more induction courses. This strategy enabled a more accurate assessment of the impact of HCT vs consolidation with chemotherapy on outcomes of patients stratified by mutational status. In two independent cohorts, we demonstrated that mutations in DNMT3A, U2AF1, and EZH2 genes were independent predictors of relapse and OS in patients who achieved CR, and performance of HCT in this group translated into a significant survival benefit.
One major difference between the CC and TCGA cohorts was that all patients in the latter had de novo AML, which accounted for higher WBC count, bone marrow blast percentage, and percentage of patients with normal cytogenetics in TCGA cohort. Donor sources were also slightly different for transplanted patients. However, none of these differences were found to predict survival in our univariable analysis. The cohorts were otherwise similar in terms of demographics, numbers of transplanted patients, and median survival durations.
The prognostic significance of DNMT3A and U2AF1 mutations in AML has been shown in a few other studies 12,22,23 . DNMT3A mutations in patients o 60 years and U2AF1 mutations were independently associated with a lower CR rate after induction therapy 12 . In a large cohort of AML patients reported by Ley, et al. 15 , DNMT3A mutation was an independent predictor of poor survival in intermediate-risk cytogenetics group. Additionally, EZH2 inactivation was associated with poor prognosis in myelodysplastic syndromes, while its role in AML remains unclear 24 . In this study, patients with mutated EZH2 had significantly shorter OS and RFS, and a model integrating these three mutations could stratify intermediate-risk AML patients into two risk groups both in the CC and TCGA cohorts. Furthermore, eradication of these clones with induction chemotherapy, followed by HCT led to long-term survival in patients with available serial sequencing data. On the contrary, emergence or persistence of these mutations was associated with worse outcomes. Therefore, assessments of the mutations at diagnosis and during follow-up might offer an opportunity to improve prognostication and management of intermediate-risk AML. While patients with mutations who underwent HCT had improved OS and RFS, the median survival was still shorter than transplanted patients without mutations, indicating the importance of developing further therapies targeting these clones.
Integration of the accumulating information on gene mutations, cytogenetics, and other markers into risk stratification and management algorithms is a critical yet challenging task. Since the demonstration of favorable prognosis associated with NPM1 and biallelic CEBPA mutations, they have been incorporated into prognostic models and routinely tested in all patients at the time of diagnosis 4,17 . HCT is no longer recommended at CR1 for patients with these mutations, who were once classified in intermediate-risk category and transplanted without additional survival benefit. With rigorous efforts towards better risk-stratification in intermediate-risk group, several studies reported adverse outcomes in patients with RUNX1 and ASXL1 mutations [25][26][27][28][29][30] . These reports convinced the ELN panel and led to the re-classification of intermediate-risk patients harboring these two mutations in the poor-risk group 5 . However, when we stratified our cohort based on RUNX1/ASXL1 mutational status, no survival difference was appreciated. More importantly, performance of HCT did not extend survival at a significant level. These results might be attributed to two factors: First, our cohort was selected for patients who achieved CR and mutations in RUNX1/ASXL1 are known to be associated with refractory disease. Therefore, the poor prognostic impact of these mutations might disappear with the achievement of CR. The lack of survival difference, as well as no survival benefit in transplanted patients, might be due to a relatively lower number of patients enrolled. However, an important biological characteristic reported by multiple studies is that the mutations in RUNX1 and ASXL1 tend to co-occur, but are mutually exclusive with mutations in NPM1 and CEBPA 2,3,12 . While previous reports have shown independent prognostic value of RUNX1 and ASXL1, some focused only on cytogenetically normal AML without adjusting for NPM1 and biallelic CEBPA in multivariable models 26,29 , while in other reports, the impact on OS was lost when adjusted for these mutations 25,27,30 . Therefore, the survival differences in these studies are skewed, as RUNX1/ASXL1 mutated intermediate-risk patients were compared with those who had favorable NPM1 and CEBPA mutations. Based on the results of present study, and highlighted issues in previous reports, the utility of RUNX1 and ASXL1 mutations in stratifying intermediaterisk AML is debated and further investigation of HCT with these mutations is warranted.
There are potential limitations in our work, mainly related to the retrospective nature of this study. These include missing sequencing data and samples in a proportion of patients, different types of transplantation and conditioning regimens. Furthermore, in the absence of a matched control sample, distinguishing somatic and germline variants is challenging. Despite these limitations, clinical and molecular data were available in the majority of original patient population, and the results were successfully validated in an external cohort. In addition, the landscape of truly somatic mutations in analyzed genes has been well defined from large-scale genomic studies, which allowed us to make confident predictions. Finally, comparisons of HCT should be treated with caution, since no statistical method can adjust for the unmeasured selection factors involved in a retrospective analysis.
Collectively, our results provide evidence of clinical utility in considering mutation screening to stratify intermediate-risk AML patients after CR1 to guide therapeutic decisions. Mutations in DMT3A, U2AF1, and EZH2 might be useful to select patients who would benefit from HCT. On the contrary, RUNX1 and ASXL1 mutations were not as useful to predict patients with poor prognosis. A prospective validation of our findings is needed and we believe that our results may contribute to improving prognostication of patients with AML and the design of clinical trials.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.