Prognostic value of measurable residual disease monitoring by next-generation sequencing before and after allogeneic hematopoietic cell transplantation in acute myeloid leukemia

Given limited studies on next-generation sequencing-based measurable residual disease (NGS-MRD) in acute myeloid leukemia (AML) patients after allogeneic hematopoietic stem cell transplantation (allo-HSCT), we longitudinally collected samples before and after allo-HSCT from two independent prospective cohorts (n = 132) and investigated the prognostic impact of amplicon-based NGS assessment. Persistent mutations were detected pre-HSCT (43%) and 1 month after HSCT (post-HSCT-1m, 20%). All persistent mutations at both pre-HSCT and post-HSCT-1m were significantly associated with post-transplant relapse and worse overall survival. Changes in MRD status from pre-HSCT to post-HSCT-1m indicated a higher risk for relapse and death. Isolated detectable mutations in genes associated with clonal hematopoiesis were also significant predictors of post-transplant relapse. The optimal time point of NGS-MRD assessment depended on the conditioning intensity (pre-HSCT for myeloablative conditioning and post-HSCT-1m for reduced-intensity conditioning). Serial NGS-MRD monitoring revealed that most residual clones at both pre-HSCT and post-HSCT-1m in patients who never relapsed disappeared after allo-HSCT. Reappearance of mutant clones before overt relapse was detected by the NGS-MRD assay. Taken together, NGS-MRD detection has a prognostic value at both pre-HSCT and post-HSCT-1m, regardless of the mutation type, depending on the conditioning intensity. Serial NGS-MRD monitoring was feasible to compensate for the limited performance of the NGS-MRD assay.


Introduction
Disease relapse remains the major cause of treatment failure in acute myeloid leukemia (AML) treated with allogeneic hematopoietic stem cell transplantation (allo-HSCT) 1 . Measurable residual disease (MRD) detection is highly valuable in predicting relapse and survival in AML patients in complete remission (CR) 2,3 . However, realtime quantitative polymerase chain reaction-based detection has limited applicability for some targets, and there are difficulties in the standardization of multiparametric flow cytometry. Thus, novel means of detecting MRD that can be standardized and applied to all AML patients are needed. In this context, next-generation sequencing (NGS) enables reliable detection of patientspecific mutations covering complete genes at both the time of diagnosis and CR 4 .
However, recent studies have demonstrated not only the consistent prognostic value of NGS-based MRD (NGS-MRD) detection, but also its limitations related to its sensitivity and specificity and its inability to correctly discriminate between residual leukemia and clonal hematopoiesis [5][6][7][8][9][10][11][12][13] . In earlier studies, NGS-MRD assessments were performed after high-dose induction treatment, which may be a suitable approach for selecting the appropriate consolidation treatment 8,9,12 . A few studies have shown that NGS-MRD detection in the setting of allo-HSCT can help predict clinical outcomes 7,10,11,13,14 . However, the NGS-MRD assessments in those studies were generally only performed pre-transplantation, and no NGS-MRD data after transplantation were collected except for one study 10 . Moreover, the results were discordant, possibly due to differences in cohorts, sample sources (bone marrow (BM) or peripheral blood (PB)), NGS techniques, the definition of MRD positivity, and strategies for mutations related to clonal hematopoiesis (i.e., clonal hematopoiesis of indeterminate potential (CHIP)). These mutations include: DNMT3A; TET2; ASXL1 (DTA); IDH2; IDH1; and other less prevalent mutations, such as JAK2, CBL, SRSF2, and SF3B1 7,11,13,14 . Thus, the prognostic value of NGS-MRD in the setting of allo-HSCT in AML is yet to be fully elucidated. In addition, the clinical impact of dynamic changes in persistent mutations before and after allo-HSCT has not been clarified to date. The influence of different transplant strategies, such as that of the conditioning intensity, also needs to be properly evaluated in prospective studies.
Thus, this study aimed to investigate the role of NGS-MRD detection in the setting of allo-HSCT to ultimately elucidate the optimal time points, cutoff values, candidates, role of DTA or CHIP, and influence of transplant strategy. Towards this goal, we collected samples and clinical data from two independent prospective cohorts and longitudinally tracked clonal changes before and after allo-HSCT.

Study design and patients
This study evaluated 146 patients with AML who underwent allo-HSCT at CR in two independent prospective cohorts in the Catholic Hematology Hospital between 2013 and 2018. The inclusion criteria were age over 19 years and availability of BM DNA both at diagnosis and in CR before allo-HSCT. Cohort 1 included patients who received transplants from matched unrelated donors and haploidentical familial donors. Cohort 2 included patients who received transplants from similar donors to those in Cohort 1 plus transplants from matched sibling donors. The patient inclusion process is shown in Supplementary Fig. S1.
Samples were obtained at three time points: (1) the time of diagnosis, pre-HSCT (before conditioning therapy, median: 27 days before transplantation, range: 10-42 days), (2) post-HSCT (1, 3, and 6 months and yearly thereafter), and (3) at relapse. Among the 132 (90%) patients who had somatic mutations at diagnosis, 114 (86%) had available BM DNA at 1 month after allo-HSCT (post-HSCT-1m, median: 28 days after transplantation, range: 26-30 days). Cohort 1 had a higher incidence of CR2, and Cohort 2 included more elderly patients. No other pre-transplant characteristics significantly differed between the two cohorts (Supplementary Table S1). The treatment courses and transplantation procedures were performed as previously described 15 . The Institutional Review Board of the Catholic Medical Center approved the current study. Informed consent was obtained from all subjects and all analyses were performed according to the Institutional Review Board guidelines and the tenets of the Declaration of Helsinki. Cohorts 1 and 2 were registered at ClinicalTrials.gov (#NCT01751997) and the Clinical Research Information Service (#KCT0002261), respectively.

NGS-MRD detection
NGS analysis was performed using St. Mary's customized NGS panel for acute leukemia, i.e., the "SM Acute leukemia panel." Ion AmpliSeq Technology (Thermo Fisher Scientific) was used to amplify 67 genes (Supplementary Table S2) using an Ion Chef™ system (Thermo Fisher Scientific) and an Ion S5 XL Sequencer (Thermo Fisher Scientific) 16 .
Annotated variants were classified into four tiers according to the Standards and Guidelines of the Association for Molecular Pathology 17 . Bioinformatics analysis was carried out using both customized and manufacturerprovided pipelines. Variants were selected and annotated using analytics algorithms and public databases 18 . Subsequently, trackable somatic mutations specific to each patient were selected. For NGS-MRD, we carefully inspected the mutations and determined the residual variant allele fraction (% VAF), which was calculated by dividing the number of mutant sequencing reads with the number of total sequencing reads. All mutations were manually verified using the Integrative Genomic Viewer 19 . Across all time points, the mean of on-target reads, depth of on-target regions, and uniformity were 99.4%, 2406×, and 96.9%, respectively. Details of the quality control matrices are summarized in Supplementary Table S3.

Statistical analysis
Categorical variables were compared using Chi-square test or Fisher's exact test while continuous variables were analyzed with the Mann-Whitney U test. Overall survival (OS) and disease-free survival (DFS) curves were plotted using the Kaplan-Meier method and analyzed with the log-rank test. Cumulative incidence was used to estimate the probability of cumulative incidence of relapse (CIR) and non-relapse mortality (NRM), to treat non-relapse death and relapse as competing risk factors for relapse and NRM, respectively. Cumulative incidence was compared across groups using the Gray test. Results were expressed as the hazard ratio with a 95% confidence interval (95% CI). For multivariate analysis, variables with a p-value of <0.10 in the univariate analysis were entered into a Cox proportional hazards model or proportional hazards model for a sub-distribution of competing risk factors. A detailed description is provided in the Supplementary materials. All statistical analyses were performed using SPSS, version 13.0 (SPSS, Inc., Chicago, IL) and Rsoftware (version 3.4.1, R Foundation for Statistical Computing, 2017).

Patient characteristics
Overall, 132 pre-HSCT and 114 post-HSCT-1m patients underwent NGS-MRD. Persistent mutations were detectable in 43% and 20% of pre-HSCT and post-HSCT-1m samples, respectively. Table 1 describes the characteristics of the patients with or without persistent mutations at each time point. Persistent pre-HSCT mutations were more frequent in Cohort 2 than in Cohort 1, whereas there was no significant between-group difference in the rate of persistent post-HSCT-1m mutations. No significant differences in patient characteristics were observed according to the presence of persistent mutations at each time point, except for older age at pre-HSCT and a greater incidence of CR2 at post-HSCT-1m in patients with persistent mutations.

Landscape of somatic mutations and dynamic changes in allelic burden at diagnosis and during the peri-transplant period
The genetic landscapes of all patients are shown in Fig. 1a and Supplementary Fig. S2. We detected a total of 389 somatic mutations in 47 genes of the 132 patients, with a median of 3 mutations (interquartile range, IQR: 2-4 mutations) per patient. The most common somatic mutation was in CEBPA, followed by that in DNMT3A, NPM1, and NRAS. The median VAF of mutations in the initial samples was 34.39% (IQR: 10.80-45.87%). In paired pre-HSCT samples, we detected 97 mutations in 57 patients, including 90 mutations detected in initial samples and 7 CR-specific mutations not present in initial samples. The median VAF of mutations in the pre-HSCT samples was 2.69% (IQR: 0.38-16.36%). We next analyzed post-HSCT-1m mutations and detected 26 mutations in 23 patients, with the most common being mutations in DNMT3A and TET2. The median VAF of mutations in the post-HSCT-1m samples was 0.19% (IQR: 0.13-0.60%). We observed a significant reduction in VAF from diagnosis, pre-HSCT, to post-HSCT-1m (Fig. 1b). Particularly, allo-HSCT had a significant impact on the remaining pre-HSCT mutations, clearing additional DNMT3A (16/ 19, 84%), TET2 (13/14, 71%), and ASXL1 (2/2,100%) mutations. Summaries of the MRD clearance rate of each mutation are shown in Supplementary Table S4. By molecular pathway, chromatin/cohesion, DNA methylation, and RNA splicing had lower mutational clearance at pre-HSCT. They were further cleared (over 80% clearance) at post-HSCT-1m.
We next classified patients into three groups according to pre-HSCT and post-HSCT-1m NGS-MRD status as follows (Fig. 2b): persistent MRD positivity group (n = 21), negative conversion of MRD positivity group (n = 30), and persistent MRD negativity group (n = 61). The risk of relapse was greatest in the persistent MRD positivity group and least in the persistent MRD negativity group (Fig. 2c). Survival analysis also showed significantly different DFS and OS among the three groups (Fig. 2d). This was supported by the results of the multivariate analysis (Supplementary Table S7). Of the two patients showing positive conversion of post-HSCT-1m NGS-MRD, one died of relapsed AML.

Effects of conditioning intensity on the prognostic value of NGS-MRD detection at each time point
Given the significant differences in patient age and transplant-related characteristics (Supplementary Table  S8) and the different degrees of dependence of transplant outcomes on graft-versus-leukemia effects according to conditioning intensity, we evaluated the impact of NGS-MRD at each time point according to conditioning intensity. In patients who received myeloablative conditioning (MAC, n = 58), pre-HSCT NGS-MRD detection was significantly associated with post-transplant relapse  ( Fig. 4a). However, there was no difference in relapse according to post-HSCT-1m NGS-MRD status. This may be partially attributable to the higher rate of NRM in NGS-MRD-positive patients than that in NGS-MRDnegative patients (Fig. 4b). In contrast, in patients who received reduced-intensity conditioning (RIC, n = 74),  post-HSCT-1m NGS-MRD detection was significantly associated with post-transplant relapse, while there was no difference in relapse according to pre-HSCT NGS-MRD status (Fig. 4c). There was no difference in NRM according to NGS-MRD status at each time point in the RIC group (Fig. 4d). Consequently, in the MAC group, survival was significantly worse in the pre-HSCT NGS-MRD-positive patients than that in the NGS-MRDnegative patients. In the RIC group, post-HSCT-1m NGS-MRD-positive patients had worse survival than the NGS-MRD-negative patients (Supplementary Fig. S6).

Clonal dynamics of mutations including later clearance and evolution after transplantation
We found that 67% (n = 38) and 52% (n = 12) of NGS-MRD-positive patients at pre-HSCT and post-HSCT-1m did not experience relapse, respectively (Supplementary Table S9). Among the 38 patients who were NGS-MRD positive at pre-HSCT, 23 (60.5%) converted to being NGS-MRD negative at post-HSCT-1m. The negative conversion rate did not significantly differ according to conditioning intensity (MAC vs. RIC: 58% vs. 65%). Of the 12 (28.9%) patients with persistent mutations at post-HSCT-1m, 7 (58%) had DTA mutations. We performed an NGS-MRD assay on BM samples taken 3 months after transplantation in 11 of these 12 patients. Ten patients became NGS-MRD negative, whereas one patient still had a persistent DNMT3A mutation (Fig. 1c).
BM samples at relapse were available in 17 patients (Supplementary Table S10). Most (16/17, 94%) of these patients had some or all of the same mutations at the time of both diagnosis and relapse. Longitudinal tracking revealed the appearance of detectable mutations at 2 or 3 months before relapse in three patients (#87, Fig. 1d; #89, Fig. 1e; #116, Fig. 1f). In addition, one patient (#90) with a KRAS mutation at initial diagnosis showed a DNMT3A mutation at post-HSCT-1m, which was thought to be of donor origin (Fig. 1g). The VAF of the DNMT3A mutation was markedly decreased at relapse, while three clonal mutations of the FLT3, NRAS, and PTPN genes had evolved at 29 months post-transplant.

Discussion
We evaluated prognostic value of NGS-MRD assay in AML patients who underwent allo-HSCT at CR in two independent prospective cohorts. NGS-MRD detection has a prognostic value at both pre-HSCT and post-HSCT-1m, in each cohort, irrespective of mutation type, including DTA or CHIP mutations. Notably, we demonstrated that the prognostic impact of detectable mutations at each time point depended on the conditioning intensity and provided evidence for the benefit of serial NGS-MRD monitoring after allo-HSCT.
There is limited evidence on the prognostic value of dynamic changes of mutational clones detected by NGS-MRD assay in the setting of allo-HSCT in AML. In this study, mutational dynamics by NGS-MRD assay before and after transplantation showed a profound decrease in VAFs, but a relatively high persistence of DTA and CHIP mutations. However, most remaining pre-HSCT mutations, even DTA and CHIP mutations, disappeared after allo-HSCT. Any persistent mutations at pre-HSCT and post-HSCT-1m were significantly associated with posttransplant relapse and worse survival. Moreover, changes in MRD status from pre-HSCT to post-HSCT-1m enabled further identification of patients at higher risk for relapse and worse survival. These investigations including dynamic changes in NGS-MRD status are distinct from previous reports for the NGS-MRD assay in the setting of allo-HSCT, which contained no posttransplantation data, suggesting prognostic value of NGS-MRD at pre-HSCT. One study emphasized the prognostic value of post-transplant NGS-MRD (at 21 days after allo-HSCT) rather NGS-MRD at pre-HSCT 10 . Given those discordant data and limits of previous studies, such as retrospective nature of smaller cohorts data 7,11,13 or the use of PB than BM at single time point 11,14 , the reliability of our data was supported by consistent results in two independent cohorts and use of BM for NGS-MRD assay.
Persistent DTA mutations are considered to be due to clonal hematopoiesis rather than residual leukemia. They have limited prognostic value after high-dose induction treatment 8,9,12 . Given the discordant findings on the role of persistent DTA mutations at pre-HSCT and the scarcity of information on the role of such mutations at post-HSCT in previous studies 10,11,13,14 , our data based on NGS-MRD detection clearly demonstrated the prognostic impact of persistent mutations in any gene (DTA or CHIP) at both pre-HSCT and post-HSCT-1m. Our findings suggest that these mutations are reliable MRD markers of post-transplant relapse. Allo-HSCT is a therapeutic approach to changing a patient's hematopoietic system with donor tissue. Thus, any mutation, even one associated with clonal hematopoiesis, is expected to disappear if AML is cured. This idea is supported by the eventual clearance of persistent DTA and CHIP mutations in patients who never relapsed in our study.
The prognostic impact of the conditioning intensity on NGS-MRD detection was addressed in a phase III trial (BMT CTN 0901) that compared between MAC and RIC. The results of that trial showed that pre-transplant NGS-MRD detection in PB is associated with post-transplant relapse in patients who undergo RIC, rather than in those who undergo MAC 14 . However, the trial did not include post-transplant NGS-MRD detection data. As such, the findings need to be validated as a limited number of genes (n = 13) were sequenced from pre-HSCT blood DNA, with no data at diagnosis or post-HSCT. In addition, the CIR of the RIC group (47% at 1 year) in the trial was higher than that in other randomized phase III trials (17-30%) 20,21 . The current study, which sequenced a broader array of 67 genes from BM DNA at multiple time points during the peri-transplant period, demonstrated that persistent pre-HSCT mutations were associated with post-transplant relapse in patients who received MAC rather than in those who received RIC. Meanwhile, persistent post-HSCT-1m mutations were associated with post-transplant relapse in patients who received RIC rather than in those who received MAC. The limited impact of persistent post-HSCT-1m mutations might be biased by the high NRM in MRD-positive patients. A recent study on patients who received mostly MAC showed the significance of NGS-MRD at 21 days after HSCT 10 . Meanwhile, the prognostic value of NGS-MRD clearly differed according to time point (better for post-HSCT-1m than that for pre-HSCT) in the RIC group. The reliability of these findings is supported by the lack of difference in NRM rate according to MRD status in our study and the similarity between the CIR (20%) in this study and that of the RIC groups in previous randomized phase III trials 20,21 . These results suggest that the prognostic impact of NGS-MRD at pre-HSCT depends on the conditioning intensity in the opposite manner to that shown in the BMT CTN 0901 study 14 . Later time points appeared to be more reliable for NGS-MRD detection in the RIC group, which was more susceptible to graftversus-leukemia effects than the MAC group. Further studies are needed to identify the precise effect of conditioning intensity on NGS-MRD results at different time points, using prospective cohorts of patients who are evenly distributed between the MAC and RIC groups.
We used conventional NGS-MRD and found that the most valuable VAF cutoff was 0% at both pre-HSCT and post-HSCT. At this cutoff, sensitivity was improved due to exclusion of mutations with low read depth, high background error rate, and allelic imbalance 18 . Using the NGS-MRD assay, we demonstrated the later clearance of persistent mutations after allo-HSCT, indicating a graft-versusleukemia effect. Of note, we found that the NGS-MRD assay enables the detection of mutations before an overt relapse. Moreover, longitudinal analyses of relapsed samples revealed various conditions including different responses of mutations to treatment, mutational selection after treatment, and evolution of mutations during the peri-transplant period, thus increasing the utility of NGS-MRD. Interestingly, we were able to schematize donor-originated clonal hematopoiesis in detail, which could be discriminated from donor cell leukemia because it appeared just after allo-HSCT and disappeared during relapse. Taken together, these data provide evidence for the validity of serial NGS-MRD monitoring after allo-HSCT, although the technique needs to be upgraded with improved sequencing methods with higher sensitivity and a minimal error rate 22,23 .
In conclusion, persistent mutations at both pre-HSCT and post-HSCT-1m were associated with high risks of relapse and mortality regardless of mutation type, including DTA and CHIP. The optimal time point of NGS-MRD assessment depended on the conditioning intensity (pre-HSCT for MAC and post-HSCT-1m for RIC). Serial NGS-MRD monitoring after transplantation is a feasible way to compensate for the limited sensitivity and specificity of conventional NGS. The usefulness of NGS-MRD monitoring will facilitate trials investigating the feasibility of MRD-driven decision-making for riskadapted approaches to reducing relapse in AML.