Clonal evolution and clinical implications of genetic abnormalities in blastic transformation of chronic myeloid leukaemia

Blast crisis (BC) predicts dismal outcomes in patients with chronic myeloid leukaemia (CML). Although additional genetic alterations play a central role in BC, the landscape and prognostic impact of these alterations remain elusive. Here, we comprehensively investigate genetic abnormalities in 136 BC and 148 chronic phase (CP) samples obtained from 216 CML patients using exome and targeted sequencing. One or more genetic abnormalities are found in 126 (92.6%) out of the 136 BC patients, including the RUNX1-ETS2 fusion and NBEAL2 mutations. The number of genetic alterations increase during the transition from CP to BC, which is markedly suppressed by tyrosine kinase inhibitors (TKIs). The lineage of the BC and prior use of TKIs correlate with distinct molecular profiles. Notably, genetic alterations, rather than clinical variables, contribute to a better prediction of BC prognosis. In conclusion, genetic abnormalities can help predict clinical outcomes and can guide clinical decisions in CML.

C hronic myeloid leukaemia (CML) is a myeloproliferative disorder caused by the BCR-ABL1 gene fusion generated in the Philadelphia (Ph) chromosome, der(22)t(9;22)(q34; q11.2). Recently, the prognosis of CML has been dramatically improved by the development of tyrosine kinase inhibitors (TKIs) targeting the BCR-ABL1 fusion protein. However, a minority of patients in the chronic phase (CP) fail to respond to TKI therapy, progress to blast crisis (BC), and show dismal clinical outcomes 1 . While a mutation in the BCR-ABL1 kinase domain is known to be one of the major determinants of TKI resistance and a risk for blastic transformation 2 , additional genetic alterations have been hypothesised to be necessary for the progression to BC. In fact, recent studies have demonstrated several driver mutations acquired during blastic transformation [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] . However, the current understanding of the genetic basis of TKI resistance, and progression of CML-CP to BC remains limited by the small number of patients and/or genes analysed in each study, as well as the paucity of matched CP and BC samples.
Another point of interest is the improved 5-year overall survival (OS) of BC patients from 16% during 2000-2004 to 33% during 2010-2016, which may be attributed to increased use of TKIs 18 . TKIs may be effective for certain patients with BC and contribute to improved survival; however, the majority of BC patients no longer show response to TKIs. If this is indeed the case, it is of considerable clinical importance to predict which BC patients can respond to TKI, for better management of CML. Unfortunately, only a few clinical factors or biomarkers are currently known to be correlated with clinical outcomes of BC patients treated with TKI-based regimens.
In this study, we investigated a large cohort of CML patients to reveal the landscape of genetic lesions in CML during both CP and BC. We aimed to identify genetic alterations that correlated with TKI-resistance and blastic transformation, as well as those that predicted clinical outcomes. For this, genetic alterations in both CP and BC samples, including paired CP and BC samples, were analysed using unbiased sequencing.

Results
Clonal evolution of CML. First, we performed whole-exome sequencing (WES) of paired CP and BC samples obtained from 52 patients with CML, to identify genetic alterations that were relevant to the clonal evolution to CML-BC, at a mean depth of 157 ( Fig. 1a, Supplementary Fig. 1a, and Supplementary Table 1). On average, 5.3 (0-21) nonsynonymous single-nucleotide variants (SNVs; or 0.088/Mb) were acquired during disease progression from CP to BC, with a median time of 26.7 months (0.7-155.1; Fig. 1b and Supplementary Fig. 1b). Notably, a Poisson regression model revealed that the number of mutations acquired during progression from CP to BC was independently and positively correlated with the interval between the progression (P = 9.4 × 10 −12 ), and negatively correlated with TKI therapy after CP diagnosis (P = 9.3 × 10 −3 ; Fig. 1b). The correlation with the number of recurrent mutations in CML-BC was not clear (Supplementary Fig. 1c). A similar trend was observed in a previous cohort of paired samples 13 , even though our findings did not show statistical significance owing to the small number of evaluable samples after quality control (n = 13; Supplementary  Fig. 1d). These results suggest that transformation from CP to BC is associated with accumulation of somatic mutations with time in the absence of effective therapy, and this accumulation is noticeably suppressed by TKI therapy, which may prevent the transformation from CP to BC.
In the CML-BC samples, mutations were frequently found in the driver genes implicated in myeloid malignancies, including RUNX1, ABL1, ASXL1, BCOR/BCORL1, TP53, and WT1 ( Fig. 1c and Supplementary Fig. 1e). We also identified recurrent mutations in recently reported genes in BC, such as UBE2A 13,14 and SETD1B 13 , as well as in previously unreported genes, such as KLC2 and NBEAL2. Deep amplicon sequencing of these mutations at a mean depth of ×2589 in paired CP and BC samples revealed that ASXL1 mutations were already present in the CP samples, whereas other major drivers, including RUNX1, ABL1, and TP53 mutations were initially absent in CP and emerged during progression to BC (Fig. 1d). In a few patients, mutations in other genes, such as WT1 and IDH2, were also found in the corresponding CP samples with lower tumour cell fractions (TCFs) calculated by variant allele frequencies (VAFs) than those found in the BC samples ( Supplementary Fig. 1f). These results suggest distinct roles for the different mutations in the progression of CML. TCFs of ASXL1 mutations were increased in nine, decreased in three, and almost stable (<10% difference) in three patients during disease progression from CP to BC. Almost all patients with ASXL1 mutations showed acquisition of other additional genetic abnormalities during progression to BC (93.3%, 14 out of 15 cases), including mutations (12/15) in RUNX1 (4/15), TP53 (3/15), BCOR (2/15), and SETD1B (2/15) genes. Typically, at least one accompanying mutation had TCFs comparable to ASXL1 mutations and was probably present in the major clones in the BC samples ( Supplementary Fig. 2). Therefore, ASXL1-mutated CP clones may be preferentially selected and may evolve by acquiring other drivers during the clonal development to BC.
We also performed sequencing-based copy-number analysis 19 . Copy-number alterations (CNAs) were frequently found in BC, but were rarely present in CP (Fig. 2a), suggesting that CNAs were also acquired during progression from CP to BC. Frequently identified CNAs in BC included −7/del(7p), +8, del(17p), amp (17q), +21, and an extra Ph chromosome (+Ph). We also analysed structural variations (SVs) close to the gene bait regions based on an algorithm, utilising both breakpoint-containing junction read pairs and improperly aligned read pairs 20 . Although the ability to detect SVs in our pipeline depends on the location of breakpoints and gene baits (see "Methods"), this approach led to the identification of an inversion event, resulting in a RUNX1-ETS2 fusion in a patient with myeloid BC, which was confirmed by performing reverse transcription PCR (RT-PCR) and Sanger sequencing ( Fig. 2b and Supplementary Fig. 3). Quantitative RT-PCR (RT-qPCR) at several time points demonstrated that this fusion was already present at the time of CP diagnosis, and the burden of this fusion, which was reduced after successful chemotherapy for CML-BC, was correlated with that of BCR-ABL1 (Fig. 2c). Thus, RUNX1-ETS2 may play a role in the rapid progression of CML-CP to BC.
Genetic landscape of CML-BC. Next, we performed targeted capture sequencing that covered 104 myeloid tumour-associated genes, including candidates of drivers found by WES in an additional 60 BC and 19 CP samples at a mean depth of ×585 (Supplementary Table 2 involved genes implicated in epigenetic regulation and signalling, such as chromatin modification, DNA methylation, transcription factors, the cohesin complex, and signalling pathways. In addition to KLC2 and NBEAL2, we found another recurrent mutational target, PHIP, which was mutated in two patients. Sequencing-based copy-number analysis disclosed the presence of complex CNAs (defined as ≥3 abnormal CNAs) in 26.5% of the BC patients, although complex karyotypes in CML-BC have previously been reported at a lower rate (10-12%) based on conventional karyotyping 23 , whose resolution is relatively limited.
We further explored the relationship between the genetic abnormalities and the lineage of BC (Fig. 3). Certain lesions, such as those attributed to +21, +8, +19, and ASXL1 and TP53 mutations, were enriched in myeloid crisis compared to those in lymphoid crisis, while others were enriched in lymphoid BC, which included CDKN2A/B and IKZF1 deletions, −7/del(7p), and −9/del(9p). In contrast, abnormalities such as RUNX1 mutations and +Ph were almost equally observed in both crises. The rearrangement of immunoglobulin and/or T-cell receptor genes, which was assessed by WES, was confirmed in most evaluable lymphoid samples and in a few samples of myeloid crisis. Analysis of pairwise correlations between the genetic lesions showed the existence of co-occurring patterns depending on the combination of lesions ( Supplementary Fig. 6). Among these, the most conspicuous correlations were those observed between +6, +8, +19, and +21, which were highly specific to myeloid crisis. Another visible correlation was between −7/del(7p), −9/del(9p), and CDKN2A/B deletions, which were highly specific to lymphoid crisis. We also found that del(17p) co-occurred with TP53 mutations, +Ph, +8, and amp(17q). Taken together, although substantial overlaps in genetic lesions were observed, myeloid and lymphoid BC cases were distinct in terms of lineage commitment, as well as their molecular profiles.
We next investigated whether a prior history of TKI therapy influenced the genetic profile in BC, whereby we compared genetic alterations in BC patients with and without a prior history of TKI therapy. As expected, ABL1 mutations were almost exclusively found in patients who had received TKIs, representing the most frequent mutation ( Supplementary Fig. 7). All   Table 2). Univariate analysis of genetic lesions observed in >5% of the patients revealed a negative prognostic impact of ASXL1 mutations, del(17p), i(17q) (isochromosome 17q, resulting in one copy of 17p and three copies of 17q), +19, +21, hyperdiploidy (as defined by presence of ≥48 chromosomes assessed by sequencing-based copy-number profiling), and complex CNAs (Table 2). Conspicuously, patients with concurrent TP53 mutations and del(17p), which were predictive of biallelic targeting of TP53, and i(17q), showed an especially grim outcome (Fig. 4). Consistent with a previous report demonstrating the association of multiple-hit TP53 mutations with complex karyotypes and poor outcomes in myelodysplastic syndromes 25 , three out of four cases with biallelic TP53 mutations  The horizontal and vertical axes represent the time from CP diagnosis (Dx) and levels of the indicated transcripts, respectively. The patient received imatinib (Ima) treatment after CP diagnosis and achieved CHR, while the BCR-ABL1 burden was not reduced significantly. Approximately 15 months after CP diagnosis, there was an abrupt development of BC, which was successfully treated with cytarabine and hydroxyurea in addition to imatinib. Thereafter, BCR-ABL1 transcript levels showed continuous reductions, and were below the threshold of CMR at 15 years after BC diagnosis. The RUNX1-ETS2 transcript was detected at the time of CP diagnosis and increased markedly during BC development, declined following chemotherapy with imatinib, and was undetectable 4 years after BC transformation.
in CML-BC patients revealed the existence of a complex karyotype. After performing adjustment for blast lineage and TKIbased therapy to correct for the possible effects of an association between the clinical and genetic factors, complex CNAs, −7/del (7p), and amp(17q) showed significant association with OS ( Table 2 and Supplementary Fig. 8). As shown in Fig. 5a, several genetic lesions were considerably and strongly associated with the lineage of BC, prior TKI-based treatment history, and/or prognosis in CML-BC, compared with clinical features, such as age, sex, white blood cell (WBC) count, haemoglobin levels, and platelet count. Therefore, genetic abnormalities may be good biomarkers for predicting clinical outcomes in BC patients. We next analysed survival, focusing on 59 TKI-treated patients, because TKI-based therapy has been shown to significantly improve OS and thus, is the current therapy of choice for patients with CML-BC 1,18,26 . In the univariate analysis, WBC count, lineage of blasts, ASXL1 and BCOR mutations, complex CNAs, del(17p), i(17q), +19, and +21 were significantly associated with OS ( Supplementary Fig. 9a). We also evaluated the relative effects of genetic alterations using Cox proportional hazard regression modelling with a standard backward selection of clinical and genetic variables, and identified ASXL1 mutations, complex CNAs, i(17q), and +21 as independent predictors of worse prognosis (Fig. 5b). To internally validate this finding, modelling was performed 100 times by conducting the bootstrap, in which all four variables of the final model were selected at a frequency of >70%, with a mean concordance statistic of 0.74 ( Supplementary  Fig. 9b). Based on the number of these unfavourable factors, TKItreated BC patients were classified into three subgroups showing distinct prognosis, where the 2-year OS rate was 65.0%, 17.1%, and 0% for patients with 0, 1, and ≥2 unfavourable genetic risk factors, respectively (P = 3.9 × 10 −12 ; Fig. 5c). We also analysed an independent external cohort reported in a recent publication 13 , in which 17 CML-BC patients were evaluable for survival with 12 receiving TKI-based therapy. Although the number of cases was limited, several similar associations were observed in the external cohort. TKI-based therapy was associated with better OS, while ASXL1 mutations and i(17q) predicted poor prognosis ( Supplementary Fig. 10a). Even though the difference was not statistically significant, owing to the small number of samples analysed, patients with genetic risk factors tended to show a poor prognosis ( Supplementary Fig. 10b). Therefore, our results suggest that genetic risk factors may help identify a subset of patients, who may be refractory to TKI therapy.
Genetic landscape and clinical outcome of CML-CP. Finally, we explored the effect of genetic abnormalities on the clinical outcomes of CML-CP patients. Our cohort contained more patients who ultimately developed BC (48%, 71/148) compared to other cohorts, because we intentionally included paired CP and BC samples to investigate the molecular pathogenesis of the clonal evolution in CML (Fig. 1, Fig. 6a, and Supplementary Table 3). The median age at diagnosis for patients with CP was 49 (14-88) years and 77.7% of the patients had received TKI therapy, with imatinib being the most commonly used therapeutic agent (80%). Based on the best response, TKI induced complete haematologic response (CHR) in 17.9%, major/complete cytogenetic response (MCyR/CCyR) in 16.9%, and major/complete molecular response (MMR/CMR) in 64.2% of the patients.
In total, additional genetic alterations were found in 25.7% of the patients with CP at diagnosis. As expected based on the analysis described above (Fig. 1), only ASXL1 was mutated in CP at a frequency comparable to that in BC, while other mutations, including TET2, KMT2D, PTPN11, RUNX1, and WT1, were found at much lower frequencies compared to those in BC (Fig. 6a, b, Supplementary Table 6, and Supplementary Data 2). Since, only ASXL1 mutations were frequently found in CP and are common in age-related clonal haematopoiesis 27 , we evaluated the correlation between ASXL1 mutations and patient age. Consistent with previous reports 10,13,28 , many patients with ASXL1 mutations were younger than 60 years at the time of diagnosis of CP, as opposed to those with age-related clonal haematopoiesis. Moreover, there was no significant impact of age on the frequency of ASXL1 mutations in CML-CP patients (Fig. 6c). We also evaluated whether genetic alterations in CP could predict BC progression in patients treated with TKIs. Interestingly, patients who received TKI and later experienced BC progression often harboured at least one mutation and/or CNA (38.5%, 15/24) than those without TKI treatment (15.8%, 12/76; P = 1.0 × 10 −2 ; Fig. 6d). Thus, even when rarely observed, genetic alterations in CP play a role in driving CML cells to undergo Odds ratio (95% CI) P  TKI  RUNX1  ASXL1  ABL1  IKZF1  BCORL1  TP53  CDKN2AB  BCOR  UBE2A  WT1  GATA2  KMT2D  SETD1B  SETD2  DNMT3A  IDH1  IDH2  NBEAL2  NRAS  PHF6  CREBBP  CTCF  KDM6A  KLC2  PHIP  PTPN11  SMC3  TET2  ASXL2  CBL  CBLB  EZH2  PTPN1  SETBP1  SMC1A  STAG2  SUZ12  U2AF1 +Ph  presented with a particularly dismal outcome, which was consistent with previous studies reporting poor outcomes in TP53mutated patients with other myeloid neoplasms, such as acute myeloid leukaemia and myelodysplastic syndromes 19,25,[31][32][33] , and in CML-BC patients with i(17q) [34][35][36] . Moreover, ASXL1 mutations, complex CNAs, i(17q), and +21 were independent predictors of poor prognosis in patients receiving TKI-based therapy based on multivariate analysis. Despite the limited number of samples analysed, our results indicate that genetic abnormalities can be effectively used as biomarkers to predict the outcome of BC patients to guide clinical decisions. As the ABL1 mutation itself was not associated with a worse prognosis of CML-BC and was almost always accompanied by other mutations, targeting Categories of mutations are depicted in different colours, and "multiple" indicates ≥2 distinct alterations found in the same gene in the same patient. The forest plot shows odds ratios with 95% confidence intervals (CI) for enrichment of each genetic lesion in myeloid BC. The dashed line represents an odds ratio of 1. Positive and negative odds ratios are indicated by red and blue colours, respectively. Genetic lesions found in >10 cases were included. P values were calculated using the Fisher's exact test. b Summary of genetic lesions in all 136 BC patients. Each column indicates one patient. Lineage of blasts and prior history of TKI therapy before BC diagnosis are also shown. Categories of alterations are depicted in different colours, and "multiple" indicates ≥2 distinct mutations found in the same gene in the same patient. The rearrangement of immunoglobulin (IG) and T-cell receptor (TCR) genes is shown in cases analysed by WES. NA not available. other mutations in combination with TKIs may be a promising treatment strategy in the future. Of note, sensitivity to various drugs is associated with the mutational profile of patients with acute myeloid leukaemia 37 , raising the possibility that in the near future, treatment may be personalised depending on individual mutational profiles. Exome analysis of serial samples revealed that CML cells accumulated somatic mutations over time, which was markedly suppressed by TKI-based therapy. Although the precise reason for this remains unclear, it is possible that the application of TKIs eliminated the rapidly cycling BCR-ABL1-harbouring cells, ignoring the slow-cycling cells, which are less likely to ameliorate the impact of somatic mutations. The TKI-mediated reduction in blastic transformation may be attributable to a drastic reduction in the number of tumour cells at a risk of acquiring driver mutations for BC. However, our results suggest that the suppression of random mutational events may contribute to the reduced risk of BC in TKI-treated patients.
As previously reported 3-17 , most CML-BC patients in our cohort acquired additional mutations, which not only included mutations previously reported in haematological malignancies and CML-BC, but also included previously unreported recurrent mutations, such as NBEAL2, PHIP, and KLC2 mutations. NBEAL2 encodes a protein with a role in megakaryocyte alphagranule biogenesis and has been reported to be mutated in patients with Grey platelet syndrome [38][39][40] and factor XIII deficiency 41 . In contrast, to the best of our knowledge, mutations in PHIP and KLC2 have not been reported in the context of haematological disorders. We also identified a RUNX1-ETS2 fusion. RUNX1 rearrangements are well-known driver events, which were first identified in core-binding factor leukaemia (RUNX1-RUNX1T1) and are present in a variety of myeloid and lymphoid malignancies, which are represented by RUNX1-MECOM in CML-BC, ETV6-RUNX1 in B-precursor acute lymphoblastic leukaemia, and other fusions with unknown partners 42 . The fusion discovered in our cohort is another RUNX1 rearrangement involving ETS2, which encodes a key haematopoietic transcription factor. Considering that genetic alterations in RUNX1 are not restricted to SNVs, but also appear as CNAs and SVs, application of NGS-based techniques to detect multiple variant types may be important to monitor cases in CML patients. Among the mutations detected in CML, ASXL1 mutations are more likely to be present at the time of CP diagnosis, and showed comparable TCFs between CP and BC samples. Thus, ASXL1 mutations are unique compared to other mutations, as they were rarely detected in CP samples, but expanded with the onset of BC. Since, patients with CP who subsequently developed BC were more likely to harbour mutations, such as ASXL1 mutations, compared to those who did not harbour such mutations, and because BC patients with ASXL1 mutations had a poor prognosis, CP patients harbouring ASXL1 mutations may require careful management. As most patients received imatinib as the first-line TKI for CP in our cohort, second-or third-generation TKIs may be better candidates for therapy to improve the outcomes of these patients, as suggested by a recent study 29 . As expected, myeloid and lymphoid BC cases exhibited distinct molecular profiles, which was in agreement with a previous report that patients with CML-BC presented with distinct additional    Information for prior history of TKI is missing in two patients treated by TKI for BC.
chromosomal alterations depending on the lineage of BC 11 . The genetic profile of CML-BC is also influenced by a prior history of TKI therapy before BC diagnosis, which may explain, at least in part, the differences in the genetic profiles of BC patients. In summary, our study demonstrates the diverse mutational profiles and clonal evolution of CML in a large cohort and bridges genetic abnormalities and clinical features, including outcomes. Our results will hopefully lead to the development of efficacious therapy and strategies for better management of patients with CML.

Methods
Patient samples. We performed WES, targeted capture sequencing, and/or deep amplicon sequencing, using 112 CML-BC and 71 CP samples at diagnosis from 130 patients at ten institutions enroled in this study, according to the protocols approved by the Institutional Review Boards (Fig. 1a and  Detection and quantification of fusion transcripts. RT-PCR for detection of BCR-ABL1 fusion transcripts 43 and RT-qPCR with TaqMan assay for BCR-ABL1 (ref. 44 ) was conducted with our lab-specific conversion factor obtained from the international reference laboratory at Adelaide, Australia for International Scale (IS) calculation. BCR-ABL1 levels were expressed as IS 45 , along with a log reduction value. SYBR-green RT-qPCR was performed using the ABI 7900HT system (Applied Biosystems, Foster City, CA) for RUNX1-ETS2 determination using freshly frozen samples, identical to those used for quantification of the BCR-ABL1 transcript, according to the manufacturer's instructions. The primer set for the measurement of the RUNX1-ETS2 transcript comprised RUNX1-ETS2-QF (5′-CTTCACAAACCCACCGCAAG-3′) and RUNX1-ETS2-QR (5′-AGGGAGTCT GAGCTCTCGAAG-3′) with the same ABL1 internal control gene for BCR-ABL1.
Each RT-qPCR reaction was performed in duplicate and the relative expression after analysis of follow-up samples was compared with that of the diagnostic sample, using comparative quantification (2−ΔΔCt method) and expressed as log reduction. For cases without germline controls, we also independently analysed the WES data of the CP and BC samples without using controls, and filtered data on the candidate mutations using the criteria used for performing targeted capture sequencing, as described below to rescue potentially overlooked recurrent mutations already present at the time of CP diagnosis. By doing so, we created three mutation lists for cases analysed for both CP and BC without germline controls, i.e., somatic mutations acquired in BC compared to CP, and independently called candidate recurrent mutations in CP and BC. We then merged the mutation lists to generate a single list for each BC sample. To analyse the number of SNVs acquired during disease progression ( Fig. 1b and Supplementary Fig. 1b,  d), data on the mutations were called for BC samples using CP as control and data on somatic mutations with (i) Fisher's exact P value <0.01; (ii) VAF > 0.05; (iii) P value for EBCall < 0.001, were filtered by excluding (i) variants occurring in repetitive genomic regions; (ii) SNPs listed in the database as described above. For the external cohort 13 , WES data of both CP and BC samples were available for 15 patients. Of these, 13 were subjected to analysis after excluding 1 case, which lacked information on progression time from CP to BC and 1 in which a much lower depth was observed compared to the other samples. The estimated TCFs harbouring the relevant mutation were calculated with the total copy number (TCN) of the region and observed VAF values as follows 19 ; TCF = TCN × VAF for deletions, TCF = 2VAF for regions without copy-number changes, and TCF = TCN × VAF for gains.
Targeted capture sequencing. Targeted capture sequencing was performed using the SureSelect custom kit (Agilent Technologies), for which 104 genes were selected from those found to be mutated in the WES data of 52 CML-CP and BC pairs, and/ or known oncogenes or tumour suppressor genes in haematological malignancies. Sequencing, alignment, and mutation calling were performed as per the WES analysis, except for the filtering criteria. Relevant somatic mutation data with (i) VAF > 0.05; (ii) depth > 100; (iii) P value for EBCall < 0.0001, were filtered by exclusion based on (i) synonymous SNVs; (ii) variants present only in unidirectional reads; (iii) variants occurring in repetitive genomic regions; (iv) missense SNVs with VAF of 0.4-0.6 or <0.04; and (v) known variants listed in SNP databases (as described in the "Whole-exome sequencing" section). In addition, (i) pathogenic mutations found in >1 haematological malignancies in the COSMIC database (v84) and (ii) mutations expected to cause premature termination of protein translation were reviewed by using a less stringent filter with (i) VAF > 0.02; (ii) depth >50; (iii) P value for EBCall > 0.001, and data were manually curated.
CNA and SV analysis. CNAs and SVs were detected using the CNACS algorithm and Genomon-SV pipeline, respectively 19,20 . Briefly, CNACS analyses sequencing depth and allele frequencies of heterozygous SNPs to determine genome-wide copy numbers and detect CNAs 19 . We also used the Exome-Depth package in R 48 to detect microdeletion events occurring in the exons of the IKZF1 and RUNX1 genes. CNA data were manually curated using IGV. The Genomon-SV pipeline detects SVs by utilising both breakpoint-containing junction read pairs and improperly aligned read pairs 20 . Data on putative SVs detected by using the Genomon pipeline were filtered by removing (i) those with Fisher's exact P value >0.03 and (ii) those present in control normal samples, whose breakpoints were manually inspected using IGV. Since sequencing reads are enriched in gene exons and target genes in WES and targeted capture sequencing, respectively, detection of SV breakpoints is limited to regions close to those covered by gene baits. Nevertheless, a few SVs were detected even when they occurred in intronic regions. For instance, our pipeline could detect a RUNX1-ETS2 fusion as the breakpoint was observed in the intronic region close to the ETS2 gene exon (Supplementary Fig. 3a). Moreover, we were able to identify the BCR-ABL1 fusion gene with the typical breakpoint, in the intronic region, through targeted capture sequencing and WES. Through targeted capture sequencing, we identified BCR-ABL1 fusions in 51 out of 60 cases (85%); however, only 4 out of 76 cases (5.3%) were detected by WES. Therefore, the ability to detect SVs in our pipeline largely depended on the location of breakpoints and gene baits. Rearrangements of immunoglobulin and T-cell receptors were detected as microdeletion events involving loci observed in the WES analysis. Complex CNAs were defined as the presence of ≥3 abnormal CNAs detected by CNACS. We considered the coamplification of 9q and 22q derived from +Ph abnormality as a single event.
Hyperdiploidy and hypodiploidy were defined as two or more gains (≥48) and losses (≤44) of chromosomes assessed by CNACS, respectively, and data on these two aspects were removed from those on complex CNAs.
Statistical analysis. Statistical analyses were performed using R (v3.5.0). Comparisons between groups were based on the two-sided Wilcoxon rank-sum test for continuous data and the Fisher's exact test for categorical data. The correlation between the number of acquired mutations and time necessary for progression from CP to BC (Fig. 1b) 49 . Molecular responses for CML were defined as follows: haematologic remission, IS ≥ 10% for CHR, IS < 10% for MCyR, IS < 1% for CCyR, IS < 0.1% for MMR, and IS < 0.0032% for CMR. Survival analysis was performed for 99 patients with CML-BC for whom survival and treatment data for BC were available, and observations were censored at the last follow-up. The median follow-up was 3.2 years in surviving patients, and 24 (24.2%) patients were alive at the last follow-up. The Kaplan-Meier method was used to estimate the OS, and the differences in OS were assessed using the survival package in R. The effects of genetic lesions on OS were evaluated by using the Cox proportional hazards regression model, and data were adjusted for clinical factors that were significantly associated with OS in the univariate analysis. For blood counts, the following rounded median values were used as the threshold: 50,000 (×10 3 /uL) for WBC counts, 9.3 (g/dL) for haemoglobin levels (Hb), and 96,000 (×10 3 /uL) for platelet counts (PLT). Old age was defined as age of a patient ≥60 years, as per a previous study 18 . Multivariate analysis was performed for patients treated with TKI-based regimens (n = 59) by performing Cox proportional hazards regression modelling with a stepwise selection of variables using P value to exclude variables, in which genetic or clinical factors with a univariate Cox P value <0.10 were considered. To evaluate the validity of the established model, we also conducted bootstrapping 100 times to construct the test models with the factors subjected to the multivariate modelling using the rms package in R. Each model was assessed for validity by calculating the concordance statistic and frequency of each covariate included in the models. All P values were calculated using two-sided tests, and P < 0.05 was considered to be statistically significant.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The WES data in this study are deposited in the European Genome-phenome Archive under accession code EGAS00001005075. The data is available under restricted access, and access can be obtained by contacting S.O. (sogawa-tky@umin.ac.jp). The public WES data used in this study are available in the European Genome-phenome Archive under accession code EGAS00001003071, and the European Nucleotide Archive under accession code PRJEB20846 (Supplementary Table 8). The remaining data are available within the article, Supplementary Information, or available from the authors upon request.