Introduction

Chronic Myeloid Leukemia (CML) is a unique model for the evolution of cancer and is classified as a triphasic myeloproliferative neoplasm based on clinical and pathological characteristics. Premalignant leukemia stem cells (LSCs) are, in fact, generated in the bone marrow nich by unknown mutagenesis processes1,2. In this step, the disease may be undetectable for a decade or more. Further oncogenic processes cause LSCs to develop into leukemia progenitor cells (LPCs)3. Leukemogenic alterations mainly affect five classes of regulatory proteins: signaling pathway components, transcription factors (TFs), epigenetic regulators (ERs), tumor suppressor genes (TSGs), and components of the spliceosome4.

Without therapeutic interventions or in the case of drug resistance, CML will progress to accelerated phase (AP) and then into an acute leukemia phase that is called blastic phase or blast crisis5. Even in the advent of Tyrosin Kinase Inhibitor therapy, blastic phase survival is less than 12 months. In general, blast crisis in chronic myeloid leukemia is still a fatal disease6,7.

Over the past decade using Massively Parallel Sequencing (MPS) or Next Generation Sequencing (NGS) the multi-dimensional genomic view with higher resolution of molecular profiling has become possible8. It was shown that the combination of exome, genome and RNA Sequencing data9, called integrative sequencing10,11,12 paves the way to better understand the mechanisms and pathogenesis of the cancer and can potentially improve the clinical management and finding new therapeutic targets specially in the advanced phase patients13.

In the present study, we conducted integrated genomic sequencing through an in-house filtering algorithm to uncover the leukemogenic important/actionable findings and investigate the myeloid leukemogenesis model in three Iranian myeloid blast crisis CML patients.

Methods

Subjects

Three patients (from 3 unrelated families) with the clinical diagnosis of blastic phase CML were recruited from the Shariati Hospital in Tehran, Iran. Inclusion criteria were based on clinical presentations and hematological findings in the peripheral blood and bone marrow, according to the WHO 2016-guideline14.

The study was approved by the Ethical Committee of the Iran University of Medical Sciences (Code: IR.IUMS.REC 1395.95-04-87-30235). All subjects and/or their legal guardian(s) gave their informed consent to participate. All methods, including obtaining informed consent, were carried out in compliance with the above-mentioned ethical standards and guidelines.

Whole Exome Sequencing (WES) along with Chromosome-seq and RNA Sequencing were performed on extracted DNA and RNA from peripheral blood samples in the blast crisis phase.

Whole exome sequencing

Peripheral blood samples of patients were collected in EDTA containing tubes and used to extract genomic DNA by Blood SV-mini kit (GeneAll Biotechnology Co., LTD, South Korea) according to the manufacturer instruction. The extracted genomic DNA samples were subjected to WES using Agilent SureSelect V6-post Capture/Target Enrichment Kit. For the third patient TWIST Human Core Exome Kit (Twist Bioscience, USA) was used according to the manufacturer’s instruction. The enriched libraries were sequenced on Illumina platforms (Illumina Inc., CA, USA). Table S1 shows a summary report of data analysis metrics from the patients’ WES.

WES data analysis (variant filtering)

Paired-end sequence reads were mapped to the refseq (NCBI) human reference genome (GRCh38 assembly) by BWA-MEM. Picard software was used to mark duplicate reads. Genome Analysis Tool Kit (GATK) software Mutect2 package was used to call single nucleotide variants (SNVs) and short insertions or deletions (Indels). The intronic variants were removed, however the 100 bp flanking each exon was included in VCF files using interval/bed file.

Several tools were used to annotate and interpret variants including CGI, Varsome, InterVar, CancerVar, and Franklin. The annotated VCF files were filtered to find Tier1 or Tier2 variants according to AMP classification and pathogenic (P), likely pathogenic (LP), or Leaning pathogenic variants based on ACMG classification.

All Tier 1 or Tier 2 variants in AMP classification and the Tier 3 variants that were pathogenic (P), likely pathogenic (LP), or Leaning pathogenic VUS in ACMG classification were selected as potentially important findings (PIFs). PIFs known to be therapeutic, prognostic or diagnostic biomarkers were annotated as potentially actionable findings (PAFs).

RNA sequencing

RNA from whole blood in PaxGene tube was extracted and processed by the TruSeq Stranded Total RNA with Ribo-Zero Gold Kit (Illumina Inc., CA, USA) using 1 µg of DNAse treated RNA. Then paired-end sequencing was performed on Illumina platforms.

CLC genomics workbench 20 was used for RNA sequencing and fusion gene analyses. RNA sequencing results were used for determining the outlier expression analysis of genes harboring important/actionable variants, analyzing the fusion transcripts as well as confirmation of variants found in WES.

Chromosome-sequencing

Vista™ Chromosome Sequencing-100 K an NGS based Whole Genome Sequencing was performed in BGI Clinical Laboratories to evaluate copy number variations (≥ 100 Kb).

Cross validation of the identified variants

The identified SNVs and small Indels in WES were visually validated in the mapping read-tracks after importing WES BAM files in the CLC genomics workbench 20. Besides, the validations of identified WES variants were performed using RNASeq BAM files by confirming both variant and gene expression.

The detected CNVs in chromosome-seq data were validated by CNV analysis from WES data using CLC genomics workbench 20.

Ethics approval and consent to participate

The study was approved by the Ethical Committee of the Iran University of Medical Sciences (Code: IR.IUMS.REC 1395.95-04-87-30235). Informed consent was obtained from all subjects and/or their legal guardian(s) for participation.

Results

Clinical findings

The patient (I), was a 66-year-old female presented by fever, hepatosplenomegaly, leukocytosis, anemia, thrombocytopenia. The blast level in the peripheral blood was 25%. Bone marrow aspiration and immunophenotyping confirmed the myeloid blastic crisis. The patient had a history of thyroid cancer that had been treated many years prior to the development of CML. She was taken to the hospital at the age of 63 with a primary diagnosis of chronic phase CML in terms of leukocytosis with myeloid precursors in the peripheral blood and a positive Philadelphia chromosome in karyotype.

The patient (II), a 55-year old female presented by leukocytosis, anemia, thrombocytopenia. The diagnosis of chronic phase CML was established 15 years before blastic phase at the age of 40 years. There was 36% blast in the peripheral blood. Bone marrow aspiration and immunophenotyping confirmed the myeloid blastic crisis. At blast crisis, the peripheral blood karyotype revealed a clone that was positive for both the philadelphia chromosome and isochromosome 17q, as well as a clone with paracentric inversion in 15q.

Patient (III) was a 45-year old male presented by leukocytosis, anemia and thrombocytopenia. The diagnosis of chronic phase CML was established two years before blastic phase at the age of 43 years. There was 40% blast in the peripheral blood. Bone marrow aspiration and immunophenotyping confirmed the myeloid blastic crisis.

BCR-ABL1 fusion transcript (b13-a2) was also found in the RNA-seq fusion gene study of all patients during the blast crisis phase.

Table 1 summarizes the patients' detailed clinical and paraclinical findings at chronic phase CML diagnosis. The clinical and paraclinical findings at myeloid blast crisis are presented in the Table 2.

Table 1 Clinical and laboratory findings for the CML patients at diagnosis/chronic phase.
Table 2 Clinical and laboratory findings for the CML patients in MBC phase.

Genetic findings

WES analysis identified sixteen PIFs (Figs. 1, 2 and 3) in which ABL1-Y272H, ASXL1-A627Gfs*8, GATA2-L359S, ABL1-F359V, ASXL1-G646Wfs*12, PTPN11-G503V and IKZF1-R208Q variants were classified as PAFs. The ABL1-I418T, PIK3CB-R231H, CHEK2:c.-4C > T, BRCA2-P3292L, ABL1-E459G, PRPF8-G1796R, PHF6-R274Q, WT1-R435X and ATM-C117Y variants were potentially important cancer variants which were not actionable (Table 3).

Figure 1
figure 1

The paired-end mapping read-track displaying reads and detected variants in the first patient. The potentially important findings from WES data analysis is presented in the CLC genomics workbench software. (A) hg38 sequence track with WES mapped reads track; (B) hg38 sequence track with RNA-Seq mapped reads track; (C) Gene expression track; (D) gene, mRNA and CDS tracks; (E) the variant genomic coordinate on hg38 and the variant name. P1: patient No.1.

Figure 2
figure 2

The paired-end mapping read-track displaying reads and detected variants in the second patient. The potentially important findings from WES data analysis is presented in the CLC genomics workbench software. (A) hg38 sequence track with WES mapped reads track; (B) hg38 sequence track with RNA-Seq mapped reads track; (C) Gene expression track; (D) gene, mRNA and CDS tracks; (E) the variant genomic coordinate on hg38 and the variant name. P2: patient No.2.

Figure 3
figure 3

The paired-end mapping read-track displaying reads and detected variants in the third patient. The potentially important findings from WES data analysis is presented in the CLC genomics workbench software. (A) hg38 sequence track with WES mapped reads track; (B) hg38 sequence track with RNA-Seq mapped reads track; (C) Gene expression track; (D) gene, mRNA and CDS tracks; (E) the variant genomic coordinate on hg38 and the variant name. P3: patient No.3.

Table 3 Potentially actionable (PAF) and potentially important (PIF) findings in WES analysis of BC-CML patients.

Genetic findings in the patient no.1

The patient was a BCR-ABL1 positive CML case in the myeloid blast crisis phase. She had abnormal karyotype with Philadelphia chromosome. In chromosome-seq there were no specific copy number alterations.

In WES she had six PIFs which contributed in all classes of leukemogenic genes (Fig. 1) except splicing components (class V). The p.Y272H variant on ABL1 (class I), p.A627Gfs*8 on ASXL1 (class III) and p.L359S on GATA2 (class II) were PAFs.

Genetic findings in the patient no.2

The patient was a BCR-ABL1 positive CML case in the myeloid blast crisis phase. She had abnormal complex karyotype with Philadelphia chromosome and isochromosome 17q in one clone and paracentric inversion in 15q in the another stemline (46,XX, t(9;22)(q34;q11.2),i(17)(q10)[17] /46,sl,inv(15)(q21,q25)[33]). In chromosome-seq (Fig. S1) there were deletions in 15q and 17p, and duplication in 17q (46,XX,del(15q26.3); del(17p11.2p13.3); dup(17p11.2q25.3)). The TP53 gene is a class II leukemogenic gene (transcription factor) located in 17p13.1 which is deleted in this patient.

In WES she had six PIFs that contribute in all classes of leukemogenic genes (Fig. 2). Considering the TP53 involvement by 17p deletion, this patient has seven PIFs. The p.F359V variant in ABL1 (class I), p.G646Wfs*12 in ASXL1 (class III) and TP53 (class IV) deletion were PAFs.

Genetic findings in the patient no.3

The patient was a BCR-ABL1 positive myeloid blast crisis CML. In chromosome-seq (Fig. S1) he had several deletions in 7p and 7q (46,XY,del(7p14.3p21.1); del(7p11.2p14.3); del(7q11.23q21.11)). The IKZF1 (7p12.2), EZH2 (7q35-q36) and CUX1 (7q22) are important leukemogenic genes which are located in the deletion regions. EZH2 gene is an epigenetic regulator gene (H3K27 methyltransferase; class III) and CUX1 is a transcription factor (class II) regulating TP53 and ATM.

He also had four PIFs in WES (Fig. 3). In terms of the pathogenic deletions in the short and long arms of chromosome 7 affecting EZH2 and CUX1, this patient has six PIFs which involve four classes of leukemogenic genes. PTPN11:p.G507V (class I) and IKZF1:p.R65Q variants (class II) were PAFs. The WT1:p.R435X (class II) and ATM:p.C117Y (class IV) variants were not PAFs.

IKZF1 variant (p.R65Q) in combination with del(7p11.2p14.3) may affect IKZF1 gene product as an important transcription factor in myeloid malignancies.

Discussion

The ability to analyze and model the leukemogenesis process may aid in the effective management and treatment of hematologic malignancies. According to the proposed “Slot machine” model of leukemogenesis by Murati et al.4, progression from chronic phase into secondary acute myeloid leukemia (blast crisis) can be modeled by the combination of at least four steps that one classes of leukemogenic genes is affected in each step.

Using integrated genomic sequencing to assess the genes affected by SNVs, Indels and CNVs, in the present study all MBC-CML patients had at least five leukemogenic genes involved. The detected SNV and Indel variants in WES represent all classes of leukemogenic genes (Fig. 4) including signaling pathway gene (ABL1-Y272H, ABL1-I418T, ABL1-F359V, ABL1-E459G, PIK3CB-R231H, PTPN11-G503V), transcription factors (GATA2-L359S, PHF6-R274Q, IKZF1-R208Q, WT1-R435X), epigenetic regulators (ASXL1-A627Gfs*8, ASXL1-G646Wfs*12), tumor suppressor and DNA repair genes (BRCA2-P3292L, ATM-C117Y, CHEK2:c.-4C > T) and components of the spliceosome (PRPF8-G1796R).

Figure 4
figure 4

The detected SNV and Indel variants in WES related to leukemogenic gene classes. The classes of leukemogenic genes includ signaling pathway genes (class I), transcription factors (class II), epigenetic regulators (class III), tumor suppressor and DNA repair genes (class IV) and components of the spliceosome (class V).

The detected variants in different signaling pathway related genes including ABL1, PIK3CB and PTPN11 in MBC-CML patients may affect ErbB, PI3K-Akt, Ras-MAPK cascade and JAK-STAT signaling pathways15,16,17. ABL1 was involved in the first and second patients. Four important missense variants were detected in ABL1 (p.Y272H, p.I418T, p.F359V, p.E459G). All variants were located in the tyrosine kinase domain of ABL1 and were previously reported in CML (COSM12576, COSM12605, COSM1460549)18. Mutations in the ABL1 kinase domain, in particular, have been identified in CML patients in the blastic phase19. Occurrence of 2 or more mutations in BCRABL1 fusion gene following TKI therapy has been reported in CML progression20.

PIK3CB-R231H is an important missense variant in the Ras Binding domain (RB domain). This deleterious variant in PIK3CB gene has not been reported in the hematological malignancies including CML (https://www.mycancergenome.org/content/gene/pik3cb/). This gene plays a critical role in solid tumor development16,21.

The third patient’s PTPN11-G507V variant is located in the Phosphatase (PH) domain. Although PTPN11 mutations have been documented in blastic phase CML, the p.G507V alteration is a novel variant in CML and has only been identified in other myeloid malignancies such as JMML, AML and MDS. (COSM14271).

There were several important variants in different transcription factor genes including GATA2, PHF6, WT1 and IKZF1 in MBC-CML patients. In addition, TP53, IKZF1 and CUX1 genes were affected by chromosomal deletions/CNVs. GATA2-L359S variant in the first patient is located on the zinc finger GATA domain. It has been indicated that GATA2 germline mutations strongly predispose patients to leukemia22. GATA2 mutations were described in MBC-CML; however p.L359S variant has not been previously reported in blastic CML. p.L359S variant was suggested as a predisposing germline likely pathogenic variant in MDS/AML23.

PHF6-R274Q variant in patient No.2 is located in the PHD-like zinc binding domain. Mutations in PHF6 also have been reported in the blastic phase CML while p.R274Q variant is a novel mutation in CML which only reported in Early T-cell precursor (ETP) acute lymphoblastic leukemia (COSM306061).

Biallelic involvement of IKZF1 was observed in the third patient. The detected missense variant was located in the C2H2 zinc finger domain of the IKZF1 (p.R65Q). Because of the 7p deletion, this variant showed a high variant allele fraction (VAF = 100%) in WES analysis. IKZF1 mutations had been found in both chronic and blastic phase CML24.The patient with IKZF1 mutation also had a mutation in PTPN11. It has been shown that in AML patients, mutations in IKZF1 and PTPN11 are associated with aggressive clinical course and primary resistant to chemotherapy25. CUX1 gene involvement was detected in the third patient in the cytogenetic studies in terms of 7q deletion. CUX1 is a transcription factor regulating TP53 and ATM26.

17p deletion in TP53 gene was established by cytogenetic analysis in the second patient. TP53 is a transcription factor located in 17p13.1. TP53 regulate cell cycle, DNA repair and apoptosis. It was observed that 45% of blast phase MPN (MPN-BP) patients have a TP53-related defect such as TP53 gene mutations or haploinsufficiency27. Involvement of TP53 has been described in both chronic and blastic phases of CML28,29.

Additional chromosomal abnormalities (ACAs) have been discovered in the second and the third patients. Patients with high-risk ACA, such as i(17q) and -7/7q-, are known to have a poor response to TKIs and a greater risk of disease progression. -7/7q- are less common in BC than i(17q), although they have a greater unfavorable influence on prognosis30. The patient number 3 with multiple deletions in the chromosome 7 has shorter time to MBC.

WT1-R435X is another significant nonsense variant discovered in the third patient. WT1 is a tumor suppressor gene which has known function as a transcriptional regulator31. This mutation causes loss of function in WT1 by the activation of NMD (Nonsense mediated decay) mechanism. Mutations in WT1 were reported in the blastic phase CML32, Wilms tumor and desmoplastic small round cell tumor (COSM21401). Pathogenic variants in this gene are correlated with poor prognosis in the patients with myelodysplastic syndromes33. The p.R435X germline mutation in WT1 has been classified as pathogenic which is associated to Wilms tumor in the clinvar (RCV000003671).

Two frameshift single nucleotide deletions were identified in ASXL1 (A627Gfs*8, G646Wfs*12). ASXL1 mutations are more frequent in secondary AML and may contribute to the development of CML to AML. Moreover, ASXL1 mutation might be associated with an aggressive phenotype in myeloid malignancies4.

It has been shown that mutations in ABL1 kinase domain, ASXL1, IKZF1, TP53 are common in CML patients who develop BC12,30,34. ASXL1 mutation is the most frequent mutation in poor outcome patients. In addition, individuals with an ASXL1 mutation had a longer median time to develop BC12. In the current study, the first and the second patients with ASXL1 mutation had longer time to MBC than the third patient.

EZH2 gene deletion was established by chromosome-seq in the third patient. EZH2 gene is an epigenetic regulator gene (H3K27 methyltransferase). Loss-of-function mutations in EZH2 have been reported in primary myelofibrosis (PMF), MDS and MDS/MPN overlap syndromes27.

Several important variants have been documented in the tumor suppressor genes specially the genes involved in the DNA damage-recognition and DNA repair pathways including CHEK2, ATM and BRCA2. The CHEK2 c.-4C > T variant is an important Tier3 variant. CHEK2 has a role in DNA damage-recognition and repair, cell cycle and P53 signaling pathway. The c.-4C > T variant is located in the kozak sequence which may affect translation35. This variant has been reported in clinvar as VUS for hereditary cancer-predisposing syndrome (RCV000131287).

ATM-C117Y variant is a Tier2 missense variant. ATM is involved in homologous recombination, cell cycle and P53 signaling pathway36. ATM has a key role in the DNA damage-response pathway and loss of function mutations in ATM may promote acceleration of the blast crisis in CML37. The ATM-CHEK2-p53 axis has been shown to play an important role against cancer initiation by inducing apoptosis, cell cycle arrest or senescence in the cancer cells27.

BRCA2-P3292L variant is a missense Tier2 variant. BRCA2 is a tumor suppressor gene which contributes to homologous recombination38. It has been indicated that 20% of the patients with Therapy-Related Myeloid Neoplasms (t-MN) among breast cancer survivors have a germline mutation in the BRCA1/2, CHEK2, TP53, or PALB2 genes, which are important in DNA repair pathways27.

PRPF8 was the only RNA splicing factor gene with a significant variant. It is a core component of U2-type and U12-type spliceosome complexes39. PRPF8-G1796R variant in the second patient is a missense Tier3 variant. This variant has been classified as driver mutation and has not been previously reported.

The self-renewal, differentiation and proliferation are three main processes which are impaired in malignancies. In the current study, the integrated genomic sequencing detected important variants in twelve different cancer genes. Mutations in signaling genes may induce proliferation of malignant cells while mutations in the epigenetic regulators (ERs) or tumor suppressor genes (TSG) may promote self-renewal in the progenitor cells and transcription factors mutations may affect differentiation. Mutations in the spliceosome component may also involve differentiation and self-renewal of progenitor cells40. Therefore, the alterations in ABL1, PTPN11 and PIK3CB signaling molecules may potentiate the proliferation process in MBC-CML. Variants in the CHEK2, BRCA2 and ATM tumor suppressor genes, as well as the ASXL1 and EZH2 Epigenetic Regulators, may support the self-renewal of the proliferating progenitors. Variants in the GATA2, PHF6, WT1, IKZF1 and CUX1 transcription factors may operate as differentiation blockers. Finally, the variant in the PRPF8 as spliceosome components may act as differentiation blocker or may cause self-renewal of the proliferating progenitor.

Finally, a recent integrated model of blast crisis41 showed that one of the most important mechanism associated to the BC progression is the epigenetic reprogramming related to the abnormal Polycomb Repressive Complex (PRC) function. The components of the PRC complex as an epigenetic complex, repress the expression of certain genes including leukemogenic HOXA cluster. On the other hand, specific gene products interact with PRC components to add/remove repressive marks or recruitment of PRC components to distinct loci. Based on recent studies on BC genome and transcriptome, it has been shown that the components of the PRC complex have mutated or altered in expression respectively. These changes overally affect the gene expression signature of BC cells. ASXL1 gene products interact with PRC2 components including EZH2. IKZF1 also recruits PRC2 to the target gene loci. According to this model, in the first and second patients the frameshift mutations in the ASXL1 may have a critical role in BC progression. In the third patient the large deletions in the chromosome 7 which affect IKZF1 (7p12.2) and EZH2 (7q35-q36) may accelerate BC.

A suggested model of leukemogenesis for the hierarchy of events is presented in the Fig. 5. In this model the initial step is the presence of mildly-deleterious variants (VUS-P) mainly in the tumor suppressor genes (TSG)/DNA damage repair (DDR) genes, transcription factors (TF) and/or components of splisosome. These variants may increase the individual’s susceptibility and predisposition to malignancies. The BCR-ABL1 fusion gene and mutations in the epigenetic regulator (ER)/transcription factor (TF) genes, which mainly influence the function of PRC, are the key genetic discoveries in the second and third steps. The final step in the accelerated and blast crisis phase may be started by the mutations in signaling molecules and/or additional chromosomal abnormalities (ACAs). ABL1 mutations specifically in the kinase domain lead to TKI resistance and delay in TKI switching may affect prognosis and survival by blast crisis. ACAs which involve leukemogenic genes (such as ER, TF or TSG) also may exacerbate the clinical prognosis and enter the disease to the blastic phase.

Figure 5
figure 5

A suggested model of leukemogenesis. In this model the initial step is the presence of mildly-deleterious variants which may affect the tumor suppressor genes (TSG), transcription factors (TF) and/or components of splisosome and increase susceptibility to cancer. In the second and third steps, the major genetic findings are BCR-ABL1 fusion gene and mutation in the epigenetic regulator (ER)/transcription factor (TF) genes. The final step in the accelerated and blast crisis phase may be started by the mutations in the signaling molecules and/or additional chromosomal abnormalities (ACAs).

Although mutations in ASXL1 or IKZF1 genes may play a key role in BC, they are more often found at the time of diagnosis in the chronic phase, in individuals with poor prognosis. Therefore, mutations in oncogenes such as ABL1 and PTPN11 may act as an accelerator.

In this study, samples related to the different phases of disease progression were not available. Future studies using genomic sequencing in different serial time points may allow detecting the sequence of occurrences related to the disease progression. Furthermore, recent advances in single cell sequencing may help to more accurately clarify the disease mechanisms.

Conclusions

The integrated genomic sequencing in this study effectively identified large spectrum of important and actionable variants from SNVs to CNVs. Mildly-deleterious passenger variants mainly in DNA damage repair (DDR) genes may increase the individual’s predisposition to cancer. BCR-ABL1 fusion, mutations in signaling and epigenetic regulator genes along with ACAs may act as cancer drivers. Epigenetic reprogramming caused by ASXL1 or IKZF1 mutations seems to be one of the most critical mechanisms linked with BC development. ABL1 domain mutations, mutations in other signaling genes and high risk ACAs may promote CML to BC.

Using this strategy we have identified several actionable findings which may improve targeted therapies. These findings may help better understanding the underlying pathophysiological mechanisms and may provide further insights into MBC-CML genetic basis in future. Further studies will shed light on the clinical usefulness of the integrated genomic analysis.