To the Editor:

Acute myeloid leukemia (AML) is a heterogeneous group of hematologic malignancies characterized by the proliferation of myeloid cells blocked in their ability to differentiate. Evaluation by G-banding and fluorescence in situ hybridization is an essential aspect in the initial disease characterization and even now is fundamental in identifying cytogenetic abnormalities that can inform disease diagnosis, prognosis, and treatment decision [1, 2]. For instance, the identification of t(8;21)(q22;q22) or t(15;17)(q22;q12), which generate RUNX1-RUNX1T1 or PML-RARA gene fusions, respectively, confers favorable prognosis when treated accordingly [1]. In recent years, advancements in next-generation sequencing and efforts by large genomics studies have led to a classification of 11 AML subgroups based on cytogenetic abnormalities as well as mutations in genes, such as NPM1 or CEBPA [1].

AML with a complex karyotype, defined by the presence of three or more unrelated chromosomal aberrations and the absence of favorable cytogenetic rearrangements, is associated with TP53 mutations and strikingly poor outcome [1, 3]. While conventional cytogenetics is still a powerful technique, complex chromosomal aberrations test the limits of cytogenetic resolution. Moreover, detection of cryptic genomic lesions, especially gene fusions, by orthogonal methods can lead to adjustment of treatment regimens and/or identification of biomarkers [4]. The frequency at which cryptic gene fusions are present within complex karyotypes is currently unknown [5]. In this study, we investigated whether whole-genome sequencing (WGS) and whole-transcriptome sequencing (RNA-seq) can resolve these complex aberrations in a series of patients where conventional cytogenetics could not resolve the driver event. Sequencing was performed on nine patients with adverse-risk AML; seven cases had complex karyotypes and the others had unusual translocation events (Table 1 and Supplementary Table 1). The study was approved by the Research Ethics Board of the University Health Network (REB# 01–0573) and written informed consent was obtained from all patients.

Table 1 Comparison of cytogenetic and genomic findings from nine patients with adverse-risk AML

WGS and RNA-seq libraries were generated using NEBNext DNA Library Prep (New England Biolabs) and TruSeq Stranded Total RNA Sample Prep (Illumina), respectively, and were sequenced on the Illumina HiSeq 2000 platform. Bioinformatic tools used here are listed in Supplementary Table 2. Briefly, we used BWA-MEM for WGS alignment, CREST for structural variant (SV) calling, and HMMcopy for copy number variation (CNV) detection. In parallel, we used STAR for RNA-seq alignment, STAR-Fusion for fusion transcript detection, and GATK for SNV/indel calling from RNA-seq data. By performing low-coverage WGS (mean genome coverage of 11.3×; Supplementary Table 1) of leukemia DNA without matched germline DNA, we aimed to detect SVs and CNVs in a manner comparable to conventional cytogenetics. Furthermore, we compared gene fusions and rearrangements detected by CREST with fusion transcripts detected by STAR-Fusion to cross validate. DNA breakpoints and fusion transcripts were validated by PCR and RT-PCR, respectively, followed by Sanger sequencing.

Analysis of the leukemia genomes concordantly identified 39 out of 74 (53%) cytogenetic abnormalities annotated by visual inspection (Table 1 and Supplementary Table 3). The concordance rates for translocations/inversions, chromosomal gains/losses, and subchromosomal gains/losses were 74%, 54%, and 37%, respectively (Supplementary Table 1). For five cases with composite karyotypes, which were used to capture the karyotypic heterogeneity, their average concordance rate was lower than that of the other four cases (59% vs. 100%; Wilcoxon rank-sum test, p = 0.015). This is likely due to the challenge of detecting subclonal CNVs and SVs with low-coverage WGS. Nonetheless, many cryptic genomic lesions, including submicroscopic deletions of BCOR, TP53, and FOXO3/LACE1, that affect known leukemia-causing/modifying genes were detected. For instance, t(X;8)(p21.2;q24.1) that was reported following conventional cytogenetics in case 2 was revealed by genomic investigation to be linked to two other SVs that resulted in the deletion of BCOR (Supplementary Fig. 1). Complex and unbalanced rearrangement patterns, which can be more difficult to discern by G-banded karyotyping, were detected in five cases and could be linked to cytogenetic abnormalities resembling the actual events. Furthermore, in accordance with known association between TP53 mutation and complex karyotype [3], each of the three cases with the most cytogenetic abnormalities—cases 3, 7, and 8—had a point mutation and a copy number loss of TP53 (Table 1).

Notably, four out of nine leukemias in our cohort harbored gene fusion events that were not identified by cytogenetics: ETV6-MECOM (case 3), NUP98-KDM5A (case 4), PICALM-MLLT10 (case 5), and NUP98-BPTF (case 6) (Fig. 1, Supplementary Fig. 2a, and Supplementary Table 4). The latter three created in-frame fusion transcripts, but ETV6-MECOM, similar to previously reported cases, created an out-of-frame fusion between intron 4 of ETV6 (NM_001987) and intron 1 of MECOM (NM_001205194) that led to increased expression of EVI1/MECOM from an alternative translation start site in exon 3. All four cases were diagnosed with de novo AML and had adverse outcome despite receiving intensive induction therapy as per institutional protocol; cases 3, 5, and 6 were primary refractory and case 4 died during induction. These four fusions are known markers of poor prognosis, for which no targeted therapies currently exist [6,7,8]. While ETV6-MECOM and PICALM-MLLT10 are not uncommon in AML, NUP98 fusions are collectively found in only 2–4% of AML cases [6, 9]. To our knowledge, case 6 represents the third reported leukemia patient with a NUP98-BPTF fusion (Supplementary Table 5). The first report of NUP98-BPTF was in a young adult with T-cell acute lymphoblastic leukemia (ALL) [10] and the second was in an infant with acute megakaryoblastic leukemia [11]; both fusions were identified via RNA-seq. Clinical presentations of these three NUP98-BPTF cases are in line with the observation that NUP98 fusions can occur in both myeloid neoplasms and T-cell ALL. Interestingly, cells from the first two cases were also characterized by complex karyotypes. Based on the shared presence of complex karyotype and the proximity of NUP98 to the telomere, we speculate that the prevalence of NUP98-BPTF may be underestimated.

Fig. 1
figure 1

Cryptic NUP98-KDM5A and NUP98-BPTF fusion events in case 4 (left panels) and case 6 (right panels), respectively. a, e Partial karyograms of chromosomes 11, 12, and 17 and relevant cytogenetic findings. b, f Schematic representations of SVs in aforementioned chromosomes. Rearranged chromosomes, predicted from SVs and cytogenetics, are shown below. Arrows represent genes in 5′ to 3′ direction, alphabets represent genomic segments demarcated by case-specific breakpoints, apostrophes represent inverted segments, dashed lines represent SVs, and red dashed lines represent SVs leading to gene fusions. c, g Reverse transcription (RT)-PCR and Sanger sequencing of fusion transcripts. Arrows represent fused exons in 5′ to 3′ direction and letters below nucleotide codons represent corresponding amino acids. d, h Predicted fusion proteins and their domain structures adapted from UniProt. Arrows represent fused proteins and numbers represent amino acid positions

NUP98 encodes a component of the nuclear pore complex and forms fusions with at least 31 different partner genes, many of which contain recurrent protein domains, such as homeodomain (HD) and plant homeodomain (PHD) [6, 9]. In particular, the PHD domain, which is a specific binder of trimethylated histone 3 lysine 4, is found in six of the partner genes including KDM5A and BPTF [9, 12] (Fig. 1d, h and Supplementary Fig. 3). PHD domains of both NUP98-KDM5A and wild-type BPTF are essential in the activation of HOX family genes, which is associated with stemness and poor prognosis [12,13,14]. NUP98-BPTF fusion in case 6 retains the C-terminal PHD domains of BPTF, so we predict that it can activate HOX genes, as previously investigated by Roussy et al. [11]. In addition, PICALM-MLLT10 in case 5 is another known activator of HOX [8], suggesting that HOX activation may be a common leukemogenic mechanism in the context of complex karyotype.

A later timepoint sample was available from case 5 whose disease progressed 10 months after diagnosis with marked increases in CD33 and CD117 antigens (Supplementary Table 6). WGS and RNA-seq were performed on this later sample to identify genomic changes that might be associated with the progression. Upon progression, a large deletion in 6q16.1-q22.31 was lost and a cryptic 4q12 deletion, resulting in the FIP1L1-PDGFRA fusion, was acquired (Supplementary Fig. 2b). This fusion constitutively activates PDGFRA and is a target of imatinib, a tyrosine kinase inhibitor [15]. In keeping with the growth factor independent state provided by the fusion transcript, the blast count rose from 41 × 109/L at presentation to 464 × 109/L at progression. This is to our knowledge the first report of FIP1L1-PDGFRA, an oncogenic driver, arising as a cooperating event during AML progression. It is possible that the progressed leukemia could have been controlled with imatinib, but we cannot predict the response of the diagnostic clone, which only carries PICALM-MLLT10 and presumably is not driven by an activated receptor tyrosine kinase.

We next focused on the karyotypes and genomes of four cases with fusions to infer the reasons that made them cytogenetically invisible. In case 3, while sequencing detected a balanced translocation between 12p13.2 (ETV6) and 3q26.2 (MECOM) that was missed by cytogenetics, karyotyping identified three abnormalities, add(3)(q27), del(3)(q27), and add(12)(p13), near the fusion breakpoints. Due to the complexity of this case’s karyotype, which contains 31 abnormalities, it was probably not possible to discern the ETV6-MECOM rearrangement. For case 5, t(10;11)(p12.31;q14.2) translocation that creates the PICALM-MLLT10 fusion was reported as t(10;11)(p1?2;q21), highlighting the limits of cytogenetic resolution in nonstimulated bone marrow cultures. For case 4, cytogenetics reported t(12;17)(p13;q11.2) but genomic analysis revealed a three-way translocation t(11;12;17)(p15.4;p13.3;q11.2) with NUP98-KDM5A fusion arising from the t(11;12) (Fig. 1a, b). The breakpoints for NUP98 (11p11.5) and KDM5A (12p13.33) were cryptic because they are both terminal G-light material with very short rearranged segments (3.7 and 0.4 Mb, respectively) and below the resolution limit of G-banding [2]. For case 6, genomic investigation revealed a complex rearrangement pattern involving five interchromosomal translocations between chromosomes 11, 12, and 17 and a deletion in chromosome 12 (Fig. 1e, f). Collectively, they led to the loss of five chromosomal segments and the joining of NUP98 at 11p15.4 to BPTF at 17q24.2. One of the five translocations joined ETV6 at 12p13.2 to LRRC4C at 11p12 and could be ascribed to t(11;12)(p13;p13) from the karyotype, but it did not produce a functional, in-frame fusion transcript. Overall, three factors contributed to making a fusion event cytogenetically cryptic: the high number of cytogenetic abnormalities in a complex karyotype case, the proximity of a breakpoint to the telomere which tends to be G-light, and the complex rearrangement pattern, which conceals the fusion-causing translocation. An approach combining WGS and RNA-seq is advantageous in that it can bypass these factors in detecting gene fusions.

Our findings support the utility of integrated analysis of WGS and RNA-seq to identify genomic lesions of clinical importance that are undetected or incompletely revealed by cytogenetics. We provide further evidence that fusions resulting from the exchange of distal segments are difficult to ascertain by conventional cytogenetics. With continuing decrease in sequencing costs and increase in sequencing capacity, low-coverage WGS and RNA-seq are cost-effective methods to complement cytogenetics and provide a more complete picture of each leukemia. Furthermore, detecting gene fusions by sequencing can reveal a therapeutic target and/or provide markers for residual disease monitoring. The uncovering of known, and yet elusive, biomarkers in AML using genomics technologies argues for a more routine use of these established methods to help in understanding and accurately defining adverse-risk leukemias.