A novel targeted RNA-Seq panel identifies a subset of adult patients with acute lymphoblastic leukemia with BCR-ABL1-like characteristics.

BCR-ABL1-like B-cell precursor acute lymphoblastic leukemia (BCP-ALL) remains poorly characterized in adults. We sought to establish the frequency and outcome of adolescent and adult BCR-ABL1-like ALL using a novel RNA-Seq signature in a series of patients with BCP-ALL. To this end, we developed and tested an RNA-Seq custom panel of 42 genes related to a BCR-ABL1-like signature in a cohort of 100 patients with BCP-ALL and treated with risk-adapted ALL trials. Mutations related to BCR-ABL1-like ALL were studied in a panel of 33 genes by next-generation sequencing (NGS). Also, CRLF2 overexpression and IKZF1/CDKN2A/B deletions were analyzed. Twenty out of 79 patients (12-84 years) were classified as BCR-ABL1-like (25%) based on heatmap clustering, with significant overexpression of ENAM, IGJ, and CRLF2 (P ≤ 0.001). The BCR-ABL1-like subgroup accounted for 29% of 15-60-year-old patients, with the following molecular characteristics: CRLF2 overexpression (75% of cases), IKZF1 deletions (64%), CDKN2A/B deletions (57%), and JAK2 mutations (57%). Among patients with postinduction negative minimal residual disease, those with the BCR-ABL1-like ALL signature had a higher rate of relapse and lower complete response duration than non-BCR-ABL1-like patients (P = 0.007). Thus, we have identified a new molecular signature of BCR-ABL1-like ALL that correlates with adverse prognosis in adult patients with ALL.


Introduction
The 2016 World Health Organization (WHO)classification of acute leukemias recognizes nine different entities within B-cell precursor acute lymphoblastic leukemia/ lymphoma (BCP-ALL) and two new provisional entities, including BCR-ABL1-like. These 11 subtypes are based on specific molecular alterations, mainly chromosome rearrangements, that promote the formation of aberrant chimeric proteins and aneuploidies 1 . Next-generation sequencing (NGS) and array technologies have been instrumental in identifying new ALL subtypes, and have aided in the discovery of new leukemogenic mechanisms. It has been known for many years that a subset of patients with ALL (~25% of BCP-ALL) have no established abnormalities, commonly referred to as B-other ALL. The genomic landscape of B-other ALL is becoming increasingly clear, and the proportion of unclassifiable patients has declined significantly 2,3 . In this context, BCR-ABL1like B-ALL has emerged as one of the most relevant new subtypes due to its frequency and the potential benefit of targeted therapies (i.e., ABL and JAK inhibitors).
Philadelphia chromosome-positive ALL is defined by the t(9;22)(q34;q11) translocation that encodes BCR-ABL1 oncogene, a constitutively active kinase. This aberration is present in >95% of patients with chronic myeloid leukemia, and in 3-5% and 25% of pediatric and adult ALL cases, respectively. In 2009, the DCOG/Erasmus and COG/St. Jude groups independently discovered a high risk BCR-ABL1-negative subgroup in children with B-ALL, exhibiting a gene expression signature similar to that of BCR-ABL1 positive-ALL but lacking the BCR-ABL1 rearrangement 4 . This subtype is associated with downregulation of B-cell development genes and overexpression of stem-and progenitor-cell genes. Clinically, this ALL subtype presents with high-risk clinical features such as high white blood cell (WBC) count, poor response to induction chemotherapy, higher measurable residual disease levels, and low probability of survival [4][5][6][7][8] .
The frequency of BCR-ABL1-like ALL has been reported as 20-30% in adults, with a peak of incidence in the adolescent and young adult population (up to 42%) [9][10][11][12] . The BCR-ABL1-like ALL subtype shows deletions in several transcription factors involved in B-cell development, including IKZF1, E2A, EBF1, and PAX5 13 . Additionally, the main molecular characteristics of BCR-ABL1like ALL are the multiple translocations in different cytokine receptor and kinase signaling genes such as ABL1 (excluding BCR association), JAK2, ABL2, PDGFRB, TYK2, CSF1R, CRLF2, and EPOR. These mutations trigger the activation of growth promoting kinase or cytokine signaling pathways 14 . CRLF2 translocation and mutations in the JAK family genes are recurrent and result in the activation of JAK-STAT pathways in patients with BCR-ABL1-like B-ALL [15][16][17] .
Several different approaches have been employed for the characterization of BCR-ABL1-like ALL. Patients were classified using hierarchical clustering on a gene expression array in early studies 6,8 . A simpler approach was subsequently optimized based on Low Density Arrays carrying a small number of genes selected by microarray prediction analysis 15,18 . In the present study, we designed a targeted NGS RNA-Seq panel of 42 genes to classify patients with BCR-ABL1-like ALL. We sought to identify the BCR-ABL1-like ALL signature by targeted expression in a series of adolescent and adult patients with BCP-ALL, but without recurrent genetic abnormalities defined by the WHO classification (henceforth Bother ALL), and to evaluate its clinical, prognostic and therapeutic relevance.

Patients and study design
We examined bone marrow (BM) or peripheral blood (PB) samples from adolescent and adult patients newly diagnosed with B-other ALL and treated between 2003 and 2017 in several Spanish hospitals. Patients received frontline chemotherapy according to PETHEMA (Programa Español de Tratamientos en Hematología) ALL risk and minimal residual disease (MRD)-oriented trials 19,20 . A first series of patients treated between 2002 and 2012 was evaluated for the identification of the BCR-ABL1-like signature (n = 49), and a second series of patients treated between 2012 and 2017 was used for validation (n = 100).
The study design is shown in Fig. 1. One hundred patients with BCR-ABL1-negative B-ALL were selected for molecular studies. Of those, 16 were discarded [Burkitt Lymphoma (n = 4), undifferentiated acute leukemia (n = 2), ALL with MLL rearrangements (n = 2), TCF3-PBX1 (n = 1), ETV6/RUNX1 (n = 1), hyperdiploidy (n = 3) and hypodiploidy (n = 1) not classified as BCR-ABL1like based on the RNA-Seq signature, and two patients with low quality clinical data]. The remaining 84 patients, classified as B-other ALL, were considered valid for BCR-ABL1-like signature identification by RNA-Seq. Of these, 28 were discarded for survival analysis due to the heterogeneity of treatment protocols (including palliative therapy) and the absence of MRD assessment. Therefore, the prognostic relevance of the BCR-ABL1-like signature was investigated in a cohort of 56 homogeneously treated patients (median age 34 years; range 16-59 years). The main clinical characteristics and prognosis of patients with BCR-ABL1-like B-ALL are shown in Table 1.
Details for patient samples are provided in Supplementary Table S1. The study was conducted in accordance with the principles of the Declaration of Helsinki, and the protocols were approved by the appropriate institutional review boards. All patients provided written informed consent for the analysis of their biological specimens.
MRD assessment by multiparametric flow cytometry BM MRD levels were centrally assessed at the end of induction (weeks 5-6) in complete remission (CR) patients and at the end of the third consolidation cycle (weeks 16-18) by multiparameter flow cytometry. MRD levels at this latter time point were used to assign postconsolidation therapy (continuation chemotherapy or allogeneic hematopoietic stem cell transplantation). The   detection limit of the method was 1 × 10 −4 . MRD level was considered positive when exceeded 0.01% or 1 × 10 −4 at the end of induction and after consolidation.

Molecular biology analyses CRLF2 overexpression
One microgram of RNA was retrotranscribed to evaluate the expression levels of CRLF2 relative to GAPDH, by real-time quantitative PCR (qPCR). The probes were acquired from Gene Expression Taqman Assays (Thermo Fisher, Palo Alto, CA): Hs00845692_m1 (CRLF2), and Hs02786624_g1 (GAPDH). Overexpression was determined by means of 2 −ΔΔCt method 21 , and was defined as positive when it was ≥0.1% relative to GAPDH gene expression.
Targeted RNA-sequencing cDNA was obtained after reverse transcription of 1 µg of RNA. cDNA integrity was checked by qPCR with a GUSB Taqman probe (Hs00939627_m1) (Thermo Fisher Scientific), discarding cDNA with a Ct>25 at a threshold of 0.1.
The quality of the amplified cDNA libraries was evaluated using Bioanalyzer High Sensitivity chips (Agilent Technologies, Santa Clara, CA) and quantified with Ion Library TaqMan™ Quantitation Kits (Thermo Fisher Scientific). Libraries were diluted to 100 pM and pooled equally, assigning 150,000 reads per sample. Pooled libraries were amplified using the Ion Chef System with the Ion 540 Sequencing Kit (Thermo Fisher Scientific). Enriched libraries on a chip were sequenced on the Ion GeneStudio S5 System using the Ion S5 Sequencing Kit (Thermo Fisher Scientific) with 500 flows. The absolute normalized Reads Per Million (RPM) matrix was obtained from RNA-Seq Analysis plug-in (v5.4.0.1) within Torrent Suite software (v5.10) (Thermo Fisher Scientific). The matrix was subsequently normalized intra-patient relative to GUSB because it was the housekeeping gene with the higher Spearman correlation coefficient (ρ) of the reads between patients within the matrix data, compared to the other three genes. A final analysis was performed using the web-based tool Morpheus (https://software.broadinstitute. Complex karyotype is defined as more than four chromosomic alterations. b Other: del 12p (n = 1); del6q (n = 1); del7p plus del9p (n = 1); del9p (n = 2); del6q plus del 12p (n = 1); other deletions (n = 3); other rearrangements (n = 4); other alterations (n = 4). org/morpheus/), a matrix visualization and analysis platform, obtaining an unsupervised hierarchical cluster heatmap using one minus Pearson correlation as a metric, and an average of the data as the linkage method. The raw RNA-Seq sequencing data were uploaded to NCBI with BioProject ID: PRJNA613841.
Libraries were prepared following the Ampliseq® protocol using at least 10 ng of template DNA per reaction. Multiple indexed libraries were pooled and sequenced on the Ion GeneStudio S5 System using Ion S5 Sequencing Kit, with 500 flows. Samples were sequenced to an average 1900× coverage. Alignment and variant detection were performed using Ion Reporter v5.10 using the human reference genome (hg19). Variants were manually reviewed in Integrated Genome Viewer v2.3.81 (Broad Institute, Cambridge, MA). Variants were classified as benign, unknown significance, pathogenic or likely pathogenic according to VARSOME software 23 . The raw DNA-Seq sequencing data were uploaded to NCBI with BioProject ID: PRJNA614523.

Statistical analyses
Baseline characteristics were reported as frequency and percentage for categorical variables and as median and range for quantitative variables. Comparisons of proportions and the medians of variables between different groups were performed using the χ 2 test, Fisher's exact test, or the nonparametric median test as appropriate. Overall survival (OS) was measured from the time of diagnosis to the time of death from any cause. Diseasefree survival (DFS) was measured from the date of achievement of the first remission until the date of relapse or death from any cause. Cumulative incidence of relapse (CIR) was calculated from the date of achievement of the first remission until the date of relapse. Patients who died without relapse were counted as a competing risk.
Patients not known to have relapsed ordied at last followup were censored on the date they were last examined. OS and DFS curves were performed using the Kaplan-Meier estimation, and the log-rank test was used for comparisons between groups. CIR curves were estimated using cumulative incidence rates and were compared by Gray's test. Two-sided P-values < 0.05 were considered statistically significant. Multivariable analyses were performed using the Cox proportional hazards model for OS and DFS, and the Fine and Gray model for CIR. The statistical package SPSS version 24.0 (Statistical Package for Social Sciences Inc., Chicago, IL) and R 3.4.2 software were used for all analyses.

BCR-ABL1-like signature
Color Key -Expression levels ratio of mutations did not allow us to assess the possible prognostic significance of JAK/STAT pathway mutations, N/KRAS, IKZF1, or PAX5 point mutations. The univariable and multivariable models for the 56 patients are shown in Supplementary Tables S2 and S3. CRLF2 overexpression: association with BCR-ABL1-like, JAK/STAT mutation pathway and prognosis Among the 56 B-other ALL patients with clinical data, most of the BCR-ABL1-like patients showed CRLF2 overexpression clustered in the BCR-ABL1-like subgroup [12/16(75%) vs 9/40(22%) non-BCR-ABL1-like patients, P < 0.001]. There were available samples for NGS molecular studies of DNA for 42/56 patients (14/16 BCR-ABL1-like and 24/40 non-BCR-ABL1-like). A total of 57 non-synonymous variants affecting 15 genes were identified among the 42 patients. At least one variant could be identified in 79% (33/42) of the patients, and at least one pathogenic mutation was identified in 57% (24/42). The distribution of the mutated genes is shown in Fig. 5a and in more detail in Supplementary Table S4. JAK2 mutations (more recurrently c.2047A > G and p.R683G) were enriched in BCR-ABL1-like ALL patients [9/14(64%) vs non-BCR-ABL1-like 3/28(11%), P = 0.001]. The distribution of JAK2 mutations across the protein domains is shown in Fig. 5b.
Significant differences in overexpression of CRLF2 were found between BCR-ABL1-like and non-BCR-ABL1-like groups (see Supplementary Table S5, P < 0.001), and additional mutations were found in JAK-STAT, RAS, and transcription factors such as IKZF1 or PAX5. Statistically significant differences were found in the rate of JAK/ STAT mutations between BCR-ABL1-like patients and the remaining B-other ALL patients [BCR-ABL1-like 9/14 (64%) vs 3/28 (11%), P = 0.001], but not for RAS genes or lymphoid transcription factors.
Of note, no differences were observed in CR achievement, MRD clearance or outcome between CRLF2+/BCR-ABL1-like and CRLF2+/non-BCR-ABL1-like patients (data not shown).
Prognostic impact of other molecular alterations: IKZF1 and CDKN2A/B deletions Analysis of copy number alterations by MLPA was available for 44/56 B-other ALL patients. Results showed that 75% of these patients (33/44) had at least one deletion in IKZF1 or CDKN2A/B, but no significant differences were found between BCR-ABL1-like and non-BCR-ABL1like groups, or for individual deletions in the case of codeletion of both genes ( Table 1).
As shown in univariable and multivariable analysis (Supplementary Tables S2 and S3), CDKN2A/B deletions showed statistical significance for outcome prediction in the 56 B-other ALL patients. However, we observed significant differences regarding the impact of these deletions within BCR-ABL1-like and non-BCR-ABL1-like subgroups individually. Although the number of patients was low, BCR-ABL1-like patients with CDKN2A/B deletion (n = 6) showed lower OS than equivalent patients without loss of CDKN2A/B (n = 8) [4-year OS 17% (95% CI: 0%; 46%) vs 83% (95% CI: 53%; 100%), P = 0.041]. However, these differences showed a trend for DFS and were not statistically significant for CIR, in which patients with CDKN2A/B loss had higher relapse incidence in both subgroups individually. By contrast, the prognostic differences on OS and DFS observed for BCR-ABL1-like were not observed within the non-BCR-ABL1like patients, although non-BCR-ABL1-like patients with CDKN2A/B deletion showed a trend towards higher CIR than those without CDKN2A/B loss (data not shown).
Subgroup analysis was repeated for patients with both deletions, to assess whether the concomitant loss of IKZF1 and CDKN2A/B had adifferent prognosis in BCR-ABL1-like  and non-BCR-ABL1-like cohorts individually. Again, although the patient numbers were low, we observed that within the BCR-ABL1-like subset, patients harboring both deletions experienced significantly lower probability of survival than those without both alterations (4/5 vs 5/9 deaths, P = 0.029), mainly due to a higher relapse rate (3/4 i 5/8 relapses, P = 0.043) (Fig. 6). Within the remaining Bother ALL subgroup, patients with deletion of both genes also experienced more relapses although the results were not statistically significant (data not shown).

Discussion
We sought to identify an RNA-Seq signature for BCR-ABL1-like ALL in a series of homogeneously treated adolescent and adult patients with B-other ALL, and analyze its prognostic impact. We demonstrate the capacity of a simplified targeted RNA-Seq signature to segregate BCR-ABL1-like patients within a BCP-ALL adult population. This approach enabled us to confirm that BCR-ABL1-like is a high-risk ALL subtype also in young, adolescent, and older adults despite treatment with modern MRD-oriented protocols.
The prognostic impact of the BCR-ABL1-like subtype has been mainly reported in pediatric and adolescent populations, and studies in adult BCP-ALL are scarce 6,8,16 . The DCOG/Erasmus and St Jude groups used different gene profiles to define the BCR-ABL1-like subtype and, consequently, there is a lack of standardization and comparability regarding the best strategy to identify these patients. The frequency (29%) and clinical outcome of BCR-ABL1-like ALL found in this study is in accord with that of the MD Anderson and St. Judereports 9,11 : Roberts and coworkers 11 characterized 194/798 (24%) patients (age range 21-86 years) as BCR-ABL1-like, and Jain et al. 9 identified 49/148 (33%) patients (age range 15-71 years). Regarding IKZF1 deletions, the MD Anderson group classified 73% of patients within the BCR-ABL1-like population as positive, St. Jude reported 81% of IKZF1 losses, and we found 64% of mutated patients. In our study, we identified 75% of patients CRLF2+ and 57% with mutated JAK2, similar to the published results. Indeed, 61% and 51% of CRLF2+ and 45% and 27% of mutated JAK2 have been reported within BCR-ABL1-like patients in the St. Jude and MD Anderson studies, respectively.
The prognostic impact of the BCR-ABL1-like ALL subtype in adults homogeneously treated within MRD-oriented trials remains still unclear. In our series, these patients had higher WBC and blast counts than non-BCR-ABL1-like patients, suggesting a greater degree of cell proliferation with a strong capacity for dissemination-an essential characteristic related to a higher degree of clonal heterogeneity and treatment resistance. Interestingly, a high degree of cell cycle deregulation (57% CDKN2A/B deletion) and stem cell-like characteristics as a result of IKZF1 losses (64%) might play an important role in the aggressiveness and resistance seen in the present and other series of BCR-ABL1-like patients. As mentioned, 75% of BCR-ABL1-like patients overexpressed CRLF2 and 64% had mutations in the JAK-STAT pathway (and all CRLF2+ patients also had JAK2 mutations), supporting the idea that this is an essential pathway in this aggressive subtype. Most JAK2 mutations identified in the present study were pathogenic and involved the tyrosinekinase domains 1 and 2, close to the nucleotide binding site for ATP-ADP exchange (residues 855-863), and structurally distant from the self-regulatory tyrosine 1007 and 1008. Currently, several clinical trials are evaluating the efficacy of JAK and mTOR inhibitors (in addition to TKI inhibitors), alone or in combination with other drugs based on this genetic rationale [24][25][26] .
IKZF1 and CDKN2A/B deletions are known to be associated with poor prognosis in pediatric and adult B-ALL populations [27][28][29] , mainly in the BCR-ABL1 B-ALL subpopulation. Our results suggest that, despite the low number of cases analyzed, CDKN2A/B deletions are markers of poor survival in B-other ALL, and the association between CDKN2A/B and IKZF1 deletions might also contribute to the dismal prognosis of BCR-ABL1-like. Finally, the survival analysis suggests that all CRLF2+ patients show poor response to standard treatment and bad prognosis whether or not they are BCR-ABL1-like. Unfortunately, we did not have enough samples to evaluate the prognostic significance of additional alterations (i.e., JAK2 mutations, IKZF1, and CDKN2A/B deletions) in the cohort of 56 B-other ALL patients or within the subgroups of patients with and without CRLF2 overexpression.
Due to the limited number of BCR-ABL1-like patients in our series, the difference in CR achievement between BCR-ABL1-like and non-BCR-ABL1-like patients did not reach statistical significance, although there were four times as many patients not achieving CR in the BCR-ABL1-like subgroup. Taken together with the low MRD clearance at the end of induction seen in BCR-ABL1-like patients (three-quarters of BCR-ABL1-like patients were MRD-positive at the end of induction compared with onethird of non-BCR-ABL1-like patients), this indicates higher treatment resistance. Given the small number of Bother patients (n = 10) MRD-positive at the end of consolidation, we could not evaluate the role of HSCT on BCR-ABL1-like and non-BCR-ABL1-like patients.
In addition to treatment resistance, the high degree of relapse observed in BCR-ABL1-like patients and, more importantly, those who were MRD-negative at the end of induction, suggests that standard treatments are not less efficient for this ALL subtype. Also, while MRD has become the most powerful outcome predictor in ALL, it is not fully predictive, and other factors beyond MRD (e.g., genetic alterations) also impact patients' prognosis, especially in the B-other subpopulation.
The BCR-ABL1-like signature methodology clearly distinguishes the outcome of patients in whom no recurrent genetic abnormalities could be identified by standard methods. Specifically, loss of CDKN2A/B identifies patients at high risk of disease progression among those with non-available genetic risk categorization. We also provide more detailed information on the prognostic impact of IKZF1 and CDKN2A/B deletions (together and separately), specifically within the BCR-ABL1-like subtype, where they seem to confer poor outcome. By contrast, the prognostic relevance of these alterations in the non-BCR-ABL1-like subgroup is less clear. Due to the paucity of samples, more studies are needed to address these issues.
Currently, methodical characterization BCR-ABL1-like is expensive and laborious. The multiple rearrangements present in this subtype and the constellation of other molecular alterations characteristic but not exclusive of this entity make NGS an attractive option, although this requires complex bioinformatic analysis. Interestingly, the approach shown here enables the identification of the BCR-ABL1-like signature by targeted RNA-Seq in 3 days upon sample arrival with a simple and fast NGS library protocol and subsequent sequencing. It is also reproducible, as we have shown in the validation cohort. Finally, a simple computer plug-in will output a normalized matrix that will predict patient outcome in the hierarchical clustering heatmap.
In summary, our study demonstrates that targeted RNA-Seq correctly identifies BCR-ABL1-like ALL in BCP-ALL patients. This methodology is inexpensive, available, and feasible, and could be introduced in the routine clinical workout for ALL patients. We have also confirmed the poor prognosis of BCR-ABL1-like ALL in the adult setting. An early diagnosis of BCR-ABL1-like ALL could be critical to initiate appropriate treatment depending on the kinase profile (i.e., JAK inhibitor, tyrosine-kinase inhibitor, etc.). These results endorse the inclusion of these molecular approaches in further clinical protocols for adopting early clinical decisions with the goal to better manage patients with ALL.
Supplementary information is available at Blood Cancer Journal website.