Early T-cell precursor acute lymphoblastic leukaemia (ETP ALL) is an aggressive malignancy of unknown genetic basis. We performed whole-genome sequencing of 12 ETP ALL cases and assessed the frequency of the identified somatic mutations in 94 T-cell acute lymphoblastic leukaemia cases. ETP ALL was characterized by activating mutations in genes regulating cytokine receptor and RAS signalling (67% of cases; NRAS, KRAS, FLT3, IL7R, JAK3, JAK1, SH2B3 and BRAF), inactivating lesions disrupting haematopoietic development (58%; GATA3, ETV6, RUNX1, IKZF1 and EP300) and histone-modifying genes (48%; EZH2, EED, SUZ12, SETD2 and EP300). We also identified new targets of recurrent mutation including DNM2, ECT2L and RELN. The mutational spectrum is similar to myeloid tumours, and moreover, the global transcriptional profile of ETP ALL was similar to that of normal and myeloid leukaemia haematopoietic stem cells. These findings suggest that addition of myeloid-directed therapies might improve the poor outcome of ETP ALL.
Acute lymphoblastic leukaemia (ALL) is the most common malignancy of childhood, with 85% of cases being of B-cell lineage, and 15% T-cell lineage1. Recent studies have identified a subtype of T-cell acute lymphoblastic leukaemia (T-ALL) termed “early T-cell precursor” (ETP) ALL that comprises up to 15% of T-ALL, and is associated with a high risk of treatment failure2. ETP ALL is characterized by lack of expression of the T-lineage cell surface markers CD1a and CD8, weak or absent expression of CD5, aberrant expression of myeloid and haematopoietic stem cell markers (for example, CD13, CD33, CD34 and CD117), and a gene expression profile reminiscent of the murine early T-cell precursor3. The normal ETP, or double negative 1 (DN1) thymocyte retains the ability to differentiate into cells of both the T-cell and myeloid, but not B-cell, lineages4.
Somatic genetic alterations in ETP ALL
ETP ALL cases commonly exhibit a high burden of DNA copy number alterations, but lack a known unifying genetic alteration2. To define the landscape of genetic alterations in ETP ALL, The St Jude Children's Research Hospital–Washington University Pediatric Cancer Genome Project performed whole-genome sequencing (WGS) for matched leukaemic and normal DNA from 12 children with ETP ALL (Supplementary Tables 1–4 and 7, Supplementary Figs 2–4), and determined the frequency of mutations in a separate cohort of 52 ETP and 42 non-ETP childhood T-ALL cases, 82 of which had matched remission DNA. Transcriptome sequencing was performed for two WGS cases, and whole-exome sequencing for three ETP samples in the recurrence cohort (Supplementary Table 8). Putative somatic sequence mutations, and structural alterations identified using CREST5, were validated by PCR and sequencing. We identified an average of 1,140 sequence mutations (range 235–1,929) and 12 structural variations (range 0–25) per case (Supplementary Tables 9, 11 and 12, Supplementary Fig. 5), including 154 non-silent sequence variants. Fifty-four per cent of the missense mutations were predicted to be deleterious (Supplementary Table 12), indicating that many of these variants are involved in leukaemogenesis.
Structural rearrangements in ETP ALL
We detected 181 structural variations across the WGS cases (Supplementary Results and Supplementary Tables 13, 14, Fig. 1 and Supplementary Fig. 6). Most abnormalities identified by cytogenetics were also evident on analysis of WGS data (Supplementary Table 15). We also observed evidence of telomere shortening on analysis of WGS data (Supplementary Fig. 7). Three cases (SJTALL001, SJTALL002, SJTALL003) had multiple complex rearrangements with breakpoints suggestive of a single cellular catastrophe (“chromothripsis”6; Supplementary Figs 6 and 8). Case SJTALL001 had a nonsense mutation in MLH3, a DNA mismatch repair gene with a role in DNA double strand break repair, and SJTALL003 had a missense mutation in DCLRE1C, which encodes the non-homologous end-joining factor ARTEMIS, indicating a potential causal relationship between these mutations and the acquisition of structural variations. Case SJTALL007 had a deletion disrupting the mismatch repair gene MSH5, and also harboured multiple structural rearrangements.
Remarkably, 51% (77 out of 151) of the validated structural variations had breakpoints in coding genes, including genes with known roles in haematopoiesis and leukaemogenesis, or genes also targeted by sequence mutations (for example, MLH3, SUZ12 and RUNX1). A majority of these structural variations (65 out of 77, 84%) are predicted to result in loss-of-function of the involved genes, or occur as part of complex translocations that result in the formation of chimaeric fusion proteins. Ten chimaeric genes encoding six fusion proteins were detected in five cases (Supplementary Table 16) which resulted in the expression of chimaeric in-frame novel fusion genes disrupting haematopoietic regulators, including ETV6–INO80D (case SJTALL002), NUP214–SQSTM1 (SJTALL009) and NAP1L1–MLLT10 (SJTALL013) (Supplementary Figs 9–12). Case SJTALL012 harboured a RUNX1–EVX1 rearrangement arising from transplicing (SJTALL012; Supplementary Figs 10 and 11). No additional cases with these chimaeric fusions were identified upon testing 77 ETP and non-ETP ALL cases with available RNA by PCR with reverse transcription. However, exome sequencing identified ETV6–INO80D in case SJTALL208 (Supplementary Fig. 13). ETV6 encodes a transcription factor required for definitive haematopoiesis that is frequently altered in leukaemia7,8,9. Deletions and mutations of ETV6 were present in 33% of ETP and 10% of non-ETP T-ALL cases (Supplementary Fig. 14).
Sequence mutations in ETP ALL
In addition to genes known to be mutated in T-ALL, including NRAS10,11 (N = 3 out of 12 WGS cases) JAK1 (ref. 12, N = 2), NOTCH1 (ref. 13, N = 1), FLT3 (refs. 14–16, N = 1), PHF6 (ref. 17, N = 3) and WT1 (ref. 18, N = 1), (Supplementary Fig. 15), we identified multiple novel recurring targets of mutation. These included DNM2 (N = 2), ECT2L (N = 2), EP300 (N = 2), GATA3 (N = 2), IL7R (N = 2), JAK3 (N = 3), RELN (N = 2) and RUNX1 (N = 4) (Table 1, Fig. 2, Supplementary Tables 17 and 18, Supplementary Fig. 15). For the two cases also analysed by transcriptome sequencing (SJTALL002 and SJTALL012), 21 out of 38 mutations were expressed. We did not observe selective expression of mutant alleles, with the exception of those with a concomitant deletion of the wild-type allele (for example, KRAS in SJTALL002).
Of 42 genes analysed by Sanger sequencing and single-nucleotide polymorphism microarray analysis in the recurrence cohort, 27 were recurrently mutated (Supplementary Table 19, Figs 2 and 3a, and Supplementary Figs 15 and 16). Of 254 validated non-silent mutations (Supplementary Table 17), 40.7% were indel mutations and 9.4% were nonsense mutations. Eighty-two per cent of missense mutations were predicted to be deleterious, a marked enrichment compared with mutations identified in the WGS samples, consistent with the majority being driver mutations.
We observed a high frequency of mutations known or predicted to result in aberrant cytokine receptor and RAS signalling in ETP ALL. Forty-three out of 64 (67.2%) of ETP cases had mutations in these pathways, compared to 8 out of 42 (19%) non-ETP cases (P < 0.0001; Table 1, Fig. 3b and Supplementary Table 20). Known or predicted activating mutations were identified in BRAF, FLT3, IGFR1, JAK1, JAK3, KRAS and NRAS (Supplementary Results). Three cases harboured the JAK3 M511I mutation located adjacent to the pseudokinase domain that has been identified previously in acute myeloid leukaemia and is transforming when introduced into murine haematopoietic progenitor cells19. The pseudokinase domain mutation, A573V, has previously been identified in acute megakaryoblastic leukaemia and is transforming20. The mutations identified in JAK1 are novel, but are in close proximity to sites of activating mutations previously identified in ALL12.
Seven cases (five ETP and two non-ETP) harboured mutations in IL7R encoding the IL7RA (interleukin 7 receptor alpha) chain (Fig. 4a). IL7RA forms a heterodimer with IL2RG (common gamma chain) for the cytokine IL7, and with CRLF2 (cytokine receptor like factor 2) forms a receptor for TSLP (thymic stromal lymphopoietin). IL7R and CRLF2 signalling are important in early lymphoid maturation21. Rearrangement of CRLF2 is observed in B-progenitor ALL22,23, and IL7R mutations have recently been identified in ALL24. All seven cases had an in-frame insertion or substitution at residues I241–V253 of the IL7R transmembrane domain. Consistent with prior data, expression of several of the IL7R mutant alleles in the cytokine-dependent murine haematopoietic Ba/F3 and MOHITO25 cell lines resulted in transformation to cytokine-independent cell growth (Fig. 4b, c). In six cases the mutations introduced a cysteine into the transmembrane domain that induces dimerization of the receptor in the absence of ligand (Fig. 4d). The mutations also induced Stat5 phosphorylation that was attenuated by Jak inhibition (Fig. 4e). Expression of mutant, but not wild type Il7r in primary murine haematopoietic progenitors resulted in enhanced colony replating in vitro (Fig. 4f, g), indicating that the IL7R alterations are transforming events in T-ALL.
We also identified a high frequency of alterations of genes with roles in haematopoietic and lymphoid development, including RUNX1, IKZF1, ETV6, GATA3 and EP300 (57.8% of ETP cases versus 16.7% of non-ETP T-ALL cases, P < 0.0001). Importantly, several of these genes were targeted by multiple mechanisms of alteration across the cohort: sequence mutation, deletion and chromosomal translocations. Six cases (all ETP) had inactivating mutations of GATA3, four of which were biallelic due to either biallelic sequence mutations (SJTALL179, R276Q and A310_T317>VRP; SJTALL010 N286T and S271_W275fs) (Fig. 2) or due to concomitant deletion of the second allele (Supplementary Table 18). GATA3 encodes GATA binding protein 3, a member of a family of highly conserved zinc-finger transcription factors that is required for the development of early T-lineage progenitors26, and is mutated in the hypoparathyroidism with sensorineural deafness and renal dysplasia syndrome (HDR)27. In four cases the mutation was at R276, a residue also mutated in HDR27. The R276P mutation results in impaired DNA-binding affinity of GATA3 for its DNA targets, indicating that the mutations observed in T-ALL are likely to be loss of function. An additional case, SJTALL011, harboured a somatic mutation in GATA2, R307W, which is also located in the highly conserved GATA zinc-finger domain and is homologous to the GATA3 R276W mutation.
Twelve cases (ten ETP, and two non-ETP) harboured alterations of RUNX1. Two cases had concomitant deletion of the non-mutated allele, and three had RUNX1 deletions but no sequence mutation. RUNX1 is required for definitive haematopoiesis28 and normal T-lymphoid development, and is commonly rearranged and mutated in myeloid and lymphoid malignancies (Supplementary results)8,29,30,31,32. The mutations observed in T-ALL commonly involve the Runt domain, include frameshift and nonsense mutations, and are predicted to be deleterious. Nine cases (eight ETP) had deletions or sequence mutations of IKZF1 (IKAROS), which encodes a zinc-finger transcription factor required for the development of all lymphoid lineages that is commonly mutated in high-risk B-progenitor ALL and murine models of T-ALL (Supplementary Fig. 16).
A notable finding was a high frequency of somatic alterations targeting histone modification in ETP ALL. Six WGS cases had alterations in genes encoding components of the polycomb repressor complex 2 (PRC2), including deletions and sequence mutations of EED, EZH2 and SUZ12 (Table 1, Fig. 2 and Supplementary Fig. 17). EZH2 catalyses trimethylation of histone 3 lysine 27 (H3K27), resulting in transcriptional repression of genes involved in development, stem cell maintenance and differentiation33. Twenty-seven (42.2%) of ETP ALL cases harboured a deletion and/or sequence mutation in these genes, compared to five (11.9%) of non-ETP T-ALL cases (P = 0.001). Gain-of-function EZH2 Y641 mutations are common in lymphoma34. In contrast, structural modelling predicts that the mutations observed in T-ALL are likely to disrupt the catalytic SET domain and result in loss of function (Supplementary Results and Supplementary Figs 18 and 19). In addition, case SJTALL192 harboured a focal homozygous deletion of SETD2 which encodes a H3K36 trimethylase, and an additional four cases had loss-of-function mutations of this gene (Supplementary Fig. 20). Three cases had predicted loss-of-function mutations of the histone acetyltransferase gene EP300 (p300). Together, 31 ETP and 5 non-ETP cases had mutations affecting epigenetic regulation, which were biallelic or involved multiple genes in 10 cases (9 ETP and 1 non-ETP).
Novel recurrent somatic mutations
Recurring mutations were also identified in genes not previously known to be involved in lymphoid development or oncogenesis. DNM2 was mutated in 17 cases (13 ETP, 4 non-ETP), including two cases with biallelic mutations (Fig. 2). DNM2 encodes dynamin 2, a member of a family of large GTPases, and is involved in a wide range of cellular functions, including endocytosis, phagosome formation, intracellular trafficking, interaction with the actin and microtubule networks, and promotion of apoptosis35. Inherited DNM2 mutations result in the degenerative neurologic diseases Charcot–Marie–Tooth peripheral neuropathy and autosomal dominant centronuclear myopathy35. As in these diseases, the mutations in T-ALL are located throughout the gene in each functional domain, and include missense, nonsense, splice site and frameshift mutations, and are therefore likely to result in loss of DNM2 function. The role of DNM2 in lymphoid development and tumorigenesis is unknown, although it is expressed in leukaemic lymphoblasts (Supplementary Fig. 22).
Eight cases had missense, nonsense or splice site mutations in ECT2L (epithelial cell transforming sequence 2 oncogene gene like). Four cases had non-synonymous mutations in RELN, which encodes reelin, a large secreted extracellular matrix protein involved in the regulation of neuronal migration, and which is mutated in the neurodevelopmental disorder autosomal recessive lissencephaly with cerebellar hypoplasia36. Notably, several cases had inherited mutations in these two genes that are predicted to be deleterious. Sequence mutations were also found in 12 regulatory RNA genes including one microRNA gene (MIR1297).
Mutations in multiple pathways in ETP ALL
Recurring mutations targeting genes regulating haematopoietic development (‘type II lesions’, for example, GATA3, RUNX1, ETV6, IKZF1 and EP300) and cytokine receptor and RAS signalling (‘type I lesions’) were present in 7 out of 12 WGS cases, with an additional three cases having either type I or type II lesions, indicating that these events are central to the pathogenesis of ETP ALL. Consistent with this, pathway analysis incorporating both sequence and structural mutations demonstrated enrichment for lesions in these pathways in the 12 WGS cases (Supplementary Table 23). Across the entire cohort, 52 out of 64 (81.3%) ETP cases harboured mutations in these pathways, compared to 13 out of 42 (31%) non-ETP T-ALL cases (P < 0.0001; Fig. 3b and Supplementary Table 20). Forty-eight per cent of ETP cases had mutations in the PRC2 genes sequenced, SETD2 and EP300, compared to 12% of non-ETP ALL (P = 0.0001). This is probably an underestimate of the frequency of mutations perturbing chromatin modification, as not all PRC2 and histone-modifying genes have been sequenced. Pathway analysis of the gene expression profile of ETP ALL (Supplementary Table 24 and Supplementary Figs 23–25) demonstrated significant positive enrichment for genes mediating JAK-STAT signalling, and negative enrichment for T-cell receptor signalling genes in ETP ALL. In addition, flow cytometric intracellular phosphosignalling analysis of primary leukaemic cells demonstrated activation of RAS and JAK-STAT signalling pathways in ETP ALL cases (Supplementary Fig. 26). Furthermore, reconstruction of the transcriptional network of ETP ALL using ARACNE37 identified 30 gene networks (‘regulons’) with RUNX1 and IKZF1 observed to be hub genes of several of these regulons. Thus, alterations of these haematopoietic transcription factors are key determinants of the transcriptional profile of ETP ALL (Supplementary Table 25).
ETP ALL is a stem-cell leukaemia
The immunophenotype and gene expression of ETP ALL are similar to the murine early T-cell precursor3. However, detailed comparison of the gene expression profiles of ETP ALL and normal human haematopoietic progenitors has not been performed. Comparison of the gene expression profile of ETP ALL with those of purified normal38,39 and myeloid leukaemic40 haematopoietic stem cell and progenitor cell populations demonstrated marked negative enrichment of the gene expression profile of normal human early T-cell precursors (Supplementary Fig. 27). In contrast, the ETP ALL signature showed significant positive enrichment of the gene expression profile of normal human haematopoietic stem cells and granulocyte macrophage precursors. In addition, the ETP ALL signature demonstrated enrichment for a leukaemic stem-cell signature associated with poor outcome in acute myeloid leukaemia40, and a signature of poor outcome in IKZF1-mutated high-risk B-progenitor ALL41. Together, these data are compatible with the notion that the genetic alterations identified here result in gross maturational arrest and an aggressive poorly differentiated stem-cell-like leukaemia.
Although the striking uniformity of clinical features, immunophenotype and transcriptional profile suggests a common underlying genetic alteration in ETP ALL, we identified a remarkable diversity of novel recurrent genetic alterations. Despite this diversity, the prevalence of mutations in genes involving cytokine receptor and RAS signalling, haematopoietic development and histone modification suggests a common pathogenesis for the establishment of the ETP leukaemic clone. Mutations known or predicted to result in activated cytokine receptor and RAS signalling are present in two-thirds of ETP cases, but only 19% of non-ETP T-ALL. This includes mutations in genes with known roles in leukaemogenesis as well as novel targets of mutation (JAK3, IL7R, IFNR1 and BRAF). The ability of the identified IL7R activating mutations to induce factor-independent growth of haematopoietic cells coupled with the known function of the other identified signalling mutations strongly supports a direct role for these alterations in leukaemic cell transformation. The high frequency of deleterious mutations in PRC2 genes suggests that disruption of PRC2-mediated gene silencing is a key event in the pathogenesis of this primitive leukaemia, but not more differentiated T-ALL cases. Several of the genes recurrently mutated in ETP are also mutated in inherited disorders (DNM2, EP300, GATA3, NRAS, KRAS, PHF6, RELN and RUNX1), and the mutational spectrum in several of these genes is similar between the inherited disorders and T-ALL. Thus, sequencing of additional T-ALL cases and other leukaemia genomes will be of great interest to fully examine the relationship of inherited and acquired lesions in leukaemogenesis.
Mutation of genes regulating cytokine receptor and/or RAS signalling pathway and epigenetic modification is a common feature of acute myeloid leukaemia but is less common in T- or B-lineage ALL (Supplementary Table 28)42. Although the gene expression profile of ETP ALL is similar to that of the murine ETP, it shows strong similarity to that of normal and myeloid leukaemic haematopoietic stem cells. This indicates that ETP ALL is distinct from non-ETP T-ALL, and in fact represents a neoplasm of a less mature haematopoietic progenitor or stem cell, with arrest at a very early maturational stage that retains the capacity for myeloid differentiation. This observation raises the possibility that treatment regimens used to treat acute myeloid leukaemia, such as those incorporating high dose cytarabine, and/or targeted therapies that inhibit cytokine receptor and JAK signalling may be beneficial in ETP ALL.
Whole-genome sequencing was performed for tumour and normal DNA from 12 children with ETP ALL treated at St Jude Children’s Research Hospital. All cases fulfilled pathologic and immunophenotypic criteria for ETP ALL2. Tumour samples were obtained from diagnostic bone marrow aspirates or peripheral blood, and comprised at least 90% tumour cells. Matched non-tumour samples were obtained from remission blood or bone marrow aspirates with less than 1% leukaemic cells. Recurrence testing was performed using a cohort of 94 childhood T-ALL cases, comprising 52 ETP ALL cases from St Jude, the Children’s Oncology Group and the Associazione Italiana Ematologia de Oncologia Pediatrica (AIEOP), and 42 non-ETP T-ALL cases from St Jude. Whole-genome DNA sequencing was performed using a paired-end sequencing strategy as described in detail in the Supplementary Information. The frequency of the identified mutations in the recurrence cohort was determined using PCR amplification and Sanger sequencing and analysis of single-nucleotide polymorphism microarray data. The study was approved by the Institutional Review Boards of St Jude Children’s Research Hospital and Washington University.
Gene Expression Omnibus
The sequence data and single nucleotide polymorphism microarray data have been deposited in the dbGaP database (http://www.ncbi.nlm.nih.gov/gap) under the accession number phs000340.v1.p1. Affymetrix U133A gene expression data have been deposited in the NCBI gene expression omnibus under GSE33315, and Affymetrix U133 Plus 2.0 PM gene expression data under accession GSE28703. The nucleotide sequence for the full-length ETV-INO80D transcript has been deposited in GenBank under accession JF736506. A public data portal for results from the St Jude – Washington University Pediatric Cancer Genome Project is available at http://www.explore.pediatriccancergenomeproject.org/.
We thank the many members of St Jude Children’s Research Hospital and The Genome Institute and Siteman Cancer Center at Washington University in St Louis for support. We thank H. Mulder for project sample management, M. Stine for assistance with data deposition, B. Pappas and S. Malone for information technology infrastructure, J. Morris, E. Walker, A. Merriman and G. Neale for performing single-nucleotide polymorphism and gene expression microarrays, W. Yang for assistance with analysis of genomic data, and J. Stokes for artwork. We thank the Tissue Resources Laboratory, the Flow Cytometry and Cell Sorting Core, and the Clinical Applications of Core Technology Laboratories of the Hartwell Center for Bioinformatics and Biotechnology of St Jude Children’s Research Hospital. We thank S. Kehoe of Beckman Coulter Genomics for assistance with Sanger sequencing. This work was funded by The St Jude Children’s Research Hospital – Washington University Pediatric Cancer Genome Project, ALSAC of St Jude Children’s Research Hospital, Cancer Center support grant P30 CA021765, NIH U01 GM 92666—PAAR4Kids, grants to R.K.W. from Washington University in St Louis and the National Human Genome Research Institute (NHGRI U54 HG003079), grants to the Children’s Oncology Group (NCI CA98543, CA98413, CA114766), and grants from Alex’s Lemonade Stand and St. Baldrick’s Foundation (to M.L.H.). S.L.H. was supported by a Haematology Society of Australasia and New Zealand New Investigator Scholarship. S.P.H. is the Ergen Family Chair in Pediatric Cancer. C.G.M. is a Pew Scholar in the Biomedical Sciences and a St. Baldrick’s Scholar.
The file contains Supplementary Tables 2-4, 11, 13-15, 17-18 and 22-25. Please see Supplementary Information file for full descriptions.
About this article
Synthetic modeling reveals HOXB genes are critical for the initiation and maintenance of human leukemia
Nature Communications (2019)