Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia


Recent genomic studies have identified chromosomal rearrangements defining new subtypes of B-progenitor acute lymphoblastic leukemia (B-ALL), however many cases lack a known initiating genetic alteration. Using integrated genomic analysis of 1,988 childhood and adult cases, we describe a revised taxonomy of B-ALL incorporating 23 subtypes defined by chromosomal rearrangements, sequence mutations or heterogeneous genomic alterations, many of which show marked variation in prevalence according to age. Two subtypes have frequent alterations of the B lymphoid transcription-factor gene PAX5. One, PAX5alt (7.4%), has diverse PAX5 alterations (rearrangements, intragenic amplifications or mutations); a second subtype is defined by PAX5 p.Pro80Arg and biallelic PAX5 alterations. We show that p.Pro80Arg impairs B lymphoid development and promotes the development of B-ALL with biallelic Pax5 alteration in vivo. These results demonstrate the utility of transcriptome sequencing to classify B-ALL and reinforce the central role of PAX5 as a checkpoint in B lymphoid maturation and leukemogenesis.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Integrative B-ALL subtypes.
Fig. 2: Mutational profile of PAX5-altered (PAX5alt) B-ALL.
Fig. 3: Mutational profile of PAX5 P80R B-ALL.
Fig. 4: Distribution of signaling mutations in B-ALL subtypes.
Fig. 5: Gene expression signature of PAX5 P80R.
Fig. 6: Event-free and overall survival of the PAX5 p.Pro80ARg (P80R) subtype.
Fig. 7: PAX5 P80R impairs B cell differentiation and drives development of B-ALL.

Data availability

The raw and analyzed data are provided in a graphical, interactive platform (see URLs). Genomic data generated for this study have been deposited in the European Genome-phenome Archive (EGA) under accession number EGAS00001003266. Other legacy data used in this study have been deposited in the EGA in previous projects under accession numbers EGAS00001000654, EGAS00001001952, EGAS00001001923, EGAS00001002217 and EGAS00001000447. The TARGET genomic data used in this study are available through the TARGET website (see URLs) and also in dbGaP (see URLs) under accession number phs000218 (TARGET). The other data supporting this study are available from the corresponding author upon reasonable request.


  1. 1.

    Hunger, S. P. & Mullighan, C. G. Acute lymphoblastic leukemia in children. N. Engl. J. Med. 373, 1541–1552 (2015).

    CAS  Article  Google Scholar 

  2. 2.

    Iacobucci, I. & Mullighan, C. G. Genetic basis of acute lymphoblastic leukemia. J. Clin. Oncol. 35, 975–983 (2017).

    CAS  Article  Google Scholar 

  3. 3.

    Roberts, K. G. et al. Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer Cell 22, 153–166 (2012).

    CAS  Article  Google Scholar 

  4. 4.

    Iacobucci, I. et al. Truncating erythropoietin receptor rearrangements in acute lymphoblastic leukemia. Cancer Cell 29, 186–200 (2016).

    CAS  Article  Google Scholar 

  5. 5.

    Roberts, K. G. et al. Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N. Engl. J. Med. 371, 1005–1015 (2014).

    Article  Google Scholar 

  6. 6.

    Zhang, J. et al. Deregulation of DUX4 and ERG in acute lymphoblastic leukemia. Nat. Genet. 48, 1481–1489 (2016).

    CAS  Article  Google Scholar 

  7. 7.

    Gu, Z. et al. Genomic analyses identify recurrent MEF2D fusions in acute lymphoblastic leukaemia. Nat. Commun. 7, 13331 (2016).

    CAS  Article  Google Scholar 

  8. 8.

    Suzuki, K. et al. MEF2D-BCL9 fusion gene is associated with high-risk acute B-cell precursor lymphoblastic leukemia in adolescents. J. Clin. Oncol. 34, 3451–3459 (2016).

    CAS  Article  Google Scholar 

  9. 9.

    Gocho, Y. et al. A novel recurrent EP300-ZNF384 gene fusion in B-cell precursor acute lymphoblastic leukemia. Leukemia 29, 2445–2448 (2015).

    CAS  Article  Google Scholar 

  10. 10.

    Yasuda, T. et al. Recurrent DUX4 fusions in B cell acute lymphoblastic leukemia of adolescents and young adults. Nat. Genet. 48, 569–574 (2016).

    CAS  Article  Google Scholar 

  11. 11.

    Lilljebjorn, H. et al. Identification of ETV6-RUNX1-like and DUX4-rearranged subtypes in paediatric B-cell precursor acute lymphoblastic leukaemia. Nat. Commun. 7, 11790 (2016).

    Article  Google Scholar 

  12. 12.

    Lilljebjorn, H. & Fioretos, T. New oncogenic subtypes in pediatric B-cell precursor acute lymphoblastic leukemia. Blood 130, 1395–1401 (2017).

    Article  Google Scholar 

  13. 13.

    Den Boer, M. L. et al. A subtype of childhood acute lymphoblastic leukaemia with poor treatment outcome: a genome-wide classification study. Lancet. Oncol. 10, 125–134 (2009).

    Article  Google Scholar 

  14. 14.

    Mullighan, C. G. et al. Deletion of IKZF1 and prognosis in acute lymphoblastic leukemia. N. Engl. J. Med. 360, 470–480 (2009).

    CAS  Article  Google Scholar 

  15. 15.

    Tibshirani, R., Hastie, T., Narasimhan, B. & Chu, G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. USA 99, 6567–6572 (2002).

    CAS  Article  Google Scholar 

  16. 16.

    Harrison, C. J. et al. An international study of intrachromosomal amplification of chromosome 21 (iAMP21): cytogenetic characterization and outcome. Leukemia 28, 1015–1021 (2014).

    CAS  Article  Google Scholar 

  17. 17.

    Holmfeldt, L. et al. The genomic landscape of hypodiploid acute lymphoblastic leukemia. Nat. Genet. 45, 242–252 (2013).

    CAS  Article  Google Scholar 

  18. 18.

    Johnson, N. A. et al. Lymphomas with concurrent BCL2 and MYC translocations: the critical factors associated with survival. Blood 114, 2273–2279 (2009).

    CAS  Article  Google Scholar 

  19. 19.

    Zhu, X. et al. Identification of functional cooperative mutations of SETD2 in human acute leukemia. Nat. Genet. 46, 287–293 (2014).

    CAS  Article  Google Scholar 

  20. 20.

    Mar, B. G. et al. Mutations in epigenetic regulators including SETD2 are gained during relapse in paediatric acute lymphoblastic leukaemia. Nat. Commun. 5, 3469 (2014).

    Article  Google Scholar 

  21. 21.

    Schebesta, A. et al. Transcription factor Pax5 activates the chromatin of key genes involved in B cell signaling, adhesion, migration, and immune function. Immunity 27, 49–63 (2007).

    CAS  Article  Google Scholar 

  22. 22.

    Churchman, M. L. et al. Efficacy of retinoids in IKZF1-mutated BCR-ABL1 acute lymphoblastic leukemia. Cancer Cell 28, 343–356 (2015).

    CAS  Article  Google Scholar 

  23. 23.

    Hu, Y., Yoshida, T. & Georgopoulos, K. Transcriptional circuits in B cell transformation. Curr. Opin. Hematol. 24, 345–352 (2017).

    CAS  Article  Google Scholar 

  24. 24.

    Lauberth, S. M. & Rauchman, M. A conserved 12-amino acid motif in Sall1 recruits the nucleosome remodeling and deacetylase corepressor complex. J. Biol. Chem. 281, 23922–23931 (2006).

    CAS  Article  Google Scholar 

  25. 25.

    Miller, N. L. et al. A non-canonical role for Rgnef in promoting integrin-stimulated focal adhesion kinase activation. J. Cell. Sci. 126, 5074–5085 (2013).

    CAS  Article  Google Scholar 

  26. 26.

    Larsen, E. C. et al. Dexamethasone and high-dose methotrexate improve outcome for children and young adults with high-risk B-acute lymphoblastic leukemia: a report from Children’s Oncology Group study AALL0232. J. Clin. Oncol. 34, 2380–2388 (2016).

    CAS  Article  Google Scholar 

  27. 27.

    Mullighan, C. G. et al. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature 446, 758–764 (2007).

    CAS  Article  Google Scholar 

  28. 28.

    Shah, S. et al. A recurrent germline PAX5 mutation confers susceptibility to pre-B cell acute lymphoblastic leukemia. Nat. Genet. 45, 1226–1231 (2013).

    CAS  Article  Google Scholar 

  29. 29.

    Dang, J. et al. Pax5 is a tumor suppressor in mouse mutagenesis models of acute lymphoblastic leukemia. Blood 125, 3609–3617 (2015).

    CAS  Article  Google Scholar 

  30. 30.

    Adams, B. et al. Pax-5 encodes the transcription factor BSAP and is expressed in B lymphocytes, the developing CNS, and adult testis. Genes Dev. 6, 1589–1607 (1992).

    CAS  Article  Google Scholar 

  31. 31.

    Urbanek, P., Wang, Z. Q., Fetka, I., Wagner, E. F. & Busslinger, M. Complete block of early B cell differentiation and altered patterning of the posterior midbrain in mice lacking Pax5/BSAP. Cell 79, 901–912 (1994).

    CAS  Article  Google Scholar 

  32. 32.

    Kuiper, R. P. et al. High-resolution genomic profiling of childhood ALL reveals novel recurrent genetic lesions affecting pathways involved in lymphocyte differentiation and cell cycle progression. Leukemia 21, 1258–1266 (2007).

    CAS  Article  Google Scholar 

  33. 33.

    Fortschegger, K., Anderl, S., Denk, D. & Strehl, S. Functional heterogeneity of PAX5 chimeras reveals insight for leukemia development. Mol. Cancer Res. 12, 595–606 (2014).

    CAS  Article  Google Scholar 

  34. 34.

    Novershtern, N. et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell 144, 296–309 (2011).

    CAS  Article  Google Scholar 

  35. 35.

    Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    CAS  Article  Google Scholar 

  36. 36.

    Hardy, R. R. & Hayakawa, K. B cell development pathways. Annu. Rev. Immunol. 19, 595–621 (2001).

    CAS  Article  Google Scholar 

  37. 37.

    Pui, C. H. et al. Treating childhood acute lymphoblastic leukemia without cranial irradiation. N. Engl. J. Med. 360, 2730–2741 (2009).

    CAS  Article  Google Scholar 

  38. 38.

    Bowman, W. P. et al. Augmented therapy improves outcome for pediatric high risk acute lymphocytic leukemia: results of Children’s Oncology Group trial P9906. Pediatr. Blood. Cancer 57, 569–577 (2011).

    Article  Google Scholar 

  39. 39.

    Goldstone, A. H. et al. In adults with standard-risk acute lymphoblastic leukemia, the greatest benefit is achieved from a matched sibling allogeneic transplantation in first complete remission, and an autologous transplantation is less effective than conventional consolidation/maintenance chemotherapy in all patients: final results of the International ALL Trial (MRC UKALL XII/ECOG E2993). Blood 111, 1827–1833 (2008).

    CAS  Article  Google Scholar 

  40. 40.

    Kantarjian, H. et al. Long-term follow-up results of hyperfractionated cyclophosphamide, vincristine, doxorubicin, and dexamethasone (Hyper-CVAD), a dose-intensive regimen, in adult acute lymphocytic leukemia. Cancer 101, 2788–2801 (2004).

    CAS  Article  Google Scholar 

  41. 41.

    Ravandi, F. et al. First report of phase 2 study of dasatinib with hyper-CVAD for the frontline treatment of patients with Philadelphia chromosome-positive (Ph+) acute lymphoblastic leukemia. Blood 116, 2070–2077 (2010).

    CAS  Article  Google Scholar 

  42. 42.

    Thomas, D. A. et al. Treatment of Philadelphia chromosome-positive acute lymphocytic leukemia with hyper-CVAD and imatinib mesylate. Blood 103, 4396–4407 (2004).

    CAS  Article  Google Scholar 

  43. 43.

    Thomas, D. A. et al. Chemoimmunotherapy with a modified hyper-CVAD and rituximab regimen improves outcome in de novo Philadelphia chromosome-negative precursor B-lineage acute lymphoblastic leukemia. J. Clin. Oncol. 28, 3880–3889 (2010).

    CAS  Article  Google Scholar 

  44. 44.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  Article  Google Scholar 

  45. 45.

    Nicorici, D. et al. FusionCatcher - a tool for finding somatic fusion genes in paired-end RNA-sequencing data. Preprint at (2014).

  46. 46.

    Edgren, H. et al. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome. Biol. 12, R6 (2011).

    CAS  Article  Google Scholar 

  47. 47.

    Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    CAS  Article  Google Scholar 

  48. 48.

    Alexander, T. B. et al. The genetic basis and cell of origin of mixed phenotype acute leukaemia. Nature 562, 373–379 (2018).

    CAS  Article  Google Scholar 

  49. 49.

    Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

    CAS  Article  Google Scholar 

  50. 50.

    Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome. Biol. 11, R106 (2010).

    CAS  Article  Google Scholar 

  51. 51.

    Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    CAS  Article  Google Scholar 

  52. 52.

    McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  Article  Google Scholar 

  53. 53.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  Article  Google Scholar 

  54. 54.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  55. 55.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  Article  Google Scholar 

  56. 56.

    Pounds, S. et al. Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315–321 (2009).

    CAS  Article  Google Scholar 

  57. 57.

    Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).

    CAS  Article  Google Scholar 

  58. 58.

    Yau, C. OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes. Bioinformatics 29, 2482–2484 (2013).

    CAS  Article  Google Scholar 

  59. 59.

    Gu, Z. & Mullighan, C. G. ShinyCNV: a Shiny/R application to view and annotate DNA copy number variations. Bioinformatics (2018).

    Article  Google Scholar 

  60. 60.

    Zambon, A. C. et al. Go-elite: a flexible solution for pathway and ontology over-representation. Bioinformatics 28, 2209–2210 (2012).

    CAS  Article  Google Scholar 

  61. 61.

    Nutt, S. L., Urbanek, P., Rolink, A. & Busslinger, M. Essential functions of Pax5 (BSAP) in pro-B cell development: difference between fetal and adult B lymphopoiesis and reduced V-to-DJ recombination at the IgH locus. Genes Dev. 11, 476–491 (1997).

    CAS  Article  Google Scholar 

  62. 62.

    Pelletier, S., Gingras, S. & Green, D. R. Mouse genome engineering via CRISPR-Cas9 for study of immune function. Immunity 42, 18–27 (2015).

    CAS  Article  Google Scholar 

  63. 63.

    Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).

    CAS  Article  Google Scholar 

  64. 64.

    Mantel, N. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother. Rep. 50, 163–170 (1966).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Сox, D. R. Regression models and life-tables. J. R. Stat. Soc. Series B Stat. Methodol. 34, 187–220 (1972).

    Google Scholar 

Download references


We thank the Biorepository, the Genome Sequencing Facility of the Hartwell Center for Bioinformatics and Biotechnology, and the Cytogenetics core facility of SJCRH. This work was supported by the American Lebanese Syrian Associated Charities of SJCRH, American Society of Hematology Scholar Award (to Z.G. and K.G.R.), the Leukemia & Lymphoma Society’s Career Development Program Special Fellow Award (to Z.G.), St. Baldrick’s Foundation Robert J. Arceci Innovation Award (to C.G.M.), Amgen, Inc. to ECOG-ACRIN, NCI Outstanding Investigator Award R35 CA197695 (to C.G.M.), National Institute of General Medical Sciences grant P50 GM115279 (to C.G.M.), NCI grants P30 CA021765 (St. Jude Cancer Center Support Grant), ECOG-ACRIN Operations Center grants CA180820 (to P. O’Dwyer from University of Pennsylvania and the Abramson Cancer Center), CA189859 (to E.P.), CA180790 (to M.R.L.) and CA180791 (to M.S.T. and Y.Z).

Author information




Z.G. and C.G.M. designed the study, analyzed the data and wrote the manuscript. M.L.C performed experiments, analyzed the data and wrote the manuscript. K.G.R. performed sample preparation and data analysis. K.G.R., D.P. and C.-H.P. analyzed survival data. I.M., S. Pelletier, S.G., H.B., D.P.-T., A.H. and I.I. performed experiments. J.N. and J.D. provided in vitro modeling data. X.Z. developed the data portal webpage. K.H., L.S., S. Pounds, C.Q., S.N. and J.Z. analyzed genomic data. C.C., M.D. and Y.D. performed biostatistical analysis. S.R., J.G.-F., E.A.R., M.J.B., B.L.W., W.L.C., P.A.Z.-M., K.R.R., L.A.M., K.W.M., A.R., O.S., J.P.R., M.D.M., J.M.R., S.L., M.R.L., M.S.T., J.R., Y.Z., R.B., J.K., K.M., C.D.B., W.S., S.K., H.M.K., M.K., W.E., S.J., J.Y., E.P., J.D., M.V.R., M.L.L. and S.P.H. provided clinical samples and data.

Corresponding author

Correspondence to Charles G. Mullighan.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 B-ALL subtypes.

a, Heatmap of 1,988 cases’ gene expression profile clustered by Pearson correlation and Ward’s clustering method based on the 500 most variable genes (evaluated by median absolute deviation). B-ALL subtypes specified in Fig. 1a are annotated at the top of the heatmap. The newly identified subtypes including PAX5 P80R, PAX5alt and IKZF1 N159Y are highlighted in red; the subtypes not annotated in Fig. 1a are highlighted in blue. PAX5alt group was defined by this hierarchical clustering of gene expression profiles. Among the cluster (highlighted in red rectangle) of 173 cases enriched with PAX5 alterations, 25 were classified as other subtypes including NUTM1 (N = 1), BCL2/MYC (N = 1) and Ph-like (N = 23), and the remaining 148 cases were classified as PAX5alt. b, Distribution of B-ALL subtypes in different age groups. Definition of age groups is described in Table 1. The KMT2A-like (N = 5) and ZNF384-like (N = 4) subtypes are merged with KMT2A and ZNF384, respectively. The subtypes are grouped as gross chromosomal alteration, transcription factor (TF) rearrangement, other TF alteration, kinase driven and others. AYA, adolescent and young adult. SR, standard risk; HR, high risk.

Supplementary Figure 2 Pipeline of defining B-ALL subtypes.

Details of cohort distribution across different age groups are shown at the top. Gene rearrangements identified by transcriptome sequencing (RNA-seq) were used to define 10 B-ALL subtypes. Available karyotypic information and chromosomal level copy number alterations called from RNA-seq were used to distinguish aneuploidy: high hyperdiploid, low hypodiploid and near haploid. B-ALL subtypes (N = 11, yellow box) showing distinct gene expression profiles and with sufficient number of cases (n ≥ 10) were used as training dataset for Prediction Analysis for Microarray (PAM; Tibshirani, R. et al., Proc Natl Acad Sci U S A. 99, 6567–6572, 2002) to predict “-like” subtypes. Detailed rules are provided in Table 1. chr no., chromosome number.

Supplementary Figure 3 Distribution of PAX5 alterations.

a, Distribution of PAX5 rearrangements (PAX5r) and sequence mutations (PAX5mut). The same parameters as Fig. 1a are used in this tSNE plot and all samples in this study are included (N = 1,988). PAX5 rearrangements with JAK2 and ZCCHC7 are most commonly observed in Ph/Ph-like subtypes, while the other PAX5 rearrangements (PAX5r-other) are clustered in the PAX5alt group. Two frequently observed rearrangements PAX5-ETV6 and PAX5-NOL4L are highlighted. b, Distribution of all types of PAX5 alterations in 1,141 cases with both SNP array and RNA-seq data available. The copy number alterations (CNAs) were called from SNP array and divided into the following types: 1 copy gain/loss, broad copy gain/loss, which could be chromosomal or arm level CNA; focal 1 copy gain/loss, no more than 10 canonical genes are encompassed in the CNA region; partial 1 copy loss/gain, CNA’s breakpoint is in PAX5 gene body; CN-LOH, copy-neutral loss of heterozygosity; del between PAX5 and ZCCHC7, which is a deletion commonly observed in B-ALL and could result in PAX5-ZCCHC5 fusion; partial 1 copy loss within PAX5 is a type of focal deletion with both start and end breakpoints in PAX5 gene body (intragenic). Focal intragenic amplifications on PAX5 (N = 10; PAX5amp) are specifically enriched in PAX5alt group (N = 8). c, Distribution of PAX5 alterations in each B-ALL subtype.

Supplementary Figure 4 PAX5 internal tandem duplication (ITD) of exons 2–5.

a, PAX5-ITD (PAX5amp) detected by whole genome sequencing (WGS) in Integrative Genomics Viewer (Robinson, J.T. et al., Nat Biotechnol 29, 24-6,2011). Upper scatter plot shows WGS coverage relative to the germline sample. The genomic region with elevated copy number is highlighted by a red bar. The red arc denotes a tandem duplication encompassing PAX5 exons 2–5. Transcriptome sequencing coverage from the same sample is shown as a blue histogram and elevated expression of gained exons is shown. Below are the aligned WGS reads, and the discordant pairs are shown in red, supporting the structural variation. b, Wild-type and mutant PAX5 with amplified exons (e) 2–5. Primers (shown as arrows) were designed to amplify the fragments with the 5ʹ end in exon 5 and 3ʹ in exon 2 (primer e5: GACACCAACAAGCGCAAGAGAGAC; e2: TGATGAGCAAGTTCCACTATCCTC). c, Representative electropherogram of Sanger sequencing showing the junction of exon boundaries characterizing the duplication of exons 2–5. d, Fluorescent in-situ hybridization (FISH) confirming the presence of a PAX5 (exons 2–5) tandem duplication. Duplication is indicated by paired red signals (PAX5 exons 2–5 fosmid clone) associated with a green signal (chromosome 9 control probe). Sixty-three percent of analyzed cells were determined to be positive for the PAX5 duplication. A FISH validation in normal metaphase cells confirming the localization is shown on the left panel.

Supplementary Figure 5 Event-free (EFS) and overall survival (OS) of pediatric B-ALL cases.

Kaplan-Meier estimates for EFS and OS of children treated on St. Jude (SJ) Total protocols. P values are calculated by the two-sided time-stratified Cochran–Mantel–Haenszel test across all the subtypes in each panel. Detailed analysis results are provided in Supplementary Table 25. Favorable subtypes include High hyperdiploid, ETV6-RUNX1, TCF3-PBX1 and DUX4, 304 patients; KMT2A, 33; PAX5 P80R, 6; PAX5al, 31; Ph, 17; Ph-like, 27; other includes BCL2/MYC, CRLF2(non-Ph-like), ETV6-RUNX1-like, TCF3-HLF, iAMP21, MEF2D, NUTM1, ZNF384 and all other, 56.

Supplementary Figure 6 CRISPR/Cas9 engineering of Pax5 P80R and G183S loci.

a, Schematic representation of the Pax5 gene and mutations introduced to generate Pax5P80R and Pax5G183S mouse lines. sgRNAs (blue) targeting exon 3 or exon 5 were used to introduce Pax5 P80R or Pax5 G183S mutations (red). Several silent mutations were also introduced to facilitate PCR genotyping and prevent Cas9-mediated cleavage of the loci after repair. These include mutations disrupting the protospacer adjacent motif (PAM) sequences (underlined) and protospacer elements (orange). Blue arrow, Pax5-e3-F1; green arrow, Pax5-e3-R1; yellow arrow, Pax5-e5-F1; red arrow, Pax5-e5-R1; arrowhead, Cas9 cleavage sites. b, Representative Sanger sequencing validation of Pax5 wild-type, P80R, and G183S alleles to assign genotypes.

Supplementary Figure 7 Infer chromosomal copy number alterations (CNAs) from transcriptome sequencing data.

Gene expression level (rlog) evaluated by DESeq2 was normalized and shown on each chromosome to indicate whole chromosomal copy number gain or loss (upper). The boxplot for each chromosome shows the median value as the center line, and 25 and 75% quantile as the lower and upper hinge of each box, respectively. Lower whisker equals the smallest observation greater than or equal to lower hinge - 1.5 * IQR (interquartile range), and the upper whisker reaches the largest observation less than or equal to upper hinge + 1.5 * IQR. The skyblue line indicates the rlog from all the chromosomes, and the red line shows the median expression level of genes on chromosomes with 2 copies. With copy number changes, mutant allele frequency (MAF) of SNVs are changed and the density peaks of MAF are skewed (lower, highlighted in red if the highest peak is not around 0.5). Homozygous duplication of a chromosome could be recognized by elevated gene expression level, but is not noticeable on MAF density plot (for example, chromosome 14 and 21). The example patient ID is SJALL040088 and the figure, with the exception of chromosome 15, was highly consistent with the karyotype: 61,XX,+X,+3,+4,+5,+6,+10,+11,+12,del(12)(p11.2),+14,+15,+16,+17,+18,+21,+21 (14/70%) 62,idem,+mar (3/15%) 46,XX (3/15%). As shown in Supplementary Table 34, we observed consistency in calling of aneuploidy of autosomes between SNP array CNA data and RNA-seq data; erroneous calling on karyotyping may arise from miscalling of suboptimal metaphase data.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7

Reporting Summary

Supplementary Tables

Supplementary Tables 1–34

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gu, Z., Churchman, M.L., Roberts, K.G. et al. PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nat Genet 51, 296–307 (2019).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing