Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia

Ma, Xiaotu; Edmonson, Michael; Yergeau, Donald; Muzny, Donna M.; Hampton, Oliver A.; Rusch, Michael; Song, Guangchun; Easton, John; Harvey, Richard C.; Wheeler, David A.; Ma, Jing; Doddapaneni, HarshaVardhan; Vadodaria, Bhavin; Wu, Gang; Nagahawatte, Panduka; Carroll, William L.; Chen, I-Ming; Gastier-Foster, Julie M.; Relling, Mary V.; Smith, Malcolm A.; Devidas, Meenakshi; Auvil, Jaime M. Guidry; Downing, James R.; Loh, Mignon L.; Willman, Cheryl L.; Gerhard, Daniela S.; Mullighan, Charles G.; Hunger, Stephen P.; Zhang, Jinghui

doi:10.1038/ncomms7604

Download PDF

Article
Open access
Published: 19 March 2015

Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia

Xiaotu Ma¹,
Michael Edmonson¹,
Donald Yergeau²,
Donna M. Muzny³,
Oliver A. Hampton³,
Michael Rusch¹,
Guangchun Song⁴,
John Easton²,
Richard C. Harvey⁵,
David A. Wheeler³,
Jing Ma⁴,
HarshaVardhan Doddapaneni³,
Bhavin Vadodaria²,
Gang Wu ORCID: orcid.org/0000-0002-1678-5864¹,
Panduka Nagahawatte¹,
William L. Carroll⁶,
I-Ming Chen⁵,
Julie M. Gastier-Foster^7,8,
Mary V. Relling⁹,
Malcolm A. Smith¹⁰,
Meenakshi Devidas¹¹,
Jaime M. Guidry Auvil¹²,
James R. Downing⁴,
Mignon L. Loh¹³,
Cheryl L. Willman⁵,
Daniela S. Gerhard¹²,
Charles G. Mullighan ORCID: orcid.org/0000-0002-1871-1850⁴,
Stephen P. Hunger¹⁴ &
…
Jinghui Zhang¹

Nature Communications volume 6, Article number: 6604 (2015) Cite this article

15k Accesses
239 Citations
57 Altmetric
Metrics details

Subjects

Abstract

There is incomplete understanding of genetic heterogeneity and clonal evolution during cancer progression. Here we use deep whole-exome sequencing to describe the clonal architecture and evolution of 20 pediatric B-acute lymphoblastic leukaemias from diagnosis to relapse. We show that clonal diversity is comparable at diagnosis and relapse and clonal survival from diagnosis to relapse is not associated with mutation burden. Six pathways were frequently mutated, with NT5C2, CREBBP, WHSC1, TP53, USH2A, NRAS and IKZF1 mutations enriched at relapse. Half of the leukaemias had multiple subclonal mutations in a pathway or gene at diagnosis, but mostly with only one, usually minor clone, surviving therapy to acquire additional mutations and become the relapse founder clone. Relapse-specific mutations in NT5C2 were found in nine cases, with mutations in four cases being in descendants of the relapse founder clone. These results provide important insights into the genetic basis of treatment failure in ALL and have implications for the early detection of mutations driving relapse.

Mutational patterns and clonal evolution from diagnosis to relapse in pediatric acute lymphoblastic leukemia

Article Open access 06 August 2021

Whole genome sequencing provides comprehensive genetic testing in childhood B-cell acute lymphoblastic leukaemia

Article Open access 19 January 2023

Clonal evolution and clinical implications of genetic abnormalities in blastic transformation of chronic myeloid leukaemia

Article Open access 14 May 2021

Introduction

Despite event-free survival rates for pediatric acute lymphoblastic leukaemia (ALL) that now exceed 85%, about 15% of children with ALL experience disease recurrence, most of whom will die, and relapsed ALL remains a leading cause of cancer-related death in children^1,2,3,4. Recent genomic studies have identified relapse-specific mutations in pediatric ALL^5,6,7 and have constructed clonal lineage from diagnosis to relapse for several cancers^8,9,10. These studies have provided insight into tumour heterogeneity and the evolutionary trajectory from diagnosis to relapse. Most studies have focused primarily on genetic lesions present in the clones that survive therapy (that is, the rising clones) but not those eradicated by therapy (that is, the falling clones)^8,10. While the technology does not exist to prove complete eradication of a clone, in this study we use the term ‘eradication’ to refer to clones that are no longer detectable within the limits of the assays. In addition to discovering genetic lesions responsible for relapse, deep genome-wide sequencing of matched samples obtained at diagnosis, remission and relapse has the potential to characterize evolutionary lineages and to address key issues in tumour clonal evolution. These include the uniqueness of genetic lesions in rising clones that persist to relapse; the relative mutation burden of rising and falling clones; whether or not the rising clone at relapse always arises from a minor clone at diagnosis; and the chronology of clonal emergence and mutation acquisition at diagnosis and relapse.

To investigate how genetic lesions contribute to the rise and fall of subclones from diagnosis to relapse in childhood B-ALL, we analyse somatic sequence mutations, structural variations (SVs) and DNA copy-number alterations of samples obtained at diagnosis, remission and relapse from 20 children with B-progenitor ALL (B-ALL) studied as a part of a collaborative study from the Children’s Oncology Group (COG), the National Cancer Institute Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative and the St Jude–Washington University Pediatric Cancer Genomic Project. The median age of the patients at diagnosis was 7 years (range 2 to 19). Cases were selected for analysis based on the occurrence of an early bone marrow relapse (<36 months; median 19.2 months, range 3.8 to 35.7), which is associated with very poor survival¹¹. High-coverage whole-exome sequencing (WXS) followed by deep sequencing of somatic variants identified at each time point in the trio samples unveils the characteristics of rising and falling subclones from diagnosis to relapse in this group of high-risk pediatric B-ALL.

Results

Somatic mutation profile at diagnosis and relapse

We performed WXS at high coverage (~200-fold) of samples obtained at diagnosis, remission and relapse from 20 patients treated on recent COG B-ALL trials (Methods, Supplementary Data 1 and Supplementary Fig. 1). Eleven cases were found to harbour known oncogenic gene fusions and rearrangements including TCF3-PBX1 (n=5), TCF3-HLF (n=1), IGH-CRLF2 (n=3), P2RY8-CRLF2 (n=1) and ETV6-RUNX1 (n=1) at diagnosis, all of which were retained at relapse. In addition to sequence mutations (that is, single-nucleotide variation (SNV) and insertion/deletion (indel), somatic SV and copy-number variation (CNV) were identified by analysis of whole-genome sequencing (WGS) and single-nucleotide polymorphism (SNP) array data. A total of 2,278 SNVs, 106 indels and 68 SVs were experimentally verified (Supplementary Data 2 and 3) by capture sequencing of diagnosis, relapse and remission DNA at an average of 696-fold coverage (Methods). The mutant allele fraction (MAF) values obtained from WXS showed high concordance with those from capture sequencing (Pearson’s correlation coefficient was 0.973 and the mean difference in MAF between WXS and capture sequencing was 0.007 (Supplementary Fig. 2)). The combined coverage of WXS and capture was 883-fold, which had the power to detect over 99% of the mutations with a MAF of at least 0.01. This represented the lower limit used for determining the presence of a mutant allele throughout this study (Methods). Variants below this level are referred as relapse- or diagnosis specific, as appropriate, with the caveat that failure to detect some could potentially be caused by limitations of technology and/or read depth. Seventy-two per cent of the mutations had MAF of <0.3, and were thus likely subclonal.

Each case had genetic alterations shared at diagnosis and relapse, including the chimeric gene fusions described above, indicating a common origin from a preleukaemic ancestral clone (Supplementary Fig. 3), consistent with previous SNP array data in B-ALL⁹. On average, 74% (range 14–100%) of the coding sequence mutations present at diagnosis persisted to relapse. Four relapsed tumours were hypermutable with 156–740 coding mutations, two of which had loss-of-function mutations in the DNA mismatch repair genes PMS2 and MSH6 at relapse. Three of these hypermutable cases had substantial changes in their mutational spectra, with the prevalence of transition mutations increasing from 60–70% at diagnosis to over 95% at relapse (Supplementary Fig. 4). For the remaining 16 non-hypermutable cases, the number of coding somatic sequence mutations at relapse (median 31, range 5–59) was significantly higher (P=0.004, Wilcoxon signed-rank test) than that at diagnosis (median 18, range 6–48). No case had a significant change in the number of SVs and CNVs at relapse. Across the entire cohort, 78% of CNVs (Supplementary Fig. 3b) and 82% of SVs present at diagnosis persisted to relapse, while 40% of CNVs and 7% of SVs were relapse specific (Supplementary Data 3 and 4). Although chemotherapy may have directly induced mutations observed at relapse, the limited exposure to known mutagenic agents in the chemotherapeutic regimens used, and the lack of substantial increases in sequence mutation burden or shift in mutation spectrum suggests this was not an important mechanism in the acquisition of mutations in the majority of cases.

Recurrently mutated genes/pathways at diagnosis and relapse

Integrated analysis of sequence mutations, SVs and CNVs identified six pathways mutated at high frequency at diagnosis and/or relapse: Ras signalling (65%), JAK-STAT signalling (25%), transcriptional regulation of lymphoid development (85%), nucleoside metabolism (45%), epigenetic modification (65%) and cell cycle regulation (60%; Fig. 1). The frequency of Ras pathway mutations was similar at diagnosis (50%) and relapse (65%). However, a notable finding was the presence of multiple subclonal Ras mutations, either in the same gene (NRAS, KRAS) or in different Ras pathway genes (KRAS, NRAS, PTPN11, FLT3) in nine cases at diagnosis (Fig. 1 and Supplementary Data 2). In five out of the nine cases, we observed convergence from multiple subclonal mutations in NRAS, KRAS, PTPN11 and FLT3 at diagnosis to a single clonal NRAS mutation at relapse (PARPRW, PAPNNX, PAPAGK, PAPJIB, PARPNM; Fig. 1 and Supplementary Data 2). For example, PAPJIB had subclonal mutations of KRAS p.Ala146Thr, NRAS p.Gly13Asp and PTPN11 p.Ser502Pro with MAFs of 0.021, 0.025 and 0.233, respectively, at diagnosis, and persistence of only the NRAS p.Gly13Asp mutation with MAF increased 10-fold to 0.234 at relapse. SNVs and SVs resulting in activation of the JAK signalling pathway were present in 25% of the cases at diagnosis and relapse. Loss-of-function mutations, mostly SVs and CNVs, in genes encoding regulators of B-cell development, were present in 75 and 80% of the cases at diagnosis and relapse, respectively. Cell cycle pathway alterations were present in 60% of the tumours at diagnosis and 50% of the tumours at relapse. This included two cases with diagnosis-specific CDKN2A/B deletions (10 cases had deletions at both diagnosis and relapse) and two cases with relapse-specific TP53 sequence mutations.

**Figure 1: Recurrently mutated pathways and genes in 20 ALL trios.**

Sequence mutations in NT5C2, a gene encoding a 5′-nucleotidase involved in purine metabolism, were observed exclusively at relapse in 45% of the cases, including known activating mutations in relapsed ALL^6,7 and a novel recurrent mutation p.Arg39Gln (Supplementary Fig. 5a). The mutation rate of epigenetic regulators was 45 and 65% at diagnosis and relapse, respectively, with recurrent relapse-specific mutations found in WHSC1 (n=2) and CREBBP (n=2; Supplementary Fig. 5b,d). Two lesions in this pathway were detected at very low level at diagnosis but highly enriched in relapse. The WHSC1 p.Glu1099Lys mutation in case PARFTR exhibited an increase in MAF from 0.01 at diagnosis to 0.49 at relapse, while an intragenic deletion of CREBBP in PARBRK, detectable at diagnosis only by high-coverage capture sequencing, was found to be clonal at relapse by analysis of WGS and SNP array data. In addition to these six pathways, frequent non-silent sequence mutations were found in FCGBP (immunoglobulin Fc gamma-binding protein; n=4) and USH2A (mutated in Usher syndrome and retinitis pigmentosa; n=4; Supplementary Fig. 5c).

Multiclonal mutations were observed in six genes in 10 cases at diagnosis or relapse: NT5C2 (n=3), KRAS (n=3), NRAS (n=2), JAK2 (n=2), CREBBP (n=1) and PAX5 (n=1) (Supplementary Data 5). Multiclonal mutations in NT5C2 and CREBBP were relapse specific, while multiclonal mutations in KRAS, NRAS, JAK2 and PAX5 at diagnosis exhibited variable persistence. Case PAPSPN is illustrative of the dynamic population changes of multiclonal mutations. This case had multiclonal mutations in three genes (JAK2, KRAS and PAX5), all of which showed switching of the predominant mutation site from diagnosis to relapse. Three mutations affecting JAK2 p.Arg683, a known hotspot of mutation in B-ALL^12,13,14, were present at diagnosis (Fig. 2a). JAK2 p.Arg683Ser was the predominant clonal mutation (MAF=0.42), while p.Arg683Gly (MAF=0.007) and p.Ile682_Glu684>GlyGly (MAF=0.005) were minor subclonal mutations (Fig. 2a). Only JAK2 p.Arg683Gly persisted to relapse to become a clonal mutation with MAF of 0.44. Importantly, the MAFs detected by WXS were verified by deep sequencing (Fig. 2b). Similarly, the dominant clonal mutation in KRAS switched from p.Gly12Asp at diagnosis to p.Ala146Thr at relapse, while the frameshift mutation PAX5 p.Glu97fs at diagnosis was replaced by a relapse-specific in-frame insertion p.Gly48>ValMetIleIleLysValSer. In addition to sequence mutations, we also detected multiclonal SVs in PAX5 (n=2) and CDKN2A (n=1; Supplementary Figs 6,7 and 16).

**Figure 2: Three subclonal *JAK2* mutations in patient PAPSPN.**

Clonal evolution from diagnosis to relapse

In light of the observed turnover of predominant driver mutations in genes or pathways (Figs 1 and 2), we used deep sequencing data derived from high-coverage WXS and mutation verification data to define clonality. We constructed clonal lineages based on frequency of mutation clusters at diagnosis and relapse⁸. To improve the precision of the analysis, we developed a binomial mixture model that explicitly accounts for read coverage and determines mutation clusters in three categories: diagnosis specific, relapse specific and shared (see Methods). Clonal architecture for each case is presented in Figs 3, 4, 5 and Supplementary Figs 7–23. Dynamic changes of subclonal population sizes were observed in most of the cases. Here we present three diverse examples to illustrate how clonal evolution resulted in the turnover of the predominant mutations in signalling pathways, the reversal of clonal dominance in a dual-lineage evolution and changes of mutation spectrum after acquiring a relapse-specific mutation in a DNA mismatch repair gene. In these analyses, we refer to three complementary concepts that incorporate MAF, overall mutation load and biological processes at each time point: clusters that use relative MAF at diagnosis and relapse to aggregate multiple mutations; clones that contain multiple clusters constructed by mathematical modelling and lineages that trace clonal rise and fall from ancestral, prediagnosis clones through diagnosis and relapse.

**Figure 3: Clonal architecture of diagnosis and relapse samples for PAPSPN.**

**Figure 4: Clonal architecture of diagnosis and relapse samples for PARJZZ.**

**Figure 5: Clonal architecture of diagnosis and relapse samples for case PASLZM.**

Case PAPSPN illustrates how clonal evolution resulted in the turnover of the predominant mutations in genes involved in multiple cellular pathways, including JAK-STAT signalling (JAK2), Ras signalling (KRAS) and lymphoid development (PAX5) as well as acquisition of relapse-associated mutations in epigenetic regulators (MLL2) and drug-metabolizing genes (NT5C2). Combining data at diagnosis and relapse, a total of six mutation clusters were identified (a–f, Fig. 3a). Clusters a and b were diagnosis specific; clusters c and d were shared in diagnosis and relapse; and clusters e and f were relapse specific. At diagnosis, the mutation clusters a and c have high MAF suggesting that they are present in nearly all tumour cells at diagnosis, while the mutation clusters b and d had low MAFs (~0.01) suggesting that they were present in a small fraction of tumour cells. Clonal identity was established from MAF and tumour specificity of mutation cluster as MAF of a cluster is expected to match the combined population frequency of all clones that contain this mutation cluster (Methods). For PAPSPN, three clones (designated Clone 1–3 in Fig. 3b) were identified at diagnosis; all evolved from a founder clone that had cluster c mutations along with focal deletions in CDKN2A/B, IKZF1 and IGH-CRLF2 fusion. Mutations in a founder clone were expected to be present in all tumour cells at diagnosis and relapse, which matched the MAF profile of cluster c mutations. Cluster a had comparable MAF as cluster c but was diagnosis specific, suggesting the presence of a predominant clone (clone 1) that had both cluster a and cluster c mutations. The founder clone that was comprised exclusively of cluster c mutation was inferred as the current sequencing depth would not permit differentiating the MAF of cluster a from that of cluster c at diagnosis. The cluster a mutations defined the identity for the predominant clone 1 and they included JAK2 p.Arg683Ser, KRAS p.Gly12Asp and PAX5 p.Glu97fs (MAF 0.455). Clones 2 and 3 were minor clones, accounting for ~1 and 2% of the tumour cells at diagnosis, respectively. The lineage-specific mutations in clone 2 and clone 3 were contained in cluster b (which included the JAK2 p.IleI682_Glu684>GlyGly) and cluster d (which included JAK2 p.Arg683Gly), respectively. All JAK2 mutations at this residue result in constitutive JAK-STAT activation and are likely key driver mutations in ALL¹³. Under the assumption that activating mutations affecting the same pathway are mutually exclusive, we assigned each JAK2 mutation to a distinct subclone. Of these three clones present at diagnosis, only clone 3 survived chemotherapy to become the progenitor for the two subclones (designated clone 4 and clone 5 in Fig. 3b) detected at relapse. Clone 4 was the direct descendant of clone 3 as its lineage-specific mutations in cluster e (including NT5C2 p.Ser408Arg and USH2A p.Arg1549Gln along with 36 other SNVs) had the same MAF as the cluster c and d mutations ‘inherited’ from clone 3. Analysis of coverage and MAF showed that cluster e was likely distinct from cluster d despite the low MAF of cluster d at diagnosis (P=2 × 10⁻⁹, binomial test; Methods). Clone 5 evolved from clone 4 by acquiring additional subclonal mutations in cluster f which include MLL2 p.Trp1491* and PAX5 p.Gly48>ValMetIleIleLysValSer. The projected population sizes of clones 4 and 5 at relapse were 33 and 67%, respectively. As a result, the loss of clone 1 coupled with the expansion of clone 3 resulted in switching of the predominant JAK2 mutation from p.Arg683Ser in diagnosis to p.Arg683Gly in relapse.

PARJZZ is an illustrative case that has two rising clones from diagnosis to relapse, and was the only case exhibiting this pattern in this cohort. Of the five mutation clusters (a–e, Fig. 4a) identified in this case, the ancestral clone (clone 1) had mutations of cluster a and constituted 12.5% of the tumour cells at diagnosis. Clone 2 and clone 3, both of which were descendants of clone 1, contained mutations from cluster a along with their respective lineage-specific mutations in cluster b (represented by KRAS p.Gly12Asp mutation) and c (represented by KRAS p.Leu23Arg). In addition to having distinct sequence mutations in KRAS, these two clones also harboured distinct homozygous deletions of CDKN2A/B (Fig. 4b,c, Supplementary Fig. 6). Clone 2 had homozygous deletion of CDKN2A/B arising from two independent deletions of 1.1 and 0.1 Mb in size. Clone 3 had a 2.2-Mb homozygous deletion of CDKN2A/B caused by a deletion followed by copy-neutral loss-of-heterozygosity at chromosome 9 (hg19 coordinates 20,041,207–22,239,504; Supplementary Fig. 6). Both clone 2 and clone 3 persisted to relapse but with reversal of clonal dominance (Fig. 4d). Clone 2 was the predominant clone at diagnosis and accounted for 75% of the tumour cells, but represented a minor clone at relapse with only 8% of the tumour cells. On the other hand, the two descendants of clone 3, a minor subclone at diagnosis that accounted for 12.5% of the tumour cells, had a combined population frequency of 92% at relapse after acquiring a relapse-specific NT5C2 mutation p.Arg367Gln.

Case PASLZM was one of the four cases with hypermutation at relapse. A total of six mutation clusters were identified (Fig. 5a), four (c–f) of which were relapse specific. Clone 1 was the predominant ancestral clone, while clone 2 was its minor descendant that accounted for 8% of leukaemia cells at diagnosis (Fig. 5c). In addition to TCF3-PBX1 fusion, clone 1 harbours a clonal isochromosome 7q detectable only by manual analysis of SNP array probe intensity data due to low tumour purity of the diagnostic sample (Fig. 5d and Supplementary Data 1). The lineage-specific mutation cluster for clone 2 was cluster b, which included a RAD21 p.Ser56Pro mutation. Clone 2 persisted to relapse and its descendant, clone 3, acquired a WHSC1 SET domain mutation p.Thr1150Ala along with four other mutations in cluster c. Despite the low MAF of cluster b at diagnosis, statistical analyses established that cluster c was clearly distinct from cluster b (P=3 × 10⁻⁷, binomial test; Methods). Clone 4 descended from clone 3 after acquiring an additional 239 mutations in cluster d. This cluster included a somatic splice site mutation of PMS2, which encodes post-meiotic segregation increased 2, a protein involved in mismatch repair that is mutated in hereditary non-polyposis colorectal cancer (Lynch syndrome) and other tumours¹⁵. Notably, while the MAF of the PMS2 mutation was 0.43, we assigned this mutation to cluster d (whose MAF centres on 0.26) rather than cluster c because PMS2 is located at 7p, which is hemizygous at relapse due to clonal isochromosome 7q. Therefore, the PMS2 mutation is subclonal as the high MAF was caused by loss-of-heterozygosity at 7p. Interestingly, it is an exact match to the germline mutation c.2174+1G>A ( http://www.ncbi.nlm.nih.gov/clinvar/RCV000076844) known to be pathogenic in Lynch syndrome¹⁵. Transition events account for 99% of the SNVs in cluster d compared with 77 and 83% of SNVs observed at diagnosis and cluster c, consistent with the biallelic loss of PMS2 in clone 4. Ambiguity in the lineage of clone 5 and clone 6 was resolved by similarity of mutation spectrum (Fig. 5b). Together these observations suggest that the PMS2 mutation is directly responsible for the hypermutator phenotype of this relapse case, and represents a new mechanism of mutation acquisition in relapsed ALL.

Rise and fall of subclones from diagnosis to relapse

Using clonal lineage data for the 20 cases, we evaluated how clonal diversity, origin, mutation burden and population frequency contributed to clonal evolution from diagnosis to relapse (Fig. 6). There was no significant difference (P=0.6; Wilcoxon signed-rank test) between the median number of subclones identified at diagnosis (3; range 1–5) and at relapse (3; range 1–5). In general, the numbers of mutations in the rising clones (median 11, range 2–32) were comparable (P=0.7, Wilcoxon rank-sum test) to the numbers present in the falling clones (median 11, range 1–44) indicating that clonal survival is not dependent on mutation burden. In 15 of 20 cases (75%), the rising clone was a minor subclone at diagnosis (median population frequency 7%, range 2–20%). Relapse-specific mutations present in the founder clone at relapse were observed in the following seven genes: NT5C2 (n=5), USH2A (n=3), WHSC1 (n=2), TP53 (n=2), NRAS (n=3), IKZF1 (n=2) and CREBBP (n=2). They were observed to be relapse specific (NT5C2), enriched at relapse (CREBBP, USH2A, WHSC1, IKZF1, NRAS and TP53) or to have increased mutation burden at relapse (NRAS, WHSC1 and CREBBP). Mutations in these genes were never present in the lineage of falling clones except for NRAS mutations that exist in multiple subclones at diagnosis. Nineteen cases had mutations in at least one of these genes and nine cases had mutations in multiple of these genes. Except for CREBBP and WHSC1, both of which affect epigenetic regulation, the relapse-enriched genes were not mutually exclusive at the case or clonal level as they affect different biological processes. Of the five cases that had predominant clones at diagnosis that persisted to relapse (PAPEFH, PAPNNX, PAPZNK, PARPNM and PASFXA), three had rising clones that constituted ~50% of the tumour cells at diagnosis and the other two (PAPEFH and PASFXA) had major clones being the rising clones. Interestingly, these two cases harboured mutations in genes that would otherwise be considered relapse-specific: PASFXA has a homozygous TP53 p.Arg248Gln mutation and PAPEFH has WHSC1 p.Glu1099Lys in the predominant clone at diagnosis.

Discussion

We employed deep sequencing by WXS and by custom capture of 20 diagnosis–relapse–remission samples of pediatric B-ALL at an average coverage of 200- and 700-fold, respectively, to gain insight into the genetic and clonal architecture of relapsed ALL. At this sequencing depth, our analysis has the power for detecting 92% of the sequence mutations at MAF≥0.03 by WXS. Combining WXS and capture data enabled the identification of 99% of mutations at MAF≥0.01 if the mutations were detected in one tumour but missed in the other for the same case (Methods). Among the 116 subclones identified in this cohort, six from relapsed leukaemias (cases PANTSM, PARBRK, PAPNNX, PAPEFH, PASFXA, PASKAY) and one diagnosis ALL case (PAPZNK) had ambiguous clonal lineages. In each case, a diagnosis- or relapse-specific mutation cluster of the smallest frequency resulted in ambiguity in clonal lineage assignment and none involved genes in the six frequently mutated pathways (Fig. 1). As a result, this ambiguity does not affect clonal population size or mutation turnover analyses that describe the nature of the rising clones that survived therapy. Technology limitation may cause underestimation of clonal diversity due to failure in identifying mutations clusters with MAF below 0.01, coverage bias in WXS, non-coding mutations, low tumour purity (for example, PARPNM relapse and PASLZM diagnosis) or subclonal SVs and CNVs that are below the detection level of SNP array and CGI WGS. Nevertheless, deep sequence coverage coupled with CNV and SV derived from SNP array assay and WGS enabled a comprehensive analysis of tumour clonal lineage from diagnosis to relapse for pediatric B-ALL at the genome-wide scale.

Our study is the first that compares the rising and falling clones to gain insight into the genetic and population characteristics of these differentially behaving subclones. Seventy-five per cent of the relapsed tumours are descendants of minor subclones at diagnosis, suggesting that in most cases the predominant clones were eradicated by therapy. Our study included cases that relapsed within 3 years of initial diagnosis, and the elevated NT5C2 mutation rate in this cohort, 45% compared with the 3–10% by prior studies, is consistent with the signficant association of NT5C2 mutation status with early relapse reported previously^6,7. Of the 14 relapse-specific NT5C2 mutations identified in the nine cases, none were found in hypermutable tumours, indicating that hypermutation may represent an alternative mechanism for relapse. Notably, nine NT5C2 mutations from four cases (PAPJIB, PAPSPG, PAPZNK and PARFTR, Fig. 6) were in subclones descended from the relapse founder clones that had already acquired other relapse-specific mutations. For example, the two NT5C2 relapse-specific mutations (p.Arg367Gln and p.Ser360Pro) in PAPZNK were present in the two subclones evolved from a founder clone that has already gained eight relapse-specific mutations including mutations in two relapse-enriched genes NRAS, USH2A and a frameshift mutation in WT1, a known tumour suppressor gene (Supplementary Fig. 13). More strikingly, the PARFTR relapse specimen had four distinct NT5C2 mutations (p.Arg39Gln, p.Arg238Trp, p.Arg238Gln, and p.Arg367Gln) present in the four minor subclones that were descendants of the rising clone with WHSC1 p.Glu1099Lys mutation. The rising clone was present in 3% of the tumour cells at diagnosis (Supplementary Fig. 18). WHSC1, a gene enriched for relapse-specific mutations, had all four mutations in the SET domain including three with the recurrent p.Glu1099Lys oncogenic mutations known to promote leukaemic cell growth and self-renewal¹⁶. We previously reported ~40% of WHSC1 mutations in ALL to be subclonal¹⁶. Here we extend these findings by demonstrating WHSC1 mutated subclones at diagnosis to emerge as the founder clone at relapse.

The matched non-tumour samples used in this study were remission DNA obtained at day 28 following commencement of therapy, which may contain low levels of minimal residual disease¹⁷. Combining the high-coverage WXS and validation data, we were able to detect the presence of mutant alleles of 83 SNVs shared between diagnosis and relapse samples from 19 cases in these remission samples (Supplementary Data 6), raising the possibility of applying next-generation sequencing for disease monitoring which has been implemented for antigen-receptor genes¹⁸. For example, in remission DNA of PAPSPN, we found that JAK2 p.Arg683Gly, along with six other mutations in the founder clone of relapse, was present at MAFs of 0.001 to 0.005 (Supplementary Data 7). Further studies of patients with B-ALL using sensitive mutation testing at multiple time points during therapy are needed to determine if the persistence of low levels of the mutant clone at end induction might predict risk of relapse, and to examine the relationship between the nature of the therapeutic regimen used and the mutational spectrum.

These findings have major implications for understanding the genomic complexity of ALL. Construction of clonal lineage reveals the chronology of mutation acquisition in six major pathways, four of which were identified by our prior studies¹⁴. We have identified new targets of mutation at relapse, including USH2A, whose role in resistance to therapy is currently unknown and requires future mechanistic interrogation. This work highlights that the predominant clone present at diagnosis is eradicated by chemotherapy in the majority of cases, but with mutational turnover resulting in preservation of disruption in key pathways including Ras, JAK-STAT, lymphoid development and tumour suppressors throughout therapy. Thus, subclonal mutations affecting these key pathways should not be construed as being passenger mutations. In contrast, our findings of multiple subclonal mutations in the same gene or pathway suggest a different paradigm, in which such mutations are acquired in different cells after the earliest initiating event (often an oncogenic translocation or intrachromosomal rearrangement that creates a fusion gene)^19,20 but contribute to the establishment of multiple fully leukemogenic subclones at diagnosis. For example, our clonal lineage analysis shows that nearly all mutations in the Ras signalling pathway at diagnosis were acquired after emergence of founder clone. Additional mutations are acquired that determine fitness in the face of anti-leukaemic therapy and determine which of these mutations are extinguished or survive to seed relapse.

Our findings also have direct implications for the clinical management of ALL. Several mutations confer resistance to specific therapeutic agents, including nucleoside analogues (NT5C2) and glucocorticoids (CREBBP)⁵, thus alternative treatment approaches to avoid mutation selection or mitigate resistance may be needed. The high frequency of mutations in epigenetic regulators provides further support that agents targeting epigenetic modifications represent a new therapeutic opportunity in ALL²¹. Our findings also suggest that there may be clinical utility to assaying for specific mutations that might be a harbinger of relapse during the course of ALL treatment. While this strategy is widely applied using antigen receptor gene rearrangements, those are non-oncogenic events present in both fully leukaemic and ancestral preleukaemic cells. Sensitive mutation detection strategies focused on known oncogenic driver mutations could complement existing methods to detect low levels of minimal residual disease and warrant further investigation.

Methods

Patients and samples

All patients or their parent(s)/guardian(s) provided consent for the collection and banking of diagnostic and remission specimens. The study was approved by the St Jude Children’s Research Hospital Institutional Review Board.

Summary of data analysis

The diagnosis–remission–relapse DNA samples were obtained from 20 patients treated on recent COG B-ALL trials. Exome sequencing, WGS and SNP array data were performed using standard Illumina, Complete Genomics and Affymetrix protocols followed by data analysis, including WXS mapping, coverage and quality assessment, SNV/indel detection, tier annotation for sequence mutations, is described below. Clonal lineage was derived from mutation clusters constructed by a binomial mixture model (see below) that explicitly accounts for read coverage and determines mutation clusters that are diagnosis specific, relapse specific and common. CNV and SV were incorporated into the estimated lineages when appropriate.

For each sample, the MAFs of coding SNVs (including both silent and non-silent) in diploid regions were subjected to an unsupervised clustering analysis by R package mclust. The tumour purity was estimated by twice the highest cluster center value among all cluster centres ≤0.5. Diagnosis and relapse samples were examined separately.

Binomial mixture model

All experimentally verified somatic SNVs located in diploid chromosomal regions were included in this analysis. Diagnosis (D)-specific or relapse (R)-specific SNVs were analysed separately from those that were shared between D and R. D- or R-specific SNVs were represented as (X_i,N_i),i=1,2,…,n, where X_i and N_i denote the counts of mutant allele reads and total reads for i-th SNV, respectively. Here MAF (mutant allele frequency) p_i for i-th SNV is estimated to be and mutant allele count is modelled by binomial distribution X_i~Binom(p_i,N_i). To determine the clonal origin of each SNV, subclones are denoted as C_k, with corresponding population fraction p_k, for k=1,2,…,K and subclonality membership is determined by function Z_ik, which takes value 1 when i-th SNV belongs to subclone C_k and 0 otherwise. The probability distribution of X_i can be rewritten as

where and λ_k is the mixing parameter.

In turn, the joint probability distribution of data {(X_i,N_i)} can be written as

where X={X_i,i=1,2,…,n}, N={N_i,i=1,2,…,n} and Z={Z_ik,i=1,2,…,n; k=1,2,…,K}.

The likelihood of model parameters and given X and Z is defined as

The above likelihood is maximized using expectation maximization algorithm with Akaike information criterion (AIC) AIC=2k−2ln(L), where k=(n+2)−1.

For shared SNVs present in both D and R, the above binomial mixture model was modified to a multiplicative binomial mixture model as

where and λ_k is mixing parameter.

The joint probability distribution of data is written as

The likelihood of model parameters and given X, N and Z is defined as

where X, N and Z are defined similarly as above.

The above likelihood is maximized using an expectation maximization algorithm with the AIC=2k–2ln(L), where k=(n+3)K–1.

Each mutational cluster inferred from the binomial mixture models was labelled by genes involved in a significantly mutated pathway (Fig. 1) or a somatic variant with the highest coverage if there is no mutation in the selected pathway.

Clonal lineage inference

Clonal lineage was inferred from an approach applied in prior studies⁸ with several enhancements. Specifically, clonal lineages were constructed separatly for mutation clusters categorized as D-specific, R-specific and shared SNVs. The cluster of shared SNVs were analysed first and the cluster that had the highest MAF at diagnosis is comprised of genetic lesions of the founder clone, that is, the ancestral clone for all subclones identified at diagnosis and relapse. Starting from the founder clone at diagnosis, we first built lineages for the shared SNV clusters at diagnosis, in descending order of their population size estimated from MAF. Specifically, for each lineage J, denote the set of shared mutational clusters with established clonal lineages as {C_j,j=1,2,…J}, each with a population size {F_j,j=1,2,…J}, J=1 refers to the founder/ancestral clone. For a new shared mutational cluster C_J+1, we determined its parental clone as subclone C_j0 with the highest population size F_j0>F_k, for k≠j0. In turn, the population size for subclone C_j0 is adjusted to F_j0–F_J+1. This process was iterated until all shared mutational clusters at diagnosis were assigned to a parental clone. The D-specific clusters were subsequently added to this lineage tree with the same logic. The clonal lineage is determined similarly for the R-specific clusters. The founder clones at relapse are mapped to subclones at diagnosis based on shared mutational clusters. We allowed multiple progenitor clones in relapse in this mapping as the relapsed tumour could consist of mutations present in multiple subclones at diagnosis. An example can be found in case PARJZZ (Fig. 4).

Clonal lineages were manually reviewed to identify those with potential alternative parental lineages other than the default assigned by the algorithm (that is, the clone with the highest population size). For cases with alternative clonal lineage assignment, we attempted to resolve lineage ambiguity by incorporating biological knowledge. Specifically, a lineage that is consistent with mutual exclusivity of activating mutations in the same gene or the same pathway or a parental clone that exhibits consistent mutation spectrum will be selected preferentially. This was used to resolve clonal ambiguity of PAPSPN, PASLZM, PAPNNX (partial), PAPLDL, PAPZNK, PAPJIB, PARGHW, PARFTR and PARPNM. When biological knowledge is unavailable, unresolved ambiguous lineages are highlighted in the lineage graphs and alternative lineage graphs are provided.

Clonality of SVs and CNVs

SVs and CNVs affecting the genes involved in the six key pathways as well as the known oncogenic fusions were included for clonal lineage construction. Specifically, SVs with breakpoints disrupting the genes of interest and CNVs spanning ≤5 genes were included. Because of the limitation in detecting subclonal SVs (due to read-depth and mapping bias in WGS) and CNVs (due to resolution of SNP array), clonality of SVs/CNVs was assessed qualitatively, that is, whether they are shared, diagnosis specific or relapse specific. Since oncogenic fusions are likely the initiating events of childhood ALL^19,20, SVs/CNVs were considered to be present in the predominant clone (or the ancestral clone) by default with two exceptions: (1) SVs detected by deep sequencing of the trio samples by capture sequencing but not by WGS which indicates presence of subclonal SV at diagnosis (for example, the CREBBP SV identified in case PARBRK); and (2) CNVs classified as relapse specific by automated analysis but were found to be subclonal at diagnosis by manual review of raw probe intensity of SNP array. SVs and CNVs were placed into the clonal lineage constructed from MAF of sequence mutations unless insertion of a new branch was required to account for their presence (for example, PAPJIB).

Whole-exome sequencing

DNA samples were constructed into Illumina paired-end libraries and processed through exome capture according to the manufacturer’s specifications with modifications as described in the BCM-HGSC Illumina Non-Barcoded Paired-End Capture Library Preparation protocol ( https://www.hgsc.bcm.edu/sites/default/files/documents/BCM-HGSC_Illumina_Non-Barcoded_Paired-End_Capture_Library_Preparation.pdf).

‘Precapture libraries’ were prepared using robotic workstations (Biomek NXp and FXp models, Beckman Coulter Genomics). In brief, 1 or 5 μg of DNA was sheared into fragments of ~400 bp in a plate using a E210 system (Covaris Inc., Woburn, MA) followed by end-repair, A-tailing and ligation of non-barcoded PE adaptors (Illumina, San Diego, CA). Precapture Ligation Mediated-PCR (LM-PCR) was performed for seven cycles of amplification using the 2 × SOLiD Library High Fidelity Amplification Mix (a custom product manufactured by Invitrogen Inc.). Purification was performed with Agencourt AMPure XP (Beckman Coulter Inc.) beads after enzymatic reactions. Following the final XP beads purification, quantification and size distribution of the precapture LM-PCR product was determined using the LabChip GX electrophoresis system (PerkinElmer).

For exome capture, precapture libraries (~1 μg) were individually hybridized in solution to the SeqCap EZ Exome 2.0 design (44 Mb, NimbleGen). Human Cot1 DNA and 3′-ddC-modified hybridization enhancing oligonucleotides were added into the hybridization to block repetitive genomic sequences and the adaptor sequences. Post-capture LM-PCR amplification was performed using the 2 × SOLiD Library High Fidelity Amplification Mix with 14 cycles of amplification. After the final AMPure XP bead purification, quantity and size of the capture library was analysed using the Agilent Bioanalyzer 2100 DNA Chip 7500. The efficiency of the capture was evaluated by performing a quantitative PCR-based quality check on the four standard NimbleGen internal controls. Successful enrichment of the capture libraries was estimated to range from a 6 to 9 of ΔCt value over the non-enriched samples.

Library templates were prepared for sequencing using Illumina’s cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401–2001). In brief, these libraries were denatured with sodium hydroxide and diluted to 3–6 pM in hybridization buffer to achieve a load density of ~800 K clusters mm⁻². Each library was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-1001), sequencing-by-synthesis reactions were extended for 101 cycles from each end. Sequencing runs generated ~150–250 million successful reads on each lane of a flow cell, with an average of 20 Gb per sample. With these sequencing yields, samples achieved an average of 90% of the targeted exome bases covered to a depth of 20 × or greater.

Whole-genome sequencing

WGS for 16 cases was performed by Complete Genomics. Details on sequencing, data analysis and coverage are available at ftp://caftpd.nci.nih.gov/pub/dcc_target/ALL/Phase_II/sequence/WGS/CGI_TARGET_Pipeline_README.pdf.

Data analysis

Initial sequence analysis was performed using the HGSC Mercury analysis pipeline ( https://www.hgsc.bcm.edu/software/mercury). First, the primary analysis software on the instrument produced .bcl files that were transferred off-instrument into the HGSC analysis infrastructure by the HiSeq Real-time Analysis module. Once the run was complete and all .bcl files were transferred, Mercury ran Illumina’s primary analysis software (CASAVA), which demultiplexes pooled samples and generates sequence reads and base-call confidence values (qualities). This was followed by mapping of reads to the GRCh37 human reference genome ( http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/) using the Burrows-Wheeler aligner (BWA²², http://bio-bwa.sourceforge.net/) and producing a BAM²³ (binary alignment/map) file. The third step involves quality recalibration (using GATK²⁴, http://www.broadinstitute.org/gatk/), and where necessary the merging of separate sequence-event BAMs into a single sample-level BAM. BAM sorting, duplicate read marking and realignment to improve indel discovery all occur at this step.

A second phase of data analysis was performed at St Jude Children’s Research Hospital for detecting somatic SNVs and insertion/deletion (indels) with an established pipeline^4,5. In brief, the reference human genome assembly GRCh37-lite was used for mapping all samples. The coverage of each tumour was summarized in Supplementary Fig. 1. Putative SNVs/indels were called using Bambino⁶, and potential false positive variants caused by alignment artifacts, poor quality, and paralogs were filtered. SVs were predicted from WXS data using the CREST algorithm⁷. All SNVs, indels and SVs affecting gene coding regions were submitted for capture validation.

Affymetrix SNP 6.0 array data analysis was performed by incorporating optimal reference normalization⁹ and circular binary segmentation¹⁰. The results are shown in Supplementary Data 4. All putative CNVs identified by this pipeline were manually reviewed in dChip and only focal CNVs (≤5 genes) and CNVs involving one of CDKN2A, PAX5 and TCF3 were incorporated in the pathway analysis.

SVs predicted by analysis of complete genomics WGS data were submitted to the SV annotation pipeline FusionBuilder as previously described^4,5. SVs resulting in in-frame gene fusions or with support from CNVs determined by SNP array were submitted for capture validation.

Experimental verification

All sequence mutations (including SNVs and indels) and a subset of SVs selected from WXS and CGI WGS analysis (with selection criteria described in the data analysis section) were subjected to experimental verification. An exception was made for two hypermutators with >300 relapse-specific mutations, PAPNNX and PASFXA: ~100 SNVs were selected for verification for each case. The targeted regions were submitted for probe design to Nimblegen/Roche using Nimbledesign V1.2. Library construction was performed using the NextFlex DNA Sequencing library prep kit (Bioo Scientific, Austin, TX). Targeted enrichment was conducted according to manufacturer’s recommendations (Roche), NGS sequencing using the PE100 protocol with V3 reagents was performed on a HiSeq 2000. The overall verification rate was 98%. The verified variants and high-quality putative variants not subjected for verification are listed in Supplementary Data 2 and were used for clonal evolution analysis. The verification for 24 SNVs was supplemented by amplicon NGS using a MiSeq (PE 150 cycle protocol). The PCR amplification was accomplished using Amplitaq Gold 360 (Life Technologies). The Nextera XT library construction kit (Illumina) was used to generate NGS libraries from the amplicons.

Site-specific screening of low MAF variants

To account for low MAF variants that are below the threshold of our SNV/indel detection pipeline, we employed a site-specific screening of each validated SNV or indel for presence of mutant allele in any of the diagnosis–relapse–remission trio samples with a negative mutation call. NGS reads harbouring the mutant allele based on the alignment stored in bam file were extracted for further analysis to determine the validity of the mutant allele. Each read was realigned using the Smith–Waterman algorithm to the hg19 reference genome including 100-bp flanking sequence around the variant. If the mutant allele was absent after realignment, the read was considered to be wild-type, and the mutant allele a false positive resulting from alignment artefact generated by the mapping algorithm. For a read that survived this alignment check, we performed the following check to filter variant caused by sequencing error or PCR artefact based on the evaluation of verification assay: (a) location of the mutant allele represents the lowest point (‘valley’) of sequence quality in a 5-bp window; or (b) the mutant allele was present only in sequence fragments with overlapping forward and reverse reads. The final mutant allele count included only the NGS reads that passed both the alignment check and variant allele quality check. We required a minimum of three non-duplicate reads to determine the presence or absence of a mutant allele at diagnosis or relapse using the combined data from WXS and capture sequencing.

Power for detecting low-frequency mutations

The mean coverage for WXS and capture sequencing were 187- and 696-fold, respectively. As capture sequencing was performed on DNAs derived from diagnosis, relapse and remission trio, we were able to recover low-frequency mutations at diagnosis missed by WXS. The combined coverage of WXS and capture sequencing is 883-fold, which was used to calculate MAF for clonal evolution analysis.

Using a binomial distribution, the power for detecting mutation at a MAF of 0.01 was 28.8, 97, 99.3%, respectively, for WXS, capture-alone and combined data. The power for discovering mutations at MAF of 0.03 and 0.05 by WXS is 92.13 and 99.6%, respectively.

Clonal origin of mutations in relapsed tumour of cases PAPSPN and PASLZM

The rising subclone (clone 3 in Fig. 3b) had a mean MAF of 0.003 (considering JAK2 p.Arg683Gly and KRAS p.Ala146Thr), raising the possibility that relapse-specific mutations (cluster e in Fig. 3a) representing the founding clone of relapse (clone 4 in Fig. 3b) should be part of clone 3 but were not detected at diagnosis due to low frequency. To address this question, we used the MAF (denoted α=0.003) of clone 3 at diagnosis, the combined diagnosis tumour coverage (denoted N) to calculate the probability of misclassifying each of the 38 SNVs present in cluster e as P(X=0|N, α) using a binomial distribution. A SNV with probability ≤0.05 was unlikely to be misclassified. A compound P value is then calculated as P(X>=x|#SNV, 0.05) using binomial distribution where x is the number of SNVs that are called unlikely to be misclassified and #SNV is the total number of cluster e SNVs, and 0.05 is the cutoff to call an SNV unlikely to be misclassified. This analysis showed that 14 out of the 38 cluster e SNVs had P value<0.05, which resulted in a compound P value 2e−9, suggesting that the mutation cluster e (clone 4) as a whole was unlikely to be the same as mutation cluster d.

Using a similar approach, we evaluated the clonal original of mutation clusters present in the dominant clone in the relapsed tumour of case PASLZM. Clone 2, defined by its lineage-specific mutation cluster b, is the rising subclone that persisted from diagnosis to relapse. The MAF of cluster b is ~0.01 at diagnosis and 0.5 at relapse. Mutation cluster c is comprised of five SNVs and their mutant allele is clonal at relapse but absent at diagnosis. Combining the coverage of WXS and capture data, the minimum coverage for each SNV in cluster c is 520. Under the assumption that cluster c were part of cluster b with a MAF of 0.01 at diagnosis, the binomial P value for detecting 0 mutant allele for each SNV is 0.003, while the compound P value for detecting 0 read for all five SNVs is 3e−7. This result suggests that mutation cluster c is likely to be distinct from cluster b.

Additional information

Accession codes. The sequence data of B-progenitor ALL samples have been deposited in dbGAP ( http://www.ncbi.nlm.nih.gov/gap) under the accession code phs000218.

How to cite this article: Ma, X. et al. Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia. Nat. Commun. 6:6604 doi: 10.1038/ncomms7604 (2015).

References

Pui, C. H., Mullighan, C. G., Evans, W. E. & Relling, M. V. Pediatric acute lymphoblastic leukemia: where are we going and how do we get there? Blood 120, 1165–1174 (2012) .
Article CAS Google Scholar
Place, A. E., Frederick, N. N. & Sallan, S. E. Therapeutic approaches to haematological malignancies in adolescents and young adults. Br. J. Haematol. 164, 3–14 (2014) .
Article CAS Google Scholar
Schrappe, M. et al. Key treatment questions in childhood acute lymphoblastic leukemia: results in 5 consecutive trials performed by the ALL-BFM study group from 1981 to 2000. Klin. Padiatr. 225, (Suppl 1): S62–S72 (2013) .
Article Google Scholar
Vora, A. et al. Treatment reduction for children and young adults with low-risk acute lymphoblastic leukaemia defined by minimal residual disease (UKALL 2003): a randomised controlled trial. Lancet Oncol. 14, 199–209 (2013) .
Article CAS Google Scholar
Mullighan, C. G. et al. CREBBP mutations in relapsed acute lymphoblastic leukaemia. Nature 471, 235–239 (2011) .
Article CAS ADS Google Scholar
Meyer, J. A. et al. Relapse-specific mutations in NT5C2 in childhood acute lymphoblastic leukemia. Nat. Genet. 45, 290–294 (2013) .
Article CAS Google Scholar
Tzoneva, G. et al. Activating mutations in the NT5C2 nucleotidase gene drive chemotherapy resistance in relapsed ALL. Nat. Med. 19, 368–371 (2013) .
Article CAS Google Scholar
Ding, L. et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481, 506–510 (2012) .
Article CAS ADS Google Scholar
Mullighan, C. G. et al. Genomic analysis of the clonal origins of relapsed acute lymphoblastic leukemia. Science 322, 1377–1380 (2008) .
Article CAS ADS Google Scholar
Walter, M. J. et al. Clonal architecture of secondary acute myeloid leukemia. New Engl. J. Med. 366, 1090–1098 (2012) .
Article CAS Google Scholar
Nguyen, K. et al. Factors influencing survival after relapse from acute lymphoblastic leukemia: a Children's Oncology Group study. Leukaemia 22, 2142–2150 (2008) .
Article CAS Google Scholar
Kearney, L. et al. Specific JAK2 mutation (JAK2R683) and multiple gene deletions in Down syndrome acute lymphoblastic leukemia. Blood 113, 646–648 (2009) .
Article CAS Google Scholar
Mullighan, C. G. et al. JAK mutations in high-risk childhood acute lymphoblastic leukemia. Proc. Natl Acad. Sci. USA 106, 9414–9418 (2009) .
Article CAS ADS Google Scholar
Zhang, J. et al. Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children’s Oncology Group. Blood 118, 3080–3087 (2011) .
Article CAS Google Scholar
Senter, L. et al. The clinical phenotype of Lynch syndrome due to germ-line PMS2 mutations. Gastroenterology 135, 419–428 (2008) .
Article CAS Google Scholar
Jaffe, J. D. et al. Global chromatin profiling reveals NSD2 mutations in pediatric acute lymphoblastic leukemia. Nat. Genet. 45, 1386–1391 (2013) .
Article CAS Google Scholar
Campana, D. Role of minimal residual disease monitoring in adult and pediatric acute lymphoblastic leukemia. Hematol. Oncol. Clin. North Am. 23, 1083–1098 vii (2009) .
Article Google Scholar
Faham, M. et al. Deep-sequencing approach for minimal residual disease detection in acute lymphoblastic leukemia. Blood 120, 5173–5180 (2012) .
Article CAS Google Scholar
Bateman, C. M. et al. Acquisition of genome-wide copy number alterations in monozygotic twins with acute lymphoblastic leukemia. Blood 115, 3553–3558 (2010) .
Article CAS Google Scholar
Greaves, M. F. & Wiemels, J. Origins of chromosome translocations in childhood leukaemia. Nat. Rev. Cancer 3, 639–649 (2003) .
Article CAS Google Scholar
Mar, B. G. et al. Mutations in epigenetic regulators including SETD2 are gained during relapse in paediatric acute lymphoblastic leukaemia. Nat. Commun. 5, 3469 (2014) .
Article Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler Transform. Bioinformatics (2010) .
Li, H. et al. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009) .
Article Google Scholar
DePristo, M. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011) .
Article CAS Google Scholar

Download references

Acknowledgements

The Acute Lymphoblastic Leukemia (ALL) data set is part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative, supported by a supplement to NCI Grant U10 CA98543 to the Children’s Oncology Group (COG). This is a collaborative project of the COG, the University of New Mexico Cancer Center, the St Jude’s Children’s Research Hospital Pediatric Cancer Genome Project and the National Cancer Institute. This work was supported in part by the American Lebanese Syrian Associated Charities and National Institutes of Health Grants CA21765 and GM92666 to St Jude Children’s Research Hospital and a St Baldrick’s Foundation Scholar Award (to C.G.M.), and by grants to the Children’s Oncology Group U10 CA98543 (Chair’s grant and supplement to support the COG High-Risk ALL TARGET project), U10 CA98413 (Statistical Center) and U24 CA114766 (Specimen Banking). This project has been funded in part by the Federal ARRA funds from the National Cancer Institute, National Institutes of Health, under Contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views of policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the U.S. Government.

Author information

Authors and Affiliations

Computational Biology, St Jude Children’s Research Hospital, Memphis, 38105, Tennessee, USA
Xiaotu Ma, Michael Edmonson, Michael Rusch, Gang Wu, Panduka Nagahawatte & Jinghui Zhang
Pediatric Cancer Genome Project Validation Lab, St Jude Children’s Research Hospital, Memphis, 38105, Tennessee, USA
Donald Yergeau, John Easton & Bhavin Vadodaria
Cancer Genomics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, 77030, Texas, USA
Donna M. Muzny, Oliver A. Hampton, David A. Wheeler & HarshaVardhan Doddapaneni
Pathology, St Jude Children’s Research Hospital, Memphis, 38105, Tennessee, USA
Guangchun Song, Jing Ma, James R. Downing & Charles G. Mullighan
University of New Mexico Cancer Center, Albuquerque, 87131, New Mexico, USA
Richard C. Harvey, I-Ming Chen & Cheryl L. Willman
Perlmutter Cancer Center, NYU Langone Medical Center, New York, 10016, NY, USA
William L. Carroll
Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital, Columbus, 43205, Ohio, USA
Julie M. Gastier-Foster
Departments of Pathology and Pediatrics, The Ohio State University, Columbus, 43210, Ohio, USA
Julie M. Gastier-Foster
Department of Pharmaceutical Sciences, St Jude Children’s Research Hospital, Memphis, 38105, Tennessee, USA
Mary V. Relling
Cancer Therapy Evaluation Program, National Cancer Institute, Bethesda, 20892, Maryland, USA
Malcolm A. Smith
Department of Biostatistics, Colleges of Medicine, Public Health & Health Professions, University of Florida, Gainesville, 32607, Florida, USA
Meenakshi Devidas
Office of Cancer Genomics, National Cancer Institute, Bethesda, 20892, Maryland, USA
Jaime M. Guidry Auvil & Daniela S. Gerhard
Department of Pediatrics, Benioff Children’s Hospital and the Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, 94143, California, USA
Mignon L. Loh
Division of Oncology and The Center for Childhood Cancer Research, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Stephen P. Hunger

Authors

Xiaotu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Michael Edmonson
View author publications
You can also search for this author in PubMed Google Scholar
Donald Yergeau
View author publications
You can also search for this author in PubMed Google Scholar
Donna M. Muzny
View author publications
You can also search for this author in PubMed Google Scholar
Oliver A. Hampton
View author publications
You can also search for this author in PubMed Google Scholar
Michael Rusch
View author publications
You can also search for this author in PubMed Google Scholar
Guangchun Song
View author publications
You can also search for this author in PubMed Google Scholar
John Easton
View author publications
You can also search for this author in PubMed Google Scholar
Richard C. Harvey
View author publications
You can also search for this author in PubMed Google Scholar
David A. Wheeler
View author publications
You can also search for this author in PubMed Google Scholar
Jing Ma
View author publications
You can also search for this author in PubMed Google Scholar
HarshaVardhan Doddapaneni
View author publications
You can also search for this author in PubMed Google Scholar
Bhavin Vadodaria
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Panduka Nagahawatte
View author publications
You can also search for this author in PubMed Google Scholar
William L. Carroll
View author publications
You can also search for this author in PubMed Google Scholar
I-Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Julie M. Gastier-Foster
View author publications
You can also search for this author in PubMed Google Scholar
Mary V. Relling
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm A. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Meenakshi Devidas
View author publications
You can also search for this author in PubMed Google Scholar
Jaime M. Guidry Auvil
View author publications
You can also search for this author in PubMed Google Scholar
James R. Downing
View author publications
You can also search for this author in PubMed Google Scholar
Mignon L. Loh
View author publications
You can also search for this author in PubMed Google Scholar
Cheryl L. Willman
View author publications
You can also search for this author in PubMed Google Scholar
Daniela S. Gerhard
View author publications
You can also search for this author in PubMed Google Scholar
Charles G. Mullighan
View author publications
You can also search for this author in PubMed Google Scholar
Stephen P. Hunger
View author publications
You can also search for this author in PubMed Google Scholar
Jinghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.L.C., J.R.D., M.L.L., C.L.W., D.S.G., M.V.R., M.A.S., C.G.M., S.P.H. and J.Z. conceived the research. C.G.M., S.P.H. and J.Z. designed the study. X.M., J.Z., M.E., D.Y., D.M.M., O.A.H., M.R., G.S., J.E., R.C.H., D.A.W., J.M., H.V.D., B.V., G.W., P.N., I.-M.C., M.D., J.M.G.A., C.G.M. and S.P.H. performed the experiments, analysed the data or prepared tables and figures. X.M., C.G.M., S.P.H. and J.Z. wrote the manuscript. All authors critically reviewed the manuscript for important intellectual content and approved the final submitted manuscript.

Corresponding authors

Correspondence to Charles G. Mullighan, Stephen P. Hunger or Jinghui Zhang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-23 and Supplementary References. (PDF 3278 kb)

Supplementary Data 1

Clinical information of all 20 trios in this study (XLS 24 kb)

Supplementary Data 2

SNVs and Indels of all 20 trios in this study. (XLS 889 kb)

Supplementary Data 3

Structural Variations in all 20 trios in this study (XLS 34 kb)

Supplementary Data 4

Copy number variations detected by SNP6.0 array. (XLS 104 kb)

Supplementary Data 5

Ten tumors that have multi-subclonal mutations in the same gene. (XLS 24 kb)

Supplementary Data 6

Mutant alleles in remission DNA. (XLS 21 kb)

Supplementary Data 7

Detection of mutant allele in remission DNA in case PAPSPN. (XLS 20 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Ma, X., Edmonson, M., Yergeau, D. et al. Rise and fall of subclones from diagnosis to relapse in pediatric B-acute lymphoblastic leukaemia. Nat Commun 6, 6604 (2015). https://doi.org/10.1038/ncomms7604

Download citation

Received: 04 December 2014
Accepted: 11 February 2015
Published: 19 March 2015
DOI: https://doi.org/10.1038/ncomms7604

This article is cited by

miR-126 identifies a quiescent and chemo-resistant human B-ALL cell subset that correlates with minimal residual disease
- Carolina Caserta
- Silvia Nucera
- Bernhard Gentner
Leukemia (2023)
Disruption to the FOXO-PRDM1 axis resulting from deletions of chromosome 6 in acute lymphoblastic leukaemia
- Paul B. Sinclair
- Ruth E. Cranston
- Christine J. Harrison
Leukemia (2023)
Targetable lesions and proteomes predict therapy sensitivity through disease evolution in pediatric acute lymphoblastic leukemia
- Amanda C. Lorentzian
- Jenna Rever
- Philipp F. Lange
Nature Communications (2023)
Etiology of oncogenic fusions in 5,190 childhood cancers and its clinical and therapeutic implication
- Yanling Liu
- Jonathon Klein
- Xiaotu Ma
Nature Communications (2023)
Convergent evolution and multi-wave clonal invasion in H3 K27-altered diffuse midline gliomas treated with a PDGFR inhibitor
- Sasi Arunachalam
- Karol Szlachta
- Jinghui Zhang
Acta Neuropathologica Communications (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.