Mutational patterns and clonal evolution from diagnosis to relapse in pediatric acute lymphoblastic leukemia

The mechanisms driving clonal heterogeneity and evolution in relapsed pediatric acute lymphoblastic leukemia (ALL) are not fully understood. We performed whole genome sequencing of samples collected at diagnosis, relapse(s) and remission from 29 Nordic patients. Somatic point mutations and large-scale structural variants were called using individually matched remission samples as controls, and allelic expression of the mutations was assessed in ALL cells using RNA-sequencing. We observed an increased burden of somatic mutations at relapse, compared to diagnosis, and at second relapse compared to first relapse. In addition to 29 known ALL driver genes, of which nine genes carried recurrent protein-coding mutations in our sample set, we identified putative non-protein coding mutations in regulatory regions of seven additional genes that have not previously been described in ALL. Cluster analysis of hundreds of somatic mutations per sample revealed three distinct evolutionary trajectories during ALL progression from diagnosis to relapse. The evolutionary trajectories provide insight into the mutational mechanisms leading relapse in ALL and could offer biomarkers for improved risk prediction in individual patients.

clone 1 in (grey) represents the founding clone. The height of the colored fields corresponds to the proportion of the clones in a sample. Section (ii) shows the clones and subclones identified at diagnosis (Di), first (R1) and second relapse (R2). Each color-coded branch shows the known and putative driver genes identified at each time point, with the gene names highlighted in blue for fusion genes and in red for putative regulatory non-coding variants. Section (iii) shows the total number of somatic SNVs present in each clone using the same color code as in sections (i) and (ii). kb heterozygous deletion (del) spanning the SH2B3 gene, which combined with somatic frameshift (fs) and missense mutations (ms) in SH2B3 cause biallelic loss of function, frameshift mutations in ETV6, and missense mutations in BIRC7. The minor subclone 7 (red) present at diagnosis disappeared at R1. At R1, two novel clones have appeared. Subclone 4 (purple) contains homozygous deletions of CDKN2A and ~50 kb deletion of exons 3-7 of the IKZF1 gene, and heterozygous deletion of NF1and PMS2. This clone persists and expands at R2.
Heterozygous deletion of the mismatch repair gene PMS2 on chr7p at R1, which becomes homozygous at R2 likely leads to the hypermutated descendant clone 3 (green).     ETV6-RUNX1 is present in the founding clone 1 (grey). Two subclones present at Di (#10, green and #3, light green) disappear at first relapse, while the main clone 6 (orange) containing a missense mutation in HDAC2 persists at R1. At R1 a missense mutation in NT5C2, a stop-gain mutation in CREBBP and putative mutation in the intron of GATAD2B appear in clone 4 (purple), but this clone disappears at the R2. Subclone 6 with a missense mutation in HDAC2 Sayyab et al.
gives rise to subclone 7 (red) with a missense mutation in PRPS1. Both subclones 6 and 7 expand at second relapse, giving rise to subclone 2 (blue) with a homozygous deletion (del) in BCORL1.    and ZCCHC7 expands at R1 and persists at R2. At R1 the expanded clone 6 forms the major clone giving rise to three new subclones (#2, blue; #5, pink #8, green). Of these, subclone 5 expands R2, while subclones 2 and 8 disappear. Subclone 5 harbors a heterozygous deletion in CDKN2A gene. At R2, subclones 5 and 6 give rise to a new subclone (#3, light green).  (c) Cosine similarities (vertical axis) between observed mutations and mutations reconstructed from different sets of COSMIC mutational signatures (horizontal axis) for the 67 ALL genomes.
The three de novo mutational signatures identified in our sample set and twelve COSMIC Sayyab et al.
signatures that had a cosine-similarity > 0.65 to one of the three de novo signatures reconstruct the observed mutations well (median cosine similarities of 0.95 and 0.94, respectively). The six signatures (SBS1, SBS2, SBS6, SBS13, SBS40 and SBS89) reconstruct the observed mutations with a median similarity =0.93, which was not significantly different compared to the three de novo signatures (t-test, p = 0.17). Therefore, we used these six signatures in our analyses.  Signature 3           (c) The consensus model tree for the clonal evolution, in which each branch shows the known and putative driver genes identified at diagnosis and first and second relapse.

Determination of clusters to exclude
For example in Supplementary Fig.S4 for ALL_837, cluster 6 on the x-axis in panel a Original AF is excluded. Cluster 8 SNVs median VAF at Di is almost as high as that of the founding cluster 1. This means that the SNVs in cluster 6 must belong to a subclone having the SNVs in

Manual adjustment of cluster median allele frequency
For example, see Supplementary Fig.S4, panel a, adjusted AF in ALL_8. This panel is showing data adjusted and modeled by Pyclone before any manual cluster adjustment. Looking closely at adjusted AF in panel R1, the median AF of cluster 2 is higher than the median AF of cluster 3.

Sayyab et al.
This indicates that the clone with cluster 3 SNVs is a descendant of a clone with cluster 2 SNVs, but because cluster 3 SNVs are present at diagnosis, while cluster 2 SNVs are not, this is not possible and model building will fail. We assume that the median AF estimate of cluster 2 is slightly wrong at R2 and therefore we adjust it downwards 3% to make it fit as a descendant of cluster 3, in order to make modeling possible.    Figure S6