The acquisition of molecular drivers in pediatric therapy-related myeloid neoplasms

Schwartz, Jason R.; Ma, Jing; Kamens, Jennifer; Westover, Tamara; Walsh, Michael P.; Brady, Samuel W.; Robert Michael, J.; Chen, Xiaolong; Montefiori, Lindsey; Song, Guangchun; Wu, Gang; Wu, Huiyun; Branstetter, Cristyn; Hiltenbrand, Ryan; Walsh, Michael F.; Nichols, Kim E.; Maciaszek, Jamie L.; Liu, Yanling; Kumar, Priyadarshini; Easton, John; Newman, Scott; Rubnitz, Jeffrey E.; Mullighan, Charles G.; Pounds, Stanley; Zhang, Jinghui; Gruber, Tanja; Ma, Xiaotu; Klco, Jeffery M.

doi:10.1038/s41467-021-21255-8

Download PDF

Article
Open access
Published: 12 February 2021

The acquisition of molecular drivers in pediatric therapy-related myeloid neoplasms

Nature Communications volume 12, Article number: 985 (2021) Cite this article

5774 Accesses
41 Citations
91 Altmetric
Metrics details

Subjects

Abstract

Pediatric therapy-related myeloid neoplasms (tMN) occur in children after exposure to cytotoxic therapy and have a dismal prognosis. The somatic and germline genomic alterations that drive these myeloid neoplasms in children and how they arise have yet to be comprehensively described. We use whole exome, whole genome, and/or RNA sequencing to characterize the genomic profile of 84 pediatric tMN cases (tMDS: n = 28, tAML: n = 56). Our data show that Ras/MAPK pathway mutations, alterations in RUNX1 or TP53, and KMT2A rearrangements are frequent somatic drivers, and we identify cases with aberrant MECOM expression secondary to enhancer hijacking. Unlike adults with tMN, we find no evidence of pre-existing minor tMN clones (including those with TP53 mutations), but rather the majority of cases are unrelated clones arising as a consequence of cytotoxic therapy. These studies also uncover rare cases of lineage switch disease rather than true secondary neoplasms.

Molecular characterization of a second myeloid neoplasm developing after treatment for acute myeloid leukemia

Article 12 November 2019

Selective pressures of platinum compounds shape the evolution of therapy-related myeloid neoplasms

Article Open access 17 July 2024

Mutation order in acute myeloid leukemia identifies uncommon patterns of evolution and illuminates phenotypic heterogeneity

Article 11 March 2024

Introduction

Although the therapeutic regimens for pediatric cancer have improved with a resultant overall decrease in the incidence of tMN in children^1,2,3,4, approximately 0.5–1.0% of children continue to develop tMN after therapy for hematological, solid, and CNS malignancies². Children with tMN have a worse prognosis compared to de novo MDS/AML, with 5-year survival rates of 6–11% if not treated with hematopoietic cell transplant (HCT)^1,2. While much effort has focused on tMN in adults^5,6,7,8,9, a complete understanding of the pathogenesis of tMN in children is lacking despite well-described associations with alkylating agents (e.g., cyclophosphamide), topoisomerase II inhibitors (e.g., the epipodophyllotoxins etoposide and teniposide), radiation therapy, and HCT^{10,11,12,13,14}. Epipodophyllotoxin-associated tMN is strongly associated with KMT2Ar^10,15.

Here, using a comprehensive sequencing approach, we show that Ras/MAPK pathway mutations, alterations in RUNX1 or TP53, and KMT2A rearrangements are frequent somatic drivers in pediatric tMN, and we find that in some cases aberrant MECOM expression is secondary to enhancer hijacking. Additionally, using samples from serial timepoints, we find no evidence of pre-existing minor tMN clones (including those with TP53 mutations) like in adults with tMN^5,6,7, but rather the majority of cases are unrelated clones arising as a consequence of cytotoxic therapy.

Results

Sequencing of pediatric tMN samples

Eighty-four pediatric tMN cases, including tMDS (n = 28) and tAML (n = 56), were profiled, including both tumor and non-tumor tissue for 62 cases and only non-tumor material for 22 cases (Table 1 & Supplementary Data 1). Initial diagnoses included hematologic (70%), solid (27%), and brain (3%) neoplasms (Fig. 1a). The median age at tMN was 13.6 years (range: 1.2–24.6 yrs) (Supplementary Fig. 1a, b, & Supplementary Data 2), and the time to tMN after initial diagnosis varied widely (median: 2.9 yrs; range: 0.7–16.2 yrs) (Supplementary Fig. 1c–e, & Supplementary Data 3). Somatic variants identified from WGS (median coverage: 50x) or WES (112x) were validated by targeted resequencing (641x) (Supplementary Data 4–8).

Table 1 Sequencing Approach for the Pediatric tMN Cohort.

Full size table

**Fig. 1: Clinical and genomic features of the pediatric tMN cohort.**

A mean of 28 (range: 1–188) somatic mutations per patient were identified, which is significantly greater than the mutational burden found in pediatric primary MDS (5 mutations/patient, p < 0.001) and pediatric de novo core-binding factor AML (13 mutations/patient, p < 0.001)(Fig. 1b)^16,17. Four patients had mutation burdens greater than 2 standard deviations above the mean, ranging from 115 to 188 mutations/patient (Supplementary Fig. 2a). We detected DNA repair pathway gene (PMS2; n = 2, MSH6; n = 1) alterations in 3 of these hypermutated cases (Supplementary Data 9). In the fourth case (SJ016473), the hypermutation status appears to be driven by variants with variant allele frequency (VAF) < 0.2 (Supplementary Fig. 2b), and the corresponding driver alteration could have escaped detection due to limited depth. Including multiple modes of somatic alterations (SNV, CNV, & fusions), we used the Genomic Random Interval (GRIN) model¹⁸ to identify 91 genes that were significantly altered in this cohort (Supplementary Data 10). The most common altered functional pathways were epigenomic (n = 57 of 62, 92%) and cell signaling (n = 46 of 62, 74%), with mutations in the Ras/MAPK pathway, including KRAS and NF1, and mutations or structural alterations involving RUNX1 and KMT2A being the most frequent (Fig. 1c,d, & Supplementary Data 11).

Putative germline variants in pediatric tMN

Fourteen pathogenic or likely pathogenic presumed germline sequence alterations were identified in 13 of 84 patients (15%, 95% exact binomial CI: 8.5–25.0%) (Table 2 & Supplementary Data 12–14), indicating that germline alterations may be more common in tMN than the published prevalence of 8.5–10% in other groups of children with cancer^19,20,21,22. This includes 4 patients with germline TP53 mutations. There was also evidence of TP53 mosaicism in the non-tumor tissue in 5 additional patients (Fig. 1e & Supplementary Data 15). Collectively, 15 patients (18%) had somatic (mutation and/or copy number alteration) or germline alterations in TP53 (Supplementary Fig. 3). There was a significant enrichment of complex cytogenetics in patients with TP53 alterations (11 of 13) versus wild-type TP53 patients when considering those with comprehensive sequencing (n = 62, 85% vs. 12%; Fisher’s p < 0.0001) (Supplementary Fig. 3e). Three other patients had low VAF somatic truncating mutations in exon 6 of PPM1D (Supplementary Fig. 4)^23,24. Despite the fact that deletions or CN-LOH involving chromosome 7 (del(7)) were the most common copy number alteration (22 of 62, 35%) (Fig. 1f, Supplementary Fig. 5, & Supplementary Data 16), germline mutations in SAMD9, SAMD9L, GATA2, or RUNX1 were not present^16,25,26,27. The comprehensive mutational profile of pediatric tMN is shown in Fig. 2a.

Table 2 Pathogenic and Likely Pathogenic Germline Variants Present in the Pediatric tMN Cohort.

Full size table

**Fig. 2: Comprehensive mutational spectrum of pediatric tMN.**

Mutational signatures of pediatric tMN

C > T transitions were the predominant mutation type (Fig. 2b, c). Mutational signature analysis on the 16 WGS cases and 3 WES cases with a sufficient quantity of SNVs (>30) identified drug signatures in 9 cases, including 4 with the cisplatin signature (COSMIC 31 & 35), and 5 with the thiopurine signature²⁸, consistent with the prior treatment history (Supplementary Data 17). Eight cases did not have a detectable drug signature but rather clock-like signatures 1, 5, and 40 (Fig. 2d)^29,30, while 2 additional patients had a signature similar to one of unknown etiology recently reported in relapsed mismatch repair (MMR)-deficient ALL³¹ which we term the “relapse MMR” signature. Both had germline (SJ016519) or somatic (SJ016494) pathogenic PMS2 mutations. The relapse MMR signature bore similarities to the thiopurine signature (Supplementary Fig. 6), had similar strand bias to the thiopurine signature²⁸ (Supplementary Fig. 7), and occurred in patients with previous thiopurine exposure, thus suggesting it was a variant of the thiopurine signature that occurs under MMR-deficient conditions. We determined the probability that driver SNVs were caused by each signature as reported previously²⁸ (Fig. 2d, bottom), and found that 2 TP53 mutations were most likely (>50% probability) induced by cisplatin or thiopurines along with several Ras pathway and other variants. Example calculations showing the probability that specific driver mutations were caused by individual signatures are shown in Supplementary Fig. 8. These calculations are based on the signatures present in each sample and their mutation preference at specific trinucleotide contexts; thus, two KRAS G12D mutations in two different patients (SJ030799 and SJ016494) were likely caused by different mutational processes due to the presence of different signatures in the two samples.

Chromosomal rearrangements present in pediatric tMN

Chromosomal rearrangements encoding fusion oncoproteins were identified by RNA-seq in 70% of cases (39 of 56 with available RNA). KMT2A fusions were the most common (n = 28, 60%, GRIN p = 1.86 × 10⁻⁷⁴)(Fig. 3a, Supplementary Data 18–20, & Supplementary Fig. 9) and other in-frame fusions previously reported in myeloid malignancies involving NUP98 (n = 3) and ETV6 (n = 2) were also observed^32,33,34. Likewise, 3 in-frame RUNX1 fusions (RUNX1-MTAP, RUNX1-LYPD5, and RUNX1-MECOM) were identified (Supplementary Figs. 10 & 11). In addition to the RUNX1-MECOM fusion, we noted variable expression levels of MECOM across the cohort (FPKM range: 0.004–38.4), and 24 cases (43%) had an FPKM > 5 (MECOM^High) (Fig. 3b). Elevated MECOM expression has been associated with myeloid neoplasms, particularly tMN and those with KMT2Ar, and is associated with a poor prognosis in both adult and pediatric myeloid neoplasms^{34,35,36,37,38,39}. KMT2Ar was significantly enriched in the MECOM^high cases (KMT2Ar: 18 vs. no KMT2Ar: 6, Fisher’s p < 0.01) (Supplementary Fig. 12) while another MECOM^high patient had a NUP98 fusion (NUP98-HHEX)(Fig. 3b & Supplementary Fig. 10b), a previously reported association with high MECOM expression^40,41,42. WGS on 3 of the 4 remaining MECOM^high cases revealed structural variations (SV) involving the MECOM locus on chromosome 3 (Fig. 3c). Two cases involved noncoding regions of chromosome 2 adjacent to ZFP36L2, a gene encoding an RNA binding protein that is highly expressed in hematopoietic cells and is involved in hematopoiesis, and the other involved noncoding regions of chromosome 17 adjacent to MSI2, another gene encoding an RNA binding protein that has been found to be recurrently rearranged in hematological malignancies (Fig. 3d)^{43,44,45,46,47}. The existing ENCODE data and similar studies in human CD34 cells support that these regions of the genome are super-enhancers in hematopoietic cells, suggesting a proximity effect in which these enhancers have been hijacked to drive high levels of MECOM expression (Supplementary Fig. 13)^48,49. Furthermore, despite the lack of in-frame fusions in the RNA-seq data, these cases demonstrate allele-specific MECOM expression⁵⁰, further suggesting a cis-regulatory element may be driving this aberrant expression (Fig. 3d). WGS also identified a MECOM SV in SJ030441 (SATB1@-MECOM), but elevated MECOM RNA levels were not present in this case (Fig. 3b); however, immunohistochemical studies on the patient material demonstrated high MECOM protein expression in the blasts (Fig. 3e). Similar MECOM protein expression was detected in the other MECOM altered cases⁵¹, but not in tMN cases without a MECOM SV (Fig. 3e). Contrary to pediatric de novo AML studies, there was not a statistically significant association between higher MECOM expression and disease-related deaths within this pediatric tMN cohort (Supplementary Fig. 14)³⁶. Rather, a multivariable analysis shows that the presence of complex cytogenetics does significantly impact disease-related mortality risk (Fine-Gray model HR = 2.17; p = 0.04).

**Fig. 3: Structural variations and *MECOM* dysregulation in pediatric tMN.**

Clonal evolution of pediatric tMN

Finally, using a combination of targeted capture resequencing and a bioinformatic error suppression approach⁵² we described the timing of acquisition and evolution of the somatic mutations for 37 cases using samples from interval time points prior to the development of tMN, including 26 cases in which material for the primary malignancy was available for analysis (Supplementary Data 21). We demonstrated that the somatic variants most commonly arose after the introduction of cytotoxic therapy (n = 23 of 26, 88%), and we could detect these acquired mutations up to 748 days (mean: 405 days; range: 118–748) prior to morphologic evidence of tMN (Fig. 4a & Supplementary Figs. 15 & 16). Three cases were found to be clonally related to the original malignancy. These included a tMDS that developed 8 months after AML and both were found to harbor a NUP98-NSD1 fusion (Fig. 4b) with multiple discrete WT1^mut subclones, and 2 cases where the initial lymphoid malignancy (ALL or NHL) and tMN developed from a common clone that subsequently underwent a lineage switch (Fig. 4c–f). Unlike adult tMN⁵, the somatic TP53 variants could not be detected with ultra-deep amplicon sequencing (72,000x) and bioinformatic error suppression in pre-treatment samples⁵² (Supplementary Data 22 & Supplementary Fig. 17).

Discussion

Here we show the results of our comprehensive sequencing of pediatric tMN which reveals that KMT2Ar are the most common driver alterations in our pediatric tMN cohort along with Ras/MAPK pathway mutations. Somatic TP53 alterations were also frequent, but these mutations appeared to arise after chemotherapy, unlike adult tMN⁵. Additionally, we identified MECOM overexpression to be frequent, and in some of these cases the overexpression was driven by enhancer hijacking. Finally, we show that pediatric tMN-defining variants arise most commonly as a consequence of cytotoxic therapy, and that these malignant clones can be identified, on average, >1 year before morphologic evidence of neoplasm. While these studies reflect the experience of a single institution, the findings highlight the diverse nature of genomic alterations in pediatric tMN and suggest that genomic screening approaches may be able to identify at risk patients prior to tMN development.

Methods

Patient sample details

Patient material was obtained with written informed consent using a protocol approved by the St. Jude Children’s Research Hospital Institutional Review Board. All patients with a diagnosis of tMN (either tMDS or tAML) with appropriate consent for genomic studies and available tumor or normal samples banked in the St. Jude Tissue Biorepository were included. Diagnoses were reviewed by a hematopathologist (J.M.K.) and classified according to the WHO 2016 classification of myeloid neoplasms and acute leukemia⁵³. Supplementary Data 1 contains clinicopathological information for all samples included in our analyses. Samples were de-identified before nucleic acid extraction and analysis. The study cohort is comprised of 84 total patients (tMDS = 28, tAML = 56). Sixty-two patients had available tumor and normal tissue for characterization, while the remaining 22 lacked sufficient tumor material for comprehensive sequencing (Table 1). For the 62 tumor/normal pairs, flow sorted lymphocytes from the diagnostic tMN samples were used as the source of normal comparator genomic DNA in 53 cases, while bone marrow (n = 4) or peripheral blood (n = 5) from alternate timepoints was used for the remainder. Cryopreserved bulk bone marrow cells were thawed in a 37 °C water bath and transferred to 20% FBS in PBS to remove residual DMSO according to standard approaches⁵⁴. Cells were lysed with ACK lysing buffer (ThermoFisher A1049201) and washed with PBS prior to staining. The following antibodies were used to immunophenotype the cells and facilitate flow sorting of myeloid and lymphoid populations: CD15-FITC (eBioscience, clone HI98), CD71-BV711 (BD Biosciences, clone M-A712), CD34-PE (Beckman, clones QBEnd10, Immu133, Immu409), CD45R-PerCP-Cy5.5 (eBioscience, clone RA3-6B2), CD235a-PE-Cy7 (BD Biosciences, clone GA-R2), CD3-APC-Cy7 (BD Biosciences, clone SK7), CD33-APC (eBioscience, clone WM-53). For the 23 normal only cases, bulk sequencing was completed on interval remission samples.

WGS, WES, and RNA-Seq analysis

DNA and RNA material was isolated from bulk myeloid or isolated lymphocytes by standard phenol:chloroform extraction and ethanol precipitation. Whole genome sequencing libraries were constructed using the TruSeq DNA PCR-Free sample preparation kit (Illumina, Inc., CA) following the manufacturer’s instructions and whole-exome sequencing was completed using the Nextera Rapid Capture Expanded Exome reagent (Illumina). After library quality and quantity assessment, WGS, WES, or RNASeq samples were sequenced on various Illumina platforms (HiSeq 2500, HiSeq 4000, or NovaSeq 6000). Mapping, coverage, quality assessment, single-nucleotide variant (SNV) and indel detection, and tier annotation for sequence mutations (SNVs discovered by WGS were classified as tier 1, tier 2, tier 3, or tier 4) have been described previously^55,56,57 and briefly described here. DNA reads were mapped using BWA^58,59 (WGS: v0.7.15-r1140; WES: v0.5.9-r26-dev and v0.7.12-r1039 since data were generated over a period of time) to the GRCh37/hg19 human genome assembly. Aligned files were merged, sorted and de-duplicated using Picard tools 1.65 (broadinstitute.github.io/picard/). SNVs and Indels in WGS and WES were detected using Bambino⁶⁰. For WGS data, sequence variants were classified into the following four tiers: (i) tier 1: coding synonymous, nonsynonymous, splice-site and noncoding RNA variants; (ii) tier 2: conserved variants (conservation score cutoff of greater than or equal to 500, based on either the phastConsElements28way table or the phastConsElements17way table from the UCSC Genome Browser) and variants in regulatory regions annotated by UCSC (regulatory annotations included are targetScanS, ORegAnno, tfbsConsSites, vistaEnhancers, eponine, firstEF, L1 TAF1 Valid, Poly(A), switchDbTss, encodeUViennaRnaz, laminB1 and cpgIslandExt); (iii) tier 3: variants in non-repeat masked regions; and (iv) tier 4: the remaining SNVs. Structural variations in whole-genome sequencing data were analyzed using CREST⁶¹ (v1.0). RNA-sequencing was performed using TruSeq Stranded Total RNA library kit (Illumina) and analyzed, as previously described^16,17. Briefly, RNA reads were mapped using our StrongARM pipeline (internal pipeline, described by Wu et al.⁶²). Paired-end reads from RNA-seq were aligned to the following four database files using BWA: (i) the human GRCh37-lite reference sequence, (ii) RefSeq, (iii) a sequence file representing all possible combinations of non-sequential pairs in RefSeq exons and, (iv) the AceView database flat file downloaded from UCSC representing transcripts constructed from human ESTs. Additionally, they were mapped to the human GRCh37-lite reference sequence using STAR. The mapping results from databases (ii)–(iv) were aligned to human reference genome coordinates. The final BAM file was constructed by selecting the best of the five alignments. Chimeric fusion detection was carried out using CICERO⁶³ (v0.3.0) and Chimerascan⁶⁴ (v0.4.5). All identified fusions were validated by either RT-PCR, cytogenetics, manual review of CREST data, or a combination of these methods (Supplementary Data 18, 20, & Supplementary Figs. 9 and 18). Mapping statistics and coverage data are described in Supplementary Data 6–8 & 15. Recurrent SNV’s identified by WGS or WES were validated by custom capture resequencing (Supplementary Data 2, 3, and 19). Custom capture baits were designed (Twist Biosciences) to be 80 nucleotides long covering the provided hg19 target region consisting of 1,006,633 unique base pairs (bp). A total target region of 904,622 bp is directly covered by 11,455 probes. BWA^58,59 (v0.7.12) MEM algorithm was used to map the TWIST sequencing reads to the GRCh37/hg19 human genome assembly. Rsamtools⁶⁵ (v1.30.0) was used to retrieve read counts from BAM files for the SNV/Indels called in WES, requiring MAPQ > = 1 and base quality Phred score > = 20. We also performed de novo mutation calling in an attempt to catch canonical low variant allele frequency (VAF) cancer gene mutations missed by WES using VarScan 2⁶⁶ (v2.3.5) on the TWIST data with the following criteria: MAPQ > = 1; base quality Phred score > = 20; VAF > = 0.01 and variant call p-value < = 0.05. Selected somatic variants (WES read count <5 and targeted capture read count <10) and all somatic TP53 variants identified via WES were validated by custom amplicon sequencing. PCR primers (Supplementary Data 22) were designed to flank the putative variants. Amplicon sizes were approximately 200 base pairs. PCR was performed using KAPA HiFi HotStart ReadyMix (Roche), 100 nM of each primer (IDT) and 20 ng of gDNA in a 40uL reaction volume. Thermocycling was performed using the following parameters: 95 °C for 3 min; 98 °C for 20 s, 62 °C for 15 s, and 72 °C for 15 s for a total of 30 cycles; and 72 °C for 1 min. All amplicons were quality checked on a 2% agarose gel. Primers were designed to incorporate Illumina overhang adapter sequences which allowed for indexing using the Nextera XT Index kit (Illumina) following the manufacturer’s instructions. Libraries were normalized, pooled, and sequenced on an Illumina MiSeq instrument using a 2 × 150 paired-end version 2 sequencing kit. We used the CleanDeepSeq⁵² approach with default settings for error suppression in this ultra-deep amplicon sequencing.

Copy number analysis using NGS data

Copy number analysis of the WGS (n = 4) cases was done using CONSERTING⁶⁷. Copy number analysis of the WES (n = 58) cases was done following these steps: Samtools⁶⁸ (v1.2) mpileup command was used to generate an mpileup file from matched normal and tumor BAM files with duplicates removed; VarScan2⁶⁶ (v2.3.5) was then used to take the mpileup file to call somatic CNAs after adjusting for normal/tumor sample read coverage depth and GC content; Circular Binary Segmentation algorithm⁶⁹ implemented in the DNAcopy R package⁷⁰ was used to identify the candidate CNAs for each sample; B-allele frequency info for all high quality dbSNPs heterozygous in the germline sample was also used to assess allele imbalance.

Germline analysis

Whole exome sequencing data were analyzed using internal workflows that were previously described¹⁹. Briefly, the sequencing data were analyzed for the presence of single-nucleotide variants and small insertions and deletions (Indels) and for evidence of germline mosaicism. Germline copy-number variations and structural variations were identified with the use of the Copy Number Segmentation by Regression Tree in Next Generation Sequencing (CONSERTING)⁶⁷ and Clipping Reveals Structure (CREST)⁶¹ algorithms. For all SNPs and Indels, functional prediction (e.g., SIFT, CADD, and Polyphen) scores and population minor allele frequency (MAF) were annotated. In this work, 3 databases were used for population MAF annotation: (i) NHLBI GO Exome Sequencing Project (http://evs.gs.washington.edu/EVS/); (ii) 1000 genomes (http://www.internationalgenome.org); and (iii) ExAC non-TCGA version (http://exac.broadinstitute.org/). For missense mutations, REVEL (rare exome variant ensemble learner) score was also determined to help predict pathogenicity⁷¹. A gene list of 631 genes were composed from various resources: (i) literature review of genes that are potentially involved in AML, MDS, inherited bone marrow failure syndromes, as well as other cancer types^{5,19,72,73,74} (ii) genes that were involved in splicing from predefined pathways (e.g., splicing) in KEGG, GeneOntology, Reactome, Gene Set Enrichment Analysis (GSEA), and NCBI (Supplementary Data 14). The following filtering criteria were applied: VAF ≥ 0.2, coverage >20x, ExAC MAF < 0.001 (or not present in ExAC), REVEL score >0.5 (for missense mutations), NHLBI and 1000 genomes MAF < 0.001. One TP53 variant that was lost through this filtering was manually recovered because the patient was clinically diagnosed with Li Fraumeni syndrome. Given this finding, all germline TP53 mutations were manually reviewed and analyzed as described below for mosaicism. Of note, the germline ETV6 p.N386fs in case SJ021960 was previously reported⁷⁵. All non-synonymous mutations were comprehensively reviewed and classified as pathogenic, likely pathogenic, of uncertain significance, likely benign, or benign based on recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology⁷⁶ by members of the Cancer Predisposition Division at St. Jude (J.L.M and K.E.N).

Determination of mosaicism versus tumor-in-normal contamination

Because the normal samples used were hematopoietic specimens (sorted lymphocytes or remission bulk marrow), the mosaic mutations can be a result of incomplete remission. To rule out this possibility, we performed a previously developed statistical analysis that can model residual disease burden¹⁹. Briefly, we first determined purity (denoted as f) of the tMN tumor sample by clustering allele fractions of somatic SNVs/Indels by using R package “Mclust,” where the cluster with the highest mean (denoted as u) center under 0.5 was used to estimate tumor purity (multiplied by 2 to account for diploid status, f = 2*u). To account for clonal evolution, we also calculated tumor purity by using heterozygous loss and copy neutral loss of heterozygosity (CN-LOH) regions with the highest magnitude of scores. For heterozygous loss regions, the purity is estimated as f = 2–2^{(log.ratio+1)},while for CN-LOH region the purity is estimated as f = 2*AI where AI = | B-allele fraction – 0.5 | . The maximum of the SNV/Indel and CNV/LOH-based purity estimate was used as the final purity estimate (f) for a given tumor. We then defined an SNV/Indel as diploid clonal if its allele fraction is > f*0.5*80% = u*80% and <0.6. The sum of mutant allele counts of these markers was denoted as M, and the sum of depth of these markers as T, thus the tumor-in-normal contamination level of the germline sample is then estimated as c = M/T. The expected allele fraction of TP53 mutation is estimated by considering its local ploidy and contamination level c. In our dataset, the TP53 mutations are either 1-copy loss-LOH or CN-LOH (Supplementary Data 1, 4, and 16). For 1-copy-LOH, the expected allele fraction of TP53 under contamination is e = c*(2-c)⁻¹, while for CN-LOH the expected allele fraction of TP53 is simply e = c. We then tested the hypothesis that the observed TP53 allele counts in germline sample are due to contamination by using a binomial test. A significant p value (<0.01), after Bonferroni correction, would indicate that the observed allele counts are unlikely to be explained by contamination. To rule out the possibility of germline inheritance, we also tested the allele counts against inheritance (i.e., e = 0.5). A TP53 mutation with significant p values (<0.01) for both the contamination test and the inheritance test is called a mosaic mutation. For normal only samples, variants with a VAF of ≥0.2 were classified as germline, but variants with a VAF of <0.2 and with a supportive clinical history were classified as mosaic. We are unable to distinguish germline versus somatic mosaicism.

Mutational signature analysis

The trinucleotide context of each somatic SNV was identified using an in-house script, and mutations were assigned to one of each of the 96 trinucleotide mutation types⁷⁷. To detect whether any novel signatures were present in the dataset, we ran SigProfiler version 2.3.1⁷⁸ on the SNV catalogs from the 16 WGS samples and extracted 3 signatures. One of the extracted signatures resembled the cisplatin signature (SBS-31); one represented a combination of clock-like signatures 1 and 5 (SBS-1, SBS-5)⁷⁷, and the third resembled a signature recently reported in relapsed ALL of unknown cause which was only present in patients with germline or somatic PMS2 alterations. This third signature (termed the “relapse MMR” signature) was also similar to the thiopurine signature we recently reported²⁸, with similar strand bias, and is potentially therefore a modified thiopurine signature in samples with MMR defects. We tested for the presence of the 60+ COSMIC v3 signatures in each WGS sample using SigProfilerSingleSample (version 1.3) and the COSMIC v3 signature definitions provided with that version of the software. From this analysis, signatures never exceeding 150 mutations in any one sample were identified and excluded from our final analysis in order to avoid likely spurious signatures. Based on these data, our finalized WGS signature data were obtained by testing for the presence of only the following signatures in each sample using SigProfilerSingleSample: COSMIC signatures 1, 5, and 40 (clock-like), COSMIC signature 26 (MMR deficiency), COSMIC signatures 31 and 35 (cisplatin), the experimental thiopurine signature we recently reported, generated by treating MCF10A cells with thioguanine²⁸, and the relapse MMR signature. We used a required cosine increase of 0.02 or more for a signature to be detected in a single sample, and default parameters otherwise. For exome samples, we likewise tested for these signatures using SigProfilerSingleSample, but excluded from our analysis exome samples that had cosine reconstruction scores of less than 0.9 (comparing the sample’s SNV catalog profile with the profile as reconstructed by signatures) or less than 30 SNVs total, or which already had WGS data, resulting in only 3 exome samples with usable signature data. We calculated the probability that individual SNVs were caused by a signature as done by others⁷⁹ and as we reported previously²⁸. The probability that a variant was caused by a specific signature was calculated as follows. Let s_k represent the signature strength vector for a given sample (measured in number of SNVs caused by the signature), where k = 1, 2, …, 8 is one of 8 signatures we identified, such that s₁ equals the number of specific SNVs caused by signature 1 in the sample, and ∑s_k equals the total number of SNVs in the sample. Let c = 1, 2, …, 96 represent each of the 96 possible trinucleotide mutation types. Each of the k signatures mutates each of these 96 trinucleotide mutation types c with a probability P_c,k (ranging from 0 to 1.0) where the sum of the probabilities for a given signature across all 96 trinucleotide mutation types is 1.0. The probability that a mutation of interest m (at trinucleotide mutation type c) was caused by a specific signature i is calculated as shown in Eq. 1:

$$P\left( {i|m} \right) = \frac{{S_i^ \ast P_{c,i}}}{{\mathop {\sum}\nolimits_{k = 1}^{11} {\left( {S_k^ \ast P_{c,k}} \right)} }}$$

(1)

GRIN analysis

The genomic random interval (GRIN) method¹⁸ was used to evaluate the statistical significance for the prevalence of SNVs, heterozygous deletions, fusion breakpoints, copy-neutral loss-of-heterozygosity, and amplification in each gene. For each gene, a p-value for each of these genomic alterations was computed. Also, for each gene, an overall p-value was computed by finding the minimum p-value across the five lesion types and comparing it to the beta distribution corresponding to the distribution of the minimum of five id uniform (0,1) realizations. For each set of p-values (one for each lesion type and the overall p-value), a robust method⁸⁰ was used to compute false discovery rate estimates, which are reported with the symbol q. A total of 91 genes were identified as statistically significant with an overall q < 0.05. Additionally, MutSigCV⁸¹ analysis was used to determine driver status of SNVs and indels.

Super enhancer analysis in CD34⁺ cells

H3K27ac ChIP-seq data were downloaded from GEO accession GSE104579⁸². Raw reads were adapter-trimmed and subject to quality filtering using Trim Galore (v0.4.4), retaining reads with a quality score >20. Reads were mapped to the human genome (GRCh37) using BWA (v0.7.12)⁵⁸, converted to bam format, and duplicate reads were marked using biobambam2 (v2.0.87)⁸³ and removed using samtools (v1.10)⁶⁸. H3K27ac peaks were called using macs2 (v2.1.1)⁸⁴ in BEDPE mode with a p-value cutoff of 1 × 10⁻⁵. ROSE was run using the de-duplicated H3K27ac and input bam files and the macs2 peak file with default parameters. For additional visualization of the chromatin landscape in human CD34 + cells, three additional datasets were included in IGV snapshots. The CTCF bigwig file was downloaded from GEO accession GSE104579. The “CD34 + H3K27ac (Roadmap)” wiggle file was downloaded from GEO accession GSM772885⁸⁵ and converted to bigwig. CD34⁺ ATAC-seq data were downloaded from GEO accession GSE74912⁸⁶ and all biological replicates for CD34⁺ samples were merged into a single bedGraph file and converted to bigwig format for visualization. All RNA-seq tracks are normalized read coverage.

Statistical methods

The Wilcoxon–Mann–Whitney non-parametric test, two-tailed, was used to compare means of quantitative variables across two experimental groups or diagnostic groups. The Fisher’s exact test was used to compare the frequency of complex karyotype between patients with and without TP53 mutations. Survival analysis of cause-specific death was performed with a Fine-Gray model⁸⁷ that accounts for different causes of death as competing events and adjusts for hematopoietic stem cell transplant as a time-dependent outcome predictor variable.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The genomic data generated in this study have been deposited in the European Genome-Phenome Archive (EGA), which is hosted by the European Bioinformatics Institute (EBI), under accession EGAS00001004850 and through St. Jude Cloud [https://pecan.stjude.cloud/permalink/tMN]. All other remaining data are available within the article and supplementary files or available from the authors upon request. Other publicly available datasets used for CD34⁺ cell super-enhancer analysis are deposited in Gene Expression Omnibus (GEO): H3K27ac and CTCF ChIP-seq data are available under accession number GSE104579, CD34 + H3K27ac Roadmap ChIP-seq data are available under accession number GSM772885, and CD34⁺ ATAC-seq data are available under accession number GSE74912.

References

Tsurusawa, M. et al. Therapy-related myelodysplastic syndrome in childhood: a retrospective study of 36 patients in Japan. Leuk. Res. 29, 625–632 (2005).
Article CAS PubMed Google Scholar
Brown, C. A., Youlden, D. R., Aitken, J. F. & Moore, A. S. Therapy-related acute myeloid leukemia following treatment for cancer in childhood: a population-based registry study. Pediatr. Blood Cancer 65, e27410 (2018).
Article PubMed CAS Google Scholar
Imamura, T. et al. Nationwide survey of therapy-related leukemia in childhood in Japan. Int. J. Hematol. 108, 91–97 (2018).
Article CAS PubMed Google Scholar
Aguilera, D. G. et al. Pediatric therapy-related myelodysplastic syndrome/acute myeloid leukemia: the MD Anderson Cancer Center experience. J. Pediatr. Hematol. Oncol. 31, 803–811 (2009).
Article PubMed Google Scholar
Wong, T. N. et al. Role of TP53 mutations in the origin and evolution of therapy-related acute myeloid leukaemia. Nature 518, 552–555 (2015).
Article ADS CAS PubMed Google Scholar
Berger, G. et al. Early detection and evolution of preleukemic clones in therapy-related myeloid neoplasms following autologous SCT. Blood 131, 1846–1857 (2018).
Article CAS PubMed Google Scholar
Gibson, C. J. et al. Clonal hematopoiesis associated with adverse outcomes after autologous stem-cell transplantation for lymphoma. J. Clin. Oncol. 35, 1598–1605 (2017).
Article CAS PubMed PubMed Central Google Scholar
Renneville, A. et al. Genetic analysis of therapy-related myeloid neoplasms occurring after intensive treatment for acute promyelocytic leukemia. Leukemia 32, 2066–2069 (2018).
Article PubMed Google Scholar
Ganser, A. & Heuser, M. Therapy-related myeloid neoplasms. Curr. Opin. Hematol. 24, 152–158 (2017).
Article CAS PubMed Google Scholar
Barnard, D. R. & Woods, W. G. Treatment-related myelodysplastic syndrome/acute myeloid leukemia in survivors of childhood cancer–an update. Leuk. Lymphoma 46, 651–663 (2005).
Article CAS PubMed Google Scholar
Pui, C. H. et al. Epipodophyllotoxin-related acute myeloid leukemia: a study of 35 cases. Leukemia 9, 1990–1996 (1995).
CAS PubMed Google Scholar
Pui, C. H. et al. Acute myeloid leukemia in children treated with epipodophyllotoxins for acute lymphoblastic leukemia. N. Engl. J. Med. 325, 1682–1687 (1991).
Article CAS PubMed Google Scholar
Winick, N. J. et al. Secondary acute myeloid leukemia in children with acute lymphoblastic leukemia treated with etoposide. J. Clin. Oncol. 11, 209–217 (1993).
Article CAS PubMed Google Scholar
Rodriguez-Galindo, C. et al. Hematologic abnormalities and acute myeloid leukemia in children and adolescents administered intensified chemotherapy for the Ewing sarcoma family of tumors. J. Pediatr. Hematol. Oncol. 22, 321–329 (2000).
Article CAS PubMed Google Scholar
Blanco, J. G. et al. Molecular emergence of acute myeloid leukemia during treatment for acute lymphoblastic leukemia. Proc. Natl Acad. Sci. USA 98, 10338–10343 (2001).
Article ADS CAS PubMed PubMed Central Google Scholar
Schwartz, J. R. et al. The genomic landscape of pediatric myelodysplastic syndromes. Nat. Commun. 8, 1557 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Faber, Z. J. et al. The genomic landscape of core-binding factor acute myeloid leukemias. Nat. Genet. 48, 1551–1556 (2016).
Article CAS PubMed PubMed Central Google Scholar
Pounds, S. et al. A genomic random interval model for statistical analysis of genomic lesion data. Bioinformatics 29, 2088–2095 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. Germline mutations in predisposition genes in pediatric cancer. N. Engl. J. Med. 373, 2336–2346 (2015).
Article CAS PubMed PubMed Central Google Scholar
Parsons, D. W. et al. Diagnostic yield of clinical tumor and germline whole-exome sequencing for children with solid tumors. JAMA Oncol. 2, 616–624 (2016).
Article PubMed PubMed Central Google Scholar
Ripperger, T. et al. Childhood cancer predisposition syndromes-A concise review and recommendations by the Cancer Predisposition Working Group of the Society for Pediatric Oncology and Hematology. Am. J. Med. Genet. A 173, 1017–1037 (2017).
Article PubMed Google Scholar
Mody, R. J. et al. Integrative clinical sequencing in the management of refractory or relapsed cancer in youth. JAMA 314, 913–925 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hsu, J. I. et al. PPM1D mutations drive clonal hematopoiesis in response to cytotoxic chemotherapy. Cell Stem Cell 23, 700–713 e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kahn, J. D. et al. PPM1D-truncating mutations confer resistance to chemotherapy and sensitivity to PPM1D inhibition in hematopoietic cells. Blood 132, 1095–1105 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schwartz, J. R. et al. Germline SAMD9 mutation in siblings with monosomy 7 and myelodysplastic syndrome. Leukemia 31, 1827–1830 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wong, J. C. et al. Germline SAMD9 and SAMD9L mutations are associated with extensive genetic evolution and diverse hematologic outcomes. JCI Insight 3, e121086 https://doi.org/10.1172/jci.insight.121086 (2018).
Article PubMed Central Google Scholar
Wlodarski, M. W. et al. Prevalence, clinical characteristics, and prognosis of GATA2-related myelodysplastic syndromes in children and adolescents. Blood 127, 1387–1397 (2016). quiz 1518.
Article CAS PubMed Google Scholar
Li, B. et al. Therapy-induced mutations drive the genomic landscape of relapsed acute lymphoblastic leukemia. Blood 135, 41–55 (2020).
Article PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Article CAS PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. Clock-like mutational processes in human somatic cells. Nat. Genet. 47, 1402–1407 (2015).
Article CAS PubMed PubMed Central Google Scholar
Waanders, E. et al. Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia. Blood Cancer Discov. 1, 96–111 (2020).
Article PubMed PubMed Central Google Scholar
Gough, S. M., Slape, C. I. & Aplan, P. D. NUP98 gene fusions and hematopoietic malignancies: common themes and new biologic insights. Blood 118, 6247–6257 (2011).
Article CAS PubMed PubMed Central Google Scholar
Stengel, A. et al. Detection of recurrent and of novel fusion transcripts in myeloid malignancies by targeted RNA sequencing. Leukemia 32, 1229–1238 (2018).
Article CAS PubMed Google Scholar
Rubin, C. M. et al. t(3;21)(q26;q22): a recurring chromosomal abnormality in therapy-related myelodysplastic syndrome and acute myeloid leukemia. Blood 76, 2594–2598 (1990).
Article CAS PubMed Google Scholar
Hinai, A. A. & Valk, P. J. Review: aberrant EVI1 expression in acute myeloid leukaemia. Br. J. Haematol. 172, 870–878 (2016).
Article CAS PubMed Google Scholar
Ho, P. A. et al. High EVI1 expression is associated with MLL rearrangements and predicts decreased survival in paediatric acute myeloid leukaemia: a report from the children’s oncology group. Br. J. Haematol. 162, 670–677 (2013).
Article CAS PubMed PubMed Central Google Scholar
Balgobind, B. V. et al. EVI1 overexpression in distinct subtypes of pediatric acute myeloid leukemia. Leukemia 24, 942–949 (2010).
Article CAS PubMed Google Scholar
Li, S. et al. Myelodysplastic syndrome/acute myeloid leukemia with t(3;21)(q26.2;q22) is commonly a therapy-related disease associated with poor outcome. Am. J. Clin. Pathol. 138, 146–152 (2012).
Article ADS CAS PubMed Google Scholar
Ottema, S. et al. Atypical 3q26/MECOM rearrangements genocopy inv(3)/t(3;3) in acute myeloid leukemia. Blood 136, 224–234 (2020).
Article PubMed Google Scholar
Eguchi-Ishimae, M., Eguchi, M., Ohyashiki, K., Yamagata, T. & Mitani, K. Enhanced expression of the EVI1 gene in NUP98/HOXA-expressing leukemia cells. Int. J. Hematol. 89, 253–256 (2009).
Article PubMed Google Scholar
Burillo-Sanz, S. et al. NUP98-HOXA9 bearing therapy-related myeloid neoplasm involves myeloid-committed cell and induces HOXA5, EVI1, FLT3, and MEIS1 expression. Int. J. Lab. Hematol. 38, 64–71 (2016).
Article CAS PubMed Google Scholar
Takeda, A., Goolsby, C. & Yaseen, N. R. NUP98-HOXA9 induces long-term proliferation and blocks differentiation of primary human CD34+ hematopoietic cells. Cancer Res. 66, 6628–6637 (2006).
Article CAS PubMed Google Scholar
Stumpo, D. J. et al. Targeted disruption of Zfp36l2, encoding a CCCH tandem zinc finger RNA-binding protein, results in defective hematopoiesis. Blood 114, 2401–2410 (2009).
Article CAS PubMed PubMed Central Google Scholar
Barbouti, A. et al. A novel gene, MSI2, encoding a putative RNA-binding protein is recurrently rearranged at disease progression of chronic myeloid leukemia and forms a fusion gene with HOXA9 as a result of the cryptic t(7;17)(p15;q23). Cancer Res. 63, 1202–1206 (2003).
CAS PubMed Google Scholar
Saleki, R. et al. A novel TTC40-MSI2 fusion in de novo acute myeloid leukemia with an unbalanced 10;17 translocation. Leuk. Lymphoma 56, 1137–1139 (2015).
Article PubMed Google Scholar
Aly, R. M. & Ghazy, H. F. Prognostic significance of MSI2 predicts unfavorable outcome in adult B-acute lymphoblastic leukemia. Int J. Lab. Hematol. 37, 272–278 (2015).
Article CAS PubMed Google Scholar
Duggimpudi, S. et al. Transcriptome-wide analysis uncovers the targets of the RNA-binding protein MSI2 and effects of MSI2’s RNA-binding activity on IL-6 signaling. J. Biol. Chem. 293, 15359–15369 (2018).
Article CAS PubMed PubMed Central Google Scholar
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article ADS CAS Google Scholar
Davis, C. A. et al. The encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).
Article CAS PubMed Google Scholar
Liu, Y. et al. Discovery of regulatory noncoding variants in individual cancer genomes by using cis-X. Nat. Genet. 52, 811–818 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lewen, M. et al. Pediatric chronic myeloid leukemia with inv(3)(q21q26.2) and T lymphoblastic transformation: a case report. Biomark. Res. 4, 14 (2016).
Article PubMed PubMed Central Google Scholar
Ma, X. et al. Analysis of error profiles in deep next-generation sequencing data. Genome Biol. 20, 50 (2019).
Article PubMed PubMed Central Google Scholar
Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).
Article CAS PubMed Google Scholar
Klco, J. M. et al. Genomic impact of transient low-dose decitabine treatment on primary AML cells. Blood 121, 1633–1643 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. The genetic basis of early T-cell precursor acute lymphoblastic leukaemia. Nature 481, 157–163 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, J. et al. A novel retinoblastoma therapy from genomic and epigenetic analyses. Nature 481, 329–334 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Rusch, M. et al. Clinical cancer genomic profiling by three-platform sequencing of whole genome, whole exome and transcriptome. Nat. Commun. 9, 3962 (2018).
Article ADS MathSciNet PubMed PubMed Central CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Article PubMed PubMed Central CAS Google Scholar
Edmonson, M. N. et al. Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Bioinformatics 27, 865–866 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wang, J. et al. CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat. Methods 8, 652–654 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wu, G. et al. The genomic landscape of diffuse intrinsic pontine glioma and pediatric non-brainstem high-grade glioma. Nat. Genet. 46, 444–450 (2014).
Article CAS PubMed PubMed Central Google Scholar
Tian, L. et al. CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data. Genome Biol. 21, 126 (2020).
Article CAS PubMed PubMed Central Google Scholar
Iyer, M. K., Chinnaiyan, A. M. & Maher, C. A. ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics 27, 2903–2904 (2011).
Article CAS PubMed PubMed Central Google Scholar
Morgan M., Pagès H., Obenchain V. & N, H. Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import. 1.30.0 edn (2020).
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chen, X. et al. CONSERTING: integrating copy-number analysis with structural-variation detection. Nat. Methods 12, 527–530 (2015).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central CAS Google Scholar
Olshen, A. B., Venkatraman, E. S., Lucito, R. & Wigler, M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004).
Article PubMed MATH Google Scholar
Seshan, V. & A, O. DNAcopy: DNA copy number data analysis. R package version 1.52.0 edn (2017).
Ioannidis, N. M. et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885 (2016).
Article CAS PubMed PubMed Central Google Scholar
Zhang, M. Y. et al. Genomic analysis of bone marrow failure and myelodysplastic syndromes reveals phenotypic and diagnostic complexity. Haematologica 100, 42–48 (2015).
Article CAS PubMed PubMed Central Google Scholar
Keel, S. B. et al. Genetic features of myelodysplastic syndrome and aplastic anemia in pediatric and young adult patients. Haematologica 101, 1343–1350 (2016).
Article CAS PubMed PubMed Central Google Scholar
Cancer Genome Atlas Research, N. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med. 368, 2059–2074 (2013).
Article CAS Google Scholar
Topka, S. et al. Germline ETV6 mutations confer susceptibility to acute lymphoblastic leukemia and thrombocytopenia. PLoS Genet. 11, e1005262 (2015).
Article PubMed PubMed Central CAS Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Article PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013).
Article CAS PubMed PubMed Central Google Scholar
Morganella, S. et al. The topography of mutational processes in breast cancer genomes. Nat. Commun. 7, 11383 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Pounds, S. & Cheng, C. Robust estimation of the false discovery rate. Bioinformatics 22, 1979–1987 (2006).
Article CAS PubMed Google Scholar
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, X. et al. Large DNA methylation nadirs anchor chromatin loops maintaining hematopoietic stem cell identity. Mol. Cell 78, 506–521 e6 (2020).
Article CAS PubMed PubMed Central Google Scholar
Tischler, G. & Leonard, S. biobambam: tools for read pair collation based algorithms on BAM files. Source Code Biol. Med. 9, 13 (2014).
Article PubMed Central Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central CAS Google Scholar
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Article CAS PubMed PubMed Central Google Scholar
Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).
Article CAS PubMed PubMed Central Google Scholar
Fine, J. P. & Gray, R. J. A proportional hazards model for the subdistribution of a competing risk. J. Am. Stat. Assoc. 94, 496–509 (1999).
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank all the patients and their families at St. Jude Children’s Research Hospital (SJCRH) for their contribution of biological specimens used in this study. We also thank the Biorepository, the Flow Cytometry and Cell Sorting Core, and the Hartwell Center for Bioinformatics and Biotechnology at SJCRH for their essential services. Julie Justice in the Anatomic Pathology lab established the immunohistochemistry for MECOM. J.R.S. is supported by the NHLBI (1K08HL150282-01) and Alex’s Lemonade Stand Foundation Young Investigator Award. This work was funded by the American Lebanese and Syrian Associated Charities of St. Jude Children’s Research Hospital and grants from the US National Institutes of Health (P30 CA021765, Cancer Center Support Grant; R01 HL144653 to J.M.K.). J.M.K. holds a Career Award for Medical Scientists from the Burroughs Wellcome Fund. Support was also provided by the Edward P. Evans Foundation (J.M.K). This research content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

These authors contributed equally: Jason R. Schwartz, Jing Ma, Jennifer Kamens.

Authors and Affiliations

Vanderbilt University Medical Center, Department of Pediatrics, Nashville, TN, US
Jason R. Schwartz
St. Jude Children’s Research Hospital, Department of Pathology, Memphis, TN, US
Jing Ma, Tamara Westover, Michael P. Walsh, Lindsey Montefiori, Guangchun Song, Ryan Hiltenbrand, Priyadarshini Kumar, Charles G. Mullighan & Jeffery M. Klco
Stanford University School of Medicine, Department of Pediatrics, Stanford, CA, US
Jennifer Kamens & Tanja Gruber
St. Jude Children’s Research Hospital, Department of Computational Biology, Memphis, TN, US
Samuel W. Brady, J. Robert Michael, Xiaolong Chen, Gang Wu, Yanling Liu, John Easton, Scott Newman, Jinghui Zhang & Xiaotu Ma
St. Jude Children’s Research Hospital, Department of Biostatistics, Memphis, TN, US
Huiyun Wu & Stanley Pounds
Arkansas Children’s Northwest Hospital, Department of Hematology/Oncology, Springdale, AR, US
Cristyn Branstetter
Memorial Sloan Kettering Cancer Center, Department of Pediatrics, New York, NY, US
Michael F. Walsh
St. Jude Children’s Research Hospital, Department of Oncology, Memphis, TN, US
Kim E. Nichols, Jamie L. Maciaszek & Jeffrey E. Rubnitz
Stanford University School of Medicine, Stanford Cancer Institute, Stanford, CA, US
Tanja Gruber

Authors

Jason R. Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Jing Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Kamens
View author publications
You can also search for this author in PubMed Google Scholar
Tamara Westover
View author publications
You can also search for this author in PubMed Google Scholar
Michael P. Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Samuel W. Brady
View author publications
You can also search for this author in PubMed Google Scholar
J. Robert Michael
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lindsey Montefiori
View author publications
You can also search for this author in PubMed Google Scholar
Guangchun Song
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Huiyun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Cristyn Branstetter
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Hiltenbrand
View author publications
You can also search for this author in PubMed Google Scholar
Michael F. Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Kim E. Nichols
View author publications
You can also search for this author in PubMed Google Scholar
Jamie L. Maciaszek
View author publications
You can also search for this author in PubMed Google Scholar
Yanling Liu
View author publications
You can also search for this author in PubMed Google Scholar
Priyadarshini Kumar
View author publications
You can also search for this author in PubMed Google Scholar
John Easton
View author publications
You can also search for this author in PubMed Google Scholar
Scott Newman
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey E. Rubnitz
View author publications
You can also search for this author in PubMed Google Scholar
Charles G. Mullighan
View author publications
You can also search for this author in PubMed Google Scholar
Stanley Pounds
View author publications
You can also search for this author in PubMed Google Scholar
Jinghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Xiaotu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jeffery M. Klco
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.R.S., J.M., J.K., S.W.B., L.M., X.M., and J.M.K. prepared the manuscript. J.R.S., T.W., R.H., J.K., T.G., X.M., and J.M.K. were responsible for experimental design and analysis. T.W. prepared DNA and RNA from all patient samples. J.M., M.P.W., J.R.M., X.C., G.S., G.W., Y.L., J.E., S.N, and J.Z. were responsible for bioinformatic data analysis. L.M. performed the super-enhancer analysis of CD34⁺ cells. K.E.N., M.F.W., J.L.M., J.K., J.R.S., T.G., and J.M.K. analyzed germline variants and determined their likely pathogenicity. S.W.B. performed and analyzed the mutational signatures present in the tMN cohort. J.R.S., J.K., and C.B. assembled all clinical data for the tMN cohort. P.K. performed MECOM immunohistochemistry. S.P. and H.W were responsible for all statistical analyses. C.G.M. and J.E.R. assisted with data analysis and acquisition of patient cases.

Corresponding authors

Correspondence to Tanja Gruber, Xiaotu Ma or Jeffery M. Klco.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Tomas Radivoyevitch, Goro Sashida and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Supplementary Data 12

Supplementary Data 13

Supplementary Data 14

Supplementary Data 15

Supplementary Data 16

Supplementary Data 17

Supplementary Data 18

Supplementary Data 19

Supplementary Data 20

Supplementary Data 21

Supplementary Data 22

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Schwartz, J.R., Ma, J., Kamens, J. et al. The acquisition of molecular drivers in pediatric therapy-related myeloid neoplasms. Nat Commun 12, 985 (2021). https://doi.org/10.1038/s41467-021-21255-8

Download citation

Received: 05 June 2020
Accepted: 15 January 2021
Published: 12 February 2021
DOI: https://doi.org/10.1038/s41467-021-21255-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.