Article | Open

The landscape of genomic alterations across childhood cancers

  • Nature volume 555, pages 321327 (15 March 2018)
  • doi:10.1038/nature25480
  • Download Citation


Pan-cancer analyses that examine commonalities and differences among various cancer types have emerged as a powerful way to obtain novel insights into cancer biology. Here we present a comprehensive analysis of genetic alterations in a pan-cancer cohort including 961 tumours from children, adolescents, and young adults, comprising 24 distinct molecular types of cancer. Using a standardized workflow, we identified marked differences in terms of mutation frequency and significantly mutated genes in comparison to previously analysed adult cancers. Genetic alterations in 149 putative cancer driver genes separate the tumours into two classes: small mutation and structural/copy-number variant (correlating with germline variants). Structural variants, hyperdiploidy, and chromothripsis are linked to TP53 mutation status and mutational signatures. Our data suggest that 7–8% of the children in this cohort carry an unambiguous predisposing germline variant and that nearly 50% of paediatric neoplasms harbour a potentially druggable event, which is highly relevant for the design of future clinical trials.


Cure rates for childhood cancers have increased to about 80% in recent decades, but cancer is still the leading cause of death by disease in the developed world among children over one year of age1,2. Furthermore, many children who survive cancer suffer from long-term sequelae of surgery, cytotoxic chemotherapy, and radiotherapy, including mental disabilities, organ toxicities, and secondary cancers3. A crucial step in developing more specific and less damaging therapies is the unravelling of the complete genetic repertoire of paediatric malignancies, which differ from adult malignancies in terms of their histopathological entities and molecular subtypes4. Over the past few years, many entity-specific sequencing efforts have been launched, but the few paediatric pan-cancer studies thus far have focused only on mutation frequencies, germline predisposition, and alterations in epigenetic regulators4,5,6.

We have carried out a broad exploration of cancers in children, adolescents, and young adults, by incorporating small mutations and copy-number or structural variants on somatic and germline levels, and by identifying putative cancer genes and comparing them to those previously reported in adult cancers by The Cancer Genome Atlas (TCGA)7. We have also examined mutational signatures and potential drug targets. The compendium of genetic alterations presented here is available to the scientific community at

This integrative analysis includes 24 types of cancer and covers all major childhood cancer entities, many of which occur exclusively in children8 (Fig. 1, Supplementary Table 1). Ninety-five per cent of the patients in this study were diagnosed during childhood or adolescence (aged 18 years or younger) and 5% as young adults (up to 25 years) (Extended Data Fig. 1a). This study is biased towards central nervous system tumours, and is complemented by an additional study of a non-overlapping paediatric cohort with mainly leukaemias and extracranial solid tumours9.

Figure 1: Somatic mutations in the paediatric pan-cancer cohort.
Figure 1

Somatic coding mutation frequencies in 24 paediatric (n = 879 primary tumours) and 11 adult (n = 3,281) cancer types (TCGA)7. Hypermutated and highly mutated samples are separated by dashed grey lines and highlighted with black squares. Median mutation loads are shown as solid lines (black, cancer types; purple, all paediatric; green, all adult).

We compiled paired-end Illumina-based sequencing data for 961 tumours (914 individual patients) from previous cancer-type specific studies (see Methods and Supplementary Note 1) including 547 whole-genome sequences (WGS, median coverage 37×) and 414 whole-exome sequences (WES, 121×) partially complemented by low-coverage whole genomes (Supplementary Tables 1, 2). Tumour and matched germline samples were processed with standardized pipelines to detect single nucleotide variants (SNVs), short insertions and deletions (indels), copy-number variants (CNVs) and other structural variants. Secondary (relapse) tumours (n = 82, including 47 matched to primaries) were analysed separately from the main primary cohort (n = 879).

Mutation frequencies across cancer types

Coding somatic SNV (93%) and indel (7%) counts correlated across all samples (n = 879) (R = 0.27, P = 9.1 × 10−5; Extended Data Fig. 1b, c). Mutation frequencies varied between cancer types (0.02–0.49 mutations per Mb) and were overall 14 times lower than in adult cancers7 (0.13 versus 1.8 mutations per Mb, TCGA data; Fig. 1, Extended Data Fig. 1c, Supplementary Table 3). Relapse tumours harboured significantly more mutations than primary tumours (P = 0.0015, excluding highly mutated tumours; Extended Data Fig. 1d).

Tumours with more than 10 mutations per Mb have been referred to as ‘hypermutators’, and are often related to deficiencies in mismatch repair (MMR)10,11. In this cohort, hypermutation occurred exclusively in H3.3 or H3.1 K27-wildtype (K27wt) high-grade gliomas with biallelic germline mutations in MSH6 or PMS2, with an extremely high mutational burden similar to the highest among adult tumours (in POLE- or POLQ-mutated carcinomas)7,12 (Fig. 1). Some paediatric tumours had a mutational burden below this threshold, but markedly above average (2–10 mutations per Mb, referred to as ‘paediatric highly mutated’), including several K27wt high-grade gliomas with monoallelic germline variants in MSH2, MSH6 or PMS2 (Fig. 1). Whether these highly mutated tumours respond to immune checkpoint inhibitors, as described for paediatric glioblastoma, should be of clinical interest13.

As in previous reports, the somatic mutation burden increased with patient age (R = 0.39, P = 2.9 × 10−6), except in Burkitt’s lymphoma (immunoglobulin hypermutation) and tumours with ‘kataegis’ events of localized hypermutation at double-stranded breakpoints14,15 (Extended Data Fig. 1e, f). Both SNVs (R = 0.37, P = 1.0 × 10−5) and indels (R = 0.27, P = 5.4 × 10−4) correlated with patient age overall, although within some cancers (for example, acute lymphoblastic leukaeumia (ALL), Ewing’s sarcoma, and rhabdomyosarcoma), we observed almost random mutational loads (R < 0.2). Rhabdomyosarcomas were largely dominated by embryonal tumours with more mutations than the few alveolar cases (median 0.27 versus 0.12 mutations per Mb, P = 0.002).

Mutational processes in childhood cancers

Most cancer types predominantly harboured C > T transitions (≥30% of SNVs in two-thirds of cancer types) linked to mutational signature 1, whose previously described age-association occurred in some paediatric brain tumours15,16 (P < 0.05; Extended Data Figs 1g, 2a–c). Mutational signatures, possibly reflecting biochemical cellular processes, have previously been investigated for many, mainly adult, cancers15. In this paediatric cohort (WGS, n = 503), we found evidence for major contributions of 16 out of 30 published signatures and also identified one new signature15 (Fig. 2, Extended Data Fig. 2a, Supplementary Table 4). This ‘signature P1’, which is distinct from any previously documented signatures and harbours elevated C > T mutations in a CCC/CCT context, occurred in several atypical teratoid rhabdoid tumours (ATRTs) and one ependymoma (Fig. 2, Extended Data Fig. 2d, Supplementary Table 5). Its activity correlated with ‘multiple nucleotide variants’ (MNVs; R = 0.87, P = 1.1 × 10−12), but no particular loci or genes were mutually altered in the affected tumours (Extended Data Fig. 2d). Notably, all ATRTs with signature P1 were in the recently defined subgroup ‘SHH’, and even within one proposed methylation subset of these17 (P = 0.003, Wilcoxon rank-sum test; Extended Data Fig. 2d). Signatures 16 and 18 were heterogeneously represented within several cancer types, with signature 16 being most prominent in pilocytic astrocytomas, and signature 18, previously proposed to be associated with oxidative DNA damage and related to C > A transversions, in neuroblastomas, rhabdomyosarcomas, and other tumours with multiple structural variants15,18 (Extended Data Figs 1g, 2a, c, 3a).

Figure 2: Mutational processes active in paediatric cancers.
Figure 2

Contributions of thirty known and one novel mutational signature to the somatic mutations for the ten most frequently mutated samples per cancer type; each bar represents one individual tumour.

Signature 3, the ‘canonical’ double-stranded break signature linked to mutations in BRCA1 or BRCA2 or to a ‘BRCAness’ phenotype, and signatures 8 (recently linked to BRCA2 or PALB2 germline mutations in medulloblastomas; S. M. Waszak et al., personal communication) and 13 were linked to chromothripsis and TP53 mutations. This was particularly true for TP53 germline-mutated SHH medulloblastomas, and similarly for adrenocortical carcinomas and rhabdomyosarcomas (Extended Data Fig. 3b, c). Overall, signatures 3, 8, and 13 were more pronounced in cancer types with higher genomic instability (that is, structural variants; Extended Data Fig. 2e).

Germline variants in cancer predisposition genes

A recent study of more than 1,000 patients estimated that about 8% of children with cancer harbour a hereditary predisposition5. Accordingly, in our cohort (n = 914 individual patients, about 25% of samples overlapping with the previous study), 7.6% of samples were determined as being likely to be associated with a pathogenic germline variant5,19 (162 genes investigated; Supplementary Tables 6, 7). No general age-of-onset bias was observed in patients with a predisposition; however, onset was later in germline MMR-deficient patients (P = 0.0001), even within the high-grade glioma sub-cohort (P = 0.001).

Hereditary predisposition was most common in adrenocortical carcinomas (50%) and hypodiploid B-ALL (28%), followed by K27wt high-grade gliomas, ATRTs, SHH medulloblastomas, and retinoblastomas (15–25% each; Fig. 3a). Compared to the previous study, LZTR1, TSC2, and CHEK2 emerged as new putative predisposition genes, and possible new associations, such as SDHA with medulloblastoma, were detected5 (Fig. 3b).

Figure 3: Germline mutations in cancer predisposition genes.
Figure 3

a, Frequency of patients with a pathogenic germline mutation per cancer type (n = 914 tumours). b, Mutated genes sorted by number of affected samples (del, copy-number alterations; others, SNVs/indels). c, Cellular processes associated with cancer predisposition genes. d, Frequency of germline mutations adjusted for incidence and estimated total proportion of childhood cancers likely to be linked to hereditary predisposition.

Most germline variants were related to DNA repair genes from mismatch (MSH2, MSH6, PMS2) and double-stranded break (TP53, BRCA2, CHEK2) repair (Fig. 3b, c). Both groups are clinically relevant: patients with constitutional MMR deficiency could be candidates for immune checkpoint inhibition13 (Figs 1, 3b, c). Carriers of TP53 germline mutations (Li–Fraumeni syndrome), here most common in adrenocortical carcinomas, hypodiploid B-ALL, SHH medulloblastomas, and K27wt high-grade gliomas, are at a 50% risk for early-onset cancer compared to 1% overall, and are susceptible to treatment-induced secondary oncogenesis2,20,21,22 (Fig. 3b). Correcting the predisposition frequency of 7.6% in this cohort for the relative incidence of cancer types as a whole, we find that approximately 6% of all childhood cancer patients may carry a causative germline variant (Fig. 3d).

Significance analysis identifies cancer driver genes

Genome-wide analysis for significant mutation clusters (n = 538, WGS excluding hypermutators) identified non-coding mutations in the TERT promoter in 2.5% of tumours (Extended Data Fig. 4a, b, Supplementary Table 8). Further high-confidence clusters corresponded to coding mutations in frequently mutated genes (TP53, H3F3A, CTNNB1), and to localized hypermutation at the rearranged MYC locus in Burkitt’s lymphoma, while the bulk were classified as likely technical artefacts23 (Extended Data Fig. 4b).

MuSiC identified 77 significantly mutated genes (SMGs), which were ranked according to their pan-cancer mutation frequency24 (Fig. 4, Supplementary Tables 9, 10). Most SMGs were mutually exclusively mutated across cancer types, demonstrating specificity of single putative driver genes in childhood cancers as compared to more frequent co-mutation in adult cancers in the TCGA study7 (Extended Data Fig. 4c–e). None of the SMGs showed a bias towards samples with higher mutation frequencies. The allele frequencies of mutations in SMGs were higher than in non-SMGs, and ranked higher in individual tumours, suggesting an early clonal occurrence of these likely driver events (Extended Data Fig. 4f). Two additional SMGs emerged from analysis of the relapse tumours (n = 82): PRPS1 and NT5C2, both of which have been previously implicated in disease progression and chemotherapy resistance25,26 (Extended Data Fig. 4g).

Figure 4: Significantly mutated genes in paediatric compared to adult cancer types.
Figure 4

Percentage of tumours with non-silent mutations in 77 SMGs for 24 paediatric tumour types (n = 879 tumours) and the pan-cancer cohort.

Genes linked to epigenetic modification emerged as the most common (25% of tumours, 23 of 24 cancer types) and the largest (20%) group of SMGs (Extended Data Fig. 5a). Compared to a previous study6, for example, we also detected ARID1A and BCOR. Transcriptional regulators and MAP-kinase-associated genes accounted for 12–15% of SMGs. TP53 was the only DNA repair gene among somatic SMGs, in contrast to the multiple DNA repair-related germline mutations, and also in contrast to adult cancers (9% of SMGs, TCGA)7. PI3K-associated SMGs are the most commonly altered (31%) genes in adult cancers, compared to only 3% in paediatric cancers, which could be related to their often late occurrence in the evolution of multi-hit adult cancers27 (Extended Data Fig. 5a).

Forty-seven per cent of paediatric tumours harboured at least one SMG mutation, with most tumours (57%) having only one. SMG mutations were rare (<15%) in ependymomas, hepatoblastomas, Ewing’s sarcomas (driven by EWSR1 fusions instead of by point mutations28), and pilocytic astrocytomas, and common (>90%) in K27M high-grade gliomas, WNT medulloblastomas, and Burkitt’s lymphomas. By contrast, 93% of adult cancers harbour at least one mutation in an (adult cancer-related) SMG and 76% in multiple SMGs7 (Extended Data Fig. 5b). In line with the accompanying paediatric pan-cancer study9, only around 30% of paediatric SMGs overlapped with adult SMGs (Extended Data Fig. 5c). On the basis of incidence-normalized mutation frequencies, TP53 is predicted to be the most common somatically mutated gene (4% of childhood tumours), followed by KRAS, ATRX, NF1, and RB1 (1–2% of tumours); in adult cancers, with similarly normalized data, TP53 is also the most commonly mutated gene, albeit ten times more frequently (Extended Data Fig. 5d).

Assessment of high functional impact mutations (OncodriveFM)29 revealed well-known tumour suppressor genes (TSGs) such as TP53, ATRX, SMARCA4, and RB1, and further putative TSGs, including FMR1 in SHH/WNT medulloblastomas and MALRD1 (also known as C10orf112) in rhabdomyosarcomas (Extended Data Fig. 6a). Locally clustered ‘hotspot mutations’ (OncodriveClust)29,30 identified known oncogenes, such as CTNNB1, PIK3CA, KRAS, and BRAF, proposed oncogenes (ACVR1, KBTBD4, TBR1), and possible new candidates, such as SF3B1, in Group 4 medulloblastomas (Extended Data Fig. 6b).

Recurrent structural and copy-number variants

The degree of genomic instability (that is, the number of structural variants, including insertions, deletions, translocations, and inversions), varied substantially (median 1–434 structural variants) across cancer types (WGS, n = 539), with more than 1,000 structural variants in individual samples of adrenocortical carcinoma and osteosarcoma (Fig. 5a, Supplementary Table 11). Genomic instability correlated with germline (P = 3 × 10−15) and somatic (P = 2 × 10−4) TP53 mutations across all samples, but differed markedly between cancer types—again suggesting cancer type-specific effects of DNA repair (Fig. 5b, Extended Data Figs 3b, 7a).

Figure 5: Genomic instability and recurrent copy-number alterations.
Figure 5

a, Frequency of structural variants (SVs) across cancer types (n = 539 tumours). b, Structural variant load from a across all tumours in relation to TP53 mutations (generalized linear model, confidence interval 0.95). a, b, Quartiles, range of whiskers: 1.5 × interquartile range. c, Genomic regions with significant copy-number changes (red, gains or amplifications; blue, deletions; n = 516 tumours).

Genomically unstable cancers were also more often hyperdiploid31 (Supplementary Table 12). Twelve per cent of tumours had a ploidy of four or more, 72% retained a near-diploid state (ploidy 1.5–2.5), and hypodiploidy was observed mainly in hypodiploid B-ALLs (Extended Data Fig. 7b). Hyperdiploidy was associated with somatic (P = 0.005) and germline (P = 0.003) TP53 mutations, in line with a role for mutant TP53 in the bypassing of the G1 tetraploidy checkpoint32 (Extended Data Fig. 7c–e). Chromothripsis was also often observed in hyperdiploid cancers and co-occurred with somatic (P = 2.3 × 10−10) and germline TP53 (P = 5 × 10−8) mutations in 50% and 66% of these tumours, compared to 8% in TP53 wild-type tumours33,34,35 (Extended Data Fig. 7f–h, Supplementary Table 13).

Thirty-four regions recurrently altered by copy-number changes (17 amplified, 17 deleted) were identified using GISTIC2.0 (WGS, n = 516)36; candidate driver genes were assigned to each based on known cancer genes and literature review (Fig. 5c, Extended Data Fig. 8a, b, Supplementary Tables 14–17). Alterations per cancer type are summarized in Extended Data Fig. 9.

Recurrently amplified regions contained known oncogenes, including MYC, MYCN, or GLI2, with 11 regions involving high-level amplifications (at least 5-fold gain) (Extended Data Fig. 8b). Further interesting regions included 17q11.2 with 61 genes, containing NCOR1 as a potential candidate, and a region on 12q24.31 near (~0.1 Mb) the proposed oncogene KDM2B37,38. Recurrently deleted regions were predominantly associated with epigenetic or cell cycle regulators, most commonly TP53, PTEN, SETD2, and CDKN2A or CDKN2B. Further potential tumour suppressors included RAD51D on 17q12 and FOXF1 on 16q24.1, with significant loss across the cohort39.

As evidenced by recurrent structural variation outside genes (based on breakpoint clusters in 10-kb windows), rearrangements linked to enhancer hijacking were also found, involving GFI1B and DDX31 in medulloblastomas and TERT in neuroblastomas40,41. Together with genes directly affected by breakpoints, in total 70 structural variant-related putative cancer genes were found, many associated with cell cycle or growth (for example, the tumour suppressor PTPRD) or epigenetic regulators (such as SUZ12)42,43 (Extended Data Fig. 8c, Supplementary Tables 18, 19). Cancer type-specific events that occurred together with high expression (data derived from Northcott et al.44) included alterations of RIMS245.

The analysed genomic alterations were combined into 166 ‘likely functional events’ (LFEs) affecting 149 genes, classified as M-(mutation)-type or as SC-(structural/copy-number variant)-type (Extended Data Fig. 10a, Supplementary Table 20). Along the ‘cancer genome hyperbola’, individual tumours (WGS, n = 539) differentiated between an M-class (more M-type LFEs) and an SC-class (more SC-type LFEs)46 (Extended Data Fig. 10b, Supplementary Table 21). Fifty-five per cent of tumours were exclusive to one class, 27% were mixed but dominated by one type of LFE, 8% were ambiguous, and 10% had no LFEs (which may be of particular interest in assessing other tumour-driving events at the epigenetic or transcriptomic level). Germline MMR mutations were enriched in the M-class, and germline TP53 mutations in the SC-class (P = 0.0003 and P = 0.05, respectively, Fisher’s exact test; Extended Data Fig. 10c). Individual cancer types displayed varying relative distributions of mutation classes (Extended Data Fig. 10d).

Drug targets in childhood cancers

To assess the status of druggability of childhood cancers, the cohort (n = 675 with full genomic information; WES-only, n = 39; see Methods) was screened for potentially druggable events19(PDEs, that is, alterations in 179 genes with a directly or indirectly targeted treatment currently available or under development; Supplementary Table 22). This analysis revealed 453 PDEs in 59 genes, including 3% germline events (Supplementary Table 23). Most cancer types had tumours with PDEs related to both M- and SC-type (Fig. 6a). Most commonly, PDEs occurred in Burkitt’s lymphomas and pilocytic astrocytomas, while none were detected in ependymomas or hepatoblastomas (although the latter lacked information regarding CNVs or structural variants). Associated pathways included RTK/MAPK signalling, transcriptional regulation, cell cycle control, and DNA repair (Fig. 6a).

Figure 6: Potentially druggable events in paediatric cancers.
Figure 6

a, Proportion of primary tumours with potentially druggable events and associated biological pathways, per cancer type (n = 675 tumours with complete genomic information). NA, not available. b, Proportion of patients with potentially druggable events, projected after normalization for incidence.

When the data are normalized for relative cancer incidence, 52% of all primary paediatric tumours may harbour a PDE (Fig. 6b); this might be an underestimate, given that some structural variants may not have been detected by this approach (for example, the common MYC translocations in Burkitt’s lymphoma)23. After incidence adjustment, MAPK signalling and cell cycle control were most commonly affected. Notably, the PDEs often varied between primary and relapse tumours from one patient (n = 41): only 37% of primary tumours with PDEs retained these upon progression, while most of them partially or completely gained or lost events. This highlights the need for profiling of the current tumour when considering personalized therapy.


Our analysis of this pan-cancer compendium outlines the landscape of genomic alterations across multiple childhood cancer types. Although some alteration types and rarer entities are still under-represented and significance analyses are probably limited, this dataset of nearly 1,000 tumours (which can be explored at provides an unprecedented data resource for paediatric cancer research, further complemented by the accompanying pan-cancer study9 ( The multiple differences found compared to previous studies of adult tumours emphasize the need to consider paediatric cancers separately, further demonstrating a need for mechanism-of-action driven drug development for paediatric indications47.

The predicted frequency of pathogenic germline variants in 6% of patients, together with previous findings, demonstrates the relevance of genetic predisposition in childhood cancer5. Germline TP53 variants, which are clinically highly important, are estimated for 1.5% of children with cancer, and for more than 10% within individual cancer types. Genetic counselling should thus be systematically considered, particularly for patients with indicated high-risk entities.

Although stratified targeted treatment is currently incorporated only rarely into first-line therapy for paediatric cancer patients, our finding that nearly 50% of primary childhood tumours harbour a potentially targetable genetic event is encouraging. It also highlights the need for personalized profiling for each patient, both to increase diagnostic accuracy and to exploit the potential for potentially more effective and less harmful precision therapies. This may also transcend the direct targeting of genes or pathways, for example, through immune checkpoint inhibition in hypermutated tumours13 or through PARP inhibition in genomically unstable (‘BRCAness’) tumours48. It is hoped that ongoing personalized medicine approaches for patients at relapse will give initial information on the use and effectiveness of such targeted drugs (for example, in the clinical trials pedMATCH-NCT03155620; eSMART-NCT02813135; INFORM19). Additional longitudinal monitoring, for example using serial liquid biopsies, may further improve our understanding of tumour biology and the development of resistance mechanisms, and shed light on therapeutic challenges such as tumour heterogeneity.

In summary, this multi-faceted pan-cancer analysis provides a valuable resource for assessing genomic alterations across the spectrum of paediatric tumours. While there are undoubtedly more discoveries to come in terms of expanded cohorts and whole-genome and transcriptome analysis, we believe that this study provides a strong basis for functional follow-up and investigation of potential therapeutic targets in this specific patient population.



The cohort analysed in this study is a compilation of individual sequencing datasets from various sources: the International Cancer Genome Consortium (ICGC) – Pedbrain Tumor and MMML-seq (, the German Cancer Consortium (DKTK) (, the Pediatric Cancer Genome Project (PCGP) (, the Heidelberg Institute for Personalized Oncology (HIPO) (, the Individualized Therapy For Relapsed Malignancies in Childhood (INFORM) registry (, and other previously published datasets (listed below). For all included tumours, matched germline control tissue was available. Ninety-five per cent of the patients were under 18 years of age (or age unspecified but confirmed age group paediatric), but available data were included for patients up to 25 years, as these were considered relevant for cancer types that typically peak at a young age. All centres have approved data access and informed consent had been obtained from all patients.

External data were downloaded from the European Genome-Phenome Archive (EGA; using the accession numbers EGAD00001000085, EGAD00001000135, EGAD00001000159, EGAD00001000160, EGAD00001000161, EGAD00001000162, EGAD00001000163, EGAD00001000164, EGAD00001000165, EGAD00001000259, EGAD00001000260, EGAD00001000261, EGAD00001000268, and EGAD0000100026949,50,51,52,53,54,55,56,57,58,59,60,61,62; internal datasets are related to previous PMIDs 27748748, 27479119, 26923874, 25670083, 25253770, 24972766, 24553142, 25135868, 26632267, 26179511, 24651015, 28726821, 23817572, 25962120, 2629472517,19,44,63,64,65,66,67,68,69,70,71,72,73,74 (Supplementary Note 1).

The final cohort included 914 individual patients of no more than 25 years of age including primary tumours for 879 patients with 47 matched relapsed tumours, and an additional 35 independent relapsed tumours (Supplementary Tables 1, 2). Deep-sequencing (~30×) whole-genome data (WGS) were available for 547 samples with matched control, whole-exome sequencing (WES) for 414, and low-coverage whole-genome sequencing (lcWGS) for an additional 54 germline and 186 tumour samples. Depending on the requirements of each sub-analysis, we used WES and WGS, WGS only (excluding Ewing’s sarcoma, Wilms tumour, hepatoblastoma, and T-ALL), or WES, WGS and lcWGS (germline excluding Ewing’s sarcoma, Wilms tumour and hepatoblastoma; tumours excluding Ewing’s sarcoma and hepatoblastoma) were used (Supplementary Table 24). ‘Subgroups’ of cancer types were considered as separate entities if there was considerable evidence of differences in terms of clinical and molecular behaviours, if sub-cohort sizes were substantial, and if full annotation of all samples was available. All samples had been sequenced using Illumina technology and 99% of samples were paired-end sequences with 100 bp read length. Ninety-eight per cent of exome sequences are covered with at least 30×, 94% with at least 60×, and the total median exome coverage is 121×. The whole-genome sequenced samples have a median coverage of 37× and 94% of samples are covered with at least 30×. Information on coverage and other metrics for all samples are provided in Supplementary Table 2.

Cancer type incidence

Information on incidence of cancer types in the population was derived from the SEER database (Surveillance, Epidemiology, and End Results program)8; further detailed information on different subgroups of cancer types (central nervous system tumours and subgroups of medulloblastoma, ependymoma, and ALL) was transferred from cancer type-specific publications75,76,77,78,79. Survival data are based on information from the German Childhood Cancer Registry80. Incidence rates of adult cancers were taken from information in the German GEKID database (, 2003–2012).

Data preprocessing

All data were processed using a standardized alignment and variant calling pipeline, which was developed in the context of the ICGC Pan-Cancer project (


Datasets were available in either raw FASTQ or aligned BAM format. To allow standardized processing for all included samples, BAM files were sorted by read name using sambamba (v.0.4.6) and converted to a raw-like FASTQ format using SamToFastq (v.1.61). Reads were then aligned to the phase II reference human genome assembly of the 1000 Genomes Project including decoy sequences ( using BWA-MEM (v.0.7.8 using default settings except ‘-T 0’). Matching genotypes of tumour and control samples were confirmed by calculating pairwise DNA sequence similarities at 1,000 reference SNPs (dbSNP v.138)82.

Mutation calling

SNVs were called with the previously described samtools-based DKFZ pipeline adjusted for ICGC Pan-Cancer settings, and short indels were called using Platypus (v.0.7.4)74,83. Variants were first identified in the tumour sample and germline or somatic origin was determined based on their presence or absence in the matched control tissue. Functional effects were annotated using ANNOVAR and GENCODE19 (

Somatic structural variant discovery

Somatic structural variant discovery was pursued across all whole-genome sequenced samples (high-quality structural variants available for n = 539 primary tumours) using the DELLY ICGC Pan-Cancer analysis workflow ( A high-stringency structural variant set was obtained by additionally filtering somatic structural variants detected in 1% or more of a set of 1,105 germline samples from healthy individuals belonging to phase I of the 1000 Genomes Project and by removing somatic structural variants present in any of the paediatric germline samples of this study86. High-stringency structural variants were further required to have at least four supporting read pairs with a minimum mapping quality of 20 and were restricted to somatic structural variant sizes from 300 bp to 500 Mb.

Copy-number calling

Copy numbers were estimated using ACEseq (allele-specific copy-number estimation from sequencing) (K. Kleinheinz et al., unpublished data), using a binned tumour–control coverage ratio and B-allele frequency (BAF). Allele frequencies were obtained for all single nucleotide polymorphism (SNP) positions recorded in dbSNP version 13582. To improve sensitivity with regard to imbalanced and balanced regions, SNP positions in the control were phased with impute287. Additionally, the coverage for 10-kb windows with sufficient mapping quality and read density was recorded and subsequently corrected for GC content and replication timing.

The genome was segmented using the PSCBS package incorporating structural variant breakpoints defined by DELLY88,89. Segments were clustered based on coverage ratio and BAF using k-means and neighbouring segments in the same cluster were joined; focal segments (<9 Mb) were stitched to the more similar neighbour. Tumour cell content and ploidy were estimated by testing how well different combinations of both explain the data. Segments with balanced BAF were assigned to even-numbered copy-number states, whereas unbalanced segments were allowed to match with uneven numbers as well. Finally, estimated tumour cell content and ploidy were used to compute the total and allele-specific copy-number for each segment. High-quality copy-number calls were available for n = 516 of the WGS samples.

Mutation statistics

The frequency of somatic mutations in coding regions was determined for each sample individually by normalizing the total number of coding mutations for the number of sufficiently covered (≥6×) coding bases to account (determined using MuSiC-bmr) for different data types (WGS/WES) and for different exome target enrichment kits24. Mutation spectra were obtained by categorizing observed SNVs into base substitution types in pyrimidine context. Spearman’s rank correlation test was applied to infer correlations between different types of mutation counts or between mutation counts and age. Generalized linear models were used to fit regression lines. Clusters of localized hypermutation were identified using a previously presented approach adjusted for mutation rates in human paediatric cancers90.

Deciphering mutation signatures

Exome-sequenced tumours, except for hypermutator cases, were excluded from signature analysis owing to their low numbers of mutations. In brief, signatures are represented as probability distributions of substitution types of SNVs in pyrimidine context. Considering the immediate sequence context of each SNV, this results in 96 possible mutation types with directly adjacent mutations (multiple nucleotide variants, MNVs) being excluded, which are counted per tumour to compile its mutational profile.

As proposed by Alexandrov et al.91, the mutational profile of a tumour is expected to reflect a superposition of mutational processes (signatures) acting on its genome, where each mutational process has a different intensity (exposure). For a cohort of tumour genomes, this is modelled as a system of matrices for signatures (P) and exposures (E) defining the observed mutational catalogue (M)91: M ≈ P × E.

De novo deciphering of signatures was done as described91 based on the mutational catalogues of all cancer types and of the pan-cancer cohort. All resulting signatures were compared to published signatures (available in the COSMIC database, based on their cosine similarity15. Signatures that did not correspond to any of the previously known signatures (cosine similarity <0.85) were further analysed to examine their relevance for modelling the cancer genomes. First, linear independence from the known set of signatures was confirmed. Second, for each potentially novel signature, we examined whether the modelling of mutation profiles improved when compared to having used the set of known signatures: for each sample, the observed mutational profile was compared to the theoretical profiles calculated using the set of known signatures only, and using the extended set including the new candidate signature. Here, only samples with a total number of mutations over 200 were considered. Reconstruction was calculated as the difference between cosine similarity of the modelled profile and the observed profile. On the basis of the resulting distribution of similarities in both alternatives, a signature was considered to have a relevant contribution to the model, and thus a potential new signature, if both of the following conditions were fulfilled: the reconstruction (measured as the difference of similarities) of at least one sample increased by 0.02 and that sample had a reconstruction accuracy of <0.9 based on the known set of signatures only.

This procedure resulted in one new candidate signature, signature P1, which was added to the set of reference signatures. In order to achieve maximum resolution per sample, a sample-wise re-extraction of exposures from the mutational profiles was performed using quadratic programming with the reference signature set used for P and the exposures in E as unknown variables. Samples with a reconstruction accuracy below 0.5 were excluded (resulting in n = 503 tumours with high-quality signature information), as these samples would not be correctly accounted for by the model, which might be due to quality issues or to contributions of unknown signatures that are not present at intensities sufficient to be identified by a de novo approach. The resulting exposures were used for further downstream analyses and visualization. Previously published signatures without validation were first included to model the mutational catalogues as precisely as possible, but then summarized as ‘other’ for representation.

Spearman’s rank correlation and two-sided Kolmogorov–Smirnov tests were used to associate exposure of signatures with numerical and categorical variables, respectively. Exposures to signatures across multiple groups were compared using ANOVA and the post hoc Tukey’s test.

Identifying mutations in genes predisposing to cancers

To identify germline variants with a high likelihood of being implicated in cancer development, we investigated 162 candidate genes adapted from ref. 19 (110 genes regarded as following a dominant inheritance pattern and 52 genes with recessive inheritance) (Supplementary Table 6).

Germline SNVs and indels were subjected to a stepwise filtering approach to eventually classify them into five categories: benign, likely benign, uncertain significance, likely pathogenic, and pathogenic. First, variants reported in both the 1000 Genomes (release November 2010) and dbSNP (v.141) databases were excluded. High-quality variant calls were selected by including only positions with ≥15× coverage, a germline allele frequency of ≥0.2, and a phred-based quality score of ≥10. Variants with a population frequency ≥0.01 reported in additional common databases (esp6500siv2, X1000g2015, and exac03 included in ANNOVAR ( or with ClinVar ( annotations of ‘benign’, ‘likely benign’ or ‘uncertain significance’ were removed.

Furthermore, variants with a phred-scaled CADD score ≥15 ( and with Mutation Assessor ( categories ‘medium’ and ‘high’, or no available annotation, were included. Variants with a dbSNP classification of ‘precious’ were not subject to these two filtering steps. As indel calling is more prone to alignment and calling errors, potentially deleterious indels were manually investigated for artefacts. For recessive tumour genes, variants were included only with an allele frequency of one or with two compound heterozygous mutations of the same gene in the same patient. In total, the filtering steps narrowed down the number of potentially pathogenic mutations to n = 433. Every variant was then manually checked and scored by the use of varied, mainly gene-specific online databases (,,, and others). Only likely pathogenic and pathogenic mutations were considered as cancer-relevant and used for representation in Fig. 3. Additionally, whole-genome sequenced samples were manually screened for copy-number losses in 13 tumour suppressor genes of the candidate list, which are known to occasionally harbour germline focal deletions (MLH1, MSH2, MSH6, NF1, PMS2, PRKAR1A, PTCH1, PTEN, RB1, SMARCA4, SMARCB1, SUFU, TP53).

Detecting genome-wide mutation clusters

To identify genomic regions with single or clusters of recurrent mutations, the human genome was binned into non-overlapping windows of various sizes (50–500 bp) and compared the observed mutations to a background model (V. A. Rudneva et al., unpublished data) which was estimated using the ‘global’ model: the genome was stratified into 25 evenly sized groups of genomic windows based on the combined vector of five genetic and epigenetic features (replication timing, gene expression level, GC content, H3K9me3, and open versus closed chromatin conformation). For each region an enrichment score, binomial P value, and negative binomial test P value were computed.

Cross-validations were used to determine the significance cut-off that would provide reproducible results (with samples segregated by subgroup). A combination of the window size (500 bp), test statistics (enrichment score, mutational recurrence, binomial test P value, and gamma Poisson test P value), and a cut-off value that ensured high precision and recall values based on the precision-recall analysis (P = 10−20) were chosen (Extended Data Fig. 4a). Recall was calculated as the number of regions that satisfied the cut-off in results obtained on both halves of the dataset; precision was calculated as a fraction of the recalled regions to the total number of regions that satisfied the cut-off in each of the datasets. The chosen parameters were then used to run the pipeline on the complete dataset and then the mutations in the resulting regions were further examined manually for potential false positives in order to identify high-confidence candidate regions (Extended Data Fig. 4b).

Significantly mutated genes

Significantly mutated genes based on somatic SNVs and indels were identified with the SMG module of the MuSiC tools suite24 separately from all cancer types and from the pan-cancer cohort, and then merged.

This kind of significance analysis often produces false positive hits (for example, very large genes), despite normalization procedures, and thus several filters were applied to the raw output30. First, all genes of >30,000 bp exonic length or >10,000 bp with additional replication timing >800 were excluded (Cancer Cell Line Encyclopedia; CCLE)92. Genes that scored significant in three or more cancer types, or that were recurrently mutated at the same position, were manually inspected for artefacts from ambiguous alignments (for example, repetitive sequence regions). Also, genes that are probably not associated with tumour development but rather represent non-neoplastic somatic hypermutation processes in the context of immune function were removed. Furthermore, genes mutated in <2% of the cohort were included only if they had a secondary signal from either functional impact or from localized clustering bias (Intogen modules OncodriveFM and OncodriveClust v. 3.0 beta) or from being among known cancer genes29,93. Mutation needle plots were generated using MutationMapper94. Biological processes were assigned to the significantly mutated genes mostly exclusively, except for a few genes with high relevance for multiple processes, as specified in Supplementary Table 9.

Genome instability

Occurrence of chromothripsis was determined by manual inspection of coverage ratio plots (tumour/control) for WGS samples based on previously proposed guidelines95: at least ten copy-number switches on one chromosome, oscillating copy-number variation (usually with changes of +1 or −1, but also between other levels where additional large-scale copy-number changes interfere), and many more of such copy-number variations in one chromosome or chromosome arm compared to the remaining genome. In samples with an exceptionally high degree of structural variation, several chromosomes could be affected, and some samples showed an ‘amplifier’ type of chromothripsis, which was classified as several high-level focal amplifications on exactly the same copy-number level that are thus likely to be connected to one single event.

Generation of copy-number profiles

Copy-number calls reported by ACEseq were converted to the ‘SEG’ segmentation format, similar to the output of the circular binary segmentation algorithm based on chromosomal segment borders as pseudo marker positions96. All possible marker positions were determined from the whole cohort before assessing sample-wise copy-number profiles per marker in order to achieve identical resolution for all samples. Owing to sparse and highly oscillating sequencing coverage at centromeres, centromeric coordinates (±3 Mb around the centre of annotated centromeres) were excluded from whole-genome segmentation, as were two likely artefact regions on chromosomes 7 and 14 with nonspecific occurrences of relative copy-number gains and losses in 28% and 30% of all analysed samples in 17 of 19 entities (14q11.2, 7p14.1), which were identified using GISTIC2.0 (as described below) with ±1 Mb.

Identifying recurrent copy-number/structural variations

GISTIC2.0 (v.2.0.22, gene-gistic default parameter settings) was applied to the segmented copy-number data (per cancer type and pan-cancer) to identify significant copy-number alterations36. The resulting peaks were filtered for significance (q 0.1) and size (≤10 Mb). Compared to array-based data, which commonly serve as inputs for copy-number significance analysis, sequencing-based copy-number profiles are more prone to artefact copy-number variations, for example, due to repetitive regions leading to ambiguous alignments. Thus, several filtering steps were used to eliminate false-positive GISTIC peak calls and to discover potentially cancer-relevant copy-number alterations: first, peaks overlapping with common fragile genomic sites were excluded, as these are likely to be consequences of genomic instability rather than cancer-driving events97; next, peaks overlapping within 1 Mb of chromosomal ends were removed, as here sequencing coverage tends to vary frequently; and last, peaks overlapping with copy-number variable regions98 (regions ranked 1–100) were excluded. Additionally, some of the resulting peaks were classified as ‘passengers’ of variable regions that were called as separated peaks from most likely one event, for example, a peak with MYCNOS as passenger peak of MYCN amplification. For overlapping peaks called in multiple entities and/or pan-cancer, the final region was determined based on the analysis with highest significance for each peak, respectively.

Genes with a breakpoint inside the gene borders were assumed to be altered by structural variation and considered as recurrently altered if they had breakpoints in ≥5 samples in total or in ≥2 samples of one cancer type (for samples without chromothripsis). For other samples, genes with breakpoints in ≥5 samples were included as candidates, but these were not used for further downstream analyses. Additionally, recurrent sites of structural variation outside of gene bodies by clustering breakpoints were determined in 10-kb windows.

Scoring of druggable mutations

To identify candidates for targeted therapy, somatic and germline mutations (SNV and indels) were screened for variants in genes that are directly or indirectly involved in pathways with matched drugs either approved or currently being investigated in clinical trials (Supplementary Table 22a, adapted from ref. 19). The mutations were then manually assessed by experts in translational oncology and prioritized according to an internal algorithm taking into account the type of alteration, the mechanism of action of potential drugs within the pathway, the level of evidence for the specific alteration, and its role in the present cancer type (Supplementary Table 22b, adapted from ref. 19). Only alterations scored ‘intermediate’ or ‘high’ were regarded as being relevant in terms of druggability. A clonality analysis was not performed owing to limited sequencing depth in whole-genome-sequenced tumours.

Additionally, copy-number plots of whole-genome-sequenced data (including low-coverage WGS) were used to manually screen 52 druggable genes for amplifications or deletions (Supplementary Table 22a). Only focal CNVs (<10 Mb) with at least 5 copies (log2 ≥ 1.3) in the case of amplifications or the loss of ≥ 1 copy (log2 ≤ −1) for deletions were included and subsequently prioritized as described for the SNVs/indels. The data representation includes all tumours with full genomic information (WES + lcWGS or WGS; n = 675) and, additionally, tumours analysed by WES only for cancer types without any whole-genome-sequenced tumours (T-ALL, Ewing’s sarcoma, HB; n = 39), but the latter were excluded from downstream analyses.

Data availability

Mutation data have been deposited into commonly used public data portals and are accessible at They can be explored in and downloaded from the R2 Analysis and Genomics Platform, the PedcBio Portal for Cancer Visualization, and the TARGET Data Matrix. Sequencing data were obtained from previous studies as listed in Supplementary Note 1 and include the following accession codes: RP012816, PRJEB11430 (European Nucleotide Archive); EGAS00001001139, EGAS00001001953, EGAS00001000607, EGAS00001000381, EGAS00001000906, EGAS00001001297, EGAS00001000443, EGAS00001000213, EGAS00001000263, EGAS00001000192, EGAS00001000255, EGAS00001000254, EGAS00001000253, EGAS00001000256, EGAS00001000246, EGAS00001000379, EGAS00001000380, EGAS00001000346, EGAS00001000349, EGAS00001000347, EGAS00001000192 (European Genome-Phenome Archive).


  1. 1.

    , , , & Challenging issues in pediatric oncology. Nat. Rev. Clin. Oncol. 8, 540–549 (2011)

  2. 2.

    , & Cancer statistics, 2016. CA Cancer J. Clin. 66, 7–30 (2016)

  3. 3.

    , , , & Late effects in adult survivors of pediatric cancer: a guide for the primary care physician. Am. J. Med. 125, 636–641 (2012)

  4. 4.

    et al. The Pediatric Cancer Genome Project. Nat. Genet. 44, 619–622 (2012)

  5. 5.

    et al. Germline mutations in predisposition genes in pediatric cancer. N. Engl. J. Med. 373, 2336–2346 (2015)

  6. 6.

    et al. The landscape of somatic mutations in epigenetic regulators across 1,000 paediatric cancer genomes. Nat. Commun. 5, 3630 (2014)

  7. 7.

    et al. Mutational landscape and significance across 12 major cancer types. Nature 502, 333–339 (2013)

  8. 8.

    et al. SEER Cancer Statistics Review, 1975–2012, National Cancer Institute (National Cancer Institute, SEER Program, NIH, 2014)

  9. 9.

    et al. Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours. Nature (2018)

  10. 10.

    et al. Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat. Biotechnol. 32, 644–652 (2014)

  11. 11.

    et al. Comprehensive analysis of hypermutation in human cancer. Cell 171, 1042–1056 (2017)

  12. 12.

    et al. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67–73 (2013)

  13. 13.

    et al. Immune checkpoint inhibition for hypermutant glioblastoma multiforme resulting from germline biallelic mismatch repair deficiency. J. Clin. Oncol. 34, 2206–2211 (2016)

  14. 14.

    , , & Age-related somatic mutations in the cancer genome. Oncotarget 6, 24627–24635 (2015)

  15. 15.

    et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013)

  16. 16.

    et al. Clock-like mutational processes in human somatic cells. Nat. Genet. 47, 1402–1407 (2015)

  17. 17.

    et al. Atypical teratoid/rhabdoid tumors are comprised of three epigenetic subgroups with distinct enhancer landscapes. Cancer Cell 29, 379–393 (2016)

  18. 18.

    et al. Mutational signature analysis identifies MUTYH deficiency in colorectal cancers and adrenocortical carcinomas. J. Pathol. 242, 10–15 (2017)

  19. 19.

    et al. Next-generation personalised medicine for high-risk paediatric cancer patients—The INFORM pilot study. Eur. J. Cancer 65, 91–101 (2016)

  20. 20.

    , , & Tumor protein p53 (TP53) testing and Li–Fraumeni syndrome: current status of clinical applications and future directions. Mol. Diagn. Ther. 17, 31–47 (2013)

  21. 21.

    et al. TP53 germline mutation may affect response to anticancer treatments: analysis of an intensively treated Li–Fraumeni family. Breast Cancer Res. Treat. 151, 671–678 (2015)

  22. 22.

    et al. Radio-induced malignancies after breast cancer postoperative radiotherapy in patients with Li–Fraumeni syndrome. Radiat. Oncol. 5, 104 (2010)

  23. 23.

    & Advances in the understanding of MYC-induced lymphomagenesis. Br. J. Haematol. 149, 484–497 (2010)

  24. 24.

    et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012)

  25. 25.

    Mutant PRPS1: a new therapeutic target in relapsed acute lymphoblastic leukemia. Nat. Med. 21, 553–554 (2015)

  26. 26.

    et al. Activating mutations in the NT5C2 nucleotidase gene drive chemotherapy resistance in relapsed ALL. Nat. Med. 19, 368–371 (2013)

  27. 27.

    et al. Somatic mutation in PIK3CA is a late event in cervical carcinogenesis. J. Pathol. Clin. Res. 1, 207–211 (2015)

  28. 28.

    et al. Gene fusion with an ETS DNA-binding domain caused by chromosome translocation in human tumours. Nature 359, 162–165 (1992)

  29. 29.

    et al. IntOGen-mutations identifies cancer drivers across tumor types. Nat. Methods 10, 1081–1082 (2013)

  30. 30.

    et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013)

  31. 31.

    et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013)

  32. 32.

    , & G1 tetraploidy checkpoint and the suppression of tumorigenesis. J. Cell. Biochem. 88, 673–683 (2003)

  33. 33.

    et al. A cell-based model system links chromothripsis with hyperploidy. Mol. Syst. Biol. 11, 828 (2015)

  34. 34.

    , & Chromothripsis and cancer: causes and consequences of chromosome shattering. Nat. Rev. Cancer 12, 663–670 (2012)

  35. 35.

    et al. Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell 148, 59–71 (2012)

  36. 36.

    et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011)

  37. 37.

    et al. Polycomb group gene BMI1 controls invasion of medulloblastoma cells and inhibits BMP-regulated cell adhesion. Acta Neuropathol. Commun. 2, 10 (2014)

  38. 38.

    , , & The H3K36 demethylase Jhdm1b/Kdm2b regulates cell proliferation and senescence through p15Ink4b. Nat. Struct. Mol. Biol. 15, 1169–1175 (2008)

  39. 39.

    et al. Forkhead transcription factor FOXF1 is a novel target gene of the p53 family and regulates cancer cell migration and invasiveness. Oncogene 33, 4837–4846 (2014)

  40. 40.

    et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014)

  41. 41.

    et al. TERT rearrangements are frequent in neuroblastoma and identify aggressive tumors. Nat. Genet. 47, 1411–1414 (2015)

  42. 42.

    et al. The tyrosine phosphatase PTPRD is a tumor suppressor that is frequently inactivated and mutated in glioblastoma and other human cancers. Proc. Natl Acad. Sci. USA 106, 9435–9440 (2009)

  43. 43.

    & SUZ12 is required for both the histone methyltransferase activity and the silencing function of the EED–EZH2 complex. Mol. Cell 15, 57–67 (2004)

  44. 44.

    et al. The whole-genome landscape of medulloblastoma subtypes. Nature 547, 311–317 (2017)

  45. 45.

    , , & RIM genes differentially contribute to organizing presynaptic release sites. Proc. Natl Acad. Sci. USA 109, 11830–11835 (2012)

  46. 46.

    et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013)

  47. 47.

    et al. Implementation of mechanism of action biology-driven early drug development for children with cancer. Eur. J. Cancer 62, 124–131 (2016)

  48. 48.

    , & Use of poly ADP-ribose polymerase [PARP] inhibitors in cancer cells bearing DDR defects: the rationale for their inclusion in the clinic. J. Exp. Clin. Cancer Res. 35, 179 (2016)

  49. 49.

    et al. Somatic histone H3 alterations in pediatric diffuse intrinsic pontine gliomas and non-brainstem glioblastomas. Nat. Genet. 44, 251–253 (2012)

  50. 50.

    et al. The genomic landscape of diffuse intrinsic pontine glioma and pediatric non-brainstem high-grade glioma. Nat. Genet. 46, 444–450 (2014)

  51. 51.

    et al. Association of age at diagnosis and genetic mutations in patients with neuroblastoma. J. Am. Med. Assoc. 307, 1062–1071 (2012)

  52. 52.

    et al. Recurrent somatic structural variations contribute to tumorigenesis in pediatric osteosarcoma. Cell Rep. 7, 104–112 (2014)

  53. 53.

    et al. Genomic landscape of paediatric adrenocortical tumours. Nat. Commun. 6, 6302 (2015)

  54. 54.

    et al. Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas. Nat. Genet. 45, 602–612 (2013)

  55. 55.

    et al. C11orf95-RELA fusions drive oncogenic NF-κB signalling in ependymoma. Nature 506, 451–455 (2014)

  56. 56.

    et al. Targeting oxidative stress in embryonal rhabdomyosarcoma. Cancer Cell 24, 710–724 (2013)

  57. 57.

    et al. The landscape of somatic mutations in infant MLL-rearranged acute lymphoblastic leukemias. Nat. Genet. 47, 330–337 (2015)

  58. 58.

    et al. An Inv(16)(p13.3q24.3)-encoded CBFA2T3-GLIS2 fusion protein defines an aggressive subtype of pediatric acute megakaryoblastic leukemia. Cancer Cell 22, 683–697 (2012)

  59. 59.

    et al. The genomic landscape of hypodiploid acute lymphoblastic leukemia. Nat. Genet. 45, 242–252 (2013)

  60. 60.

    et al. A novel retinoblastoma therapy from genomic and epigenetic analyses. Nature 481, 329–334 (2012)

  61. 61.

    et al. The genomic landscape of core-binding factor acute myeloid leukemias. Nat. Genet. 48, 1551–1556 (2016)

  62. 62.

    et al. Novel mutations target distinct subgroups of medulloblastoma. Nature 488, 43–48 (2012)

  63. 63.

    International Cancer Genome Consortium PedBrain Tumor Project. Recurrent MET fusion genes represent a drug target in pediatric glioblastoma. Nat. Med. 22, 1314–1320 (2016)

  64. 64.

    et al. Mutations in the SIX1/2 pathway and the DROSHA/DGCR8 miRNA microprocessor complex underlie high-risk blastemal type Wilms tumors. Cancer Cell 27, 298–311 (2015)

  65. 65.

    et al. Ras pathway mutations are prevalent in relapsed childhood acute lymphoblastic leukemia and confer sensitivity to MEK inhibition. Blood 124, 3420–3430 (2014)

  66. 66.

    et al. The activating STAT5B N642H mutation is a common abnormality in pediatric T-cell acute lymphoblastic leukemia and confers a higher risk of relapse. Haematologica 99, e188–e192 (2014)

  67. 67.

    et al. Epigenomic alterations define lethal CIMP-positive ependymomas of infancy. Nature 506, 445–450 (2014)

  68. 68.

    et al. The genomic landscape of hepatoblastoma and their progenies with HCC-like features. J. Hepatol. 61, 1312–1320 (2014)

  69. 69.

    et al. Exome sequencing of osteosarcoma reveals mutation signatures reminiscent of BRCA deficiency. Nat. Commun. 6, 8940 (2015)

  70. 70.

    et al. Deep sequencing in conjunction with expression and functional analyses reveals activation of FGFR1 in Ewing sarcoma. Clin. Cancer Res. 21, 4935–4946 (2015)

  71. 71.

    et al. Genome sequencing of SHH medulloblastoma predicts genotype-related response to smoothened inhibition. Cancer Cell 25, 393–405 (2014)

  72. 72.

    et al. Negative feedback-defective PRPS1 mutants drive thiopurine resistance in relapsed childhood ALL. Nat. Med. 21, 563–571 (2015)

  73. 73.

    et al. Pediatric T-cell lymphoblastic leukemia evolves into relapse by clonal selection, acquisition of mutations and promoter hypomethylation. Haematologica 100, 1442–1450 (2015)

  74. 74.

    et al. Recurrent somatic alterations of FGFR1 and NTRK2 in pilocytic astrocytoma. Nat. Genet. 45, 927–932 (2013)

  75. 75.

    et al. Alex’s Lemonade Stand Foundation infant and childhood primary brain and central nervous system tumors diagnosed in the United States in 2007–2011. Neuro-oncol. 16 (Suppl 10), x1–x36 (2015)

  76. 76.

    et al. Molecular classification of ependymal tumors across all CNS compartments, histopathological grades, and age groups. Cancer Cell 27, 728–743 (2015)

  77. 77.

    et al. Medulloblastomics: the end of the beginning. Nat. Rev. Cancer 12, 818–834 (2012)

  78. 78.

    et al. Three distinct subgroups of hypodiploidy in acute lymphoblastic leukaemia. Br. J. Haematol. 125, 552–559 (2004)

  79. 79.

    , & Acute lymphoblastic leukemia. N. Engl. J. Med. 350, 1535–1548 (2004)

  80. 80.

    German Childhood CancerRegistry - Report 2013/14 (1980-2013) (Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Univ. Medical Center of Johannes Gutenberg Univ., 2014)

  81. 81.

    , , , & Data analysis: Create a cloud commons. Nature 523, 149–151 (2015)

  82. 82.

    et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001)

  83. 83.

    et al. Dissecting the genomic complexity underlying medulloblastoma. Nature 488, 100–105 (2012)

  84. 84.

    , & ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010)

  85. 85.

    et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012)

  86. 86.

    et al. A global reference for human genetic variation. Nature 526, 68–74 (2015)

  87. 87.

    , & A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009)

  88. 88.

    et al. Parent-specific copy number in paired tumor–normal studies using circular binary segmentation. Bioinformatics 27, 2038–2046 (2011)

  89. 89.

    , & xsample(): An R function for sampling linear inverse problems. J. Stat. Softw. 30, 1–15 (2009)

  90. 90.

    et al. Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. Mol. Cell 46, 424–435 (2012)

  91. 91.

    , , , & Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013)

  92. 92.

    et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012)

  93. 93.

    et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004)

  94. 94.

    & Mutationmapper: a tool to aid the mapping of protein mutation data. PLoS ONE 8, e71711 (2013)

  95. 95.

    & Criteria for inference of chromothripsis in cancer genomes. Cell 152, 1226–1236 (2013)

  96. 96.

    , , & Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004)

  97. 97.

    et al. Common fragile site profiling in epithelial and erythroid cells reveals that most recurrent cancer deletions lie in fragile sites hosting large genes. Cell Rep. 4, 420–428 (2013)

  98. 98.

    et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015)

Download references


This project was mainly supported and funded by the German Cancer Consortium (DKTK Pediatric Malignancies Joint Funding Project) and German Cancer Aid (#108128) and Deutsche Kinderkrebsstiftung (German Cancer Childhood Foundation) for the INFORM project. Additional support came from the German Ministry for Education and Research (BMBF #01KU1201A) and the German Cancer Aid (#109252) for the ICGC (International Cancer Genome Consortium) PedBrain Tumor Project and the ICGC MMML-Seq Project (within Program for Medical Genome Research Grants #01KU1002A–#01KU1002J), and the BioTOP Project (#01EK1502B). This work was also supported by an ERC starting grant to J.O.K. (#336045), MMML-MYC-SYS (#0316166I), ICGC DE-Mining (#01KU1505G), the Heidelberg Center for Personalized Oncology (DKFZ-HIPO) and the BMBF-funded Heidelberg Center for Human Bioinformatics (HD-HuB) within the German Network for Bioinformatics Infrastructure (de.NBI) (#031A537A, #031A537C). For technical support and expertise we thank the DKFZ Genomics and Proteomics Core Facility, M. Hain from the Division of Molecular Genetics (DKFZ), N. Jaeger and R. Kabbe from the Department of Pediatric Neurooncology (DKFZ), and S. Oelmez from the Data Management Group (DKFZ). We further thank members and technical staff of the ICGC MMML-Seq (International Cancer Genome Consortium Molecular Mechanisms in Malignant Lymphoma by Sequencing) and the European Renal Tumor Study Group (SIOP-RTSG).

Author information

Author notes

    • Susanne N. Gröbner
    •  & Barbara C. Worst

    These authors contributed equally to this work.

    • Lukas Chavez
    • , Marc Zapatka
    •  & Stefan M. Pfister

    These authors jointly supervised this work.


  1. Hopp-Children’s Cancer Center at the NCT Heidelberg (KiTZ), Heidelberg, Germany

    • Susanne N. Gröbner
    • , Barbara C. Worst
    • , Pascal D. Johann
    • , Gnana Prakash Balasubramanian
    • , Sebastian Brabetz
    • , Sebastian Bender
    • , Dominik Sturm
    • , Elke Pfaff
    • , Serap Erkek
    • , Sander Lambo
    • , Andreas E. Kulozik
    • , Hendrik Witt
    • , Olaf Witt
    • , Cornelis M. van Tilburg
    • , Kristian W. Pajtler
    • , Marcel Kool
    • , David T. W. Jones
    • , Lukas Chavez
    •  & Stefan M. Pfister
  2. Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Susanne N. Gröbner
    • , Barbara C. Worst
    • , Pascal D. Johann
    • , Gnana Prakash Balasubramanian
    • , Sebastian Brabetz
    • , Sebastian Bender
    • , Dominik Sturm
    • , Elke Pfaff
    • , Serap Erkek
    • , Sander Lambo
    • , Hendrik Witt
    • , Paul A. Northcott
    • , Kristian W. Pajtler
    • , Marcel Kool
    • , David T. W. Jones
    • , Lukas Chavez
    •  & Stefan M. Pfister
  3. German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Susanne N. Gröbner
    • , Barbara C. Worst
    • , Pascal D. Johann
    • , Sebastian Brabetz
    • , Barbara Hutter
    • , Dominik Sturm
    • , Elke Pfaff
    • , Serap Erkek
    • , Sander Lambo
    • , Claudia Blattmann
    • , Arndt Borkhardt
    • , Michaela Kuhlen
    • , Angelika Eggert
    • , Simone Fulda
    • , Roland Kappler
    • , Stefan Burdach
    • , Renate Kirschner-Schwabe
    • , Udo Kontny
    • , Andreas E. Kulozik
    • , Dietmar Lohmann
    • , Cornelia Eckert
    • , Michaela Nathrath
    • , Charlotte Niemeyer
    • , Günther H. Richter
    • , Johannes Schulte
    • , Frank Westermann
    • , Hendrik Witt
    • , Olaf Witt
    • , Cornelis M. van Tilburg
    • , Gudrun Fleischhack
    • , Thomas Klingebiel
    • , Benedikt Brors
    • , Ursula D. Weber
    • , Paul A. Northcott
    • , Kristian W. Pajtler
    • , Marcel Kool
    • , Rosario M. Piro
    • , David T. W. Jones
    • , Peter Lichter
    • , Lukas Chavez
    •  & Stefan M. Pfister
  4. Department of Pediatric Oncology, Hematology & Immunology, Heidelberg University Hospital, Heidelberg, Germany

    • Barbara C. Worst
    • , Pascal D. Johann
    • , Dominik Sturm
    • , Elke Pfaff
    • , Daniel Hübschmann
    • , Andreas E. Kulozik
    • , Hendrik Witt
    • , Olaf Witt
    • , Kristian W. Pajtler
    •  & Stefan M. Pfister
  5. European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany

    • Joachim Weischenfeldt
    • , Vasilisa A. Rudneva
    • , Maia Segura-Wang
    • , Serap Erkek
    • , Sebastian Waszak
    •  & Jan O. Korbel
  6. The Finsen Laboratory, Rigshospitalet, Biotech Research and Innovation Centre (BRIC), Copenhagen University, Copenhagen, Denmark

    • Joachim Weischenfeldt
  7. Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Ivo Buchhalter
    • , Kortine Kleinheinz
    • , Barbara Hutter
    • , Gideon Zipprich
    • , Michael Heinold
    • , Jürgen Eils
    • , Christian Lawerenz
    • , Benedikt Brors
    • , Matthias Schlesner
    •  & Roland Eils
  8. Department of Developmental Neurobiology, St Jude Children’s Research Hospital, Memphis, Tennessee, USA

    • Vasilisa A. Rudneva
    • , Benedikt Brors
    •  & Paul A. Northcott
  9. Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Gnana Prakash Balasubramanian
    • , Barbara Hutter
    •  & Daniel Hübschmann
  10. Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, Heidelberg University and BioQuant Center, 69120, Heidelberg, Germany

    • Daniel Hübschmann
    • , Michael Heinold
    •  & Roland Eils
  11. Klinikum Stuttgart - Olgahospital, Zentrum für Kinder-, Jugend- und Frauenmedizin, Pädiatrie, Stuttgart, Germany

    • Claudia Blattmann
    • , Stefan Bielack
    •  & Ewa Koscielniak
  12. Department of Pediatric Oncology, Hematology & Clinical Immunology, University Children’s Hospital, Heinrich Heine University, Düsseldorf, Germany

    • Arndt Borkhardt
    •  & Michaela Kuhlen
  13. Department of Pediatric Oncology/Hematology, Charité-Universitätsmedizin Berlin, Berlin, Germany

    • Angelika Eggert
    • , Renate Kirschner-Schwabe
    • , Cornelia Eckert
    •  & Johannes Schulte
  14. Institute for Experimental Cancer Research in Pediatrics, University Hospital Frankfurt, Frankfurt am Main, Germany

    • Simone Fulda
  15. Theodor-Boveri-Institute/Biocenter, Developmental Biochemistry, and Comprehensive Cancer Center Mainfranken, University of Würzburg, Würzburg, Germany

    • Manfred Gessler
    •  & Jenny Wegert
  16. Department of Pediatric Surgery, Research Laboratories, Dr von Hauner Children’s Hospital, Ludwig Maximilians University Munich, Munich, Germany

    • Roland Kappler
  17. Bone Tumor Reference Center at the Institute of Pathology, University Hospital Basel and University of Basel, Basel, Switzerland

    • Daniel Baumhoer
  18. Children’s Cancer Research Centre and Department of Pediatrics, Klinikum rechts der Isar, Technische Universität München, Munich, Germany

    • Stefan Burdach
    • , Michaela Nathrath
    •  & Günther H. Richter
  19. Division of Pediatric Hematology and Oncology, University Medical Center Aachen, Aachen, Germany

    • Udo Kontny
  20. Department of Human Genetics, University Hospital Essen, Essen, Germany

    • Dietmar Lohmann
  21. Division of Pediatric Hematology and Oncology, Department of Pediatrics, University Medical Center Freiburg, Freiburg, Germany

    • Simone Hettmer
    •  & Charlotte Niemeyer
  22. Department of Pediatric Oncology, Klinikum Kassel, Kassel, Germany

    • Michaela Nathrath
  23. Institute of Human Genetics, University of Ulm & University Hospital of Ulm, Ulm, Germany

    • Reiner Siebert
  24. Division of Neuroblastoma Genomics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Frank Westermann
  25. Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands

    • Jan J. Molenaar
  26. Innovative Therapies for Children with Cancer Consortium and Department of Clinical Research, Gustave Roussy, Université Paris-Saclay, Villejuif, France

    • Gilles Vassal
  27. Pediatric Hematology and Oncology, University Hospital Münster, Münster, Germany

    • Birgit Burkhardt
  28. Pediatric Hematology and Oncology, Hannover Medical School, Hannover, Germany

    • Christian P. Kratz
  29. Clinical Cooperation Unit Pediatric Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Olaf Witt
  30. Center for Individualized Pediatric Oncology (ZIPO) and Brain Tumors, University Hospital and German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Cornelis M. van Tilburg
  31. Division of Pediatric Hematology and Oncology, University Medical Center Göttingen, Göttingen, Germany

    • Christof M. Kramm
  32. Pediatric Oncology & Hematology, Pediatrics III, University Hospital of Essen, Essen, Germany

    • Gudrun Fleischhack
    •  & Uta Dirksen
  33. Department of Pediatric Hematology and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany

    • Stefan Rutkowski
    •  & Katja von Hoff
  34. Swabian Children’s Cancer Center, Children’s Hospital, Klinikum Augsburg, Augsburg, Germany

    • Michael Frühwald
  35. Genomics and Proteomics Core Facility, High Throughput Sequencing Unit, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Stephan Wolf
  36. Hospital for Children and Adolescents, University Hospital Frankfurt, Frankfurt, Germany

    • Thomas Klingebiel
  37. University Hospital Cologne, Klinik und Poliklinik für Kinder- und Jugendmedizin, Cologne, Germany

    • Pablo Landgraf
  38. Department of Oncogenomics, Academic Medical Center, Amsterdam, The Netherlands

    • Jan Koster
    •  & Danny A. Zwijnenburg
  39. Division of Neurosurgery, Center for Childhood Cancer Research, Department of Biomedical and Health Informatics and Center for Data-Driven Discovery in Biomedicine, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA

    • Adam C. Resnick
    •  & Pichai Raman
  40. Department of Computational Biology, St Jude Children’s Research Hospital, Memphis, Tennessee, USA

    • Jinghui Zhang
    • , Yanling Liu
    •  & Xin Zhou
  41. Division of Oncology, Center for Childhood Cancer Research, Department of Biomedical and Health Informatics and Center for Data-Driven Discovery in Biomedicine, Children’s Hospital of Philadelphia, Philadelphia, USA

    • Angela J. Waanders
  42. Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Ursula D. Weber
    • , Rosario M. Piro
    • , Peter Lichter
    •  & Marc Zapatka
  43. Institute of Computer Science, Freie Universität Berlin, Berlin, Germany

    • Rosario M. Piro
    •  & Marc Zapatka
  44. Institute of Medical Genetics and Human Genetics, Charité University Hospital, Berlin, Germany

    • Rosario M. Piro
  45. Bioinformatics and Omics Data Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Matthias Schlesner
  46. Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Peter Lichter
    • , Ursula Weber
    • , Bernhard Radlwimmer
    •  & Marc Zapatka
  47. Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Roland Eils
    •  & Ivo Buchhalter
  48. Department of Neuropathology, Heidelberg University Hospital, Heidelberg, Germany

    • Andrey Korshunov
  49. Hopp-Children’s Cancer Center at the NCT Heidelberg (KiTZ), Heidelberg, Germany

    • Olaf Witt
    • , Stefan Pfister
    • , David Jones
    • , Peter Lichter
    •  & Natalie Jäger
  50. Clinical Cooperation Unit Pediatric Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Olaf Witt
  51. Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Stefan Pfister
    • , David Jones
    •  & Natalie Jäger
  52. Department of Neuropathology, Heinrich-Heine-University, Düsseldorf, Germany

    • Guido Reifenberger
    •  & Jörg Felsberg
  53. Division of Translational Oncology, German Cancer Research Center (DKFZ)/National Center for Tumor Diseases (NCT), Heidelberg, Germany

    • Christof von Kalle
  54. GeneWerk GmbH, Heidelberg, Germany

    • Manfred Schmidt
    •  & Cynthia Bartholomä
  55. Division of Neurosurgery, Hospital for Sick Children, Toronto, Ontario, Canada

    • Michael Taylor
  56. Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany

    • Jan Korbel
    • , Adrian Stütz
    •  & Tobias Rausch
  57. Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics (MPI-MG), Berlin, Germany

    • Marie-Laure Yaspo
    • , Hans Lehrach
    •  & Hans-Jörg Warnatz
  58. Hematology and Clinical Immunology, University Hospital, Düsseldorf, Germany

    • Pablo Landgraf
    •  & Arndt Borkhardt
  59. Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Benedikt Brors
  60. Data Management Group, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Jürgen Eils
    •  & Christian Lawerenz
  61. Institute of Human Genetics, University of Ulm and University Hospital of Ulm, Ulm, Germany

    • Reiner Siebert
    • , Ole Ammerpohl
    • , Cristina López
    •  & Rabea Wagener
  62. Institute of Human Genetics, Christian-Albrechts-University, Kiel, Germany

    • Reiner Siebert
    • , Susanne Wagner
    • , Andrea Haake
    • , Julia Richter
    • , Gesine Richter
    • , Anke K. Bergmann
    • , Alexander Claviez
    • , Ole Ammerpohl
    • , Sietse M. Aukema
    • , Cristina López
    • , Inga Nagel
    • , Inga Vater
    •  & Rabea Wagener
  63. Hematopathology Section, Institute of Pathology, Christian-Albrechts-University, Kiel, Germany

    • Julia Richter
    • , Wolfram Klapper
    • , Monika Szczepanowski
    •  & Sietse M. Aukema
  64. Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Roland Eils
    • , Chris Lawerenz
    • , Jürgen Eils
    • , Jules Kerssemakers
    • , Christina Jaeger-Schmidt
    • , Ingrid Scholz
    • , Daniel Hübschmann
    • , Kortine Kleinheinz
    •  & Matthias Schlesner
  65. Department for Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology and Bioquant, University of Heidelberg, Heidelberg, Germany

    • Roland Eils
    • , Daniel Hübschmann
    •  & Kortine Kleinheinz
  66. Department of Pediatrics, University Hospital Schleswig-Holstein, Campus Kiel, Kiel, Germany

  67. Department of Internal Medicine/Hematology, Friedrich-Ebert-Hospital, Neumünster, Germany

    • Christoph Borst
    •  & Siegfried Haas
  68. University Hospital Muenster - Pediatric Hematology and Oncology, Münster, Germany

    • Birgit Burkhardt
  69. University Hospital Giessen, Pediatric Hematology and Oncology, Giessen, Germany

    • Birgit Burkhardt
    • , Jasmin Lisfeld
    •  & Marius Rohde
  70. Department of Medicine III - Campus Grosshadern, University Hospital Munich, Munich, Germany

    • Martin Dreyling
  71. Department of Hematology and Oncology, Georg-August-University of Göttingen, Göttingen, Germany

    • Sonja Eberth
    • , Christina Stadler
    • , Lorenz Trümper
    •  & Dieter Kube
  72. University Hospital Würzburg, Department of Medicine and Poliklinik II, University of Würzburg, Würzburg

    • Hermann Einsele
  73. Department of Medicine III, Hematology and Oncology, Dr Horst-Schmidt-Kliniken of Wiesbaden, Wiesbaden, Germany

    • Norbert Frickhofen
  74. Senckenberg Institute of Pathology, University of Frankfurt Medical School, Frankfurt am Main, Germany

    • Martin-Leo Hansmann
  75. Department of Internal Medicine II: Hematology and Oncology, University Medical Centre, Campus Kiel, Kiel, Germany

    • Dennis Karsch
    •  & Michael Kneba
  76. Hospital of Internal Medicine II, Hematology and Oncology, St-Georg Hospital Leipzig, Leipzig, Germany

    • Luisa Mantovani-Löffler
  77. Department of Pathology, Robert-Bosch-Hospital, Stuttgart, Germany

    • German Ott
  78. Clinic for Hematology and Oncology, St-Antonius-Hospital, Eschweiler, Germany

    • Peter Staib
  79. Department for Internal Medicine III, University of Ulm and University Hospital of Ulm, Ulm, Germany

    • Stephan Stilgenbauer
  80. National Centre for Tumor Disease, Heidelberg, Germany

    • Thorsten Zenz
  81. Institute of Cell Biology (Cancer Research), University of Duisburg-Essen, Duisburg-Essen, Medical School, Essen, Germany

    • Ralf Küppers
    •  & Marc Weniger
  82. Institute of Pathology, Charité – University Medicine Berlin, Berlin, Germany

    • Michael Hummel
    •  & Dido Lenze
  83. Comprehensive Cancer Center Ulm (CCCU), University Hospital Ulm, Ulm, Germany

    • Ulrike Kostezka
  84. Institute of Pathology, University of Ulm and University Hospital of Ulm, Ulm, Germany

    • Peter Möller
  85. Institute of Pathology, University of Würzburg, Würzburg, Germany

    • Andreas Rosenwald
    • , Ellen Leich
    •  & Jordan Pischimariov
  86. Department of Pediatric Oncology, Hematology and Clinical Immunology, Heinrich-Heine-University, Düsseldorf, Germany

    • Vera Binder
    • , Arndt Borkhardt
    •  & Jessica I. Hoell
  87. German Cancer Research Center (DKFZ), Division of Molecular Genetics, Heidelberg, Germany

    • Peter Lichter
    •  & Bernhard Radlwimmer
  88. Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany

    • Philip Rosenstiel
    •  & Markus Schilhabel
  89. Department of General Internal Medicine, University Kiel, Kiel, Germany

    • Stefan Schreiber
  90. Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany

    • Stephan H. Bernhart
    • , Hans Binder
    • , Gero Doose
    • , Steve Hoffmann
    • , Lydia Hopp
    • , Helene Kretzmer
    • , David Langenberger
    •  & Peter F. Stadler
  91. Bioinformatics Group, Department of Computer, University of Leipzig, Leipzig, Germany

    • Stephan H. Bernhart
    • , Hans Binder
    • , Gero Doose
    • , Steve Hoffmann
    • , Helene Kretzmer
    • , David Langenberger
    •  & Peter F. Stadler
  92. Transcriptome Bioinformatics, LIFE Research Center for Civilization Diseases, University of Leipzig, Leipzig, Germany

    • Stephan H. Bernhart
    • , Gero Doose
    • , Steve Hoffmann
    • , Helene Kretzmer
    • , David Langenberger
    •  & Peter F. Stadler
  93. Division of Applied Bioinformatics (G200), German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Benedikt Brors
  94. Department of Pediatric Immunology, Hematology and Oncology, University Hospital, Heidelberg, Germany

    • Daniel Hübschmann
  95. Institute for Medical Informatics Statistics and Epidemiology, University of Leipzig, Leipzig, Germany

    • Markus Kreuz
    • , Markus Loeffler
    •  & Maciej Rosolowski
  96. EMBL Heidelberg, Genome Biology, Heidelberg, Germany

    • Jan Korbel
    •  & Stephanie Sungalee
  97. Bioinformatics and Omics Data Analytics (B240), German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Matthias Schlesner
  98. RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology IZI, Leipzig, Germany

    • Peter F. Stadler
  99. Santa Fe Institute, Santa Fe, New Mexico, USA

    • Peter F. Stadler
  100. Max-Planck-Institute for Mathematics in Sciences, Leipzig, Germany

    • Peter F. Stadler


  1. ICGC PedBrain-Seq Project

  2. ICGC MMML-Seq Project


  1. Search for Susanne N. Gröbner in:

  2. Search for Barbara C. Worst in:

  3. Search for Joachim Weischenfeldt in:

  4. Search for Ivo Buchhalter in:

  5. Search for Kortine Kleinheinz in:

  6. Search for Vasilisa A. Rudneva in:

  7. Search for Pascal D. Johann in:

  8. Search for Gnana Prakash Balasubramanian in:

  9. Search for Maia Segura-Wang in:

  10. Search for Sebastian Brabetz in:

  11. Search for Sebastian Bender in:

  12. Search for Barbara Hutter in:

  13. Search for Dominik Sturm in:

  14. Search for Elke Pfaff in:

  15. Search for Daniel Hübschmann in:

  16. Search for Gideon Zipprich in:

  17. Search for Michael Heinold in:

  18. Search for Jürgen Eils in:

  19. Search for Christian Lawerenz in:

  20. Search for Serap Erkek in:

  21. Search for Sander Lambo in:

  22. Search for Sebastian Waszak in:

  23. Search for Claudia Blattmann in:

  24. Search for Arndt Borkhardt in:

  25. Search for Michaela Kuhlen in:

  26. Search for Angelika Eggert in:

  27. Search for Simone Fulda in:

  28. Search for Manfred Gessler in:

  29. Search for Jenny Wegert in:

  30. Search for Roland Kappler in:

  31. Search for Daniel Baumhoer in:

  32. Search for Stefan Burdach in:

  33. Search for Renate Kirschner-Schwabe in:

  34. Search for Udo Kontny in:

  35. Search for Andreas E. Kulozik in:

  36. Search for Dietmar Lohmann in:

  37. Search for Simone Hettmer in:

  38. Search for Cornelia Eckert in:

  39. Search for Stefan Bielack in:

  40. Search for Michaela Nathrath in:

  41. Search for Charlotte Niemeyer in:

  42. Search for Günther H. Richter in:

  43. Search for Johannes Schulte in:

  44. Search for Reiner Siebert in:

  45. Search for Frank Westermann in:

  46. Search for Jan J. Molenaar in:

  47. Search for Gilles Vassal in:

  48. Search for Hendrik Witt in:

  49. Search for Birgit Burkhardt in:

  50. Search for Christian P. Kratz in:

  51. Search for Olaf Witt in:

  52. Search for Cornelis M. van Tilburg in:

  53. Search for Christof M. Kramm in:

  54. Search for Gudrun Fleischhack in:

  55. Search for Uta Dirksen in:

  56. Search for Stefan Rutkowski in:

  57. Search for Michael Frühwald in:

  58. Search for Katja von Hoff in:

  59. Search for Stephan Wolf in:

  60. Search for Thomas Klingebiel in:

  61. Search for Ewa Koscielniak in:

  62. Search for Pablo Landgraf in:

  63. Search for Jan Koster in:

  64. Search for Adam C. Resnick in:

  65. Search for Jinghui Zhang in:

  66. Search for Yanling Liu in:

  67. Search for Xin Zhou in:

  68. Search for Angela J. Waanders in:

  69. Search for Danny A. Zwijnenburg in:

  70. Search for Pichai Raman in:

  71. Search for Benedikt Brors in:

  72. Search for Ursula D. Weber in:

  73. Search for Paul A. Northcott in:

  74. Search for Kristian W. Pajtler in:

  75. Search for Marcel Kool in:

  76. Search for Rosario M. Piro in:

  77. Search for Jan O. Korbel in:

  78. Search for Matthias Schlesner in:

  79. Search for Roland Eils in:

  80. Search for David T. W. Jones in:

  81. Search for Peter Lichter in:

  82. Search for Lukas Chavez in:

  83. Search for Marc Zapatka in:

  84. Search for Stefan M. Pfister in:


S.N.G. and B.C.W. performed data analysis and interpretation. S.N.G., J.W., I.B., K.K., V.A.R., G.P.B., M.S.-W., B.H., D.H., G.Z., M.H., J.E., C.L., and S.L. established workflows and performed data processing. P.D.J., S.Br., S.Be., D.S., E.P., S.E., S.W., U.K., J.J.M., G.V., C.P.K., M.Ko., D.T.W.J., L.C., and M.Z. contributed to design and interpretation of the analyses. P.D.J., D.H., C.B., A.B., M.Ku., S.F., J.W., R.K., D.B., A.E., S.Bu., R.K.-S., A.E.K., D.L., S.H., C.E., S.Bi., M.N., C.N., G.H.R., J.S., R.S., F.W., H.W., B.Bu., U.D., O.W., C.M.v.T., C.M.K., G.F., S.R., M.F., M.G., J.W., K.v.H., S.W., P.L., T.K., E.K., P.A.N., K.W.P., and M.Ko. provided data and patient materials. J.K., A.C.R., J.Z., Y.L., X.Z., A.J.W., D.A.Z., and P.R. established the databases. S.N.G., B.C.W., D.T.W.J. and S.M.P. prepared the manuscript and figures. B.Br., U.D.W., M.Ko., R.M.P., J.O.K., M.S., R.E., D.T.W.J., P.L., L.C., M.Z., and S.M.P. contributed to project management and provided leadership.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Stefan M. Pfister.

Reviewer Information Nature thanks S. Chanock and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Supplementary information

PDF files

  1. 1.

    Life Sciences Reporting Summary

  2. 2.

    Supplementary Information

    This file contains Supplementary Note 1.

Excel files

  1. 1.

    Supplementary Tables

    This file contains Supplementary Tables 1-24.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Creative Commons BYThis work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons licence, users will need to obtain permission from the licence holder to reproduce the material. To view a copy of this licence, visit