As a first phase of the International Cancer Genome Consortium (ICGC) PedBrain Tumor Project (, we have collected matched tumour and germline samples from 125 medulloblastoma patients aged from 0 to 17 years (Supplementary Table 1). Whole-genome sequencing (WGS, n = 39) and whole-exome sequencing (WES, n = 21) were applied to a ‘discovery’ set, with a custom-capture approach used to sequence 2,734 genes in an additional ‘replication’ set (n = 65). All tumour samples were obtained at primary diagnosis, before adjuvant therapy, and the distribution of molecular subgroups was similar across cohorts (Supplementary Fig. 1).

Investigation of genome-wide somatic mutation allele frequencies identified several cases with a clear peak at approximately 25%, rather than the expected approximately 50% allele frequency for early, heterozygous events (Fig. 1a). Analysis of coverage depth and allele frequencies in regions of copy-number change ruled out stromal contamination, but rather indicated a tetraploid baseline in the tumour genome (Fig. 1b). Predicted ploidy status was confirmed by fluorescence in situ hybridization (FISH) using multiple centromeric probes in 17 out of 18 cases analysed (Fig. 1a). The extremely low fraction of mutations at approximately 50% allele frequency indicates that genome duplication occurred very early during tumorigenesis. Some cases probably went through even higher polyploidy states before reaching an approximately 4n baseline (for example ICGC_MB45, displaying 4n chromosomes with 4:0 or 3:1 allele ratios; Supplementary Fig. 2). Across the discovery set, tetraploidy was most commonly observed in Group 3 (7 out of 13, 54%) and Group 4 tumours (8 out of 20, 40%), followed by SHH (4 out of 14, 29%) and WNT tumours (1 out of 7, 14%). Interestingly, the four tetraploid SHH tumours all harboured TP53 mutations and also displayed chromothripsis6. Tetraploid Group 3 and 4 tumours showed significantly more large-scale copy number alterations compared with diploid cases (median 10 changes per tumour in tetraploid versus 4 per tumour in diploid cases, P = 0.008, two-tailed Mann–Whitney U-test; Supplementary Fig. 3). Thus, tetraploidy followed by genomic instability may be an early driving event in a large proportion of Group 3 and 4 medulloblastomas, which pose a significant clinical challenge due to their dismal prognosis and lack of targeted treatment options. Novel classes of drugs such as mitotic checkpoint kinase or kinesin inhibitors, which target the maintenance of tetraploidy through successive cell divisions, may therefore represent a rational therapeutic strategy in these cases7,8. The value of tetraploidy as a prognostic marker also requires further investigation.

Figure 1: Tetraploidy is a frequent early event in medulloblastoma tumorigenesis, and mutation rates vary with age and subgroup.
figure 1

a, Distributions of genome-wide somatic mutation allele frequencies (the proportion of sequence reads supporting a mutation) for diploid tumours (with a peak at 50% for heterozygous events, n = 7) and tetraploid cases (with a peak at 25%, n = 7). Insets show centromeric FISH for chromosomes 1 (red) and 11 (green), confirming the predicted ploidy status. b, Top left, rescaled tumour:germline coverage ratio, indicating copy-number gains (red) or losses (green). Bottom left, B-allele frequency (BAF) in the tumour at SNP positions which are heterozygous in the germ line. Right, genome alteration print (GAP) of segmented copy number and allele frequency profiles. Chromosomes with predicted 3:0/2:1/3:2 allele ratios show a BAF of approximately 0/0.33/0.4 and coverage ratios of approximately 0.75/0.75/1.25. Owing to random sampling, the 2:2 allele ratio is slightly below 0.5. c, Genome-wide somatic mutation rates are positively correlated with patient age (n = 39). Grp, Group. d, Distribution of somatic mutation rates by tumour subgroup (n = 39). P values are according to a Wilcoxon rank-sum test with Bonferroni correction. SHH-p53, SHH-subgroup tumours harbouring a somatic or germline TP53 mutation.

PowerPoint slide

The average somatic mutation rate in the WGS cohort was 0.52 per megabase (Mb), with an average of 10.3 non-synonymous coding single-nucleotide variants (SNVs) in the discovery cohort (Supplementary Table 2). This is slightly higher than previously reported for medulloblastoma9, possibly due to improved coverage and technical sensitivity, but considerably lower than in deep-sequenced adult tumours, for example10,11. There were significantly fewer transitions in the somatic alterations compared with germline variation (P = 4.6 × 10−7, Wilcoxon rank-sum test; Supplementary Fig. 4). All coding somatic SNVs identified in the combined cohort are listed in Supplementary Table 3.

We identified a positive correlation between genome-wide mutation rate and patient age, as previously reported for coding mutations9 (r2 = 0.35, P = 7.8 × 10−5 Pearson’s product–moment correlation; Fig. 1c). Intriguingly, this association was more pronounced in diploid tumours (r2 = 0.52, P = 3 × 10−5), and virtually absent in tetraploid cases (r2 = 0.04, P = 0.5) (Supplementary Fig. 5a, b). A similar trend was observed for non-synonymous mutations across the discovery cohort (Supplementary Fig. 5c). Coverage level did not correlate with mutation rate (Supplementary Fig. 5d). One explanation may be that all medulloblastomas originate during embryogenesis, with some tumours needing to accumulate more genetic ‘hits’ before becoming symptomatic. Alternatively, tumours arising in older patients may derive from more differentiated cells that require a greater number of alterations to undergo malignant transformation. Investigation of additional tumours from older patients may help to clarify this.

Five SHH tumours harbouring TP53 mutations, including three previously described Li–Fraumeni syndrome (LFS)-associated tumours with germline mutations6, one newly identified LFS case (ICGC_MB23), and one somatically mutated tumour (ICGC_MB34), had significantly more mutations than the remaining cases, both genome wide (mean 1.1 per Mb versus 0.43 per Mb, P = 4.5 × 10−6; two-tailed t-test) and for non-synonymous changes (mean 23 versus 8.8, P = 2.6 × 10−6). Interestingly, the WNT subgroup, which typically shows a good prognosis and few copy-number changes, had the next highest mutation rate (Fig. 1d).

Forty-one somatic, coding, small insertions/deletions (Indels) were identified across the cohort, with an average of 0.4 coding Indels per case in the discovery set (range 0–2; Supplementary Table 4). Some genes, however, were more commonly affected by Indels than SNVs. For example, frameshift Indels in PTCH1 were detected in 6 out of 125 cases, whereas only 2 SNVs were observed. Recurrent Indels were also seen in the chromatin modifiers MLL2, KDM6A (3 cases each) and BCOR (2 cases).

In contrast to another paediatric brain tumour, glioblastoma, in which we recently identified frequently recurrent hotspot mutations12, the majority of mutated genes in this study were unique to a single case (587 out of 760 non-synonymous SNVs in the 125 cases, 77%), demonstrating the pronounced genetic heterogeneity of medulloblastoma. Twenty-five of these singleton mutations, and 53 SNVs in total, were at positions listed in the COSMIC database of somatic alterations in tumours (available at, suggesting a rare but important contribution of many known cancer genes in medulloblastoma (Supplementary Table 5). Only 8 genes were somatically altered in more than 3% of the whole series: CTNNB1 (15 cases, 12%); DDX3X (10 cases, 8%); PTCH1 (8 cases, 6%), SMARCA4 (6 cases, 5%), MLL2 (6 cases, 5%), TP53 (somatically mutated in 5 cases, 4%), KDM6A (5 cases, 4%) and CTDNEP1 (4 cases, 3%) (Fig. 2). These were also the only genes found to be significantly altered upon analysis of the combined cohort with MutSig, an algorithm testing whether the observed mutations in a gene are not simply a consequence of random background mutation processes. It takes into account gene length and composition, silent to non-silent mutation ratios, and other factors (see; Supplementary Table 6). Large-scale copy-number changes known to be associated with medulloblastoma, such as formation of an isodicentric 17q and losses of 10q/9q/X13,14,15, were more frequently recurrent than SNVs (Supplementary Fig. 6a–e).

Figure 2: Subgroup specificity of common genetic alterations.
figure 2

Summary of clinical data and recurrent alterations in the combined cohort (n = 125). Genes which were found to be significantly mutated by MutSig analysis were included. UPD, uniparental disomy; ND, no material available for conclusive molecular subgroup assignment.

PowerPoint slide

Many alterations were enriched in specific medulloblastoma subgroups. For example, all of the WNT tumours (15 out of 15) harboured a mutation in CTNNB1, and 13 out of 15 displayed loss of one copy of chromosome 6 (or acquired uniparental disomy in one case), alterations which have previously been associated with this subgroup4,13,15. Mutations in DDX3X were also clearly enriched in WNT tumours (adjusted P = 7.06 × 10−6, two-tailed Fisher’s exact test with a Bonferroni correction), and these mutations were clustered within the helicase domain (Supplementary Fig. 7a). Three were localized at the RNA-binding surface of the protein and three were predicted to disrupt the closed (RNA-binding) conformation (Supplementary Fig. 7b). The remainder were predicted to disrupt indirectly either the positive charge on the RNA-binding surface (n = 2) or the folding of the closed form (n = 2). No truncating mutations were found, indicating an alteration rather than simply a loss of function. DDX3X has recently been proposed to have an oncogenic role10,11, although its exact function in tumorigenesis remains to be determined.

As anticipated from previous studies13,16, SHH tumours frequently showed loss of the whole of chromosome arm 9q, as well as alterations in key hedgehog-pathway signalling molecules (for example, PTCH1, altered in 8 cases; MYCN, amplified in 5 cases; and SMO, mutated in ICGC_MB12).

The most frequently mutated gene in Group 3 tumours was SMARCA4 (3 out of 26 cases). As with DDX3X, these mutations were clustered in the helicase domain (Supplementary Fig. 7a). As noted above, tetraploidy was also a common event in this subgroup and in Group 4 tumours. Recurrent truncating mutations in KDM6A (on chromosome X, which frequently shows copy-number loss in female Group 3 and 4 medulloblastoma patients; also known as UTX), encoding a histone 3 lysine 27 (H3K27) demethylase, were also seen in Group 4 (4 out of 40, 10%), indicating a tumour-suppressive role in this subgroup, as previously described for other cancers17. CTDNEP1 (a homologue of the Xenopus gene dullard), was also affected by truncating alterations in four tumours. In three of these cases, the mutation was accompanied by loss of the wild-type allele through isodicentric 17q formation. This gene, encoding a nuclear envelope phosphatase, was shown in Xenopus to have roles in BMP signalling and neural development18. In mammalian cells it is involved in the lipin activation pathway, regulating nuclear membrane biogenesis and production of diacylglycerol19,20. Given the high frequency of isodicentric 17q in medulloblastoma, genetic targets on this chromosome have long been sought after. CTDNEP1 may be a good candidate for one of the medulloblastoma tumour suppressors on 17p.

Aside from these subgroup-enriched events, a commonly recurring theme across all medulloblastomas is alterations in genes involved in chromatin modification. Some point mutations and DNA copy number alterations in this pathway have previously been implicated in medulloblastoma9,21. Overall, 45 out of 125 cases (36%) harboured a mutation in a gene categorized under the Gene Ontology term ‘Chromatin Modification’ (GO:0015168, Supplementary Fig. 6f, g).

We recently described an enrichment of catastrophic DNA rearrangements (‘chromothripsis’) in TP53-mutated SHH medulloblastomas6. Three new TP53-mutant SHH tumours were identified in this study: ICGC_MB23 (germline mutation), MBRep_T29 and MBRep_T53 (somatic mutations). Two of these, ICGC_MB23 and MBRep_T53, showed complex genomic rearrangements indicative of the chromothripsis model (Supplementary Fig. 8)22.

Deep sequencing also allowed fine mapping of two amplicons on chromosome 7 in ICGC_MB34 (a SHH tumour with a somatic TP53 mutation, relating to MB2034 in ref. 6). One amplicon included the entire SHH gene, whereas the second disrupted DNAJB6, such that its first exon was juxtaposed to SHH (Fig. 3a, b). RNA sequencing further revealed a novel fusion transcript, not expected from the DNA data, containing the first exon of DNAJB6 and exons 2 and 3 of SHH. The first exon of SHH was skipped, resulting in a predicted amino-terminally truncated SHH protein (Fig. 3c). Expression of SHH was extremely high in this case, although virtually absent in 301 other medulloblastomas (Supplementary Fig. 9a). Predicted DNA and RNA junctions were validated by PCR (Supplementary Fig. 9b).

Figure 3: Identification of novel fusion genes in medulloblastoma.
figure 3

a, Read-depth plot with log2 tumour:germline coverage ratio showing alterations on chromosome 7 in ICGC_MB34. Lines indicate connected segments. b, Schematic of the rearrangement. c, Details of the SHH fusion gene structure and support for its expression, derived from RNA sequencing data. aa, amino acids.

PowerPoint slide

Several additional in-frame gene fusions were identified by large insert mate-pair sequencing, which gives better resolution for structural variant detection. ICGC_MB18, for example, carried an intrachromosomal translocation resulting in a fusion between LCLAT1 and ERBB4, the latter of which has previously been associated with medulloblastoma oncogenesis23 (Supplementary Fig. 9c–f). In ICGC_MB6, a complex rearrangement of fragments from chromosomes 1 and 17 produced a fusion between MLLT6 and MRPL45, a mitochondrial ribosomal protein, resulting in strong overexpression of the latter (Supplementary Fig. 10a–c). These findings indicate that gene fusions involving well-established medulloblastoma oncogenes may have a more important role in medulloblastoma than previously recognized, and warrant further investigation.

High-coverage, strand-specific RNA sequencing of 28 cases allowed us to determine the proportion of DNA SNVs that were observable in the transcriptome (Supplementary Tables 3 and 4). Overall, 129 out of 268 (48%) non-synonymous mutations in the DNA were also detectable at the RNA level. A further 38% (101 out of 268) resided in genes expressed at extremely low abundance (reads per kilobase of exon model per million mapped reads (RPKM) < 1). Thus, the fraction of expressed mutations is even smaller than the already low number of DNA alterations, supporting the hypothesis that very few driving hits are needed to generate this paediatric tumour. It may also be the case that some mutations required for tumour initiation are not essential for later tumour cell maintenance.

RNA sequencing further revealed monoallelic expression of a heterozygous mutation in TBR1, producing a p.G275C change, which was also seen in a previous study9 (Supplementary Fig. 11a). TBR1 encodes a T-box transcription factor involved in brain development24. This gene, and a second family member, EOMES (or TBR2), clearly showed subgroup-specific differential expression (Fig. 4a). Sequencing of TBR1 exon 2 in a further 85 medulloblastomas revealed one additional case with an identical mutation. All three mutated tumours were in Group 4. Gene expression was also strongly correlated with DNA methylation for both TBR1 and EOMES (Fig. 4b, c and Supplementary Fig. 11b, c), and expression of TBR1 and EOMES is inversely correlated in Group 4 tumours (Fig. 4d), giving subsets that are either TBR1-methylated and EOMEShi or EOMES-methylated and TBR1hi (Supplementary Fig. 11d, e). These two genes are markers for different stages of neuronal lineage commitment, suggesting possible differences in cell-of-origin or differentiation within Group 4 subpopulations25.

Figure 4: Integration of mutation, expression and methylation data shows differential regulation of TBR1 and EOMES in medulloblastoma.
figure 4

a, Microarray data showing clear differences in TBR1 and EOMES expression between medulloblastoma subgroups (n = 301). b, DNA methylation of TBR1 (n = 54), ranging from low (blue) to high (red). Horizontal red bar indicates the region used for correlation analysis in c. c, Expression of TBR1 is tightly correlated with gene methylation (n = 54; Pearson’s correlation values, r). SHH tumours show high methylation and virtually no expression, whereas WNT, Group 3 and Group 4 tumours display a more varied pattern. d, Expression levels of TBR1 (diamonds) and EOMES (circles) are inversely related in Group 4 tumours (n = 104).

PowerPoint slide

This large, integrative genomics study has provided a detailed insight into new mechanisms contributing to medulloblastoma tumorigenesis and disclose novel targets for therapeutic approaches, especially for Group 3 and 4 patients. The molecular subgroup-related enrichment of many alterations highlights the importance of considering this distinguishing factor in research, trial design and clinical practice.

Methods Summary

All patient material was collected after receiving informed consent according to ICGC guidelines and as approved by the institutional review board of contributing centres. Tumour subgrouping was based on gene expression profiling or immunohistochemical analysis as described in ref. 5.

Next generation sequencing was performed using Illumina technologies. Mean DNA sequence coverage was 35-fold for whole-genome cases (range 26–56×), whereas mean on-target coverage in the whole-exome and replication cohorts was 68-fold (74% of targets above 20× for whole exome, 66% for the replication cohort). Exome capture was carried out with Agilent SureSelect (Human All Exon 50 Mb and XT Custom Library) in-solution reagents. Sequence data were aligned to the hg19 human reference genome assembly; duplicate and non-uniquely mapping reads were excluded. Tumour ploidy was predicted from sequencing data by a novel approach integrating copy number aberrations with allele frequencies. A subset of sequence variants were validated using PCR and Sanger sequencing. Verification rates were 95% (128 out of 135) for SNVs and 100% (14 out of 14) for Indels (Supplementary Tables 3 and 4). A complete description of the materials and methods is provided in the Supplementary Information.