Recurrent noncoding U1 snRNA mutations drive cryptic splicing in SHH medulloblastoma

Abstract

In cancer, recurrent somatic single-nucleotide variants—which are rare in most paediatric cancers—are confined largely to protein-coding genes1,2,3. Here we report highly recurrent hotspot mutations (r.3A>G) of U1 spliceosomal small nuclear RNAs (snRNAs) in about 50% of Sonic hedgehog (SHH) medulloblastomas. These mutations were not present across other subgroups of medulloblastoma, and we identified these hotspot mutations in U1 snRNA in only <0.1% of 2,442 cancers, across 36 other tumour types. The mutations occur in 97% of adults (subtype SHHδ) and 25% of adolescents (subtype SHHα) with SHH medulloblastoma, but are largely absent from SHH medulloblastoma in infants. The U1 snRNA mutations occur in the 5′ splice-site binding region, and snRNA-mutant tumours have significantly disrupted RNA splicing and an excess of 5′ cryptic splicing events. Alternative splicing mediated by mutant U1 snRNA inactivates tumour-suppressor genes (PTCH1) and activates oncogenes (GLI2 and CCND2), and represents a target for therapy. These U1 snRNA mutations provide an example of highly recurrent and tissue-specific mutations of a non-protein-coding gene in cancer.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Highly recurrent mutations of the U1 snRNAs in SHH medulloblastoma.
Fig. 2: Mutational repertoire of snRNA-mutant SHH medulloblastomas.
Fig. 3: Clinical and cytogenetic features of SHH medulloblastomas with mutant U1 snRNA.
Fig. 4: Aberrant splicing of Hedgehog signalling genes in SHH medulloblastoma with mutant U1 snRNA.

Data availability

Sequencing data have been deposited in the European Genome–Phenome Archive (EGA) and Gene Expression Omnibus (GEO): RNA-seq (EGAD00001001899 and EGAD00001004958), whole-genome sequences (EGAD00001003125 and EGAD00001004347) and RNA-seq of exogenous expression analyses (GSE128005). Materials used in this study are available from the following: GENCODE (ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz), ICGC (https://icgc.org/), hs37d5 reference (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence), Burrows–Wheeler aligner (bwa) (http://bio-bwa.sourceforge.net/), Mutect2 (https://software.broadinstitute.org/gatk/), EBCall (https://github.com/friend1ws/EBCall), Varscan2 (http://dkoboldt.github.io/varscan/), Strelka (https://github.com/Illumina/strelka), SomaticSniper (http://gmt.genome.wustl.edu/packages/somatic-sniper/), Virmid (https://sourceforge.net/p/virmid/wiki/Home/), Platypus (http://www.well.ox.ac.uk/platypus), Seurat (https://sites.google.com/site/seuratsomatic/home), ENCODE (https://www.encodeproject.org/), PennCNV (http://penncnv.openbioinformatics.org/en/latest/), Database of Genomic Variants (http://dgv.tcag.ca/dgv/app/home), Genomon Project (https://github.com/Genomon-Project), SpliceRack (http://katahdin.mssm.edu/splice/index.cgi?database=spliceNew) and GEO (https://www.ncbi.nlm.nih.gov/geo/)

References

  1. 1.

    Pugh, T. J. et al. Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations. Nature 488, 106–110 (2012).

  2. 2.

    Jones, D. T. et al. Dissecting the genomic complexity underlying medulloblastoma. Nature 488, 100–105 (2012).

  3. 3.

    Northcott, P. A. et al. The whole-genome landscape of medulloblastoma subtypes. Nature 547, 311–317 (2017).

  4. 4.

    Northcott, P. A., Korshunov, A., Pfister, S. M. & Taylor, M. D. The clinical implications of medulloblastoma subgroups. Nat. Rev. Neurol. 8, 340–351 (2012).

  5. 5.

    Northcott, P. A. et al. Medulloblastoma comprises four distinct molecular variants. J. Clin. Oncol. 29, 1408–1414 (2011).

  6. 6.

    Taylor, M. D. et al. Molecular subgroups of medulloblastoma: the current consensus. Acta Neuropathol. 123, 465–472 (2012).

  7. 7.

    Cavalli, F. M. G. et al. Intertumoral heterogeneity within medulloblastoma subgroups. Cancer Cell 31, 737–754 (2017).

  8. 8.

    Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013).

  9. 9.

    Gan, K. A., Carrasco Pro, S., Sewell, J. A. & Fuxman Bass, J. I. Identification of single nucleotide non-coding driver mutations in cancer. Front. Genet. 9, 16 (2018).

  10. 10.

    Manser, T. & Gesteland, R. F. Human U1 loci: genes for human U1 RNA have dramatically similar genomic environments. Cell 29, 257–264 (1982).

  11. 11.

    Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).

  12. 12.

    Shen, S. et al. rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-seq data. Proc. Natl Acad. Sci. USA 111, E5593–E5601 (2014).

  13. 13.

    Lee, J. H., You, J., Dobrota, E. & Skalnik, D. G. Identification and characterization of a novel human PP1 phosphatase complex. J. Biol. Chem. 285, 24466–24476 (2010).

  14. 14.

    Tessema, M. et al. Differential epigenetic regulation of TOX subfamily high mobility group box genes in lung and breast cancers. PLoS ONE 7, e34850 (2012).

  15. 15.

    Kogerman, P. et al. Alternative first exons of PTCH1 are differentially regulated in vivo and may confer different functions to the PTCH1 protein. Oncogene 21, 6007–6016 (2002).

  16. 16.

    Sasaki, H., Nishizaki, Y., Hui, C., Nakafuku, M. & Kondoh, H. Regulation of Gli2 and Gli3 activities by an amino-terminal repression domain: implication of Gli2 and Gli3 as primary mediators of Shh signaling. Development 126, 3915–3924 (1999).

  17. 17.

    Huard, J. M., Forster, C. C., Carter, M. L., Sicinski, P. & Ross, M. E. Cerebellar histogenesis is disturbed in mice lacking cyclin D2. Development 126, 1927–1935 (1999).

  18. 18.

    Kenney, A. M. & Rowitch, D. H. Sonic hedgehog promotes G1 cyclin expression and sustained cell cycle progression in mammalian neuronal precursors. Mol. Cell. Biol. 20, 9055–9067 (2000).

  19. 19.

    Mirzaa, G. et al. De novo CCND2 mutations leading to stabilization of cyclin D2 cause megalencephaly–polymicrogyria–polydactyly–hydrocephalus syndrome. Nat. Genet. 46, 510–515 (2014).

  20. 20.

    Dvinge, H., Kim, E., Abdel-Wahab, O. & Bradley, R. K. RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer 16, 413–430 (2016).

  21. 21.

    Kim, E. et al. SRSF2 mutations contribute to myelodysplasia by mutant-specific effects on exon recognition. Cancer Cell 27, 617–630 (2015).

  22. 22.

    Mullighan, C. G. et al. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature 446, 758–764 (2007).

  23. 23.

    Seiler, M. et al. H3B-8800, an orally available small-molecule splicing modulator, induces lethality in spliceosome-mutant cancers. Nat. Med. 24, 497–504 (2018).

  24. 24.

    Morrissy, A. S. et al. Divergent clonal selection dominates medulloblastoma at recurrence. Nature 529, 351–357 (2016).

  25. 25.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

  26. 26.

    Shiraishi, Y. et al. An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data. Nucleic Acids Res. 41, e89 (2013).

  27. 27.

    Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).

  28. 28.

    Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–1817 (2012).

  29. 29.

    Larson, D. E. et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28, 311–317 (2012).

  30. 30.

    Kim, S. et al. Virmid: accurate detection of somatic mutations with sample impurity inference. Genome Biol. 14, R90 (2013).

  31. 31.

    Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat. Genet. 46, 912–918 (2014).

  32. 32.

    Christoforides, A. et al. Identification of somatic mutations in cancer through Bayesian-based analysis of sequenced genome pairs. BMC Genomics 14, 302 (2013).

  33. 33.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  34. 34.

    Boeva, V. et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 28, 423–425 (2012).

  35. 35.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  36. 36.

    Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

  37. 37.

    Kalvari, I. et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res. 46, D335–D342 (2018).

  38. 38.

    Darty, K., Denise, A. & Ponty, Y. VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 25, 1974–1975 (2009).

  39. 39.

    Sheth, N. et al. Comprehensive splice-site analysis using comparative genomics. Nucleic Acids Res. 34, 3955–3967 (2006).

  40. 40.

    Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647 (2017).

  41. 41.

    Northcott, P. A. et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488, 49–56 (2012).

  42. 42.

    Pei, Y. et al. HDAC and PI3K antagonists cooperate to inhibit growth of MYC-driven medulloblastoma. Cancer Cell 29, 311–323 (2016).

  43. 43.

    Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

  44. 44.

    Zhukova, N. et al. Subgroup-specific prognostic implications of TP53 mutation in medulloblastoma. J. Clin. Oncol. 31, 2927–2935 (2013).

  45. 45.

    Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

  46. 46.

    Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).

  47. 47.

    Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).

  48. 48.

    Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

Download references

Acknowledgements

M.D.T. is supported by the NIH (R01CA148699 and R01CA159859), The Pediatric Brain Tumour Foundation, The Terry Fox Research Institute, The Canadian Institutes of Health Research, The Cure Search Foundation, b.r.a.i.n.child, Meagan’s Walk, Genome Canada, Genome BC, Genome Quebec, the Ontario Research Fund, Worldwide Cancer Research, V-Foundation for Cancer Research and the Ontario Institute for Cancer Research, through funding provided by the Government of Ontario. M.D.T. is also supported by a Canadian Cancer Society Research Institute Impact grant and by a Stand Up To Cancer (SU2C) St. Baldrick’s Pediatric Dream Team Translational Research Grant (SU2C-AACR-DT1113) and SU2C Canada Cancer Stem Cell Dream Team Research Funding (SU2C-AACR-DT-19-15) provided by the Government of Canada through Genome Canada and the Canadian Institutes of Health Research, with supplementary support from the Ontario Institute for Cancer Research through funding provided by the Government of Ontario. SU2C is a programme of the Entertainment Industry Foundation administered by the American Association for Cancer Research. M.D.T. is also supported by the Garron Family Chair in Childhood Cancer Research at the Hospital for Sick Children and the University of Toronto. E.G.V.M. is supported by the NIH (R01-NS096236 and R01CA235162) and the CURE Childhood Cancer Foundation. X.S.P. is supported by Ministerio de Economía y Competitividad (MINECO) (SAF2013-45836-R). A.K. was supported by 2017-1.2.1-NKP-2017-00002 National Brain Research Program NAP 2.0. M.L.G. is supported by AIRC (Italian Association for Cancer Research) and by Fondazione Berlucchi. H.S. is a recipient of a Research Fellowship (Astellas Foundation for Research on Metabolic Disorders). S.A.K. is a recipient of funding from the Restracomp Research Fellowship (SickKids Research Institute) and the MD/PhD Studentship Award (Canadian Institute of Health Research). A.D.-N. is a recipient of the Department of Education of the Basque Government (PRE_2017_1_0100). J.R. is supported by Genome Canada Genome Technology Platform Grant 12505, Canada Foundation for Innovation Project 33408. Computations were partially performed on the NIG supercomputer at ROIS National Institute of Genetics and on the Niagara supercomputer at the SciNet HPC Consortium. SciNet is funded by the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund - Research Excellence; and the University of Toronto. The authors would also like to thank M. Kamal and J. Loukides, and recognize the Labatt Brain Tumour Research Centre.

Author information

Affiliations

Authors

Contributions

M.D.T. led the study. H.S., S.S. and S.D.B. performed whole-genome sequencing analysis (Figs. 1, 2, Extended Data Figs. 13). F.M.G.C., N.G., J.R. and A.S.M. contributed to the pre-processing of RNA-seq data. H.S. and H.F. contributed to SNP6 copy-number analyses (Extended Data Figs. 1, 4c, d). H.S., S.S., F.M.G.C., I.S. and J.Z. contributed to RNA expression analyses (Extended Data Fig. 7a). H.S., S.S., I.S., A.F., S.D.B. and O.A. contributed to alternative splicing analyses (Fig. 4a–d, Extended Data Figs. 510). H.S. and V.R. performed clinical analysis (Fig. 3, Extended Data Fig. 4e–i). A.G. and M.A.M. helped with bioinformatics analyses and provided expert advice. S.A.K., P.D.A., K.J. and M.C.V. performed real-time PCR and qPCR analyses (Fig. 4e, f, Extended Data Figs. 9d, 10e, l). S.A.K., A.D.-N., A.G.-F., P.D.A., K.J., I.S., N.A., D.P., A.M., J.W., W.D., R.J.W.-R. and X.S.P. contributed to exogenous expression experiments (Extended Data Fig. 7b–d). S.A.K., K.J. and I.S. performed rhAMP SNP experiments (Figs. 1, 3a–c, Extended Data Figs. 1, 4a, b). P.S. and B. Luu contributed to the collection and processing of human tissue samples. C.D., X.W., R.J.W.-R., L.G., X.H., X.S.P., J. A. Chan and L.S. provided expert advice for experiments. S-K.K., W.A.G., A.J., M.F-M., M.L.G., A.A.N.R., C.G., J.M.K., P.J.F., N.J., H.-K.N., W.S.P., C.G.E., I.F.P., J.M.O., W.A.W., T.K., E.L.-A., B. Lach, M.M., E.G.V.M., J.B.R., R.V., L.B.C., N.K., A.K., L.B., J. A. Calarco, C.C.F., S.M.P., L.G. and D.M. provided patient material and helped design the study. H.S., S.A.K., S.S., J. A. Calarco, L.S. and M.D.T. prepared the manuscript and figures.

Corresponding author

Correspondence to Michael D. Taylor.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Peer review information Nature thanks Rotem Karni, Brandon Wainwright and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Fig. 1 Overview of analysed cohorts and methods.

a, The detection methods for U1 snRNA mutations by each cohort, and comparison methods for alternative splicing analysis. b, Cohort specification. c, Subgroup distribution of whole-genome sequencing (WGS) cohorts.

Extended Data Fig. 2 U11 snRNA mutations, and conservation of U1 and U11 snRNA genes across evolution.

a, Seed sequences of the U1 snRNA obtained from the Rfam database demonstrate high-level conservation across a variety of eukaryotic species, particularly at the site of the SHH medulloblastoma mutation. The consensus sequence and first 50 nucleotides of reference sequences are included for comparison. Grey indicates nucleotide differences, and red identifies the SHH medulloblastoma hotspot mutation. b, Cartoon illustrating the number of somatic mutations in the U11 snRNA genes. Sequence conservation scores for U11 snRNA, as determined using the Rfam database. c, Secondary structure of the mutant U11 snRNA. The red circle identifies the location of the hotspot mutation. The yellow and green rectangles indicate the 5′ splice-site recognition site and the Sm protein-binding site, respectively. Numerals I to IV indicate stem loops. d, Seed sequences of the U11 snRNA obtained from the Rfam database demonstrate high-level conservation across a variety of eukaryotic species, particularly at the site of the SHH medulloblastoma mutation. The consensus sequence and first 30 nucleotides of reference sequences are included for comparison. Grey indicates nucleotide differences, and red identifies the SHH medulloblastoma hotspot mutation.

Extended Data Fig. 3 High levels of genomic conservation surrounding human U1 snRNAs complicate the specific PCR amplification of any individual locus.

a, Genomic locations of the 4 expressed U1 snRNA genes (on chromosome 1p, red) and 136 pseudogenes across the Homo sapiens genome, as indicated. Three pseudogenes with sequences that are identical (hg19) to the expressed U1 snRNA genes are indicated in orange. b, Average mapping quality of bwa-mem and coverage of each of the expressed U1 and U11 snRNA genes from whole-genome sequencing of germline samples from patients with medulloblastoma are illustrated (n = 341 patients). Blue bars represent the alignability of 100-mers using the GEnome Multitool (GEM) mapper from ENCODE and Centre for Genomic Regulation (CRG). Regions that are >1,000 bases upstream and downstream are shown on a log10 scale. Red bar indicates the gene body. c, Average number of multimapped reads overlapped for each gene pair using STAR aligner. The heat map shows the average number of mapped reads across the whole-genome sequencing of germline samples from patients with medulloblastoma (n = 341 patients). d, Sequence similarity of U1 snRNA genes, U1 snRNA pseudogenes with 164 identical base pairs and the U11 snRNA gene. The numbers in each square and heat map indicate identity scores and bit scores calculated using blast software. A blank square indicates that no hit was found.

Extended Data Fig. 4 Allele-specific rhAmp SNP PCR of RNU1 loci, copy-number changes in SHH medulloblastoma with mutant U1 snRNA versus wild-type U1 snRNA, and prognostic analysis.

a, The frequency of any U1 snRNA mutation in the RNU1_batch primer set (RNU1-1, RNU1-2, RNU1-3, RNU1-4 and RNVU1-18) (left) and the RNU1_pseudo primer set (RNU1-27P and RNU1-28P) (right). b, Hotspot mutations of RNU1-27P or RNU1-28P U1 snRNA pseudogenes, as confirmed by Sanger sequencing. c, Broad copy-number aberrations in SHHα with wild-type U1 snRNA (n = 25), SHHα with mutant U1 snRNA (n = 8) and SHHδ with mutant U1 snRNA (n = 41). Dark blue and dark red bars, as well as asterisks, identify statistically significant regions, comparing SHHα with mutant versus wild-type U1 snRNA (P < 0.05, two-sided Fisher’s exact test). d, Significant focal copy-number aberrations in SHHα with wild-type U1 snRNA (n = 25), SHHα with mutant U1 snRNA (n = 8) and SHHδ with mutant U1 snRNA (n = 41) illustrate significant genomic differences between cases with wild-type and mutant U1 snRNA. Candidate target genes within the corresponding loci are indicated. q values were calculated using GISTIC (Methods). eg, Overall survival of patients with SHHα, stratified by mutational status of U1 snRNA mutation (n = 10 for mutant, n = 27 for wild type) (e), TP53 (n = 15 for mutant, n = 22 for wild type) (f) or both (n = 9 for both mutant, n = 1 for U1 snRNA mutation only, n = 6 for TP53 mutation only, n = 21 for both wild type) (g). P values were determined using the two-sided log-rank test. h, i, Progression-free survival (h) and overall survival (i) stratified by U1 snRNA mutation and SHH subtype (n = 10 for SHHα with mutant U1 snRNA, n = 27 for SHHα with wild-type U1 snRNA, n = 23 for SHHβ with wild-type U1 snRNA, n = 24 for SHHγ with wild-type U1 snRNA, n = 46 for SHHδ with mutant U1 snRNA). P values were determined using the two-sided log-rank test. +, censored case.

Extended Data Fig. 5 Intron-centric analysis of SHHδ medulloblastomas.

a, Quantification of alternative splicing events by SHH subtype, as detected by intron-centric alternative splicing analysis (n = 30 of each subtype). Bar plot shows adjusted standardized residual of included alternative splicing events. Positive values indicate a relatively higher number, and negative values indicate a relatively lower number among subtypes. b, Volcano plots of alternative splicing events (n = 30 of each subtype). Significant events (FDR < 0.01 and absolute log effect size > 1.5 calculated using LeafCutter (Methods)) are illustrated by colour. Alternative splicing events of PTCH1 and GLI2 with the highest effect size are annotated. c, Splice-site sequences of included alternative splicing events by subtype (n = 30 of each subtype). Asterisk denotes nucleotide sites with q value < 10−2 (χ2 test and Benjamini–Hochberg method).

Extended Data Fig. 6 Intron-centric analysis of SHHα medulloblastomas.

a, b, Quantification (a) and proportion (b) of alternative splicing events between SHHα medulloblastoma with mutant U1 snRNA (n = 13), and SHHα medulloblastoma with wild-type U1 snRNA (n = 39) as detected by intron-centric alternative splicing analysis. P value was calculated by χ2 test. c, Volcano plots of alternative splicing events (n = 13 for SHHα with mutant U1 snRNA, n = 39 for SHHα with wild-type U1 snRNA). The x axis shows the difference of per cent spliced in calculated using LeafCutter. Significant events (FDR < 0.01 and absolute log effect size > 1.5, calculated by LeafCutter (Methods)) are illustrated by colour. d, Splice-site sequences of included alternative splicing events in SHHα with mutant U1 snRNA (n = 13 for SHHα with mutant U1 snRNA, n = 39 for SHHα with wild-type U1 snRNA). Size and colour for each circle indicate the q values and Cramer’s V values for each nucleotide position (q values were calculated by χ2 test and Benjamini–Hochberg method; the precise values are given in Supplementary Table 11). e, Residual analysis of 5′ splice-site sequences of annotated and cryptic 5′ alternative splicing (n = 13 for SHHα with mutant U1 snRNA, n = 39 for SHHα with wild-type U1 snRNA). The size and colour of each circle denote the two-sided P value, and adjusted standardized residual calculated by Haberman’s method. The precise values are given in Supplementary Table 11.

Extended Data Fig. 7 Nonsense-mediated decay pathway in SHH medulloblastoma with mutant U1 snRNA, and exogenous expression analyses.

a, Enrichment plots of ‘GO nuclear transcribed mRNA catabolic process nonsense mediated decay’ by GSEA between SHHδ with mutant U1 snRNA (n = 30) and other subtypes of SHH medulloblastoma with wild-type U1 snRNA (n = 90), and SHHα with mutant U1 snRNA (n = 13) and SHHα with wild-type U1 snRNA (n = 39). P values were calculated using gsea v.3.0 (Methods). b, Quantification of alternative splicing events between HEK-293T cells with mutant U1 snRNA and HEK-293T cells with wild-type U1 snRNA, as detected by intron-centric alternative splicing analysis. c, Splice-site sequences of included alternative splicing events in HEK-293T cells with mutant U1 snRNA. Asterisk denotes nucleotide sites with q value < 10−2 (χ2 and Benjamini–Hochberg method). d, Comparison of the extent of overlap between detected alternative splicing events by SHH medulloblastoma (either of SHHα or of SHHδ) with mutant U1 snRNA, SHH medulloblastoma (either of SHHα, SHHβ or SHHγ) with wild-type U1 snRNA and HEK-293T cells with mutant U1 snRNA exogenous expression. Left, alternatively spliced events with cryptic 5′ sites. Right, alternatively spliced events with cryptic 5′ sites and C base at the 6th intron. e, Alternative splicing signatures by t-SNE analysis. Left, the per cent spliced-in values of detected cryptic 5′ alternative splicing events, with a ‘C’ nucleotide at the 6th base in the intron from the 5′ splice site. Top right, per cent spliced-in values of all cryptic 5′ alternative-splicing events. Bottom right, per cent spliced-in values of all alternative splicing events.

Extended Data Fig. 8 Retained introns inactivate tumour-suppressor genes in tumours with U1 snRNA r.3A>G mutation.

a, Illustration of the different types of alternative splicing events analysed using rMATS (n = 30 of each subtype). Red arrows indicate expected 5′ prime sites recognized by the mutant U1 snRNA. b, Quantification of alternative splicing events by subtype of SHH medulloblastoma, as detected by exon-centric alternative splicing analysis. c, Scatter plots of alternative splicing events (n = 30 of each subtype). The x axis shows the difference of per cent spliced in (psi) calculated using rMATS. Different types of significant event (FDR < 0.01 and absolute differential psi > 0.05 calculated using rMATS (Methods)) are illustrated by different colours, as annotated. d, Splice-site sequences of alternative 5′ splice site (A5SS), included cassette exon (CE) and included retained intron (RI) events in SHHδ with mutant U1 snRNA (n = 30). Each event corresponds to a red arrow cartoon in a. Asterisk denotes nucleotide sites with q value < 10−2 (χ2 test and Benjamini–Hochberg method). e, Distribution of per cent spliced in for PAX6 based on U1 snRNA mutation status (n = 13 for SHHα with mutant U1 snRNA, n = 30 for SHHδ with mutant U1 snRNA, n = 99 for SHH medulloblastoma with wild-type U1 snRNA (n = 90) and normal brain tissue (n = 9)). Dashed line defines threshold that divides the dataset into two groups (k-means method). The table displays the number of samples above the threshold (high) or below (low) based on mutational status. P value was calculated using two-sided Fisher’s exact test compared to samples with wild-type U1 snRNA. Samples with mutant U1 snRNA are indicated in pink, and wild-type samples in blue. f, Sashimi plot of splicing of PAX6 based on mutational status determined by exon-centric alternative splicing analysis (rMATS). The bar plot shows modified FPKM. Numbers refer to average junctional reads across all samples. Annotated exon tracks are shown below, with genomic positions marked. g, Distribution of per cent spliced in for TOX4 based on U1 snRNA mutation status (n = 13 for SHHα with mutant U1 snRNA, n = 30 for SHHδ with mutant U1 snRNA, n = 99 for SHH medulloblastoma with wild-type U1 snRNA (n = 90) and normal brain tissue (n = 9)). Dashed line defines threshold that divides the dataset into two groups (k-means method). The table displays the number of samples above the threshold (high) or below (low) based on mutational status. P value was calculated using two-sided Fisher’s exact test compared to samples with wild-type U1 snRNA. Samples with mutant U1 snRNA are indicated in pink, and wild-type samples in blue. h, Sashimi plot of splicing of TOX4 based on mutational status determined by exon-centric alternative splicing analysis (rMATS). The bar plot shows modified FPKM. Numbers refer to average junctional reads across all samples. Annotated exon tracks are shown below, with genomic positions marked.

Extended Data Fig. 9 Exon-centric analysis of SHHα medulloblastomas and overlapped splicing events.

a, Quantification of alternative splicing events between SHHα medulloblastoma with mutant U1 snRNA and SHHα medulloblastoma with wild-type U1 snRNA, as detected by exon-centric alternative splicing analysis. b, Scatter plots of alternative splicing events (n = 13 for SHHα with mutant U1 snRNA, n = 39 for SHHα with wild-type U1 snRNA). The x axis shows the difference of per cent spliced in calculated using rMATS. Different types of significant event (FDR < 0.01 and absolute differential psi > 0.05 calculated using rMATS (Methods)) are illustrated by different colours, as annotated. c, Splice-site sequences of alternative 5′ splice sites, included cassette exon and included retained intron events in SHHα medulloblastoma with mutant U1 snRNA and SHHα medulloblastoma with wild-type U1 snRNA. Each event corresponds to a red arrow cartoon in Extended Data Fig. 8a. Asterisk denotes nucleotide sites with q value < 10−2 (χ2 test and Benjamini–Hochberg method). d, Box plot of fold changes in expression of the alternatively spliced isoform as compared to the wild-type isoform in subsets of SHH medulloblastoma, as determined by real-time qPCR. In the box plots, the centre lines show data median; box limits indicate the IQR from the 25th and 75th percentiles; lower and upper whiskers extend 1.5× the IQR. Outliers are represented by individual points. P values were calculated using two-sided Wilcoxon rank-sum test. e, Comparison of the extent of overlap between splicing events by subtype of SHH medulloblastoma and U1 snRNA mutational status. Effect sizes are calculated by LeafCutter with an absolute effect-size threshold of 1.5.

Extended Data Fig. 10 Aberrant splicing of oncogenes and tumour-suppressor genes in SHH medulloblastoma with mutant U1 snRNA.

a, Overview of cryptic alternative splicing of GLI2, demonstrating the position of a cryptic cassette exon with the 5′ splice-site sequence. b, Sashimi plot of splicing of GLI2 in representative cases. The bar plot shows counts per million reads. Numbers are of junctional reads; reads for the mutant U1 snRNA isoform are in red. c, Scatter plot comparing detected alternatively spliced read and total junction reads that share a 3′ splice site. Jittering was performed for both values. d, Per cent spliced in values for SHHα with mutant U1 snRNA, SHHδ with mutant U1 snRNA and SHH medulloblastoma with wild-type U1 snRNA (all subtypes of SHH medulloblastoma). e, Box plot of fold changes in expression of the alternatively spliced isoform as compared to the wild-type isoform of GLI2 in subsets of SHH medulloblastoma, determined by real-time qPCR. f, Illustration of canonical and cryptic isoforms of GLI2. Translation start sites are indicated by an ATG arrow. Resulting proteins (and sizes) are displayed for each isoform. Repression and activation domains are indicated in blue and orange, respectively. g, Overview of cryptic alternative splicing of CCND2, illustrating the position of a cryptic cassette exon with the 5′ splice-site sequence. h, Sashimi plot of representative cases demonstrates alternative splicing at the CCND2 locus. Numbers illustrate junctional reads. Junctional reads specific to U1 snRNA mutants are in red. i, The canonical isoform and the cryptic isoform of CCND2. j, Scatter plot comparing detected alternatively spliced read and total junction reads that shared a 3′ splice site. Jittering was performed for both values. k, Per cent spliced in values for U1-mutant SHHα (n = 13), U1-mutant SHHδ (n = 58), and U1-wildtype SHH (all SHH subtypes, n = 104). l) Real-time qPCR comparing the expression of the cryptic isoform of CCND2 demonstrates high levels of expression of CCND2 restricted to SHHδ cases (n = 6 for SHHα with mutant U1 snRNA, n = 6 for SHHδ with mutant U1 snRNA, n = 6 for SHHα with wild-type U1 snRNA). m, Overview of cryptic alternative splicing of PAX5, illustrating the position of a cryptic cassette exon with the 5′ splice-site sequence. n, Sashimi plot of representative cases demonstrates alternative splicing at the PAX5 locus. Numbers denote junctional reads. Junctional reads specific to U1 snRNA mutants are in red. o, The canonical isoform and the cryptic isoform of PAX5. p, Scatter plot comparing detected alternatively spliced read and total junction reads that shared a 3′ splice site. Jittering was performed for both values. q, Per cent spliced in values by SHHα with mutant U1 snRNA (n = 5), SHHδ with mutant U1 snRNA (n = 27) and SHH medulloblastoma with wild-type U1 snRNA (all subtypes of SHH medulloblastoma, n = 7). In all box plots, centre lines show data median; box limits indicate the IQR from the 25th and 75th percentiles; lower and upper whiskers extend 1.5× the IQR. Outliers are represented by individual points. P values were calculated using two-sided Wilcoxon rank-sum tests. Source data

Supplementary information

Supplementary Information

This file contains a guide for Supplementary Tables 1-21 (Supplementary Table files supplied separately).

Reporting Summary

Supplementary Tables

This file contains Supplementary Tables 1-21 – see Supplementary Information document for full guide.

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Suzuki, H., Kumar, S.A., Shuai, S. et al. Recurrent noncoding U1 snRNA mutations drive cryptic splicing in SHH medulloblastoma. Nature 574, 707–711 (2019). https://doi.org/10.1038/s41586-019-1650-0

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.