The high background tumor mutation burden in cutaneous melanoma limits the ability to identify significantly mutated genes (SMGs) that drive this cancer. To address this, we performed a mutation significance study of over 1,000 melanoma exomes, combined with a multi-omic analysis of 470 cases from The Cancer Genome Atlas. We discovered several SMGs with co-occurring loss-of-heterozygosity and loss-of-function mutations, including PBRM1, PLXNC1 and PRKAR1A, which encodes a protein kinase A holoenzyme subunit. Deconvolution of bulk tumor transcriptomes into cancer, immune and stromal components revealed a melanoma-intrinsic oxidative phosphorylation signature associated with protein kinase A pathway alterations. We also identified SMGs on the X chromosome, including the RNA helicase DDX3X, whose loss-of-function mutations were exclusively observed in males. Finally, we found that tumor mutation burden and immune infiltration contain complementary information on survival of patients with melanoma. In summary, our multi-omic analysis provides insights into melanoma etiology and supports contribution of specific mutations to the sex bias observed in this cancer.
Subscribe to Journal
Get full journal access for 1 year
only $8.67 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Previously published melanoma somatic variants that were reanalyzed in this study are available from the associated publications: Hodis et al.2, Krauthammer et al.19 and Van Allen et al.20. The human melanoma data generated by the TCGA Research Network (http://cancergenome.nih.gov/) can be accessed from the GDC Data Portal (https://portal.gdc.cancer.gov/), after approval for dbGap Study Accession phs000178 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000178.v10.p8), due to the presence of personally identifiable information, such as patient germline DNA variants. The following mutation annotation format files (MAFs) were used: TCGA.SKCM.muse.4cd49f89-d7e2-4333-9872-0bff5327c896.protected.maf; TCGA.SKCM.mutect.bd022199-d399-45db-8474-6dc1f3aad457.protected.maf; and TCGA.SKCM.somaticsniper.4ff8ab0f-1a75-44f6-af48-2b30fc6d5a08.protected.maf The somatic variants from 183 human melanoma whole genomes18 can be accessed from the International Cancer Genome Consortium data portal (https://dcc.icgc.org/releases/release_23/Projects/MELA-AU), without restriction. RNA-seq data from DDX3X knockdown in HT144 cell lines can be accessed from the Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra), using accession identifiers provided in Supplementary Table 14. eCLIP data and expression data from DDX3X knockdown in K562 and HepG2 human cell lines can be downloaded from the ENCODE portal (https://www.encodeproject.org/), using accession identifiers provided in Supplementary Table 16 and 15, respectively. Regions considered for neutral mutation rate estimation were defined using the following files available from Ensembl or the UCSC Genome Browser66 website:
ENCODE’s ETS transcription factor binding sites65 were downloaded from the UCSC Genome Browser website: https://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeRegTfbsClustered/wgEncodeRegTfbsClusteredWithCellsV3.bed.gz. CCLE cell lines gene expression data was obtained from: https://data.broadinstitute.org/ccle/CCLE_DepMap_18q3_RNAseq_reads_20180718.gct. Cell line annotations were obtained from DepMap (DepMap-2018q4-celllines.csv): https://depmap.org/portal/download/all/?release=DepMap+Public+18Q4&file=CCLE_depMap_18Q4_TPM_transcripts_v3.csv. Gene lengths used for RPKM calculations were obtained from: ftp://ftp.ensembl.org/pub/release-86/gtf/homo_sapiens/Homo_sapiens.GRCh38.86.gtf.gz. The mutated genes pathway enrichment analysis was based on the EpiFactors68 database (downloaded on 21 January 2018, http://epifactors.autosome.ru/) and the Reactome67 database (downloaded on 20 January 2018, https://reactome.org/, ENSEMBL-to-pathways). The mRNA subgroups pathway enrichment analysis was based on MSigDB (v.6.2): https://www.gsea-msigdb.org/gsea/msigdb/index.jsp. We obtained transcript level expression (in TPM) for TCGA-SKCM from ref. 80: https://osf.io/gqrz9. Gene set enrichment analyses for DDX3X differential expression was based on the Reactome database (downloaded on 6 October 2019): https://reactome.org/. For the GISTIC2 analysis of recurrent focal copy-number alteration, we used the following reference file provided by the GDC: snp6.na35.liftoverhg38.txt.zip (https://gdc.cancer.gov/about-data/data-harmonization-and-generation/gdc-reference-files/). The COSMIC mutation signature definitions22 were downloaded from the DeconstructSigs website: https://github.com/raerose01/deconstructSigs/blob/master/data/signatures.exome.cosmic.v3.may2019.rda. The combined set of reannotated variants, excluding those protected by the TCGA, can be accessed at our GitHub repository: https://github.com/ianwatsonlab/multiomic_melanoma_study_2019.
Code related to the main findings of the study is available at GitHub at: https://github.com/ianwatsonlab/multiomic_melanoma_study_2019.
Bastian, B. C. The molecular pathology of melanoma: an integrated taxonomy of melanocytic neoplasia. Annu. Rev. Pathol. 9, 239–271 (2014).
Hodis, E. et al. A landscape of driver mutations in melanoma. Cell 150, 251–263 (2012).
Krauthammer, M. et al. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Nat. Genet. 44, 1006–1014 (2012).
Cancer Genome Atlas Network. Genomic classification of cutaneous melanoma. Cell 161, 1681–1696 (2015).
Brash, D. E. UV signature mutations. Photochem. Photobiol. 91, 15–26 (2015).
Joosse, A. et al. Superior outcome of women with stage I/II cutaneous melanoma: pooled analysis of four European Organisation for Research and Treatment of Cancer phase III trials. J. Clin. Oncol. 30, 2240–2247 (2012).
Joosse, A. et al. Sex is an independent prognostic indicator for survival and relapse/progression-free survival in metastasized stage III to IV melanoma: a pooled analysis of five European Organisation for Research and Treatment of Cancer randomized controlled trials. J. Clin. Oncol. 31, 2337–2346 (2013).
van Kempen, L. C. et al. The protein phosphatase 2A regulatory subunit PR70 is a gonosomal melanoma tumor suppressor gene. Sci. Transl. Med. 8, 369ra177 (2016).
Dees, N. D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Mularoni, L., Sabarinathan, R., Deu-Pons, J., Gonzalez-Perez, A. & Lopez-Bigas, N. OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations. Genome Biol. 17, 128 (2016).
Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).
Fredriksson, N. J. et al. Recurrent promoter mutations in melanoma are defined by an extended context-specific mutational signature. PLoS Genet. 13, e1006773 (2017).
Mao, P. et al. ETS transcription factors induce a unique UV damage signature that drives recurrent mutagenesis in melanoma. Nat. Commun. 9, 2626 (2018).
Perera, D. et al. Differential DNA repair underlies mutation hotspots at active promoters in cancer genomes. Nature 532, 259 (2016).
Sabarinathan, R., Mularoni, L., Deu-Pons, J., Gonzalez-Perez, A. & Lopez-Bigas, N. Nucleotide excision repair is impaired by binding of transcription factors to DNA. Nature 532, 264 (2016).
Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Hayward, N. K. et al. Whole-genome landscapes of major melanoma subtypes. Nature 545, 175–180 (2017).
Krauthammer, M. et al. Exome sequencing identifies recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas. Nat. Genet. 47, 996–1002 (2015).
Van Allen, E. M. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).
Trucco, L. D. et al. Ultraviolet radiation-induced DNA damage is prognostic for outcome in melanoma. Nat. Med. 25, 350–350 (2019).
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).
Curtin, J. A., Busam, K., Pinkel, D. & Bastian, B. C. Somatic activation of KIT in distinct subtypes of melanoma. J. Clin. Oncol. 24, 4340–4346 (2006).
Newell, F. et al. Whole-genome landscape of mucosal melanoma reveals diverse drivers and therapeutic targets. Nat. Commun. 10, 3163 (2019).
Wong, S. Q. et al. Whole exome sequencing identifies a recurrent RQCD1 P131L mutation in cutaneous melanoma. Oncotarget 6, 1115–1127 (2015).
Rodriguez, C. I. & Setaluri, V. Cyclic AMP (cAMP) signaling in melanocytes and melanoma. Arch. Biochem. Biophys. 563, 22–27 (2014).
Stratakis, C. A., Kirschner, L. S. & Carney, J. A. Clinical and molecular features of the Carney complex: diagnostic criteria and recommendations for patient evaluation. J. Clin. Endocrinol. Metab. 86, 4041–4046 (2001).
Arafeh, R. et al. Recurrent inactivating RASA2 mutations in melanoma. Nat. Genet. 47, 1408–1410 (2015).
Dunford, A. et al. Tumor-suppressor genes that escape from X-inactivation contribute to cancer sex bias. Nat. Genet. 49, 10–16 (2017).
Gupta, S., Artomov, M., Goggins, W., Daly, M. & Tsao, H. Gender disparity and mutation burden in metastatic melanoma. J. Natl Cancer Inst. 107, djv221 (2015).
Cruciat, C. M. et al. RNA helicase DDX3 is a regulatory subunit of casein kinase 1 in Wnt-β-catenin signaling. Science 339, 1436–1441 (2013).
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Phung, B. et al. The X-linked DDX3X RNA helicase dictates translation reprogramming and metastasis in melanoma. Cell Rep. 27, 3573–3586 e7 (2019).
Soto-Rifo, R. & Ohlmann, T. The role of the DEAD-box RNA helicase DDX3 in mRNA metabolism. Wiley Interdiscip. Rev. RNA 4, 369–385 (2013).
Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA binding proteins. Preprint at bioRxiv https://doi.org/10.1101/179648 (2018).
Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).
Lauss, M., Nsengimana, J., Staaf, J., Newton-Bishop, J. & Jonsson, G. Consensus of melanoma gene expression subtypes converges on biological entities. J. Invest. Dermatol. 136, 2502–2505 (2016).
Moffitt, R. A. et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat. Genet. 47, 1168–1178 (2015).
Aran, D., Hu, Z. & Butte, A. J. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 18, 220 (2017).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Mootha, V. K. et al. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).
Snyder, A. et al. Genetic basis for clinical response to CTLA-4 blockade in melanoma. N. Engl. J. Med. 371, 2189–2199 (2014).
Klebanov, N. et al. Burden of unique and low prevalence somatic mutations correlates with cancer survival. Sci. Rep. 9, 4848 (2019).
Miao, D. et al. Genomic correlates of response to immune checkpoint blockade in microsatellite-stable solid tumors. Nat. Genet. 50, 1271–1281 (2018).
Liu, D. et al. Integrative molecular and clinical modeling of clinical outcomes to PD1 blockade in patients with metastatic melanoma. Nat. Med. 25, 1916–1927 (2019).
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830 (2018).
Ott, P. A. et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217–221 (2017).
Tsao, H., Bevona, C., Goggins, W. & Quinn, T. The transformation rate of moles (melanocytic nevi) into cutaneous melanoma: a population-based estimate. Arch. Dermatol. 139, 282–288 (2003).
Snijders Blok, L. et al. Mutations in DDX3X are a common cause of unexplained intellectual disability with gender-specific effects on wnt signaling. Am. J. Hum. Genet. 97, 343–352 (2015).
Ditton, H. J., Zimmer, J., Kamp, C., Rajpert-De Meyts, E. & Vogt, P. H. The AZFa gene DBY (DDX3Y) is widely transcribed but the protein is limited to the male germ cells by translation control. Hum. Mol. Genet. 13, 2333–2341 (2004).
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).
Conforti, F. et al. Cancer immunotherapy efficacy and patients' sex: a systematic review and meta-analysis. Lancet Oncol. 19, 737–746 (2018).
Lazova, R. et al. Spitz nevi and Spitzoid melanomas: exome sequencing and comparison with conventional melanocytic nevi and melanomas. Mod Pathol 30, 640–649 (2017).
Smith, L. K., Rao, A. D. & McArthur, G. A. Targeting metabolic reprogramming as a potential therapeutic strategy in melanoma. Pharmacol. Res. 107, 42–47 (2016).
Johannessen, C. M. et al. A melanocyte lineage program confers resistance to MAP kinase pathway inhibition. Nature 504, 138–142 (2013).
Van Allen, E. M. et al. The genetic landscape of clinical resistance to RAF inhibition in metastatic melanoma. Cancer Discov. 4, 94–109 (2014).
Grossman, R. L. et al. Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016).
Zhang, J. et al. International cancer genome consortium data portal—a one-stop shop for cancer genomics data. Database (Oxford) 2011, bar026 (2011).
Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Gerstein, M. B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012).
Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
Fabregat, A. et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 46, D649–D655 (2018).
Medvedeva, Y. A. et al. EpiFactors: a comprehensive database of human epigenetic factors and complexes. Database (Oxford) 2015, bav067 (2015).
Carter, S., Meyerson, M. & Getz, G. Accurate estimation of homologue-specific DNA concentration-ratios in cancer samples allows long-range haplotyping. Nat. Preced. https://doi.org/10.1038/npre.2011.6494.1 (2011).
Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).
Gaujoux, R. & Seoighe, C. et al. A flexible R package for non-negative matrix factorization. BMC Bioinformatics 11, 367 (2010).
Senbabaoglu, Y., Michailidis, G. & Li, J. Z. Critical limitations of consensus clustering in class discovery. Sci. Rep. 4, 6207 (2014).
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Morgan, M., Pages, H., Obenchain, V. & Hayden, N. Rsamtools: binary alignment (BAM), FASTA, variant call (BCF) and tabix file import. R package version 1.28.0 (2017).
Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).
Jurtz, V. et al. NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).
Tatlow, P. J. & Piccolo, S. R. A cloud-based workflow to quantify transcript-expression levels in public cancer compendia. Sci. Rep. 6, 39259 (2016).
We thank J. Pelletier, N. Beauchemin, A. Lissouba and the Watson laboratory for their critical comments on the manuscript. We thank and acknowledge the Analysis Working Group of the SKCM TCGA project and the authors of Hodis et al.2, Van Allen et al.20, Krauthammer et al.19 and Hayward et al.18, whose past work enabled this study. We especially thank N. Hayward, M. Krauthammer and R. Halaban for answering specific questions related to these studies and R. Marais and P. Mundra for sharing their curated list of nonacral cutaneous melanoma cases from TCGA21. This work was supported by the V Foundation (I.R.W. V Scholar grant ID no.: V2016-023). I.R.W. is a Canada Research Chair II and funded by grants from the Melanoma Research Alliance (grant no. 412429), the Canadian Institute of Health Research (grant no. PJT-152975) and the Terry Fox Research Institute and Genome Québec (grant no. 1084). R.A. is a recipient of the Canderel Graduate Studentship, the Fonds de Recherche du Québec – Santé Doctoral Training Award and the Canadian Institute of Health Research Doctoral Award, Frederick Banting and Charles Best Canada Graduate Scholarships (CGS-D).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Determination of the optimal NMF decomposition rank (k) based on the average of the mean squared error (MSE) between observed trinucleotide mutation counts and predictions of masked values (y-axis) imputed by NMF. The average, calculated across three repetitions of 5-fold cross validation, is plotted against the decomposition ranks (x-axis). Error bars represent the standard error of the mean (SEM). b, Sample-wise Spearman’s correlation between the observed and NMF’s fitted trinucleotide mutation counts (n = 96 trinucleotide mutations, n = 1,014 tumors). The color gradient represents the number of mutated trinucleotides in each tumor sample and is meant to highlight that lower correlations result simply from the low sparsity of NMF’s fit. c, Percentage contribution of trinucleotide mutations for each mutational signature. d, Percent contribution of each mutational signature to the total number of mutations per tumor. e, The proportion of UVR-signature mutations per tumor. Melanoma subtypes are distinguished by different colours. f, Comparisons of our trinucleotide mutational signatures to the Catalogue of Somatic Mutations in Cancer (COSMIC) set of signatures. Left panels show the Person’s correlation (y-axis) between the percent contribution of trinucleotide mutations to our signatures (the values in c) and each of 65 signatures in COSMIC (x-axis) (n = 96 trinucleotide mutations). Right panels show the mean squared difference (y-axis) between the percent contributions (n = 96 trinucleotide mutations). g, Heatmap showing the column-sum normalized weights of COSMIC mutational signature (rows) in our set of 1,014 tumor samples (columns), estimated using non-negative linear regression (via the nnlm() function implemented in the NNLM R package). Our unsupervised estimates of mutation signature contributions are shown at the top. There is strong agreement between our estimates and those based on the COSMIC signatures.
a, Quantile-quantile (Q-Q) plots of OFML (right-tailed permutation test) p-values (y-axis) plotted against uniformly distributed p-values (x-axis) (n = 177 UVR-low tumor samples, n = 824 UVR-high tumors). Each point represents one gene. Genes with an FDR adjusted p-value of <10% are labelled. b-c, Venn diagrams showing the overlap between genes identified in each UVR group (b) and using each scoring function (c). d, Detailed breakdown of OFML subset analyses. Each row of the matrix contains the genes identified as significantly mutated (FDR < 1%) in an analysis using the labelled score (leftmost row label), the UVR group (centre) and melanoma subset (right). Sample sizes are as follows: (UVR-low, UVR-high, all subtypes n = 1,001 tumors); (UVR-low, non-acral cutaneous n = 77 tumors); (UVR-low, all subtypes n = 177 tumors); (UVR-high, non-acral cutaneous n = 690 tumors); (UVR-high, all subtypes n = 824 tumors). e, Mutation frequency of SMGs stratified by cohort, f, melanoma subtype and g, UVR-group. The number of mutated tumors is indicated above each bar for panels (f) and (g). All FDR adjusted p-values were obtained using the Benjamini-Hochberg procedure.
a, Distribution of mutation frequency near transcription start sites (TSS) of expressed transcripts (> 1 TPM) that overlap (red) or do not overlap (black) ETS transcription factor peaks. Red: TSS overlapped by ETS peaks. Black: other TSS. b, Number and proportion of gene mutations that fall within an ETS transcription factor peak. c, Recurrent mutations at ETS binding sites overlapping SMGs (STK19, SLC27A5, SUCO). For each gene, the top to bottom panels show (1) the gene locus. (2) the various transcripts at that locus. (3) the cumulative median expression of all transcripts, per DNA strand, in units of transcripts per million (TPM). (4) the cumulative median expression of all transcripts, per DNA strand, restricted to their coding regions (CDS), in TPM. (5) the number of mutations at each position in the region. (6) the locations of ETS transcription factor ChIP-seq peaks. d, Neutral mutation frequency, per tumor sample per nucleotide, for SMGs and other potential drivers (see Methods). e, mRNA expression of genes in melanoma cell lines from the cancer cell line encyclopedia (CCLE) (n = 55 cell lines) (top) and melanoma TCGA tumors (second to fourth panels). Each point on the plot represents one tumor or cancer cell line. Expression levels are in log transformed units of reads per kilobase per million (RPKM). For each gene, TCGA tumors were stratified by mutation status. The colour of TCGA data points corresponds to the Spearman’s correlation between gene mRNA expression and tumor purity. The number of tumors used to compute the correlation coefficient is denoted for each gene and mutation type at the top of each panel.
Extended Data Fig. 4 Enrichment of biological processes and protein complexes in the top OncodriveFML (OFML) hits.
a and b, Left panel: gene set membership for 66 genes which had an OFML FDR < 10%. Right panel: P-values from a one-tailed Fisher’s exact test of the overlap of Reactome (a) or EpiFactors (b) gene sets with genes that passed an OncodriveFML FDR of <10%. Only gene sets that passed an FDR threshold of <20% are shown. FDRs were computed form p-values using the Benjamini-Hochberg procedure. c, Bar plot showing the frequency of somatic mutations in (SWI/SNF)/BAF complex subunits (ARID2, ARID1A, ARID1B, BRD7, and PBRM1) and MLL complex subunits (KMT2A, KMT2B, KANSL1, and MEN1) that passed an OncodriveFML FDR of <10%. Number of samples = 1,014 tumors.
a, Volcano plots showing the relationship between patient sex and mutation status of SMGs. Each panel corresponds to a logistic regression model, where the probability of mutations in each SMG is modeled as a function of sex and potential confounders such as tumor mutation burden (TMB) and age. Each point corresponds to one gene. The x-axis corresponds to the value of the sex coefficient in the model (log2), the y-axis corresponds to its associated FDR-adjusted p-value (derived from a two-tailed z Wald test and adjusted for the number of hypotheses tested using the Benjamini-Hochberg procedure). These models show that the imbalance of DDX3X mutations between sex are not confounded by factors such as tumor mutation burden and age. There were a limited number of samples for which age was available. Therefore, the FDR value corresponding to DDX3X increases when age is included to the model, as do the FDR values of other SMGs. Models not including age at diagnosis were fitted to 1,013 tumors (59 DDX3X mutant tumors). The model including age at diagnosis was fitted to 841 tumors (50 DDX3X mutant tumors). Data is only shown for genes with at least five mutations. b, Scatter plots showing DDX3X associated differences in gene expression in tumors (x-axis; n = 22 mutant DDX3X tumors vs n = 167 wildtype tumors) compared against gene expression differences in cancer cell lines (y-axis; two biologically independent replicates of DDX3X knockdown per cell line vs two biologically independent controls per cell line). In each panel, the (log2) odds ratio (OR) between the sign of expression differences in the corresponding cell line vs the sign of expression differences in the tumors are shown. Only genes that had a differential expression p-value < 0.05 in tumors were considered in this analysis (p-values were estimated using the Limma R package; parameterized to perform a two-tailed t-test on linear model coefficients). c, Expression differences of individual genes between DDX3X mutant and wildtype samples (TCGA, left panel) or DDX3X knockdown and control samples (HT144 cell line, right panel). See (b) for sample sizes and statistical test used with TCGA data. P-values for TCGA were adjusted for multiple testing using the Benjamini-Hochberg procedure. For HT144 data, a two-tailed z Wald test was performed on negative binomial model coefficients fitted using DESeq2. Genes are ordered according to differential expression p-values in HT144. d, Heatmaps showing the difference in density of differentially expressed DDX3X targets relative to the targets of other RBPs. Each panel corresponds to a different combination of datasets used to determine differential expression (x and y-axis) and DDX3X or other RBP targets (indicated at top of panel). Target genes were identified based on the overlap of their 5′ UTR with eCLIP peaks in K562 (left column), HepG2 (middle column) or the union of peaks in both cell lines (right column). e, mRNA expression of DDX3X and DDX3Y in TCGA tumors from female (left) and male (right) patients (n = 289 male tumors, n = 179 female tumors). Each point represents one tumor, with the color representing DDX3X mutation status. Expression levels are in log transformed units of reads per kilobase per million (RPKM). f, Mutation matrix of loss-of-function and missense mutations of DDX3X and DDX3Y across all samples analyzed in this study.
a, Kaplan-Meier survival curves for 216 patients from TCGA with metastatic tumor samples from a regional lymph node, stratified by mRNA expression percentiles of lymphocytic markers. Each panel corresponds to one lymphocytic maker. b-d, Determining the optimal NMF decomposition for RNA-seq data (n = 468 tumors). b, Average of tumor sample connectivity matrices across 100 randomly initialized NMF runs. c, Cumulative distribution function (CDF) of averaged tumor connectivity matrices. (d) Proportion of ambiguous clustering (PAC) by NMF – used to evaluate the stability of NMFs solution at each rank (k). PAC measurements using five different definitions of ambiguity are shown. e-h, Distribution of NMF’s expression signatures and their relationship with non-melanocyte skin cells. (e) Distribution of NMF signature weights in TCGA tumors. (f and g) Scatter plots of each tumor’s NMF keratin weights (y-axis) and xCell keratinocyte and sebocyte signature scores (x-axis) (n = 468 tumors). Correlation p-value was computed using a two-sided Spearman's test. (h) Distribution of keratin weights across TCGA tumor tissue sites (nprimary = 101 and nother = 362 tumor samples). Each point corresponds to one tumor sample. P-value is from a two-tailed Wilcoxon rank sum test. Boxes indicate first, second, and third quartiles. Whiskers extend to the minimum and maximum data points, no further than 1.5 times the inter-quartile range from the hinges. i, Classical clustering recapitulates melanoma cell intrinsic expression signatures when the effect of varying tumor purity is subtracted from gene expression data. See Methods for additional details. j, Agreement between NMFs proposed cancer intrinsic signatures and the groups uncovered in (i). Each panel includes samples from a single mRNA subgroup identified in (i), indicated on the right side of the panel (nCommon = 299, nMITF-low = 76 and nOxPhos = 72 tumors). Shown on the y-axis are the NMF weights corresponding to each NMF signature indicated on the x-axis. Boxes indicate first, second, and third quartiles. Whiskers extend to the minimum and maximum data points, no further than 1.5 times the inter-quartile range from the hinges.
a-c, mRNA expression of MITF and hypoxia markers HIF1A and VEGFA in melanoma mRNA subgroups (nCommon = 299, nMITF-low = 76 and nOxPhos = 72 tumors). The y-axis corresponds to mRNA expression in log transformed counts per million (CPM). P-values are from a two-tailed Wilcoxon rank sum test. Boxes indicate first, second, and third quartiles. Whiskers extend to the minimum and maximum data points, no further than 1.5 times the inter-quartile range from the hinges. d-e, Gene set enrichment analysis (GSEA), performed on genes differentially expressed in each mRNA subgroup compared to all out-of-group samples. Genes were first ranked by fold-difference in RNA expression (estimated using DESeq2). The ranked log transformed fold-differences were provided to the Broad Institute’s GSEA tool, which performs a one-tailed permutation test of a Kolmogorov-Smirnov-like statistic (number of genes = 17,481). The MSigDB hallmarks (d) and curated gene sets (C2) (e) databases were used. For each gene set (y-axes) an enrichment score (x-axes) and a corresponding FDR value (colour gradient) are assigned. Positive and negative enrichment scores indicate that a gene set is enriched in upregulated and downregulated genes, respectively. The gene sets shown here are the top seven per group that passed an FDR cut-off <1%. GSEA computes FDR values using a permutation approach. f, Distribution of TMB in non-acral cutaneous samples, stratified by their dominant intrinsic mRNA signature. See panel (a) legend for sample sizes. g, Distribution of UVR-mutation proportions in non-acral cutaneous samples stratified by their dominant intrinsic mRNA signature. See panel (a) legend for sample sizes. P-values for (a) and (b) are based on a Kruskal-Wallis test. h, Bootstrap estimates (10,000 iterations) of the Spearman correlation between immune signature and TMB (left) or proportion of UVR mutations (right) in each tumor (n = 394 non-acral cutaneous tumors). In panels (f-h), Boxes indicate first, second, and third quartiles. Whiskers extend to the minimum and maximum data points, no further than 1.5 times the inter-quartile range from the hinges.
Extended Data Fig. 8 Single predictors evaluation and relative quality of multivariable post-accession survival models in non-acral cutaneous melanomas.
a, Univariable Cox model Hazard ratio and unadjusted p-value of single predictors (n = 347 tumor samples). b, Relative quality of univariable and multivariable Cox survival models including all subsets of predictors. Models are ordered from left to right by increasing Akaike Information Criteria (AIC, top). The bottom panel (binary matrix) indicates which predictors were included in each model (n = 347 tumor samples). c and e, Relative quality of multivariable Cox regression models including different subsets of predictors for unstratified (n = 347) and UVR-high (n = 301) tumor samples. All models include age at procurement and tumor tissue site, in addition to the predictors specified on the x-axis (indicated by ellipses). The y-axis shows the Akaike information criteria (AIC, top subpanel) and concordance index (lower subpanel). d and f, Coefficients of the multivariable Cox regression models shown in (c) and (e). For each subpanel, the left part shows the coefficients, expressed in log2 hazard-ratios with 95% confidence intervals, and the right part shows the coefficient p-values. Cox model coefficient p-values in all panels were computed using a two-tailed z Wald test.
Extended Data Fig. 9 Single predictors evaluation and relative quality of multivariable overall survival models in non-acral cutaneous melanomas.
a, Univariable Cox model Hazard ratio and unadjusted p-value of single predictors (n = 347 tumor samples). b, Relative quality of univariable and multivariable Cox survival models including all subsets of predictors. Models are ordered from left to right by increasing Akaike Information Criteria (AIC, top). The bottom panel (binary matrix) indicates which predictors were included in each model (n = 347 tumor samples). c and e, Relative quality of multivariable Cox regression models including different subsets of predictors for unstratified (n = 347) and UVR-high (n = 301) tumor samples. All models include age at initial diagnosis, pathologic stage, and mRNA subgroup in addition to the predictors specified on the x-axis (indicated by ellipses). The y-axis shows the Akaike information criteria (AIC, top subpanel) and concordance index (lower subpanel). d and f, Coefficients of the multivariable Cox regression models shown in (c) and (e). For each subpanel, the left part shows the coefficients, expressed in log2 hazard-ratios with 95% confidence intervals, and the right part shows the coefficient p-values. Cox model coefficient p-values in all panels were computed using a two-tailed z Wald test.
Extended Data Fig. 10 Relative quality of multivariable Cox regression models including TMB or neoantigen load.
a, Akaike information criteria. b, Concordance index. Models were restricted to non-acral cutaneous melanomas with no missing data for all predictors (n = 336 tumors, of which 293 tumors were UVR-high).
About this article
Cite this article
Alkallas, R., Lajoie, M., Moldoveanu, D. et al. Multi-omic analysis reveals significantly mutated genes and DDX3X as a sex-specific tumor suppressor in cutaneous melanoma. Nat Cancer 1, 635–652 (2020). https://doi.org/10.1038/s43018-020-0077-8