Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2–20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600–5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data analysed in this manuscript have been deposited in Synapse (http://www.synapse.org), accession number syn1729383, and in dbGaP (http://www.ncbi.nlm.nih.gov/gap), accession numbers phs000330.v1.p1, phs000348.v1.p1, phs000369.v1.p1, phs000370.v1.p1, phs000374.v1.p1, phs000435.v2.p1, phs000447.v1.p1, phs000450.v1.p1, phs000452.v1.p1, phs000467.v6.p1, phs000488.v1.p1, phs000504.v1.p1, phs000508.v1.p1, phs000579.v1.p1, phs000598.v1.p1.
This work was conducted as part of TCGA, a project of the National Cancer Institute and the National Human Genome Research Institute. We are grateful to T. I. Zack, S. E. Schumacher, and R. Beroukhim for sharing their copy-number analyses before publication.
This file contains a list of source datasets analyzed in this work, and references to the corresponding publications.
This file contains the 260 significantly mutated cancer genes found by analysis with the MutSig suite (see Supplementary Information file for full legend).
This file contains a list of the 21 tumor types studied, and the significantly mutated genes found by the MutSig suite in each tumor type (see Supplementary Information file for full legend).
The file contains a list of references reporting the identification of candidate cancer genes (see Supplementary Information file for full legend).
This file contains a list of references to biological literature supporting the 33 novel candidate cancer genes with clear and compelling connections to cancer biology.
This file contains a summary of the analysis comparing the performance of each of the three MutSig metrics separately, in pairwise combinations, and all three combined as in the main analysis (see Supplementary Information file for full legend).
About this article
PipeIT: Singularity Container for Molecular Diagnostic Somatic Variant Calling on Ion Torrent NGS Platform
The Journal of Molecular Diagnostics (2019)