Article | Published:

Discovery and saturation analysis of cancer genes across 21 tumour types

Nature volume 505, pages 495501 (23 January 2014) | Download Citation



Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2–20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600–5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


Data deposits

The data analysed in this manuscript have been deposited in Synapse (, accession number syn1729383, and in dbGaP (, accession numbers phs000330.v1.p1, phs000348.v1.p1, phs000369.v1.p1, phs000370.v1.p1, phs000374.v1.p1, phs000435.v2.p1, phs000447.v1.p1, phs000450.v1.p1, phs000452.v1.p1, phs000467.v6.p1, phs000488.v1.p1, phs000504.v1.p1, phs000508.v1.p1, phs000579.v1.p1, phs000598.v1.p1.


  1. 1.

    & Lessons from the cancer genome. Cell 153, 17–37 (2013)

  2. 2.

    et al. Cancer genome landscapes. Science 339, 1546–1558 (2013)

  3. 3.

    et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150, 1107–1120 (2012)

  4. 4.

    et al. Absolute quantification of somatic DNA alterations in human cancer. Nature Biotechnol. 30, 413–421 (2012)

  5. 5.

    et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature Biotechnol. 31, 213–219 (2013)

  6. 6.

    et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013)

  7. 7.

    et al. Pan-cancer patterns of somatic copy number alteration. Nature Genet. 45, 1134–1140 (2013)

  8. 8.

    et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl Acad. Sci. USA 109, 3879–3884 (2012)

  9. 9.

    Cancer Genome Atlas Research. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67–73 (2013)

  10. 10.

    et al. Mutational landscape and significance across 12 major cancer types. Nature 502, 333–339 (2013)

  11. 11.

    et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013)

  12. 12.

    & Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011)

  13. 13.

    et al. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int. J. Cancer 127, 2893–2917 (2010)

Download references


This work was conducted as part of TCGA, a project of the National Cancer Institute and the National Human Genome Research Institute. We are grateful to T. I. Zack, S. E. Schumacher, and R. Beroukhim for sharing their copy-number analyses before publication.

Author information

Author notes

    • Eric S. Lander
    •  & Gad Getz

    These authors contributed equally to this work.


  1. Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA

    • Michael S. Lawrence
    • , Petar Stojanov
    • , Craig H. Mermel
    • , James T. Robinson
    • , Levi A. Garraway
    • , Todd R. Golub
    • , Matthew Meyerson
    • , Stacey B. Gabriel
    • , Eric S. Lander
    •  & Gad Getz
  2. Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA

    • Petar Stojanov
    • , Levi A. Garraway
    • , Todd R. Golub
    •  & Matthew Meyerson
  3. Massachusetts General Hospital, Cancer Center and Department of Pathology, 55 Fruit Street, Boston, Massachusetts 02114, USA

    • Craig H. Mermel
    •  & Gad Getz
  4. Harvard Medical School, 25 Shattuck Street, Boston, Massachusetts 02115, USA

    • Levi A. Garraway
    • , Todd R. Golub
    • , Matthew Meyerson
    • , Eric S. Lander
    •  & Gad Getz
  5. Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, Maryland 20815, USA

    • Todd R. Golub
  6. Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA

    • Eric S. Lander


  1. Search for Michael S. Lawrence in:

  2. Search for Petar Stojanov in:

  3. Search for Craig H. Mermel in:

  4. Search for James T. Robinson in:

  5. Search for Levi A. Garraway in:

  6. Search for Todd R. Golub in:

  7. Search for Matthew Meyerson in:

  8. Search for Stacey B. Gabriel in:

  9. Search for Eric S. Lander in:

  10. Search for Gad Getz in:


G.G., E.S.L., T.R.G., M.M., L.A.G. and S.B.G. conceived the project and provided leadership. M.S.L., G.G., E.S.L., P.S. and C.H.M. analysed the data and contributed to scientific discussions. M.S.L., E.S.L. and G.G. wrote the paper. J.T.R., M.S.L., E.S.L. and G.G. created the website for visualizing this data set.

Competing interests

A patent related to this work has been filed.

Corresponding authors

Correspondence to Eric S. Lander or Gad Getz.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This file contains Supplementary Figures 1-9 and legends for Supplementary Tables 1-6 (see separate files for tables).

Excel files

  1. 1.

    Supplementary Table 1

    This file contains a list of source datasets analyzed in this work, and references to the corresponding publications.

  2. 2.

    Supplementary Table 2

    This file contains the 260 significantly mutated cancer genes found by analysis with the MutSig suite (see Supplementary Information file for full legend).

  3. 3.

    Supplementary Table 3

    This file contains a list of the 21 tumor types studied, and the significantly mutated genes found by the MutSig suite in each tumor type (see Supplementary Information file for full legend).

  4. 4.

    Supplementary Table 4

    The file contains a list of references reporting the identification of candidate cancer genes (see Supplementary Information file for full legend).

  5. 5.

    Supplementary Table 5

    This file contains a list of references to biological literature supporting the 33 novel candidate cancer genes with clear and compelling connections to cancer biology.

  6. 6.

    Supplementary Table 6

    This file contains a summary of the analysis comparing the performance of each of the three MutSig metrics separately, in pairwise combinations, and all three combined as in the main analysis (see Supplementary Information file for full legend).

About this article

Publication history





Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.