Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Computational approaches to identify functional genetic variants in cancer genomes


The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Scheme depicting the three main approaches routinely used in the analysis of cancer somatic mutations.

Similar content being viewed by others


  1. International Cancer Genome Consortium. et al. International network of cancer genome projects. Nature 464, 993–998 (2010).

  2. Stratton, M.R., Campbell, P.J. & Futreal, P.A. The cancer genome. Nature 458, 719–724 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Hanahan, D. & Weinberg, R.A. The hallmarks of cancer. Cell 100, 57–70 (2000).

    Article  CAS  PubMed  Google Scholar 

  4. Hanahan, D. & Weinberg, R.A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

    Article  CAS  PubMed  Google Scholar 

  5. Futreal, P.A. et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Malumbres, M. & Barbacid, M. RAS oncogenes: the first 30 years. Nat. Rev. Cancer 3, 459–465 (2003).

    Article  CAS  PubMed  Google Scholar 

  7. Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417, 949–954 (2002).

    CAS  PubMed  Google Scholar 

  8. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w118; iso-2; iso-3. Fly 6, 80–92 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Medina, I. et al. VARIANT: command line, web service and web interface for fast and accurate functional characterization of variants found by next-generation sequencing. Nucleic Acids Res. 40, W54–W58 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Hoehndorf, R., Kelso, J. & Herre, H. The ontology of biological sequences. BMC Bioinformatics 10, 377 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  12. Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Flicek, P. et al. Ensembl 2013. Nucleic Acids Res. 41, D48–D55 (2013).

    Article  CAS  PubMed  Google Scholar 

  14. Karolchik, D., Hinrichs, A.S. & Kent, W.J. The UCSC Genome Browser. in Current Protocols in Bioinformatics (eds. Baxevanis, A.D. et al.) 1.4 (2012).

  15. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  CAS  Google Scholar 

  16. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. 1000 Genomes Project Consortium. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  18. Forbes, S.A. et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39, D945–D950 (2011).

    Article  CAS  PubMed  Google Scholar 

  19. Stenson, P.D. et al. The Human Gene Mutation Database: 2008 update. Genome Med. 1, 13 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Amberger, J., Bocchini, C.A., Scott, A.F. & Hamosh, A. McKusick's Online Mendelian Inheritance in Man (OMIM). Nucleic Acids Res 37, D793–D796 (2009).

    Article  CAS  PubMed  Google Scholar 

  21. Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).

    Article  CAS  PubMed  Google Scholar 

  22. Ng, P.C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. González-Pérez, A. & López-Bigas, N. Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, condel. Am. J. Hum. Genet. 88, 440–449 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Ryan, M., Diekhans, M., Lien, S., Liu, Y. & Karchin, R. LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures. Bioinformatics 25, 1431–1432 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Stone, E.A. & Sidow, A. Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res. 15, 978–986 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Gonzalez-Perez, A., Deu-Pons, J. & Lopez-Bigas, N. Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome Med. 4, 89 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Carter, H. et al. Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 69, 6660–6667 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Kaminker, J.S., Zhang, Y., Watanabe, C. & Zhang, Z. CanPredict: a computational tool for predicting cancer-associated missense mutations. Nucleic Acids Res. 35, W595–W598 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Capriotti, E. & Altman, R.B. A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. Genomics 98, 310–317 (2011).

    Article  CAS  PubMed  Google Scholar 

  31. Thusberg, J., Olatubosun, A. & Vihinen, M. Performance of mutation pathogenicity prediction methods on missense variants. Hum. Mutat. 32, 358–368 (2011).

    Article  PubMed  Google Scholar 

  32. Liu, X., Jian, X. & Boerwinkle, E. dbNSFP: a lightweight database of human non-synonymous SNPs and their functional predictions. Hum. Mutat. 32, 894–899 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Niknafs, N. et al. MuPIT Interactive: Webserver for mapping variant positions to annotated, interactive 3D structures. Hum. Genet. (in the press).

  34. Maerkl, S.J. & Quake, S.R. A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007).

    Article  CAS  PubMed  Google Scholar 

  35. Badis, G. et al. Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Boyle, A.P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Bryne, J.C. et al. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 36, D102–D106 (2008).

    Article  CAS  PubMed  Google Scholar 

  39. Clifford, R.J., Edmonson, M.N., Nguyen, C. & Buetow, K.H. Large-scale analysis of non-synonymous coding region single nucleotide polymorphisms. Bioinformatics 20, 1006–1014 (2004).

    Article  CAS  PubMed  Google Scholar 

  40. Pleasance, E.D. et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010).

    Article  CAS  PubMed  Google Scholar 

  41. Hoffman, M.M. & Birney, E. An effective model for natural selection in promoters. Genome Res. 20, 685–692 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Cowper-Sal Lari, R. et al. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nat. Genet. 44, 1191–1198 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Quesada, V. et al. Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat. Genet. 44, 47–52 (2011).

    Article  PubMed  CAS  Google Scholar 

  44. Horn, S. et al. TERT promoter mutations in familial and sporadic melanoma. Science 339, 959–961 (2013).

    Article  CAS  PubMed  Google Scholar 

  45. Huang, F.W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Pleasance, E.D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).

    Article  CAS  PubMed  Google Scholar 

  47. Lohr, J.G. et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl. Acad. Sci. USA 109, 3879–3884 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Stamatoyannopoulos, J.A. et al. Human mutation rate associated with DNA replication timing. Nat. Genet. 41, 393–395 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Greenman, C. et al. Patterns of somatic mutation in human cancer genomes. Nature 446, 153–158 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Hodis, E. et al. A landscape of driver mutations in melanoma. Cell 150, 251–263 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Dees, N.D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature advance online publication, 10.1038/nature12213 (16 June 2013).

  53. Gonzalez-Perez, A. & Lopez-Bigas, N. Functional impact bias reveals cancer drivers. Nucleic Acids Res. 40, e169 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Reimand, J. & Bader, G.D. Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Mol. Syst. Biol. 9, 637 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Sjöblom, T. et al. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268–274 (2006).

    Article  PubMed  CAS  Google Scholar 

  56. Creixell, P., Schoof, E.M., Erler, J.T. & Linding, R. Navigating cancer network attractors for tumor-specific therapy. Nat. Biotechnol. 30, 842–848 (2012).

    Article  CAS  PubMed  Google Scholar 

  57. Douville, C. et al. CRAVAT: Cancer-Related Analysis of VAriants Toolit. Bioinformatics 29, 647–648 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Carter, H. et al. Identifying Mendelian disease genes with the Variant Effect Scoring Tool. BMC Genomics 14 (suppl. 3), S3 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Gundem, G. et al. IntOGen: integration and data-mining of multidimensional oncogenomic data. Nat. Methods 7, 92–93 (2010).

    Article  CAS  PubMed  Google Scholar 

  60. Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Masica, D.L. & Karchin, R. Correlation of somatic mutation and expression identifies genes important in human glioblastoma progression and survival. Cancer Res. 71, 4550–4561 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Lee, W., Zhang, Y., Mukhyala, K., Lazarus, R.A. & Zhang, Z. Bi-directional SIFT predicts a subset of activating mutations. PLoS ONE 4, e8311 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  63. Ng, S. et al. PARADIGM-SHIFT predicts the function of mutations in multiple cancers using pathway impact analysis. Bioinformatics 28, i640–i646 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Iyer, G. et al. Genome sequencing identifies a basis for everolimus sensitivity. Science 338, 221 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Valencia, A. & Hidalgo, M. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics. Genome Med. 4, 61 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Lincoln D Stein or Nuria Lopez-Bigas.

Ethics declarations

Competing interests

The author declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 2–4 (PDF 178 kb)

Supplementary Table 1

Sequence Ontology (SO) terms used to describe the effect of mutations. (XLSX 10 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

the International Cancer Genome Consortium Mutation Pathways and Consequences Subgroup of the Bioinformatics Analyses Working Group. Computational approaches to identify functional genetic variants in cancer genomes. Nat Methods 10, 723–729 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer