Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Protein-structure-guided discovery of functional mutations across 19 cancer types

A Corrigendum to this article was published on 27 July 2017

This article has been updated


Local concentrations of mutations are well known in human cancers. However, their three-dimensional spatial relationships in the encoded protein have yet to be systematically explored. We developed a computational tool, HotSpot3D, to identify such spatial hotspots (clusters) and to interpret the potential function of variants within them. We applied HotSpot3D to >4,400 TCGA tumors across 19 cancer types, discovering >6,000 intra- and intermolecular clusters, some of which showed tumor and/or tissue specificity. In addition, we identified 369 rare mutations in genes including TP53, PTEN, VHL, EGFR, and FBXW7 and 99 medium-recurrence mutations in genes such as RUNX1, MTOR, CA3, PI3, and PTPN11, all mapping within clusters having potential functional implications. As a proof of concept, we validated our predictions in EGFR using high-throughput phosphorylation data and cell-line-based experimental evaluation. Finally, mutation–drug cluster and network analysis predicted over 800 promising candidates for druggable mutations, raising new possibilities for designing personalized treatments for patients carrying specific mutations.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: HotSpot3D workflow, robustness simulations, and comparison to SpacePAC.
Figure 2: Significant spatial clusters.
Figure 3: Cancer type specificity of intramolecular and intermolecular clusters.
Figure 4: Intramolecular and intermolecular clusters with unique hotspot variants and new variants.
Figure 5: Polar plots showing discovery of rare and medium-recurrence functional variants in intramolecular and intermolecular clusters.
Figure 6: Functional assessment using phosphorylation data and experimental validation.
Figure 7: Variant–drug interaction heat maps and structures.

Change history

  • 20 March 2017

    In the version of this article initially published, residue R384 was incorrectly highlighted in the protein model depicted in Figure 7e. The correct residue is R394. The error has been corrected in the HTML and PDF versions of the article.


  1. Dees, N.D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).

    Article  CAS  Google Scholar 

  2. Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    Article  CAS  Google Scholar 

  3. Carter, H., Samayoa, J., Hruban, R.H. & Karchin, R. Prioritization of driver mutations in pancreatic cancer using cancer-specific high-throughput annotation of somatic mutations (CHASM). Cancer Biol. Ther. 10, 582–587 (2010).

    Article  CAS  Google Scholar 

  4. Gonzalez-Perez, A. et al. IntOGen-mutations identifies cancer drivers across tumor types. Nat. Methods 10, 1081–1082 (2013).

    Article  CAS  Google Scholar 

  5. Gonzalez-Perez, A. & Lopez-Bigas, N. Functional impact bias reveals cancer drivers. Nucleic Acids Res. 40, e169 (2012).

    Article  CAS  Google Scholar 

  6. Tamborero, D., Gonzalez-Perez, A. & Lopez-Bigas, N. OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29, 2238–2244 (2013).

    Article  CAS  Google Scholar 

  7. Niknafs, N. et al. MuPIT interactive: webserver for mapping variant positions to annotated, interactive 3D structures. Hum. Genet. 132, 1235–1243 (2013).

    Article  Google Scholar 

  8. Ryan, M., Diekhans, M., Lien, S., Liu, Y. & Karchin, R. LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures. Bioinformatics 25, 1431–1432 (2009).

    Article  CAS  Google Scholar 

  9. Teyra, J. & Kim, P.M. Interpreting protein networks with three-dimensional structures. Nat. Methods 10, 43–44 (2013).

    Article  CAS  Google Scholar 

  10. Yue, P., Melamud, E. & Moult, J. SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 166 (2006).

    Article  Google Scholar 

  11. Singh, A. et al. MutDB: update on development of tools for the biochemical analysis of genetic variation. Nucleic Acids Res. 36, D815–D819 (2008).

    Article  CAS  Google Scholar 

  12. Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).

    Article  CAS  Google Scholar 

  13. Ryslik, G.A. et al. A spatial simulation approach to account for protein structure when identifying non-random somatic mutations. BMC Bioinformatics 15, 231 (2014).

    Article  Google Scholar 

  14. Kamburov, A. et al. Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc. Natl. Acad. Sci. USA 112, E5486–E5495 (2015).

    Article  CAS  Google Scholar 

  15. Betts, M.J. et al. Mechismo: predicting the mechanistic impact of mutations and modifications on molecular interactions. Nucleic Acids Res. 43, e10 (2015).

    Article  Google Scholar 

  16. Sato, Y. et al. Integrated molecular analysis of clear-cell renal cell carcinoma. Nat. Genet. 45, 860–867 (2013).

    Article  CAS  Google Scholar 

  17. Choi, Y.L. et al. Oncogenic MAP2K1 mutations in human epithelial tumors. Carcinogenesis 33, 956–961 (2012).

    Article  CAS  Google Scholar 

  18. Fleming, N.I. et al. SMAD2, SMAD3 and SMAD4 mutations in colorectal cancer. Cancer Res. 73, 725–735 (2013).

    Article  CAS  Google Scholar 

  19. Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  CAS  Google Scholar 

  20. Cohen, M., Potapov, V. & Schreiber, G. Four distances between pairs of amino acids provide a precise description of their interaction. PLoS Comput. Biol. 5, e1000470 (2009).

    Article  Google Scholar 

  21. Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).

    Article  CAS  Google Scholar 

  22. Lo, S.C., Li, X., Henzl, M.T., Beamer, L.J. & Hannink, M. Structure of the Keap1:Nrf2 interface provides mechanistic insight into Nrf2 signaling. EMBO J. 25, 3605–3617 (2006).

    Article  CAS  Google Scholar 

  23. Kerner, G.S. et al. Common and rare EGFR and KRAS mutations in a Dutch non-small-cell lung cancer population and their clinical outcome. PLoS One 8, e70346 (2013).

    Article  CAS  Google Scholar 

  24. Kancha, R.K., von Bubnoff, N., Peschel, C. & Duyster, J. Functional analysis of epidermal growth factor receptor (EGFR) mutations and potential implications for EGFR targeted therapy. Clin. Cancer Res. 15, 460–467 (2009).

    Article  CAS  Google Scholar 

  25. de Biase, D. et al. Next-generation sequencing of lung cancer EGFR exons 18–21 allows effective molecular diagnosis of small routine samples (cytology and biopsy). PLoS One 8, e83607 (2013).

    Article  Google Scholar 

  26. Pao, W. et al. Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. PLoS Med. 2, e73 (2005).

    Article  Google Scholar 

  27. Vogel, V.G. et al. Effects of tamoxifen vs raloxifene on the risk of developing invasive breast cancer and other disease outcomes: the NSABP Study of Tamoxifen and Raloxifene (STAR) P-2 trial. J. Am. Med. Assoc. 295, 2727–2741 (2006).

    Article  CAS  Google Scholar 

  28. Hardman, W.E. (n-3) fatty acids and cancer therapy. J. Nutr. 134 (suppl. 12), 3427S–3430S (2004).

    Article  CAS  Google Scholar 

  29. Redaelli, S. et al. Activity of bosutinib, dasatinib, and nilotinib against 18 imatinib-resistant BCR/ABL mutants. J. Clin. Oncol. 27, 469–471 (2009).

    Article  CAS  Google Scholar 

  30. Ohanian, M., Cortes, J., Kantarjian, H. & Jabbour, E. Tyrosine kinase inhibitors in acute and chronic leukemias. Expert Opin. Pharmacother. 13, 927–938 (2012).

    Article  CAS  Google Scholar 

  31. Azam, M., Seeliger, M.A., Gray, N.S., Kuriyan, J. & Daley, G.Q. Activation of tyrosine kinases by mutation of the gatekeeper threonine. Nat. Struct. Mol. Biol. 15, 1109–1118 (2008).

    Article  CAS  Google Scholar 

  32. UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 40, D71–D75 (2012).

  33. Berman, H.M. The Protein Data Bank: a historical perspective. Acta Crystallogr. A 64, 88–95 (2008).

    Article  CAS  Google Scholar 

  34. Law, V. et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 42, D1091–D1097 (2014).

    Article  CAS  Google Scholar 

  35. Dangalchev, C. Residual closeness in networks. Physica A 365, 556–564 (2006).

    Article  Google Scholar 

  36. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

    Article  CAS  Google Scholar 

  37. Friesel, R., Burgess, W.H. & Maciag, T. Heparin-binding growth factor 1 stimulates tyrosine phosphorylation in NIH 3T3 cells. Mol. Cell. Biol. 9, 1857–1865 (1989).

    Article  CAS  Google Scholar 

Download references


We thank R. Friesel (Maine Medical Center Research Institute) for NIH3T3 clone 2.2 cells. We thank K. Johnson, C. Kandoth, M. Bharadwaj, and K. Huang for suggestions on data analysis. We also thank M. Xie, R. Jayasinghe, and H. Greulich for suggestions on experiments. This work was supported by National Cancer Institute grants R01CA180006 and R01CA178383 and National Human Genome Research Institute grant U01HG006517 to L.D. F.C. is supported in part by US Department of Defense grant PC130118 (W81XWH-14-1-0458) and National Institute of Diabetes and Digestive and Kidney Diseases grant R01DK087960. M.H.B. is in part supported by the Precision Medicine Pathway at the Washington University School of Medicine.

Author information

Authors and Affiliations



L.D. and F.C. designed and supervised research. B.N., A.D.S., S.S., J.N., M.H.B., P.B., J.W., M.D.M., P.T., C.L., K.Y., S.Q.S., W.-W.L., F.C., and L.D. analyzed the data. M.C.W., B.N., A.D.S., and Q.Z. performed statistical analysis. M.A.W., B.N., A.D.S., S.S., and M.H.B. prepared figures and tables. B.N., A.D.S., S.S., J.W., and R.J.M. contributed to HotSpot3D code. L.D., F.C., B.N., A.D.S., S.S., and M.H.B. wrote the manuscript. F.C., M.C.W., A.D.S., S.S., and L.D. revised the manuscript.

Corresponding authors

Correspondence to Feng Chen or Li Ding.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Clustered mutations of oncogenes and tumor suppressors.

For each oncogene (red dots) and tumor suppressor (blue dots), the number of mutations found in an intramolecular cluster is shown with the number of total mutations. Labeled genes correspond to those from either category that had at least half of all mutations found in a cluster.

Supplementary Figure 2 Cancer type specificities of clusters and three-dimensional protein structures.

(a) Intramolecular plot of the top three MTOR clusters anchored around C1483 (purple), F1888 (blue), and T1977 (green). Connections link the significant interacting pairs in lung (LUSC and LUAD), endometrial (UCEC), and kidney (KIRC and KIRP) cancers lying within a geodesic cluster radius of 10 Å from the centroid. Bubble position in the individual tracks indicates mutations along the primary (linear) protein sequence, and bubble size corresponds to sample count for each mutation. Gray shading of MTOR indicates sections currently lacking structure information in PDB. All residues in the three clusters (including cancer types not shown above) are highlighted on the three-dimensional structure model. (b) Intermolecular plot for two PIK3CA–PIK3R1 clusters centered at PIK3CA N345 (green) and PIK3CA E545 (yellow). Residues in the cluster (including cancer types not shown above) are depicted in spatial arrangement on the ribbon structure model (below).

Supplementary Figure 3 Drug–mutation clusters involving one drug.

All dasatinib drug–mutation clusters are shown that involve ABL1, BMX, and BTK. Dasatinib (green) is the centroid of each cluster, and distances to each residue where mutations occurred are shown in the radial plot (left) for ABL1 (red), BMX (orange), and BTK (purple). The outer ring segments show the linear protein sequence of each gene, and the inner ring segments show the regions containing mutations. Structures are shown (from left to right) for ABL1, BTK, and BMX.

Supplementary Figure 4 HotSpot3D online visualization portal.

(a) A screenshot of the online visualization portal with a pair of mutations between VHL and TCEB1 shown. (b) Mutations from the ASB9, SOCS4, TCEB1, and VHL intermolecular cluster are shown for a structure of TCEB1 (purple) and VHL (green). Y79 in TCEB1 was recently validated as disrupting interaction with VHL in KIRC cases.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Note. (PDF 932 kb)

Supplementary Table 1

List of curated cancer-related genes. (XLSX 41 kb)

Supplementary Table 2

Gene, pair, and cluster counts at varying proximity-pair significance levels and by cancer gene category. (XLSX 40 kb)

Supplementary Table 3

Intramolecular clusters with annotations. (XLSX 1082 kb)

Supplementary Table 4

Intermolecular clusters with annotations. (XLSX 158 kb)

Supplementary Table 5

Cluster closeness and number of mutations from oncogene clusters. (XLSX 43 kb)

Supplementary Table 6

Cluster closeness and number of mutations from tumor-suppressor clusters. (XLSX 47 kb)

Supplementary Table 7

Oncogene intramolecular clusters with protein domain annotation. (XLSX 71 kb)

Supplementary Table 8

Tumor-suppressor intramolecular clusters with protein domain annotation. (XLSX 80 kb)

Supplementary Table 9

Intramolecular clusters with at least 50% specificity to one cancer type. (XLSX 47 kb)

Supplementary Table 10

Intermolecular clusters with at least 50% specificity to one cancer type. (XLSX 48 kb)

Supplementary Table 11

Experimental validation for predicted functional rare driver mutations. (XLSX 26 kb)

Supplementary Table 12

Intramolecular clusters with hotspot residues and potential novel functional mutations. (XLSX 27 kb)

Supplementary Table 13

Intermolecular clusters with hotspot residues and potential novel functional mutations. (XLSX 49 kb)

Supplementary Table 14

Intramolecular clusters with medium-recurrence novel functional mutations. (XLSX 26 kb)

Supplementary Table 15

Intermolecular clusters with medium-recurrence novel functional mutations. (XLSX 50 kb)

Supplementary Table 16

Singletons In RBX1–CUL1–GLMN cluster. (XLSX 38 kb)

Supplementary Table 17

Quantitative image analysis of intensities from immunoblots. (XLSX 47 kb)

Supplementary Table 18

Drug–mutation clusters with annotations (Cc > 2.5 in blue). (XLS 124 kb)

Supplementary Table 19

HGNC gene families found in drug–mutation clusters. (XLSX 42 kb)

Supplementary Table 20

NIH drug classes observed in drug–mutation clusters. (XLSX 24 kb)

Supplementary Table 21

DrugBank drug classes observed in drug–mutation clusters. (XLSX 34 kb)

Supplementary Table 22

Prioritized putative functional variants in rank order by closeness centrality. (XLSX 44 kb)

Supplementary Table 23

Sample association table of three HUGO genes and UniProt IDs, PDB IDs, and transcript IDs. (XLSX 25 kb)

Supplementary Table 24

Somatic mutations from 19 TCGA cancer types annotated for all transcripts with the minimum columns necessary from a mutation allele format (MAF) to run HotSpot3D. (XLS 223472 kb)

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, B., Scott, A., Sengupta, S. et al. Protein-structure-guided discovery of functional mutations across 19 cancer types. Nat Genet 48, 827–837 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer