Abstract
Local concentrations of mutations are well known in human cancers. However, their three-dimensional spatial relationships in the encoded protein have yet to be systematically explored. We developed a computational tool, HotSpot3D, to identify such spatial hotspots (clusters) and to interpret the potential function of variants within them. We applied HotSpot3D to >4,400 TCGA tumors across 19 cancer types, discovering >6,000 intra- and intermolecular clusters, some of which showed tumor and/or tissue specificity. In addition, we identified 369 rare mutations in genes including TP53, PTEN, VHL, EGFR, and FBXW7 and 99 medium-recurrence mutations in genes such as RUNX1, MTOR, CA3, PI3, and PTPN11, all mapping within clusters having potential functional implications. As a proof of concept, we validated our predictions in EGFR using high-throughput phosphorylation data and cell-line-based experimental evaluation. Finally, mutation–drug cluster and network analysis predicted over 800 promising candidates for druggable mutations, raising new possibilities for designing personalized treatments for patients carrying specific mutations.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Actionability classification of variants of unknown significance correlates with functional effect
npj Precision Oncology Open Access 15 July 2023
-
Pan-cancer clinical impact of latent drivers from double mutations
Communications Biology Open Access 20 February 2023
-
The 3D mutational constraint on amino acid sites in the human proteome
Nature Communications Open Access 07 June 2022
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout







Change history
20 March 2017
In the version of this article initially published, residue R384 was incorrectly highlighted in the protein model depicted in Figure 7e. The correct residue is R394. The error has been corrected in the HTML and PDF versions of the article.
References
Dees, N.D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).
Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Carter, H., Samayoa, J., Hruban, R.H. & Karchin, R. Prioritization of driver mutations in pancreatic cancer using cancer-specific high-throughput annotation of somatic mutations (CHASM). Cancer Biol. Ther. 10, 582–587 (2010).
Gonzalez-Perez, A. et al. IntOGen-mutations identifies cancer drivers across tumor types. Nat. Methods 10, 1081–1082 (2013).
Gonzalez-Perez, A. & Lopez-Bigas, N. Functional impact bias reveals cancer drivers. Nucleic Acids Res. 40, e169 (2012).
Tamborero, D., Gonzalez-Perez, A. & Lopez-Bigas, N. OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29, 2238–2244 (2013).
Niknafs, N. et al. MuPIT interactive: webserver for mapping variant positions to annotated, interactive 3D structures. Hum. Genet. 132, 1235–1243 (2013).
Ryan, M., Diekhans, M., Lien, S., Liu, Y. & Karchin, R. LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures. Bioinformatics 25, 1431–1432 (2009).
Teyra, J. & Kim, P.M. Interpreting protein networks with three-dimensional structures. Nat. Methods 10, 43–44 (2013).
Yue, P., Melamud, E. & Moult, J. SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 166 (2006).
Singh, A. et al. MutDB: update on development of tools for the biochemical analysis of genetic variation. Nucleic Acids Res. 36, D815–D819 (2008).
Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).
Ryslik, G.A. et al. A spatial simulation approach to account for protein structure when identifying non-random somatic mutations. BMC Bioinformatics 15, 231 (2014).
Kamburov, A. et al. Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc. Natl. Acad. Sci. USA 112, E5486–E5495 (2015).
Betts, M.J. et al. Mechismo: predicting the mechanistic impact of mutations and modifications on molecular interactions. Nucleic Acids Res. 43, e10 (2015).
Sato, Y. et al. Integrated molecular analysis of clear-cell renal cell carcinoma. Nat. Genet. 45, 860–867 (2013).
Choi, Y.L. et al. Oncogenic MAP2K1 mutations in human epithelial tumors. Carcinogenesis 33, 956–961 (2012).
Fleming, N.I. et al. SMAD2, SMAD3 and SMAD4 mutations in colorectal cancer. Cancer Res. 73, 725–735 (2013).
Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Cohen, M., Potapov, V. & Schreiber, G. Four distances between pairs of amino acids provide a precise description of their interaction. PLoS Comput. Biol. 5, e1000470 (2009).
Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).
Lo, S.C., Li, X., Henzl, M.T., Beamer, L.J. & Hannink, M. Structure of the Keap1:Nrf2 interface provides mechanistic insight into Nrf2 signaling. EMBO J. 25, 3605–3617 (2006).
Kerner, G.S. et al. Common and rare EGFR and KRAS mutations in a Dutch non-small-cell lung cancer population and their clinical outcome. PLoS One 8, e70346 (2013).
Kancha, R.K., von Bubnoff, N., Peschel, C. & Duyster, J. Functional analysis of epidermal growth factor receptor (EGFR) mutations and potential implications for EGFR targeted therapy. Clin. Cancer Res. 15, 460–467 (2009).
de Biase, D. et al. Next-generation sequencing of lung cancer EGFR exons 18–21 allows effective molecular diagnosis of small routine samples (cytology and biopsy). PLoS One 8, e83607 (2013).
Pao, W. et al. Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. PLoS Med. 2, e73 (2005).
Vogel, V.G. et al. Effects of tamoxifen vs raloxifene on the risk of developing invasive breast cancer and other disease outcomes: the NSABP Study of Tamoxifen and Raloxifene (STAR) P-2 trial. J. Am. Med. Assoc. 295, 2727–2741 (2006).
Hardman, W.E. (n-3) fatty acids and cancer therapy. J. Nutr. 134 (suppl. 12), 3427S–3430S (2004).
Redaelli, S. et al. Activity of bosutinib, dasatinib, and nilotinib against 18 imatinib-resistant BCR/ABL mutants. J. Clin. Oncol. 27, 469–471 (2009).
Ohanian, M., Cortes, J., Kantarjian, H. & Jabbour, E. Tyrosine kinase inhibitors in acute and chronic leukemias. Expert Opin. Pharmacother. 13, 927–938 (2012).
Azam, M., Seeliger, M.A., Gray, N.S., Kuriyan, J. & Daley, G.Q. Activation of tyrosine kinases by mutation of the gatekeeper threonine. Nat. Struct. Mol. Biol. 15, 1109–1118 (2008).
UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 40, D71–D75 (2012).
Berman, H.M. The Protein Data Bank: a historical perspective. Acta Crystallogr. A 64, 88–95 (2008).
Law, V. et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 42, D1091–D1097 (2014).
Dangalchev, C. Residual closeness in networks. Physica A 365, 556–564 (2006).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Friesel, R., Burgess, W.H. & Maciag, T. Heparin-binding growth factor 1 stimulates tyrosine phosphorylation in NIH 3T3 cells. Mol. Cell. Biol. 9, 1857–1865 (1989).
Acknowledgements
We thank R. Friesel (Maine Medical Center Research Institute) for NIH3T3 clone 2.2 cells. We thank K. Johnson, C. Kandoth, M. Bharadwaj, and K. Huang for suggestions on data analysis. We also thank M. Xie, R. Jayasinghe, and H. Greulich for suggestions on experiments. This work was supported by National Cancer Institute grants R01CA180006 and R01CA178383 and National Human Genome Research Institute grant U01HG006517 to L.D. F.C. is supported in part by US Department of Defense grant PC130118 (W81XWH-14-1-0458) and National Institute of Diabetes and Digestive and Kidney Diseases grant R01DK087960. M.H.B. is in part supported by the Precision Medicine Pathway at the Washington University School of Medicine.
Author information
Authors and Affiliations
Contributions
L.D. and F.C. designed and supervised research. B.N., A.D.S., S.S., J.N., M.H.B., P.B., J.W., M.D.M., P.T., C.L., K.Y., S.Q.S., W.-W.L., F.C., and L.D. analyzed the data. M.C.W., B.N., A.D.S., and Q.Z. performed statistical analysis. M.A.W., B.N., A.D.S., S.S., and M.H.B. prepared figures and tables. B.N., A.D.S., S.S., J.W., and R.J.M. contributed to HotSpot3D code. L.D., F.C., B.N., A.D.S., S.S., and M.H.B. wrote the manuscript. F.C., M.C.W., A.D.S., S.S., and L.D. revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Clustered mutations of oncogenes and tumor suppressors.
For each oncogene (red dots) and tumor suppressor (blue dots), the number of mutations found in an intramolecular cluster is shown with the number of total mutations. Labeled genes correspond to those from either category that had at least half of all mutations found in a cluster.
Supplementary Figure 2 Cancer type specificities of clusters and three-dimensional protein structures.
(a) Intramolecular plot of the top three MTOR clusters anchored around C1483 (purple), F1888 (blue), and T1977 (green). Connections link the significant interacting pairs in lung (LUSC and LUAD), endometrial (UCEC), and kidney (KIRC and KIRP) cancers lying within a geodesic cluster radius of 10 Å from the centroid. Bubble position in the individual tracks indicates mutations along the primary (linear) protein sequence, and bubble size corresponds to sample count for each mutation. Gray shading of MTOR indicates sections currently lacking structure information in PDB. All residues in the three clusters (including cancer types not shown above) are highlighted on the three-dimensional structure model. (b) Intermolecular plot for two PIK3CA–PIK3R1 clusters centered at PIK3CA N345 (green) and PIK3CA E545 (yellow). Residues in the cluster (including cancer types not shown above) are depicted in spatial arrangement on the ribbon structure model (below).
Supplementary Figure 3 Drug–mutation clusters involving one drug.
All dasatinib drug–mutation clusters are shown that involve ABL1, BMX, and BTK. Dasatinib (green) is the centroid of each cluster, and distances to each residue where mutations occurred are shown in the radial plot (left) for ABL1 (red), BMX (orange), and BTK (purple). The outer ring segments show the linear protein sequence of each gene, and the inner ring segments show the regions containing mutations. Structures are shown (from left to right) for ABL1, BTK, and BMX.
Supplementary Figure 4 HotSpot3D online visualization portal.
(a) A screenshot of the online visualization portal with a pair of mutations between VHL and TCEB1 shown. (b) Mutations from the ASB9, SOCS4, TCEB1, and VHL intermolecular cluster are shown for a structure of TCEB1 (purple) and VHL (green). Y79 in TCEB1 was recently validated as disrupting interaction with VHL in KIRC cases.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–4 and Supplementary Note. (PDF 932 kb)
Supplementary Table 1
List of curated cancer-related genes. (XLSX 41 kb)
Supplementary Table 2
Gene, pair, and cluster counts at varying proximity-pair significance levels and by cancer gene category. (XLSX 40 kb)
Supplementary Table 3
Intramolecular clusters with annotations. (XLSX 1082 kb)
Supplementary Table 4
Intermolecular clusters with annotations. (XLSX 158 kb)
Supplementary Table 5
Cluster closeness and number of mutations from oncogene clusters. (XLSX 43 kb)
Supplementary Table 6
Cluster closeness and number of mutations from tumor-suppressor clusters. (XLSX 47 kb)
Supplementary Table 7
Oncogene intramolecular clusters with protein domain annotation. (XLSX 71 kb)
Supplementary Table 8
Tumor-suppressor intramolecular clusters with protein domain annotation. (XLSX 80 kb)
Supplementary Table 9
Intramolecular clusters with at least 50% specificity to one cancer type. (XLSX 47 kb)
Supplementary Table 10
Intermolecular clusters with at least 50% specificity to one cancer type. (XLSX 48 kb)
Supplementary Table 11
Experimental validation for predicted functional rare driver mutations. (XLSX 26 kb)
Supplementary Table 12
Intramolecular clusters with hotspot residues and potential novel functional mutations. (XLSX 27 kb)
Supplementary Table 13
Intermolecular clusters with hotspot residues and potential novel functional mutations. (XLSX 49 kb)
Supplementary Table 14
Intramolecular clusters with medium-recurrence novel functional mutations. (XLSX 26 kb)
Supplementary Table 15
Intermolecular clusters with medium-recurrence novel functional mutations. (XLSX 50 kb)
Supplementary Table 16
Singletons In RBX1–CUL1–GLMN cluster. (XLSX 38 kb)
Supplementary Table 17
Quantitative image analysis of intensities from immunoblots. (XLSX 47 kb)
Supplementary Table 18
Drug–mutation clusters with annotations (Cc > 2.5 in blue). (XLS 124 kb)
Supplementary Table 19
HGNC gene families found in drug–mutation clusters. (XLSX 42 kb)
Supplementary Table 20
NIH drug classes observed in drug–mutation clusters. (XLSX 24 kb)
Supplementary Table 21
DrugBank drug classes observed in drug–mutation clusters. (XLSX 34 kb)
Supplementary Table 22
Prioritized putative functional variants in rank order by closeness centrality. (XLSX 44 kb)
Supplementary Table 23
Sample association table of three HUGO genes and UniProt IDs, PDB IDs, and transcript IDs. (XLSX 25 kb)
Supplementary Table 24
Somatic mutations from 19 TCGA cancer types annotated for all transcripts with the minimum columns necessary from a mutation allele format (MAF) to run HotSpot3D. (XLS 223472 kb)
Rights and permissions
About this article
Cite this article
Niu, B., Scott, A., Sengupta, S. et al. Protein-structure-guided discovery of functional mutations across 19 cancer types. Nat Genet 48, 827–837 (2016). https://doi.org/10.1038/ng.3586
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3586
This article is cited by
-
Actionability classification of variants of unknown significance correlates with functional effect
npj Precision Oncology (2023)
-
Pan-cancer clinical impact of latent drivers from double mutations
Communications Biology (2023)
-
Computational analysis of cancer genome sequencing data
Nature Reviews Genetics (2022)
-
The 3D mutational constraint on amino acid sites in the human proteome
Nature Communications (2022)
-
Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes
Scientific Reports (2021)