Phase and context shape the function of composite oncogenic mutations


Cancers develop as a result of driver mutations1,2 that lead to clonal outgrowth and the evolution of disease3,4. The discovery and functional characterization of individual driver mutations are central aims of cancer research, and have elucidated myriad phenotypes5 and therapeutic vulnerabilities6. However, the serial genetic evolution of mutant cancer genes7,8 and the allelic context in which they arise is poorly understood in both common and rare cancer genes and tumour types. Here we find that nearly one in four human tumours contains a composite mutation of a cancer-associated gene, defined as two or more nonsynonymous somatic mutations in the same gene and tumour. Composite mutations are enriched in specific genes, have an elevated rate of use of less-common hotspot mutations acquired in a chronology driven in part by oncogenic fitness, and arise in an allelic configuration that reflects context-specific selective pressures. cis-acting composite mutations are hypermorphic in some genes in which dosage effects predominate (such as TERT), whereas they lead to selection of function in other genes (such as TP53). Collectively, composite mutations are driver alterations that arise from context- and allele-specific selective pressures that are dependent in part on gene and mutation function, and which lead to complex—often neomorphic—functions of biological and therapeutic importance.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Composite mutations in human cancers.
Fig. 2: Gene- and residue-specific selective pressure for composite mutations.
Fig. 3: cis- and trans-acting composite mutants.
Fig. 4: Mutant-allele-specific enrichment for composite mutations.

Data availability

All mutational data from the prospective sequencing cohort are available at Mutational data from The Cancer Genome Atlas were acquired from RNA sequencing data have been deposited in the Gene Expression Omnibus with accession number GSE136295. All other genomic and clinical data accompany the Article, and are available in the Extended Data and Supplementary Information. All other materials are available upon request from the corresponding authors.

Code availability

Source code for these analyses is available at


  1. 1.

    Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Garraway, L. A. & Lander, E. S. Lessons from the cancer genome. Cell 153, 17–37 (2013).

    CAS  PubMed  Google Scholar 

  3. 3.

    Cairns, J. Mutation selection and the natural history of cancer. Nature 255, 197–200 (1975).

    ADS  CAS  PubMed  Google Scholar 

  4. 4.

    Nowell, P. C. The clonal evolution of tumor cell populations. Science 194, 23–28 (1976).

    ADS  CAS  PubMed  Google Scholar 

  5. 5.

    Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

    CAS  Article  Google Scholar 

  6. 6.

    Hyman, D. M., Taylor, B. S. & Baselga, J. Implementing genome-driven oncology. Cell 168, 584–599 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Knudson, A. G., Jr. Mutation and cancer: statistical study of retinoblastoma. Proc. Natl Acad. Sci. USA 68, 820–823 (1971).

    ADS  PubMed  Google Scholar 

  8. 8.

    Bielski, C. M. et al. Widespread selection for oncogenic mutant allele imbalance in cancer. Cancer Cell 34, 852–862.e4 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Jin, G. et al. Disruption of wild-type IDH1 suppresses d-2-hydroxyglutarate production in IDH1-mutated gliomas. Cancer Res. 73, 496–501 (2013).

    CAS  PubMed  Google Scholar 

  11. 11.

    Mueller, S. et al. Evolutionary routes and KRAS dosage define pancreatic cancer phenotypes. Nature 554, 62–68 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Chang, M. T. et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat. Biotechnol. 34, 155–163 (2016).

    CAS  PubMed  Google Scholar 

  13. 13.

    Chang, M. T. et al. Accelerating discovery of functional mutant alleles in cancer. Cancer Discov. 8, 174–183 (2018).

    CAS  PubMed  Google Scholar 

  14. 14.

    Intlekofer, A. M. et al. Acquired resistance to IDH inhibition through trans or cis dimer-interface mutations. Nature 559, 125–129 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Hidaka, N. et al. Most T790M mutations are present on the same EGFR allele as activating mutations in patients with non-small cell lung cancer. Lung Cancer 108, 75–82 (2017).

    PubMed  Google Scholar 

  16. 16.

    Gainor, J. F. et al. Molecular mechanisms of resistance to first- and second-generation ALK inhibitors in ALK-rearranged lung cancer. Cancer Discov. 6, 1118–1133 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Kobayashi, S. et al. EGFR mutation and resistance of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 352, 786–792 (2005).

    CAS  PubMed  Google Scholar 

  18. 18.

    Vasan, N. et al. Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PI3Kα inhibitors. Science 366, 714–723 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Chen, Z. et al. EGFR somatic doublets in lung cancer are frequent and generally arise from a pair of driver mutations uncommonly seen as singlet mutations: one-third of doublets occur at five pairs of amino acids. Oncogene 27, 4336–4343 (2008).

    CAS  PubMed  Google Scholar 

  20. 20.

    Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Bell, R. J. A. et al. The transcription factor GABP selectively binds and activates the mutant TERT promoter in cancer. Science 348, 1036–1039 (2015).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Berenjeno, I. M. et al. Oncogenic PIK3CA induces centrosome amplification and tolerance to genome doubling. Nat. Commun. 8, 1773 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Kinross, K. M. et al. An activating Pik3ca mutation coupled with Pten loss is sufficient to initiate ovarian tumorigenesis in mice. J. Clin. Invest. 122, 553–557 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Madsen, R. R. et al. Oncogenic PIK3CA promotes cellular stemness in an allele dose-dependent manner. Proc. Natl Acad. Sci. USA 116, 8380–8389 (2019).

    CAS  PubMed  Google Scholar 

  25. 25.

    Hyman, D. M. et al. Precision medicine at Memorial Sloan Kettering Cancer Center: clinical next-generation sequencing enabling next-generation targeted therapy trials. Drug Discov. Today 20, 1422–1428 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Cheng, D. T. et al. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Zehir, A. et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. 1, 1–16 (2017).

    Google Scholar 

  29. 29.

    Campbell, B. B. et al. Comprehensive analysis of hypermutation in human cancer. Cell 171, 1042–1056.e10 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Niu, B. et al. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 30, 1015–1016 (2014).

    CAS  PubMed  Google Scholar 

  31. 31.

    Middha, S. et al. Reliable pan-cancer microsatellite instability assessment by using targeted next-generation sequencing data. JCO Precis. Oncol. 1, 1–17 (2017).

    Google Scholar 

  32. 32.

    Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14, 927–930 (2003).

    Google Scholar 

  34. 34.

    Smedley, D. et al. The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res. 43, W589–W598 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Forbes, S. A. et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).

    CAS  PubMed  Google Scholar 

  36. 36.

    Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Alexandrov, L. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Pich, O. et al. Somatic and germline mutation periodicity follow the orientation of the DNA minor groove around nucleosomes. Cell 175, 1074–1087.e18 (2018).

    CAS  PubMed  Google Scholar 

  39. 39.

    Sabarinathan, R., Mularoni, L., Deu-Pons, J., Gonzalez-Perez, A. & López-Bigas, N. Nucleotide excision repair is impaired by binding of transcription factors to DNA. Nature 532, 264–267 (2016).

    ADS  CAS  PubMed  Google Scholar 

  40. 40.

    Buisson, R. et al. Passenger hotspot mutations in cancer driven by APOBEC3A and mesoscale genomic features. Science 364, eaaw2872 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Hess, J. M. et al. Passenger hotspot mutations in cancer. Cancer Cell 36, 288–301.e14 (2019).

    CAS  PubMed  Google Scholar 

  42. 42.

    Needleman, S. B. & Wunsch, C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970).

    CAS  PubMed  Google Scholar 

  43. 43.

    McGranahan, N. et al. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci. Transl. Med. 7, 283ra54 (2015).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Dimitrova, N. et al. Stromal expression of miR-143/145 promotes neoangiogenesis in lung cancer development. Cancer Discov. 6, 188–201 (2016).

    CAS  PubMed  Google Scholar 

  45. 45.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

  49. 49.

    Bult, C. J., Blake, J. A., Smith, C. L., Kadin, J. A. & Richardson, J. E. Mouse genome database (MGD) 2019. Nucleic Acids Res. 47, D801–D806 (2019).

    CAS  PubMed  Google Scholar 

  50. 50.

    Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 46, D260–D266 (2018).

    CAS  PubMed  Google Scholar 

  51. 51.

    Tan, G. & Lenhard, B. TFBSTools: an R/bioconductor package for transcription factor binding site analysis. Bioinformatics 32, 1555–1556 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Touzet, H. & Varré, J.-S. Efficient and accurate P-value computation for position weight matrices. Algorithms Mol. Biol. 2, 15 (2007).

    PubMed  PubMed Central  Google Scholar 

  53. 53.

    Supek, F. & Lehner, B. Clustered mutation signatures reveal that error-prone DNA repair targets mutations to active genes. Cell 170, 534–547.e23 (2017).

    CAS  PubMed  Google Scholar 

  54. 54.

    Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank the members of the E.R. and B.S.T. laboratories for discussion and support. This work was supported by National Institutes of Health awards P30 CA008748, P01 CA087497 (S.W.L.), U54 OD020355 (S.W.L. and B.S.T.), R01 CA207244 (B.S.T.), R01 CA204749 (B.S.T.), R01 CA245069 (B.S.T.); Brown Performance Group ICI Fund (N.V. and E.R.), Society of MSK (N.V. and E.R.), American Cancer Society, Anna Fuller Fund and the Josie Robertson Foundation (B.S.T.). F.J.S.-R. is an HHMI Hanna Gray Fellow supported in part by an MSKCC Translational Research Oncology Training Fellowship (T32-CA160001). S.W.L. is an investigator of the Howard Hughes Medical Institute.

Author information




A.N.G., E.R. and B.S.T. conceived the study. C.M.B., E.B., P.J., A.V.P., A.L.R., N.D.F., C.B., N.S., E.R. and B.S.T. assisted with genomic data collection and analytical methodology development. F.J.S.-R., Y.C., N.V., M.S. and S.W.L. designed and performed the experiments. Y.J.H. and T.B. assisted with RNA sequencing. A.N.G., E.R. and B.S.T. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Ed Reznik or Barry S. Taylor.

Ethics declarations

Competing interests

N.V. reports advisory board activities for Novartis and consulting activities for Petra Pharmaceuticals. M.S. has received research funding from Puma Biotechnology, Daiichi-Sankio, Immunomedics, Targimmune and Menarini Ricerche; is a cofounder of, and is on the advisory boards of the Bioscience Institute and Menarini Ricerche. S.W.L. is a founder and scientific advisory board member of Oric Pharmaceuticals, Mirimus, Inc. and Blueprint Medicines; and is on the scientific advisory boards of Constellation Pharmaceuticals, Petra Pharmaceuticals and PMV Pharmaceuticals. B.S.T. reports receiving honoria and research funding from Genentech and Illumina, and advisory board activities for Boehringer Ingelheim and Loxo Oncology, a wholly owned subsidiary of Eli Lilly, Inc. All stated activities were outside of the work described here. The other authors declare no competing interests.

Additional information

Peer review information Nature thanks Moritz Gerstung, Mark Lackner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Study cohort and rates of composite mutations.

a, Distribution of cancer types in the study cohort. b, The rate of composite mutations (22.7% of all tumours) compared to a simulated background rate (black, P = 10−5 from one-sided permutation test for enrichment with 100,000 random permutation-based simulations (no permutation exceeded observed value)). c, The observed rate of composite mutations in the primary untreated cancers of the TCGA cohort (n = 10,908 solid tumours) when controlling for gene content for consistency with the targeted sequencing panel of the prospective cohort studied here. The null distribution from sampling (Methods) is shown in black. d, The observed and expected rate of composite mutations in tumours of the indicated tumour mutational burden (as in Fig. 1b, n = 30,505 biologically independent tumour samples with tumour mutational burden ≤ 40, P = 1 × 10−9 from two-sided Wilcoxon signed-rank test).

Extended Data Fig. 2 Sources of local hypermutation.

a, The number of composite mutations comprising two or more constituent variants (top) and the distribution of likely causative mutational signatures among them (bottom). Composite mutants comprising greater than three mutations were increasingly produced by APOBEC-associated mutagenesis, indicative of localized hypermutation53,54, but accounted for a minority of events cohort-wide. b, Left, the somatic mutational data in the study cohort reflect the elevated mutation rates previously observed at both the positions closest to the nucleosome dyad as well as DNA bound to active transcription-factor binding sites38,39. However, mutations arising in composite events were proportionally less often proximal to such sites (defined here as within the full width at half maximum of the peak of mutation rate (red)) than were singleton mutations (right, P = 10−27 and 10−47, respectively; two-sided two-sample Z-test, n = 323,883 single-nucleotide substitutions arising in 471 biologically distinct melanoma samples).

Extended Data Fig. 3 Number and distribution of composite events across genes.

a, The number and percentage of cases in the study cohort containing composite mutations in the indicated genes (right) juxtaposed to their overall mutation rate (left). Genes with a significant enrichment of composite mutations are shown (Q < 0.01, FDR-adjusted P values from one-sided binomial test for enrichment, n = 26,997 as in Fig. 2b), limited to the top 10 genes by significance in each category of gene function, unless fewer. b, The significance of enrichment for composite mutations (n and statistical tests as described in a and Fig. 2b) limited to 168 oncogenes.

Extended Data Fig. 4 cis composite secondary-resistance mutations.

The cis composite mutations classified as arising in post-treatment specimens due to acquired resistance to one of several molecularly targeted therapies in the study cohort.

Extended Data Fig. 5 Phenotypic characterization of TP53 composite mutants.

a, TP53R280T/E287D mutant lung adenocarcinoma. Left, mutant allele fractions of clonal TP53 mutations consistent with loss of wild-type TP53 (error bars, 95% binomial confidence intervals). Expected mutant allele fractions of different copy number states are shown as horizontal lines. Mutant KEAP1 in the same tumour (with LOH) is shown for reference. Right, spanning reads indicating cis mutations. b, Right and left, Trp53 and Cdkn1a mRNA expression in KrasG12D/+Trp53Mut mouse lung cancer cells expressing distinct Trp53 genotypes. Bars, average of three replicates, error bars are 95% confidence intervals. c, The aggregate Z-score per replicate for the mRNA expression of canonical p53-target genes (n = 3 replicates per allele; box centre is median, edges are 25% and 75% quartiles, whiskers are minimum and maximum of the most extreme values). d, Principal component analysis of the transcriptomes of Trp53 genotypes (n = 3 replicates shown per condition). e, Dendrogram as in Fig. 3f, indicating the genes of interest (effectors of the AP-1 transcription factor network (PID_AP1_PATHWAY; Q = 1.4 × 10−7 based on computed overlap (using mSigDB) with n = 5,501 gene sets from the curated C2 collection)). f, The prevalence of TP53R280T and TP53E287D mutations (top), and the fraction arising as composite mutants (bottom). The corresponding mouse alleles are given in parentheses. g, Principal component analysis of the transcriptomes of the Trp53R277K/E282K composite mutation genotypes (as in d). n = 3 replicates per allele. h, The percentage of GFP+ FACS-purified KrasG12D/+Trp53−/− lung adenocarcinoma cells stably transduced with pMIG-empty or pMIG-p53-R277T-E284D, and cultured in vitro for 10 days in a 60:40 mixture with untransduced parental cells. Bar indicates mean, error bars are s.d., n = 3 independent infections. i, Overall survival of immunocompromised mice bearing lung tumours of the indicated Trp53 genotypes generated by tail vein injection of stably transduced and FACS-purified KrasG12D/+Trp53−/− lung adenocarcinoma cells (n = 100,000 cells).

Extended Data Fig. 6 Saturation analysis of genes for composite mutation detection.

Down-sampling indicates the number of residues identified as enriched for arising in composite mutations in each of four genes (Q < 0.1, FDR-adjusted one-sided Fisher’s exact tests as in Fig. 4a; n = 1,000–26,997 patients per down-sample) as a function of the number of tumours sequenced (LOESS fit is shown with 95% confidence interval). Four genes that accounted for the greatest proportion of all enriched residues detected are shown (Fig. 4a). EGFR appears to reach saturation for discovery of residues enriched for arising in composite, whereas the other genes have not yet reached saturation for discovery at the current cohort size.

Extended Data Fig. 7 Mutational signature attribution among composite mutations.

a, The fraction of all composite mutations identified here in which one or both individual mutations could be unambiguously attributed to an established mutational signature. The majority of composite variants could not be directly attributed to APOBEC, ultraviolet, smoking or other known mutational signatures. b, The fraction of composite mutations per gene in which one or both variants could be attributed to an established mutational signature.

Extended Data Fig. 8 Conditional mutant alleles.

a, The number of affected cases containing each of the indicated somatic mutations in TERT, EGFR or PIK3CA as either individual mutations (top) or as part of composite mutants (bottom). Conditional mutations were defined as those statistically enriched for arising as part of composite mutations, but seldom as individual hotspot mutations in cancer (predominantly accompanied by a second somatic mutation). b, The incidence of TERT promoter mutations and the fraction arising as composite mutations (orange). Bottom, the co-occurrence and mutual exclusivity of composite mutations in the TERT promoter (The P values for n = 5 and 6 co-occurring mutations are 0.002 and 3 × 10−7, respectively, and for 0 mutually exclusive mutations is 1 × 10−25; two-sided Fisher’s exact test, n = 29,507 patients). c, Transcription factor GABPA binding affinity for mutant and wild-type TERT promoter sequences at the 228G>A, 250G>A and the conditional 205G>A allele.

Supplementary information

Supplementary Information

This file contains a guide for Supplementary Tables 1-5.

Reporting Summary

Supplementary Tables

This file contains Supplementary Tables 1-5 – see Supplementary Information document for full guide.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gorelick, A.N., Sánchez-Rivera, F.J., Cai, Y. et al. Phase and context shape the function of composite oncogenic mutations. Nature 582, 100–103 (2020).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing