Abstract
Missense driver mutations in cancer are concentrated in a few hotspots1. Various mechanisms have been proposed to explain this skew, including biased mutational processes2, phenotypic differences3,4,5,6 and immunoediting of neoantigens7,8; however, to our knowledge, no existing model weighs the relative contribution of these features to tumour evolution. We propose a unified theoretical ‘free fitness’ framework that parsimoniously integrates multimodal genomic, epigenetic, transcriptomic and proteomic data into a biophysical model of the rate-limiting processes underlying the fitness advantage conferred on cancer cells by driver gene mutations. Focusing on TP53, the most mutated gene in cancer1, we present an inference of mutant p53 concentration and demonstrate that TP53 hotspot mutations optimally solve an evolutionary trade-off between oncogenic potential and neoantigen immunogenicity. Our model anticipates patient survival in The Cancer Genome Atlas and patients with lung cancer treated with immunotherapy as well as the age of tumour onset in germline carriers of TP53 variants. The predicted differential immunogenicity between hotspot mutations was validated experimentally in patients with cancer and in a unique large dataset of healthy individuals. Our data indicate that immune selective pressure on TP53 mutations has a smaller role in non-cancerous lesions than in tumours, suggesting that targeted immunotherapy may offer an early prophylactic opportunity for the former. Determining the relative contribution of immunogenicity and oncogenic function to the selective advantage of hotspot mutations thus has important implications for both precision immunotherapies and our understanding of tumour evolution.
Similar content being viewed by others
Main
The distribution of mutations in cancer is highly non-uniform. Mutations in oncogenes and tumour suppressors are enriched across cancers, and specific sites known as hotspots are more frequently mutated, leading to the hypothesis that hotspot mutations offer a selective advantage1. A paradigmatic example is the tumour suppressor p53. Although TP53 is mutated in more than 50% of cancers, only eight hotspot mutations make up approximately one-third of all missense TP53 mutations3. Several hypotheses have been offered to explain the mechanisms behind this skewed distribution, including biased generative mutational processes during tumour evolution2,3, degree of functional alteration3,4,5, structural stability3,6 and immune editing7,8. However, these hypotheses are not mutually exclusive. Mutations and subsequent selection can lead to substantial alterations in the concentration of oncogenic proteins9,10,11, a factor that has not been quantified as a contributor to the predominance of hotspot mutations. Generally, mutant p53 is present at a higher concentration than wild-type protein, depending on the tissue, copy-number alteration and mutation12,13,14. Yet, divergence from self and overexpression can contribute to mutant p53 neoantigen immunogenicity, constraining the ability of mutant p53 to avoid immune surveillance. Because neoantigens from mutations in tumour driver genes that are shared across patients and tumour types represent attractive immunotherapeutic targets15,16, understanding this issue is of critical importance. Here we examine the relationship between oncogenicity and immunogenicity for tumour driver mutations, using p53 as a primary example, to develop a model for predicting therapeutic targeting strategies, such as for neoantigen-based immune therapies.
We found that mutation frequency distributions for commonly mutated driver genes were conserved across multiple cancer mutation databases (Fig. 1a, b) and that innate mutation rates based on trinucleotide context significantly correlated with mutation frequencies for several genes (Supplementary Information). We next quantified amino acid conservation over homologous proteins, a proxy for functional phenotype (Fig. 1c), and in silico-predicted reduced neoantigen presentation by major histocompatibility complex class I (MHC-I) molecules (Fig. 1d) across driver genes7. Several genes have hotspots at conserved sites and are poorly presented (Fig. 1e), implying that the fitness advantages the mutations confer may be driven by both features. We focused on TP53 because it is widely mutated in tumours, with well-established, order-conserved pan-cancer hotspots (Fig. 1b and Supplementary Table 1) and broadly available functional phenotypic data5. We quantified the altered transcription factor function of mutant p53 across eight principal transcriptional targets with a quantitative yeast assay5 (Fig. 1f and Extended Data Fig. 1). We found that, although loss of transactivation was present for hotspot mutations, many non-hotspot mutations had comparatively low transactivation capacity. Moreover, we predicted MHC-I molecule presentation for the set of nonamer neopeptides surrounding p53 hotspot mutations to be worse than for non-hotspot peptides in The Cancer Genome Atlas (TCGA; P = 4.748 × 10–7, two-sided Welch’s t-test; Fig. 1g). Mutant p53 loss of transcriptional activity and neoantigen presentation of derived neopeptides showed only weak rank correlation (Fig. 1h), leading us to conclude that all of the mechanisms proposed to underlie mutant p53 fitness are likely to provide some predictive information.
We therefore sought to harmonize this proposed feature set within a mechanistic mathematical model of mutant p53 fitness17,18,19,20,21. A model based on background mutation rates alone was insufficient to separate the hotspots from other mutations (Fig. 2a). We further looked to capture variation in mutant p53 concentration, which affects both the transcription factor function and neoantigen presentation. We assigned TCGA samples a normalized p53 protein concentration and effective MDM2 promoter affinity to infer typical per-allele mutant-specific concentrations22,23. We consistently found a significant inverse relationship between these two variables across tumour types (Fig. 2b and Extended Data Fig. 2a) and a significant correlation between our concentration estimates and immunohistochemistry data (Extended Data Fig. 2b, c). We constructed a nonlinear, two-parameter model that separates mutant p53 fitness onto a positive pro-oncogenic probability and a negative immunogenic probability (Supplementary Methods) coupled to mutant p53 concentration. Each component is given an appropriate weight by maximum-likelihood fitting with respect to TCGA mutation frequencies. Our fitness model successfully predicts the distribution of mutation frequencies, both per mutation and per codon (Fig. 2c and Supplementary Information), and accurately predicts the increase or decrease in each mutant frequency with respect to background frequency (Extended Data Fig. 3a, b). We found that predicting the distribution of TP53 mutations requires both functional and immune components through determining the relative likelihoods of the models (Supplementary Table 2 and Supplementary Methods). Model optimization depended strongly on the sampled MHC-I haplotype and all mutant phenotypes (Extended Data Fig. 3c, d and Supplementary Information). We optimized and applied similar models to other driver genes, with conservation used as a proxy for function (Extended Data Fig. 4a and Supplementary Methods). Combined models were more predictive for mutation distributions with larger frequency variance across all database mutations, which implies that increased mutation frequency variance relates to increased selection, as expected from Fisher’s theorem24 (Extended Data Fig. 4b), such as for PTEN (Extended Data Fig. 4c). To build a predictive model for KRAS, we were able to include measured binding affinities to the downstream Raf effector protein for a limited set of hotspot mutations25 (Supplementary Methods), in addition to inferences in conservation and immunogenicity (Extended Data Fig. 4d).
To represent the landscape of mutant p53 fitness, we defined a ‘free fitness’ function of each mutation as the sum of the positive functional fitness, the negative immune fitness and the logarithm of the background frequency (Supplementary Methods), analogous to a free energy in statistical physics with the multiplicity of states derived from the background mutation rate. We plotted the free fitness landscape (Fig. 2d) and observed a general trade-off between intrinsic fitness (logarithm of the background frequency and functional fitness; Supplementary Methods) and extrinsic immune fitness. The trade-off observed in TP53 is reminiscent of other evolutionary trade-offs, and we theorized that TP53 hotspots were Pareto optimal26,27. We computed the Pareto front and identified the optimal fitness coordinate constrained by the front when using our model (Fig. 2d and Supplementary Methods). We found that hotspots had statistically higher free fitness (Fig. 2e) and occupied an optimal regime in which they successfully trade off between the pro-tumorigenic benefit of functional loss and the cost of presenting immunogenic neoantigens. However, there was substantial variation among the hotspot mutations. For instance, R175H is functionally the most wild-type-like hotspot but typically has the poorest MHC-I binding capacity. By contrast, the R248Q and R248W (R248Q/W) mutations have nearly complete loss of transcriptional function and therefore can more often afford to generate potentially immunogenic neoantigens, because the proliferative competitive advantage induced by mutation would offset the cost of immunogenicity. For KRAS, under more restrictive assumptions, we observed evidence for a trade-off between functional and immune fitness for hotspot mutations in pancreatic adenocarcinoma, where KRAS is typically mutated (Extended Data Fig. 4e and Supplementary Methods).
One possible explanation for the inverse relationship is that mutations that alter protein function are generally more likely to generate differentially immunogenic peptides. We therefore compared non-pathogenic and pathogenic mutations in a curated set of non-cancerous disease driver genes and found that both types of mutation generated comparably predicted immunogenic peptides (Extended Data Fig. 5), implying that the trade-off observed is not to be expected a priori. Moreover, because our functional predictions for mutant TP53 are based on precision yeast assays, we checked for evidence of an oncogenic–immunogenic trade-off using independent TCGA assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA sequencing assay to develop a score for the lack of mutant p53 binding site occupancy (Supplementary Methods). We found that the functional component of our fitness model correlated significantly with lack of binding (Extended Data Fig. 6a) and that samples with increased lack of p53 binding consistently showed decreases in p53 target gene RNA expression (Extended Data Fig. 6b). We independently re-derived the oncogenicity–immunogenicity trade-off by comparing the inferred immunogenicity to our scores for lack of binding (Extended Data Fig. 6c). Finally, as a further control, we found a correlation between the yeast assay-derived probability of DNA binding and median target gene RNA expression conditioned on chromatin accessibility (Extended Data Fig. 6d).
We tested our immunogenicity predictions for mutant p53 using peptides from hotspot mutations predicted to be presented on human leukocyte antigen (HLA)-A*02:01 (Supplementary Table 3 and Supplementary Methods), which is the most frequent MHC-I allele in TCGA. First, we asked whether these peptides had differential ability to bind and stabilize HLA on the cell surface, using the TAP2-deficient human lymphoblastoid T2 cell line (Supplementary Methods). We found that R248Q/W peptides but not R175H peptide could significantly stabilize HLA-A*02:01 expression on T2 cells in a dose-dependent manner in comparison with the respective wild-type peptide sequence (Extended Data Fig. 7a and Supplementary Table 3). We next asked whether R175H and R248Q/W TP53 hotspot mutations elicit differential immune responses in vivo in patients with cancer. We identified seven HLA-A*02:01-positive patients with either bladder or ovarian tumours with these mutations and available peripheral blood mononuclear cell (PBMC) samples at Memorial Sloan Kettering Cancer Center (MSKCC). In total, three samples were from patients with R175H-mutant tumours (07E, 38A and 72J) and five samples were from patients with R248Q-mutant tumours (72J, 01A, 39A, 82A and 105A) (Supplementary Table 4). One patient’s tumour (72J) had both mutations, although the R175H clonal fraction was far lower (Supplementary Table 4). All but two patients (72J and 07E) were immunotherapy naive at the time of sample collection. Patient 72J, who had a tumour with both hotspot mutations, had an ongoing complete response to nivolumab (anti-programmed death (PD)-1) treatment with no disease detectable at the time of PBMC collection. Patient 07E, who harboured the R175H mutation, was on atezolizumab (anti-PD-L1) treatment at the time of PBMC collection. All other samples were collected before treatment initiation. We stimulated the PBMCs with peptides harbouring the R175H or R248Q mutations or with a CEF (cytomegalovirus, Epstein–Barr virus, and influenza virus) peptide pool or DMSO as positive and negative controls, respectively (Supplementary Table 3). We then measured the interferon-γ (IFNγ) and tumour necrosis factor-α (TNFα) production in CD8+ T cells by flow cytometry (Fig. 3a, b and Extended Data Fig. 7b). We found responses in three of the five R248Q samples, with the response proportional to the size of the CD8+ T cell population (Fig. 3a, b and Extended Data Fig. 7c, d). This indicates responses might correlate with the frequency of CD8+ T cell precursors recognizing the neopeptides. By contrast, only one of the three patients with R175H-mutant tumours had neopeptide reactivity; this patient (07E) had one of the largest expansions for the mutant TP53 allele and a concomitant increase in protein abundance as well as a positive response to anti-PD-L1 treatment (Fig. 3a and Extended Data Fig. 7e). This finding in combination with the lack of T cell reactivity in the immunotherapy-naive patient (38A) with four mutant R175H alleles indicates despite expansion of the mutant allele, R175H tends to be less immunogenic than R248Q/W, but anti-R175H T cell responses may be unleashed by immune checkpoint blockade therapy. Consistent with this, we found no reactivity in patient 72J, who harboured both hotspot mutations at lower abundance (Extended Data Fig. 7e) and had a complete response to immune checkpoint blockade therapy. This indicates that, in cancer, expansion and/or persistence of cognate T cell pools depends on the levels of the mutant protein.
We next asked whether differential immunogenicity of TP53 hotspots was a broad phenomenon in the healthy population and therefore potentially linked to the frequency of T cell precursors recognizing a mutant peptide. We compared the capacity of R175H and R248Q/W peptides when loaded onto autologous antigen-presenting cells to prime and expand specific T cells in two healthy donors with the HLA-A*02:01 allele (Extended Data Fig. 7b, Supplementary Table 3 and Supplementary Methods). We consistently noted greater IFNγ and Ki67 expression in T cells stimulated with R248Q/W peptides than in those stimulated with R175H peptides in both donors (Fig. 3c, d and Extended Data Fig. 7f). Furthermore, we assessed the yield of TP53 hotspot-specific T cell clones by multiplex identification of T cell receptor (TCR) antigen specificity (MIRA) assay (Adaptive Biotechnologies) in PBMC samples from 107 healthy donors representing a set of distinct HLA alleles, including 25 HLA-A, 46 HLA-B and 20 HLA-C alleles (Supplementary Methods). Forty mutant epitopes from R175, R282, R273 and R248 loci covering the top six p53 hotspots were screened for multiple peptide lengths. The distribution of normalized TCR yield per antigen peptide per donor, indicative of specific clonal expansion, was plotted for each hotspot position (Fig. 3e). Notably, we found that the R175 hotspot yielded statistically lower TCR reactivity per peptide as compared with all other hotspots, having a median value of zero reacting TCRs per peptide. Moreover, we found that hotspot reactivity corresponded to fitness model predictions (Fig. 3f). These results indicate that the MHC-I haplotype and TCR repertoire distributions of the healthy population may be more likely to react to the R248 locus than the R175 locus.
Validating the link between increased immunogenicity and immune response to mutant p53, we found that the protein abundance of the CTLA-4, PD-1 and PD-L1 immune checkpoint proteins was higher in TCGA samples with TP53 mutations that were predicted to be more immunogenic (Extended Data Fig. 8). Our results suggest increased immune activation and concurrent establishment of adaptive immune resistance. When we segregated survival on the basis of functional, immune and combined fitness in TCGA and a cohort of patients with non-small-cell lung cancer (NSCLC) treated with anti-PD-1 at MSKCC (Extended Data Fig. 9), we found that functional and immune fitness components were required to achieve significant survival separation in TCGA, whereas immune fitness on its own significantly separated immunotherapy-treated patients with NSCLC by survival. For robustness, we retrained our models across a range of relative weights between functional and immune fitness (Supplementary Methods). We demonstrated that both components contributed to a model optimized for survival separation across TCGA, with the functional component carrying greater weight, whereas the immune component was the main determinant for an equivalent model in the immunotherapy-treated NSCLC cohort (Fig. 4e).
Because germline TP53 mutations are the primary cause of Li–Fraumeni syndrome (LFS), which is a highly cancer-prone autosomal dominant disorder28, we theorized that mutant p53 fitness relates to the time to first tumour formation in patients with LFS. We plotted Kaplan–Meier curves showing the age of tumour onset for persons with germline missense TP53 mutations in the International Agency for Research on Cancer (IARC) R20 germline dataset and for an independent LFS cohort coordinated by the National Cancer Institute (NCI)29, stratified on the basis of mutant p53 fitness (Supplementary Methods). We found that functional and immune components were required for significant separation of patients based on time to onset, with the immune component required across a range of relative weights (Fig. 4a, b and Extended Data Fig. 10). These results may seem counterintuitive in that mutant p53 may be interpreted as ‘self’ by the adaptive immune system in patients with LFS. However, increased mutant p53 abundance, compounded by additional somatic mutations, may increase tumour immune surveillance and mutant p53 antigenicity during tumorigenesis. These findings suggest a possible role for immune surveillance and the potential for immune intervention in germline TP53-mutant tumours.
Finally, non-cancerous cells in diverse tissues harbour somatic TP53 mutations that confer a competitive advantage, predisposing the clones containing such mutations to develop into cancer30. We collated mutation data from multiple published works across many mutated tissues (Supplementary Information) and found the same cancer hotspots in non-neoplastic cells (Fig. 4c). Unexpectedly, however, the frequency of the hotspot mutations was different. R175H was markedly under-represented in non-neoplastic cells compared with tumours (P < 0.0001, two-sided binomial test), whereas the potentially more immunogenic R248Q/W mutations were among the most frequent. The addition of an immune component in the non-neoplastic setting improved predictions to a substantially lower degree than in the neoplastic setting (Fig. 4d and Supplementary Table 5), supporting the hypothesis that the difference in hotspot frequency between non-cancerous and cancerous datasets is driven by the hotspot mutation’s immune fitness. We then split the non-neoplastic TP53 mutation dataset into the largest tissue-specific subgroups and found that immune weight depended on the tissue type (Fig. 4d), although the weight was always weaker than the optimal value for fitting the TCGA mutation distribution. Overall, these findings suggest that more functionally fit mutations probably predominate in non-cancerous and precancerous lesions owing to their selective replicative advantage; for cancer to form, however, immune escape becomes critical (Fig. 4f).
We present a general mathematical framework for predicting the fitness of tumour driver mutations. For p53, we used a free fitness model that integrates the background mutation rate, protein concentration, functional fitness advantage and immune fitness cost. Hotspots were predicted to fall on a near-optimal Pareto front, with trade-offs constraining driver mutations from completely evading immune selection, as has been shown for specific hotspot mutations31,32,33. Immune fitness has less of a role in predicting the distribution of non-cancerous TP53 mutations, which is consistent with recent observations that immune editing is less relevant in precancerous lesions34. Our insights therefore help define a window of opportunity for prophylactic immune intervention against mutant p53. Additionally, our model shows that mutant p53 fitness may have a role in determining the age of tumour onset in LFS, implying a benefit in targeting germline TP53 mutations immunotherapeutically. Inducing prophylactic immunity against mutant p53 seems to be possible according to our in vitro data showing the possibility of inducing anti-mutant p53 T cell responses in healthy individuals and even against poorly immunogenic mutations when sufficient antigen concentration and proper immune co-stimulation are delivered. Our approach captures critical mechanistic determinants of mutant p53 fitness and is amenable to extensions as data become available. For instance, although we considered only functional alterations for a set of canonical p53-regulated genes in this study, future models can include additional new measures for describing mutant gain of function, such as novel binding interactions between mutant p53 and other molecules due to changes in protein conformation or concentration. Similarly, other functions reflecting the vital role of p53 as a central transcription factor may be incorporated with additional data, such as induction of apoptosis at the mitochondria, immune regulation and surveillance of transposons and other genome parasites. The latter evolutionary role of p53 in preserving genome integrity may be responsible for p53’s centrality as a bottleneck across transcriptional networks35,36,37. Finally, our free fitness framework lends itself naturally to interpretable, free energy-based machine learning models38, which broadens the applicability of our approach to additional topics and modalities. By quantifying the underlying mechanisms of driver mutation fitness, we can therefore uncover both fundamental knowledge about tumour evolution and new opportunities for precision therapies.
Methods
All research involving human participants was approved by the authors’ institutional review board (MSKCC IRB), and all clinical investigation was conducted according to the principles expressed in the Declaration of Helsinki. Written informed consent was obtained from the participants.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
Data availability
Original data required for running the fitness model are available at https://github.com/dfhoyosg/p53_fitness_tradeoff.
Code availability
Original code required for running the fitness model is available at https://github.com/dfhoyosg/p53_fitness_tradeoff.
Change history
31 May 2022
A Correction to this paper has been published: https://doi.org/10.1038/s41586-022-04879-8
References
Martínez-Jiménez, F. et al. A compendium of mutational cancer driver genes. Nat. Rev. Cancer 20, 555–572 (2020).
Giacomelli, A. O. et al. Mutational processes shape the landscape of TP53 mutations in human cancer. Nat. Genet. 50, 1381–1387 (2018).
Baugh, E. H., Ke, H., Levine, A. J., Bonneau, R. A. & Chan, C. S. Why are there hotspot mutations in the TP53 gene in human cancers? Cell Death Differ. 25, 154–160 (2018).
Petitjean, A. et al. Impact of mutant p53 functional properties on TP53 mutation patterns and tumor phenotype: lessons from recent developments in the IARC TP53 database. Hum. Mutat. 28, 622–629 (2007).
Kato, S. et al. Understanding the function–structure and function–mutation relationships of p53 tumor suppressor protein by high-resolution missense mutation analysis. Proc. Natl Acad. Sci. USA 100, 8424–8429 (2003).
Kotler, E. et al. A systematic p53 mutation library links differential functional impact to cancer mutation pattern and evolutionary conservation. Mol. Cell 71, 178–190 (2018).
Marty, R. et al. MHC-I genotype restricts the oncogenic mutational landscape. Cell 171, 1272–1283 (2017).
Pyke, R. M. et al. Evolutionary pressure against MHC class II binding cancer mutations. Cell 175, 416–428 (2018).
Ding, J. et al. Systematic analysis of somatic mutations impacting gene expression in 12 tumour types. Nat. Commun. 6, 8554 (2015).
Huang, N., Shah, P. K. & Li, C. Lessons from a decade of integrating cancer copy number alterations with gene expression profiles. Brief. Bioinform. 13, 305–316 (2012).
Fehrmann, R. S. et al. Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat. Genet. 47, 115–125 (2015).
Köbel, M. et al. Optimized p53 immunohistochemistry is an accurate predictor of TP53 mutation in ovarian carcinoma. J. Pathol. Clin. Res. 2, 247–258 (2016).
Murnyák, B. & Hortobágyi, T. Immunohistochemical correlates of TP53 somatic mutations in cancer. Oncotarget 7, 64910 (2016).
Cole, A. J. et al. Assessing mutant p53 in primary high-grade serous ovarian cancer using immunohistochemistry and massively parallel sequencing. Sci. Rep. 6, 26191 (2016).
Tran, E. et al. T-cell transfer therapy targeting mutant KRAS in cancer. N. Engl. J. Med. 375, 2255–2262 (2016).
Hsiue, E. H. et al. Targeting a neoantigen derived from a common TP53 mutation. Science 371, eabc8697 (2021).
Eigen, M. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften 58, 465–523 (1971).
Gerland, U. & Hwa, T. On the selection and evolution of regulatory DNA motifs. J. Mol. Evol. 55, 386–400 (2002).
Łuksza, M. & Lässig, M. A predictive fitness model for influenza. Nature 507, 57–61 (2014).
Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature 551, 512–516 (2017).
Łuksza, M. et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature 551, 517–520 (2017).
Ma, L. et al. A plausible model for the digital response of p53 to DNA damage. Proc. Natl Acad. Sci. USA 102, 14266–14271 (2005).
Gaglia, G., Guan, Y., Shah, J. V. & Lahav, G. Activation and control of p53 tetramerization in individual living cells. Proc. Natl Acad. Sci. USA 110, 15497–15501 (2013).
Price, G. R. Fisher’s ‘fundamental theorem’ made clear. Ann. Hum. Genet. 36, 129–140 (1972).
Hunter, J. C. et al. Biochemical and structural analysis of common cancer-associated KRAS mutations. Mol. Cancer Res. 13, 1325–1335 (2015).
Shoval, O. et al. Evolutionary trade-offs, pareto optimality, and the geometry of phenotype space. Science 336, 1157–1160 (2012).
Pinheiro, F., Warsi, O., Andersson, D. I. & Lässig, M. Metabolic fitness landscapes predict the evolution of antibiotic resistance. Nat. Ecol. Evol. 5, 677–687 (2021).
Kratz, C. P. et al. Analysis of the Li–Fraumeni spectrum based on an international germline TP53 variant data set: an International Agency for Research on Cancer TP53 database analysis. JAMA Oncol. 7, 1800–1805 (2021).
De Andrade, K. C. et al. Cancer incidence, patterns, and genotype–phenotype associations in individuals with pathogenic or likely pathogenic germline TP53 variants: an observational cohort study. Lancet Oncol. 22, 1787–1798 (2021).
Martincorena, I. & Campbell, P. J. Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015).
Caushi, J. X. et al. Transcriptional programs of neoantigen-specific TIL in anti-PD-1-treated lung cancers. Nature 596, 126–132 (2021).
Bear, A. S. et al. Biochemical and functional characterization of mutant KRAS epitopes validates this oncoprotein for immunological targeting. Nat. Commun. 12, 4365 (2021).
Malekzadeh, P. et al. Antigen experienced T cells from peripheral blood recognize p53 neoantigens. Clin. Cancer Res. 26, 1267–1276 (2020).
Colom, B. et al. Mutant clones in normal epithelium outcompete and eliminate emerging tumours. Nature 598, 510–514 (2021).
Wylie, A. et al. p53 genes function to restrain mobile elements. Genes Dev. 30, 64–77 (2016).
Levine, A. J., Ting, D. T. & Greenbaum, B. D. p53 and the defenses against genome instability caused by transposons and repetitive elements. Bioessays 38, 508–513 (2016).
McKerrow, W. et al. LINE-1 expression in cancer correlates with p53 mutation, copy number alteration, and S phase checkpoint. Proc. Natl Acad. Sci. USA 119, e2115999119 (2022).
Dayan, P., Hinton, G. E., Neal, R. M. & Zemel, R. S. The Helmholtz machine. Neural Comput. 7, 889–904 (1995).
Acknowledgements
We thank U. Alon, C. Chan, N. Copeland, P. Hainaut, N. Jenkins, S. Lowe, D. Pardoll, A. Snyder, D. Ting and the staff of the Balachandran, Greenbaum, Savage and Wolchok laboratories for offering conversations. We also thank N. Rusk for comments and editing. This research was funded in part through National Institutes of Health (NIH)/NCI Cancer Center Support Grant P30CA008748, Swim Across America, the Ludwig Institute for Cancer Research, the Ludwig Center for Cancer Immunotherapy at Memorial Sloan Kettering, the Cancer Research Institute, the Parker Institute for Cancer Immunotherapy, NIH grants R01AI081848, U01CA224175, R01CA240924 and U01CA228963, a collaboration of Stand Up To Cancer, the Society for Immunotherapy of Cancer and the Lustgarten Foundation, the V Foundation for Cancer Research, Sephora and the Pershing Square Sohn Foundation. C.B. was supported by NIH grant R01CA227534-03. D.F.B., S.A.F. and J.E.R. were supported by NIH grant P50CA221745 and the Ludwig Institute for Cancer Research. S.A.F. is supported by NIH grant K12CA184746-01A1. B.D.G. was supported by the Pershing Square Sohn Prize–Mark Foundation Fellowship supported by funding from the Mark Foundation for Cancer Research. M.Ł. is a Pew Biomedical Scholar. The work of K.C.A., P.P.K. and S.A.S. is supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, NCI. S.P.S. was supported by the LesLois Shaw Foundation and holds the Nicholls-Biondi Chair in Computational Oncology at MSKCC. B.W. is funded in part by Breast Cancer Research Foundation and Cycle for Survival grants. R.Z. was supported by the Parker Bridge Fellow Award. D.Z. is supported by the Ovarian Cancer Research Foundation Liz Tilberis Award and the Department of Defense Ovarian Cancer Research Academy (OC150111).
Author information
Authors and Affiliations
Contributions
B.D.G. conceptualized the study. T.M., A.J.L., M.Ł. and B.D.G. developed the research plan. D.H., Z.S., M.Ł. and B.D.G. developed the computational methodology. R.Z., I.S., A.J.L. and T.M. developed the experimental methodology. D.H., Z.S. and K.C.A. performed the computational analysis. R.Z. and I.S. performed the experimental analysis. P.P.K. and S.A.S. performed the LFS data collection and analysis. S.A.F., J.S.H., J.D.W., M.D.H. and T.M. collected and analysed the human samples. C.B., D.F.B., M.K.C., S.A.F., J.S.H., S.R.H., J.E.R., S.P.S., I.V.-G., B.W., M.W. and D.Z. collected samples from patients with cancer. L.F.C., E.J.O., M.K. and H.S.R. conducted MIRA assays and analysis. D.H., R.Z., I.S., Z.S. T.M., M.Ł. and B.D.G. wrote the manuscript. K.C.A., V.P.B., M.D.H., P.P.K., S.A.S., S.P.S., B.W. and J.D.W. reviewed and edited the manuscript. T.M., A.J.L., M.Ł. and B.D.G. are credited with senior authorship.
Corresponding authors
Ethics declarations
Competing interests
D.F.B. is a consultant for Bristol Myers Squibb, Merck, Genentech–Roche, AstraZeneca and Pfizer and has received research support from Merck, Genentech–Roche, AstraZeneca, Novartis and Bristol Myers Squibb. M.K.C. has received consulting fees for Bristol Meyers Squibb, Merck, Incyte, Moderna, Immunocore and AstraZeneca and research funding from Bristol Meyers Squibb. L.F.C., E.J.O., M.K. and H.S.R. are employees of Adaptive Biotechnologies. S.A.F. has received research support from AstraZeneca and Genentech–Roche; is a consultant and advisory board member for Merck; and owns stock in UroGen, Allogene Therapeutics, Neogene Therapeutics, Kronos Bio and IconOVir. B.D.G. has received honoraria for speaking engagements from Merck, Bristol Meyers Squibb and Chugai Pharmaceutical; has received research funding from Bristol Meyers Squibb; has been a compensated consultant for PMV Pharma, DarwinHealth and ROME Therapeutics; and is a cofounder of ROME Therapeutics. M.D.H. reports personal fees from Achilles, Adagene, Adicet, Arcus, AstraZeneca, Blueprint, Bristol Myers Squib, Da Volterra, Eli Lilly, Genentech–Roche, Genzyme–Sanofi, Janssen, Immunai, Instil Bio, Mana Therapeutics, Merck, Mirati, Natera, PACT Pharma, Shattuck Labs and Regeneron and has equity options with Factorial, Immunai, Shattuck Labs and Arcus. M.D.H. also reports that a patent filed by Memorial Sloan Kettering related to the use of tumour mutational burden to predict response to immunotherapy (PCT/US2015/062208) is pending and licensed by Personal Genome Diagnostics and that, subsequent to completing this work, he became an employee of AstraZeneca. A.J.L. is a founder, director and shareholder of PMV Pharma and is the chair of the Janssen scientific advisory board. T.M. is a cofounder and holds equity in Imvaq Therapeutics; is a consultant for ImmunOs Therapeutics, Im AQ19 munoGenesis and Pfizer; has received research support from Bristol Myers Squibb, Surface Oncology, Kyn Therapeutics, Infinity Pharmaceuticals, Peregrine Pharmaceuticals, Adaptive Biotechnologies, Leap Therapeutics and Aprea; and has patent applications related to work on oncolytic viral therapy, alpha virus-based vaccines, neoantigen modelling, CD40, GITR, OX40, PD-1 and CTLA-4. I.S. is an inventor on a patent application related to work on CD40. J.E.R. has received consulting fees and trial funding from Bayer, Seagen, AstraZeneca, Roche, Astellas Pharma and QED Therapeutics; consulting fees from Bristol Myers Squibb, Merck, Pfizer, Pharmacyclics, Boehringer Ingelheim, GlaxoSmithKline, Infinity, Janssen, Mirati, EMD Serono, Gilead, BioClin, Eli Lilly and Company, Tyra Biosciences and Pharmacyclics; honoraria for continuing medical education from Research to Practice, MJH Life Sciences, Medscape, Clinical Care Options, OncLive and EMD Serono; royalties from UpToDate. B.W. reports ad hoc membership of the scientific advisory board of Repare Therapeutics, outside the submitted work. J.D.W. is a consultant for Adaptive Biotechnologies, Amgen, Apricity, Ascentage Pharma, ArsenalBio, Astellas, AstraZeneca, Bayer, BeiGene, Boehringer Ingelheim, Bristol Myers Squibb, Celgene, Chugai, Eli Lilly, Elucida, F-Star, Georgiamune, Imvaq, Kyowa Kirin, Linneaus, Merck, Neon Therapeutics, Polynoma, PsiOxus, Recepta, Takara Bio, Trieza, Truvax, SELLAS, Serametrix, Surface Oncology, Syndax, Syntalogic and Werewolf Therapeutics; receives grant and research support from Bristol Myers Squibb and Sephora; and has equity in Tizona Pharmaceuticals, Adaptive Biotechnologies, Imvaq, BeiGene, Linneaus, Apricity, ArsenalBio and Georgiamune. R.Z. is an inventor on patent applications related to work on GITR, PD-1 and CTLA-4; is a scientific advisory board member of iTEOS Therapeutics; has consulted for Leap Therapeutics; and receives grant support from AstraZeneca and Bristol Myers Squibb. D.Z. is a consultant for Merck, Agenus, Hookipa Biotech, AstraZeneca, Western Oncolytics, Synthekine, MANA Therapeutics, Xencor, Memgen and Takeda; receives grant and research support from AstraZeneca, Roche and Plexxikon; holds stock options with ImmunOs Therapeutics, Calidi Biotherapeutics and Accurius; and has a patent related to use of Newcastle disease virus for cancer therapy with royalties paid by Merck.
Peer review
Peer review information
Nature thanks Alexander Anderson, Paul Thomas and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Inferred relationships between relative transactivation and apparent dimer dissociation constant.
Relationship between the relative transactivation and the inferred apparent dimer dissociation constant for mutant homodimer p53. Blue dotted lines correspond to wild-type p53, which has a relative transactivation of 1 (Methods). The hotspots’ inferred values are annotated in red.
Extended Data Fig. 2 Relationship between mutant p53 concentration and predicted MDM2 binding affinities.
a, Variation in normalized concentration across mutant p53 versus predicted affinity to MDM2 DNA in common TP53-mutated tissues within TCGA. Protein concentration is expressed as log2 of inferred protein concentration in nanomolar (nM) units. b, Fraction positive immunohistochemistry (IHC) assay from the IARC R20 dataset plotted against predicted per-allele mutant p53 concentration averaged across tissues. Correlations are for mutations with at least 10 IHC data entries (Pearson p-value 0.00848, Spearman p-value 0.00967). c, Fraction positive IHC assay plotted against predicted per-allele mutant p53 concentration averaged across tissues only for mutant TP53 hotspots (Pearson p-value 0.0207, Spearman p-value 0.00503).
Extended Data Fig. 3 Fitness model prediction analysis.
a, Predicted ratio from combined fitness model plotted against posterior ratio for each TP53 mutation. Mutations are colored by their observed frequency. Ratios > 1 are predicted to be fixed in the cancer population. Diagonal line corresponds to ratios being equal. b, Prediction accuracy plotted as the proportion of observed mutation frequency for true positive (TP), false positive (FP), true negative (TN) and false negative (FN) model predictions. c, Kullback-Leiber divergence versus number of simulated HLA-I haplotypes shows improved model predictions according to the haplotype sample size. d, Internal validation by shuffling background mutation frequencies, functional phenotypes and immune phenotypes of TP53 mutations for 1,000 iterations and computing the Kullback-Leibler divergence for each iteration. The histogram is of the distribution of Kullback-Leibler divergences from all iterations. Permutation-mean Kullback-Leibler divergence is plotted as a vertical black dotted line and the true Kullback-Leibler divergence is plotted as a vertical red dotted line.
Extended Data Fig. 4 Fitness model predicts mutation frequencies in commonly mutated cancer driver genes.
a, Degree to which models of varying complexity account for mutation distributions from TCGA and COSMIC, excluding TCGA samples, across 27 commonly mutated cancer driver genes. Models are ranked by Bayesian Information Criterion (BIC) in descending order (models with the lowest BIC value are deemed the most explanatory). b, Boxplots of observed mutation frequency variances of driver genes best explained by a particular model, ranked by complexity in ascending order. c, Fitness model results for PTEN per protein position in TCGA, using both conservation and immunogenicity over background mutation rates. The full model is justified by the BIC value (KL divergence = 0.269; Pearson r = 0.701, p-value = 2.013e-24; Spearman r = 0.701, p-value = 2.386e-24). d, Fitness model results for KRAS per protein position in TCGA, using a full model with conservation, function and immunogenicity over background mutation rates with functional information available for seven frequent KRAS cancer mutations (G12A/C/D/R/V, G13D and Q61L). All components are justified by the BIC value (KL divergence = 0.256; Pearson r = 0.981, p-value = 2.095e-24; Spearman r = 0.616, p-value = 0.000104). e, Trade-off between gain-of-function and avoidance of neoantigen presentation, defined as \(1-{I}_{m}\left(H\right)\), in TCGA pancreatic cancer for KRAS hotspots (Pearson −0.750, p-value = 2.599e-23; Spearman r = −0.774, p-value = 1.507e-25). Each point corresponds to an individual pancreatic cancer sample with a hotspot KRAS mutation.
Extended Data Fig. 5 Inferred mutant immunogenicity is not related to pathogenicity in non-cancer driver genes.
a–f, Comparison of inferred immunogenicity across not-pathogenic and pathogenic missense mutations in nine non-cancerous disease driver genes (HBA, HBB, HBD, HG1, HG2, F8, PAH, PHEX and POGZ) using the Mann-Whitney U-test. Six out of nine genes had sufficient data for comparison between not-pathogenic and pathogenic mutations (HBA, HBB, F8, PAH, PHEX and POGZ). g, Data corresponding to all hemoglobin subunits (HBA, HBB, HBD, HG1 and HG2) were combined and compared (Hemoglobin). Mutations and their “Not-pathogenic” and “Pathogenic” status were determined using the NCBI’s dbSNP and ClinVar systems, respectively.
Extended Data Fig. 6 Fitness trade-offs inferred from ATAC- and RNA-seq.
a, Lack of binding score plotted versus predicted functional fitness. Most TCGA ATAC-seq samples were breast cancers (BRCA), therefore we only plot matched BRCA samples to normalize on tissue-specific protein abundance (Pearson r = 0.46, p-value = 0.063, Spearman r = 0.55, p-value 0.023, N = 17). b, log2 of median TCGA RNA expression (TPM) of eight p53 target genes utilized in fitness model split on median TCGA ATAC-seq lack of DNA binding score (Mann-Whitney p-value = 0.006). c, Immune fitness plotted versus ATAC-seq-based lack of DNA binding footprinting score for each TCGA sample (Pearson r = −0.45, p-value < 0.0001; Spearman r = −0.49, p-value < 0.0001). d, Median TCGA RNA expression (TPM) of the target genes with available ATAC-seq data (WAF1, BAX, h1433s, AIP1, GADD45 and NOXA) plotted versus median probability of mutant p53 binding DNA, conditioned on target DNA chromatin accessibility (Pearson r = 0.25, p-value 0.0459; Spearman r = 0.088, p-value 0.480).
Extended Data Fig. 7 Differential T-cell reactivity to p53 neopeptides.
a, Flow cytometry quantification of HLA-A*02:01 expression on the surface of live T2 cells as a measure of peptide:MHC stabilization via binding to specific peptides. T2 cells were incubated overnight in serum-free media with recombinant human B2M and the indicated peptides at the indicated concentrations, or DMSO as vehicle control. Blue, negative controls (DMSO and unrelated HLA-B*35-restricted NY-ESO-1-derived peptide); red, positive controls (HLA-A*02:01-restricted peptides from flu and HIV viral antigens and Mart1/Melan-A melanoma-associated antigen); gray, experimental peptides containing the indicated mutation in comparison with the corresponding wild-type (wt) sequence. Data are mean ± SD of 2-3 replicates. P values are calculated with a two-sided unpaired t-test. b, Model illustrating the molecular basis of the T-cell stimulation assay and stimulation conditions (APC, antigen presenting cell; TCR, T-cell receptor). c, Representative plots of IFN-γ ± TNF-a expressing cells among CD8+CD3+ live T cells in PBMCs from patients with mutant p53 tumors as in Fig. 3a. d, Correlation analyses between indicated parameters in PBMC samples from R248Q mutant patients with presence of disease (N = 4) at the time of PBMC collection as in Fig. 3b. e, Estimate of mutant p53 amount per tumor cell before treatment in the same patients. Samples with R175H mutations are colored in blue. The sample which reacted, corresponding to the patient who received immune checkpoint blockade (ICB) therapy, is in solid blue, and the sample which did not react, and did not receive ICB, has filled-in lines. f, Flow cytometry gating strategy for total CD8 and non-naïve memory CD8 T-cells analyzed in Fig. 3c, d. TN: naïve T-cells, TCM: central memory T-cells, TEM, effector memory T-cells, TEMRA: effector memory T-cells re-expressing CD45RA.
Extended Data Fig. 8 Relationships between immune fitness and immune checkpoint protein expression in TCGA.
a, b, Continuous and categorical relationships between CTLA-4 (a) and PD-1 (b) protein expression available from TCGA RPPA proteomics assay and immune fitness. For the CTLA-4 scatterplot, Pearson p-value < 0.0001, Spearman p-value < 0.0001. For the PD-1 scatterplot, Pearson p-value = 0.00153, Spearman p-value < 0.0001. Categorical differences measured with the Welch’s t-test. c, Continuous and categorical relationships between PD-L1 protein expression available from TCGA RPPA proteomics assay and immune fitness in commonly TP53-mutated tissues. Correlation p-values: Ovarian - Pearson p-value = 0.2, Spearman p-value = 0.0829; Colorectal - Pearson p-value = 0.157, Spearman p-value 0.003; NSCLC - Pearson p-value = 0.0812, Spearman p-value = 0.00793; Breast - Pearson p-value = 0.00671, Spearman p-value = 0.000140. Categorical differences measured with the Welch’s t-test.
Extended Data Fig. 9 p53 fitness predicts survival and immune relevance in diverse p53-mutated groups.
Kaplan-Meier curves separated by median functional, immune and total fitness in TCGA and MSKCC non-small cell lung cancer (NSCLC) ICB-treated samples. For NSCLC samples, matched HLA-TP53 mutation pairs with lung-specific and allele-specific concentrations were used to determine functional, immune and combined fitness. ns p > 0.05, * p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001, **** p ≤ 0.0001.
Extended Data Fig. 10 Relationships of germline mutant p53 fitness and age of tumour onset.
Kaplan-Meier curves separated by median functional and immune mutant p53 fitness for first-cancer age of onset in the LFS IARC R20 germline dataset (N = 998) and the NCI LFS cohort (N = 82). Mutant p53 fitness was determined using TCGA-derived tissue-specific mutant p53 concentrations for both datasets, with individual HLA-I types for the NCI cohort and averages taken over TCGA haplotypes for the IARC dataset, which lacked individual HLA-I types.
Supplementary information
Supplementary Information
This file contains Table 1, Supplementary Figs. 1–5 and Supplementary References.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hoyos, D., Zappasodi, R., Schulze, I. et al. Fundamental immune–oncogenicity trade-offs define driver mutation fitness. Nature 606, 172–179 (2022). https://doi.org/10.1038/s41586-022-04696-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-022-04696-z