Cancers accumulate mutations that lead to neoantigens, novel peptides that elicit an immune response, and consequently undergo evolutionary selection. Here we establish how negative selection shapes the clonality of neoantigens in a growing cancer by constructing a mathematical model of neoantigen evolution. The model predicts that, without immune escape, tumor neoantigens are either clonal or at low frequency; hypermutated tumors can only establish after the evolution of immune escape. Moreover, the site frequency spectrum of somatic variants under negative selection appears more neutral as the strength of negative selection increases, which is consistent with classical neutral theory. These predictions are corroborated by the analysis of neoantigen frequencies and immune escape in exome and RNA sequencing data from 879 colon, stomach and endometrial cancers.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The datasets analyzed during the current study are available from the National Cancer Institute Genomics Data Commons Data Portal (https://portal.gdc.cancer.gov) colon adenocarcinoma, rectum adenocarcinoma, STAD and UCEC domains and from the European Genome-phenome Archive (https://ega-archive.org/) under accession no. EGAS00001003066.
Schumacher, T. N. & Schreiber, R. D. Neoantigens in cancer immunotherapy. Science 348, 69–74 (2015).
Lu, Y.-C. & Robbins, P. F. Cancer immunotherapy targeting neoantigens. Semin. Immunol. 28, 22–27 (2016).
Galon, J. et al. Towards the introduction of the ‘Immunoscore’ in the classification of malignant tumours. J. Pathol. 232, 199–209 (2014).
Sharma, P. & Allison, J. P. The future of immune checkpoint therapy. Science 348, 56–61 (2015).
Larkin, J. et al. Combined nivolumab and ipilimumab or monotherapy in untreated melanoma. N. Engl. J. Med. 373, 23–34 (2015).
Milo, I. et al. The immune system profoundly restricts intratumor genetic heterogeneity. Sci. Immunol. 3, eaat1435 (2018).
Dunn, G. P., Bruce, A. T., Ikeda, H., Old, L. J. & Schreiber, R. D. Cancer immunoediting: from immunosurveillance to tumor escape. Nat. Immunol. 3, 991–998 (2002).
DuPage, M., Mazumdar, C., Schmidt, L. M., Cheung, A. F. & Jacks, T. Expression of tumour-specific antigens underlies cancer immunoediting. Nature 482, 405–409 (2012).
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Identification of neutral tumor evolution across cancer types. Nat. Genet. 48, 238–244 (2016).
Koebel, C. M. et al. Adaptive immunity maintains occult cancer in an equilibrium state. Nature 450, 903–907 (2007).
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
Marty, R. et al. MHC-I genotype restricts the oncogenic mutational landscape. Cell 171, 1272–1283.e15 (2017).
Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).
Yarchoan, M., Johnson, B. A.3rd., Lutz, E. R., Laheru, D. A. & Jaffee, E. M. Targeting neoantigens to augment antitumour immunity. Nat. Rev. Cancer 17, 209–222 (2017).
Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).
Lennerz, V. et al. The response of autologous T cells to a human melanoma is dominated by mutated neoantigens. Proc. Natl Acad. Sci. USA 102, 16013–16018 (2005).
Le, D. T. et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science 357, 409–413 (2017).
Łuksza, M. et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature 551, 517–520 (2017).
Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature 551, 512–516 (2017).
Gibney, G. T., Weiner, L. M. & Atkins, M. B. Predictive biomarkers for checkpoint inhibitor-based immunotherapy. Lancet. Oncol. 17, e542–e551 (2016).
Turajlic, S., Sottoriva, A., Graham, T. & Swanton, C. Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20, 404–416 (2019).
Williams, M. J. et al. Quantification of subclonal selection in cancer from bulk sequencing data. Nat. Genet. 50, 895–903 (2018).
Ostrow, S. L., Barshir, R., DeGregori, J., Yeger-Lotem, E. & Hershberg, R. Cancer evolution is associated with pervasive positive selection on globally expressed genes. PLoS Genet. 10, e1004239 (2014).
Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041.e21 (2017).
Temko, D., Tomlinson, I. P. M., Severini, S., Schuster-Böckler, B. & Graham, T. A. The effects of mutational processes and selection on driver mutations across cancer types. Nat. Commun. 9, 1857 (2018).
Cannataro, V. L., Gaffney, S. G. & Townsend, J. P. Effect sizes of somatic mutations in cancer. J. Natl Cancer Inst. 110, 1171–1177 (2018).
Williams, M. J. et al. Measuring the distribution of fitness effects in somatic evolution by combining clonal dynamics with dN/dS ratios. eLife 9, e48714 (2020).
Cvijović, I., Good, B. H. & Desai, M. M. The effect of strong purifying selection on genetic diversity. Genetics 209, 1235–1278 (2018).
Good, B. H., Walczak, A. M., Neher, R. A. & Desai, M. M. Genetic diversity in the interference selection limit. PLoS Genet. 10, e1004222 (2014).
Neher, R. A. & Hallatschek, O. Genealogies of rapidly adapting populations. Proc. Natl Acad. Sci. USA 110, 437–442 (2013).
Charlesworth, B., Morgan, M. T. & Charlesworth, D. The effect of deleterious mutations on neutral molecular variation. Genetics 134, 1289–1303 (1993).
Haigh, J. The accumulation of deleterious genes in a population—Muller’s Ratchet. Theor. Popul. Biol. 14, 251–267 (1978).
Kessler, D. A. & Levine, H. Scaling solution in the large population limit of the general asymmetric stochastic Luria–Delbrück evolution process. J. Stat. Phys. 158, 783–805 (2015).
Antal, T. & Krapivsky, P. L. Exact solution of a two-type branching process: models of tumor progression. J. Stat. Mech. 2011, P08018 (2011).
Vinay, D. S. et al. Immune evasion in cancer: mechanistic basis and therapeutic strategies. Semin. Cancer Biol. 35, S185–S198 (2015).
McGranahan, N. et al. Allele-specific HLA loss and immune escape in lung cancer evolution. Cell 171, 1259–1271.e11 (2017).
Kather, J. N., Halama, N. & Jaeger, D. Genomics and emerging biomarkers for immunotherapy of colorectal cancer. Semin. Cancer Biol. 52, 189–197 (2018).
Shukla, S. A. et al. Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat. Biotechnol. 33, 1152–1158 (2015).
Schenck, R. O., Lakatos, E., Gatenbee, C., Graham, T. A. & Anderson, A. R. A. NeoPredPipe: high-throughput neoantigen prediction and recognition potential pipeline. BMC Bioinformatics 20, 264 (2019).
Grasso, C. S. et al. Genetic mechanisms of immune evasion in colorectal cancer. Cancer Discov. 8, 730–749 (2018).
Xie, T. et al. A comprehensive characterization of genome-wide copy number aberrations in colorectal cancer reveals novel oncogenes and patterns of alterations. PLoS ONE 7, e42001 (2012).
McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).
Cross, W. et al. The evolutionary landscape of colorectal tumorigenesis. Nat. Ecol. Evol. 2, 1661–1672 (2018).
Riaz, N. et al. Tumor and microenvironment evolution during immunotherapy with nivolumab. Cell 171, 934–949.e16 (2017).
Anagnostou, V. et al. Evolution of neoantigen landscape during immune checkpoint blockade in non-small cell lung cancer. Cancer Discov. 7, 264–276 (2017).
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1983).
Stead, L. F., Sutton, K. M., Taylor, G. R., Quirke, P. & Rabbitts, P. Accurately identifying low-allelic fraction variants in single samples with next-generation sequencing: applications in tumor subclone resolution. Hum. Mutat. 34, 1432–1438 (2013).
Yadav, M. et al. Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature 515, 572–576 (2014).
Blomen, V. A. et al. Gene essentiality and synthetic lethality in haploid human cells. Science 350, 1092–1096 (2015).
Van den Eynden, J., Basu, S. & Larsson, E. Somatic mutation patterns in hemizygous genomic regions unveil purifying selection during tumor evolution. PLoS Genet. 12, e1006506 (2016).
Van den Eynden, J., Jiménez-Sánchez, A., Miller, M. L. & Larsson, E. Lack of detectable neoantigen depletion signals in the untreated cancer genome. Nat. Genet. 51, 1741–1748 (2019).
Rooney, M. S., Shukla, S. A., Wu, C. J., Getz, G. & Hacohen, N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 160, 48–61 (2015).
Werner, B. et al. Measuring single cell divisions in human tissues from multi-region sequencing data. Nat. Commun. 11, 1035 (2020).
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Zapata, L. et al. Negative selection in tumor genome evolution acts on essential cellular functions and the immunopeptidome. Genome Biol. 19, 67 (2018).
Marty Pyke, R. et al. Evolutionary pressure against MHC class II binding cancer mutations. Cell 175, 416–428.e13 (2018).
Kim, J. M. & Chen, D. S. Immune escape to PD-L1/PD-1 blockade: seven steps to success (or failure). Ann. Oncol. 27, 1492–1504 (2016).
Sharma, P., Hu-Lieskovan, S., Wargo, J. A. & Ribas, A. Primary, adaptive, and acquired resistance to cancer immunotherapy. Cell 168, 707–723 (2017).
Iorgulescu, J. B., Braun, D., Oliveira, G., Keskin, D. B. & Wu, C. J. Acquired mechanisms of immune escape in cancer following immunotherapy. Genome Med. 10, 87 (2018).
Gillespie, D. T. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22, 403–434 (1976).
Grossman, R. L. et al. Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016).
Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Muzny, D. M. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Kautto, E. A. et al. Performance evaluation for rapid detection of pan-cancer microsatellite instability with MANTIS. Oncotarget 8, 7452–7463 (2017).
Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat. Genet. 46, 912–918 (2014).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Alsaab, H. O. et al. PD-1 and PD-L1 checkpoint signaling inhibition for cancer immunotherapy: mechanism, combinations, and clinical outcome. Front. Pharmacol. 8, 561 (2017).
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann. Oncol. 26, 64–70 (2015).
Jurtz, V. et al. NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).
This work was supported by the Wellcome Trust (grant 202778/B/16/Z to A.S.; grant 202778/Z/16/Z to T.A.G.; grant 105104/Z/14/Z to the Centre for Evolution and Cancer, Institute of Cancer Research; grant 108861/7/15/7 to R.O.S.; and grant 097319/Z/11/Z to C.P.B.) and Cancer Research UK (grant A22909 to A.S.; grant A19771 to T.A.G. supporting E.L.). A.R.A.A. and C.G, and A.S. and T.A.G., received support from the National Institutes of Health, National Cancer Institute (grant U54CA143970) and grant U54 CA217376, respectively. R.O.S. was also supported by the Wellcome Centre for Human Genetics (grant 203141/7/16/7). B.W. is funded by a Barts Charity Lectureship (grant MGU045). L.Z. is supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Research Fellowship scheme (846614).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Distribution of neoantigen scores of tumour cell populations when reaching 100,000 cells, for increasing selection strengths (left column, top to bottom) and cell-antigenicity threshold (Tc) used for analysis (right column). n=100 tumours were simulated for each parameter combination. b, The distribution of antigenicity values of all neoantigens present in at least 10 cells in the final tumour population simulated with exponential, uniform and normal prior neoantigen distributions. The thick line shows the mean density of 50 simulated distributions at no, moderate and high selection pressure (s = 0 (yellow), s = -0.8 (teal), s= -1.6 (blue), respectively), the shaded regions represent ±1 standard deviation around this mean. Priors are shown with a grey dashed line.
Extended Data Fig. 2 Dynamics of the growing population compared to a constant size population model.
Tumours simulated using our tumour growth model (growing population, (a)–(c)) and a death-birth Moran process (constant size population, (d)–(f)) at different selection strength and mutation rate regimes, as indicated in each row and detailed in the main text. Simulations were run for a final population of 10,000 cells in the growing population model and for 50,000 steps in the constant size model. (a)&(d) The mean antigenicity of the tumour for 6 individual simulations. Tumours that reached detectable size are shown in blue, eradicated tumours (cell count reaches 0) are in red, the constant size tumours are invariably in grey. (b)&(e) Distribution of CCF values of the most common neoantigen computed from 20 tumours. For the growing population, only tumours that did not go extinct are shown, and consequently no graph is included for the last two rows. (c)&(f) Cumulative VAF distribution of all mutations (grey) and neoantigens (red). The thick line shows the mean of 20 simulated cumulative distributions, the shaded regions represent ±1 standard deviation around this mean. Note that in the first row, there are no neoantigens in the studied frequency range (no red line), while in the last row the grey and red curves overlap.
a, Prevalence of the individual immune escape mechanisms considered in the combined cohort of CRC, STAD and UCEC samples. P-values shown on top of each bar indicate the result of chi-squared test for that mechanism, corrected for multiple comparisons using the Holm-Bonferroni method. An additional test comparing the presence/absence of any immune checkpoint escape is also indicated above the checkpoint columns. b–e, Antigen landscape and immune escape characteristics of a combined cohort from TCGA. Figures correspond to Fig. 2d–g. Two-sided chi-squared test is indicated on top of (b) and (d). Mann-Whitney tests (c: two-sided, e: one-sided) are reported above (c) and (e). No adjustment for multiple comparisons was made. Violin widths in (c) & (e) represent raw data density with binned individual data-points overlaid on top.
Extended Data Fig. 4 VAF distribution of neoantigens under different selection strengths and mutation-antigenicity thresholds.
Cumulative VAF distribution as a function of the inverse of the frequency, for all mutations (grey) and antigenic mutations (red) at increasing selection strengths (top to bottom) and mutation-antigenicity threshold (Tm) applied to label mutations as antigenic (left to right). The thick line shows the mean of 100 simulated cumulative distributions, the shaded regions represent ±1 standard deviation around this mean.
a, Number of detected (true) neoantigens in n=100 simulated tumours for each selection strength between s=0 and s=-3. The mean number detected at each selection value is shown in red. b–d, Power (detection rate) to identify negative selection using two-sided Kolmogorov-Smirnov test (b, c) and number of detected neoantigens (d) as a function of read depth, false positive neoantigens amongst antigenic mutations and selection strength, when only mutations above the mutation-antigenicity threshold (Tm) of 0.35 are analysed as antigenic, instead of 0.2 (c.f. Fig. 4c, d, Extended Data Figs. 4 & 5a). n=100 simulated tumours are used in the computation. e, Cumulative VAF distribution of mutations detected in low and medium immune infiltrated CRC (upper panel) and UCEC (lower panel) MSS cancers without immune escape. VAF distributions of STAD sample could not be established due to low sample and mutation numbers. f, Cumulative VAF distributions of mutations detected in essential genes in all TCGA MSS cancers with good tumour cellularity (above 70%). The curves show synonymous (purple), frameshift and nonsense (green), missense (red) and hemizygous (located in haploid regions of the genome, yellow) mutations found in essential gene exons.
a, The number of detectable neoantigen-associated mutations (at simulated sequencing depth of ~50x) in n=50 simulated tumours with increasing base (non-immunogenic) death rate. The bottom panel shows the ratio of tumours with different levels of immune escape. Violin widths represent raw data density. b, Cumulative VAF distribution as a function of the inverse of the frequency, for all mutations (grey) and antigenic mutations (red) at increasing base death rate, db. The thick line shows the mean of n=100 simulated cumulative distributions, the shaded regions represent ±1 standard deviation around this mean. At very high base death (last panel), the VAF distribution of neoantigens and neutral mutations overlaps as tumours are exclusively immune-escaped and evolve neutrally.
a, b, The number of subclonal (CCF < 0.6) missense mutations in MSS (a) and MMR (b) CRC samples; shown together with the subclonal mutation count of ‘normal’ (a) and hyper-mutated (b) simulated tumours sequenced at a depth of 30-60x (sequencing depth sampled randomly in the range). Violin widths represent raw data density with individual data-points (of exact y values) scattered on top. c, The distribution of the proportion of antigenic mutations in a randomised TCGA MSS colon dataset, where patient mutation load and HLA types were extracted from the data and the proportion of antigenic mutations calculated by sampling randomly from missense mutations found in TCGA CRCs. The thick solid black line shows pa=0.075, the values used for simulations presented in main figures. Dashed red lines show pa=0.025 and pa=0.15 used in Extended Data Fig. S8.
Extended Data Fig. 8 VAF distribution of neoantigens under different selection strengths and antigen-generation probabilities.
Cumulative VAF distribution as a function of the inverse of the frequency, for all mutations (grey) and antigenic mutations (red) at increasing selection strengths (top to bottom) and antigen-generation probability, pa (left to right). The thick line shows the mean of n=100 independently simulated cumulative distributions, the shaded regions represent ±1 standard deviation around this mean.
a, Inter-tumour distribution of the antigenic proportion of missense mutations across CRC, STAD and UCEC cancers with >30 missense mutations. b, Inter-tumour distribution of proportional burden in MSS and MMR cancers of the meta-cohort combining CRC, STAD and UCEC cancers with >30 missense mutations. c, Inter-tumour distribution of proportional burden in escaped and non-escaped cancers of the meta-cohort. Two-sided Wilcoxon test p-values are reported on (a–c). d, Proportional burden computed for all (red) and subclonal (salmon) mutations in immune-escaped and non-escaped samples of each TCGA cohort. Lines connect values computed from the same cancer. e, Inter-tumour distribution of proportional burden in real CRC samples stratified by MSS/MMR status (left) and synthetic samples matching the mutational composition of real samples (right), with two-sided Wilcoxon-test reported on top. f, Total and subclonal proportional antigen burden computed on a matched synthetic cohort of n=100 tumours (cf. Fig. 5c). The p-value of paired two-sided Wilcoxon test is reported on (d) & (f). g, Normalised proportional subclonal burden computed by dividing subclonal burden of the meta-cohort by average subclonal burden of the synthetic cohort of n=100. P-value of a one-sample two-sided t-test against null-hypothesis of mean=1 is reported above each violin. Violin widths represent raw data density with binned individual data-points overlaid on top in (a–d) & (g) and indicated by the end-points of connecting lines in (e, f). Visual elements of boxplots in (d) correspond to the following summary statistics: centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x inter-quartile range; additional points, outliers outside of 1.5x inter-quartile range.
Colorectal, stomach and endometrial cancer samples in The Cancer Genome Atlas included in the bioinformatic analysis. The following information is listed for each sample in a tabulator separated format: patient identifier, subtype (microsatellite stable (MSS), mismatch repair mutated (MMR) or polymerase-ε mutated (POLE)), cancer type (CRC, STAD or UCEC), number of unique mutations associated with neoantigens, number of somatic missense mutations, number of clonal neoantigens, T-cell infiltration category (low, medium or high), predicted escape type (as shown in Extended Data Fig. 3), average ploidy and tumour purity (as determined using ASCAT), number of subclonal antigen-associated mutations, and number of subclonal somatic missense mutations. Fields that could not be determined for a sample (subtype, immune escape mechanism and T-cell score) are denoted by NA.
About this article
Cite this article
Lakatos, E., Williams, M.J., Schenck, R.O. et al. Evolutionary dynamics of neoantigens in growing tumors. Nat Genet 52, 1057–1066 (2020). https://doi.org/10.1038/s41588-020-0687-1
Cellular & Molecular Immunology (2021)