Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Neoantigen quality predicts immunoediting in survivors of pancreatic cancer


Cancer immunoediting1 is a hallmark of cancer2 that predicts that lymphocytes kill more immunogenic cancer cells to cause less immunogenic clones to dominate a population. Although proven in mice1,3, whether immunoediting occurs naturally in human cancers remains unclear. Here, to address this, we investigate how 70 human pancreatic cancers evolved over 10 years. We find that, despite having more time to accumulate mutations, rare long-term survivors of pancreatic cancer who have stronger T cell activity in primary tumours develop genetically less heterogeneous recurrent tumours with fewer immunogenic mutations (neoantigens). To quantify whether immunoediting underlies these observations, we infer that a neoantigen is immunogenic (high-quality) by two features—‘non-selfness’  based on neoantigen similarity to known antigens4,5, and ‘selfness’  based on the antigenic distance required for a neoantigen to differentially bind to the MHC or activate a T cell compared with its wild-type peptide. Using these features, we estimate cancer clone fitness as the aggregate cost of T cells recognizing high-quality neoantigens offset by gains from oncogenic mutations. With this model, we predict the clonal evolution of tumours to reveal that long-term survivors of pancreatic cancer develop recurrent tumours with fewer high-quality neoantigens. Thus, we submit evidence that that the human immune system naturally edits neoantigens. Furthermore, we present a model to predict how immune pressure induces cancer cell populations to evolve over time. More broadly, our results argue that the immune system fundamentally surveils host genetic changes to suppress cancer.


In 1957, Burnet and Thomas proposed that the immune system in multicellular organisms must eliminate transformed cells as an evolutionary necessity to maintain tissue homeostasis. This theory of ‘cancer immunosurveillance’ was later redefined more broadly as ‘cancer immunoediting’6—as a consequence of the immune system protecting the host from cancer, the immune system must also sculpt developing cancers1,7. When cancers develop, they accumulate mutations, some of which generate new protein sequences (neoantigens)8. As neoantigens are mostly absent from the human proteome, they can escape T cell central tolerance in the thymus to become antigens in cancers8. However, neoantigens typically arise in passenger mutations, and therefore distribute heterogeneously in cancer cell clones with variable immunogenicity. Thus, T cells selectively ‘edit’ clones1 with more immunogenic neoantigens3, inducing less immunogenic clones to outgrow in cancers.

Although cancer immunoediting has been demonstrated through longitudinal studies in immune-proficient and immune-deficient mice1,3,8, whether it is a general principle of how human cancers evolve remains uncertain. Despite suggestive evidence9,10,11, definitive evidence requires longitudinal tracking of large numbers of patients and cancers over time. As this is logistically challenging, whether the human immune system naturally edits cancers and whether edited clones can be predicted a priori remain unclear.

Quantifying selection pressures on neoantigens

To address this, we examined how 70 pancreatic ductal adenocarcinomas (PDACs) from 15 patients evolved longitudinally over 10 years (Fig. 1a). We reasoned that PDAC is an ideal cancer to test the immunoediting hypothesis. First, human PDACs have fewer neoantigens (35 on average)5,12 compared with more immunogenic cancers (112 in non-small-cell lung cancer13, 370 in melanoma14 on average). This theoretically maximizes our ability to both distinguish true neoantigen selection from neutral genomic changes over time and isolate effects of individual neoantigens on clonal selection. Second, T cell infiltrates in PDACs range from nearly zero to 1,000-fold higher5. Thus, PDACs have subsets that approximate immune-deficient and immune-proficient cancers, enabling us to theoretically observe how differential immune selection pressures modulate cancer cell clones. Finally, mutations in oncogenes occur early in PDAC carcinogenesis and are clonal15—this largely equalizes the cell-intrinsic oncogenic pressures among clones, maximizing our ability to detect how cell-extrinsic immune pressures affect clonal evolution.

Fig. 1: LTSs of PDAC develop tumours with distinct recurrence time, multiplicity and tissue tropism.
figure 1

a, The experimental design. b, c, Overall survival (b) and disease-free survival (c) of patients with PDAC. dg, The number (d), correlation with overall survival (e), patterns (f) and sites (g) of recurrent PDACs. In g, other indicates omentum, aorta, diaphragm and perirectum (STS); and pericardium, inferior vena cava, adrenal, kidney and liver (LTS). n indicates the number of individual patients (bf) or recurrent tumours (g). The horizontal bars show the median values. P values were determined using two-tailed log-rank tests (Mantel–Cox; b and c), two-tailed Mann–Whitney U-tests (d), two-tailed Pearson correlation (e) and two-tailed χ2 tests (f).

Source data

To model how immune-proficient and immune-deficient human cancers evolve, we compared how primary PDACs evolve to recurrence in a cohort of long-term survivors (LTSs) and short-term survivors (STSs) (Fig. 1a, b and Supplementary Table 1). We previously demonstrated that, compared with STSs, LTSs have primary tumours with around a 12-fold greater number of activated CD8+ T cells5,16,17 that are predicted to target immunogenic neoantigens5, therefore phenocopying relative greater immune pressure. Furthermore, in the current cohort we find that the largest T cell clones of LTS tumours have more similar CDR3β sequences18 compared with the largest T cell clones in STS tumours (Extended Data Fig. 1a, b), suggesting T cell clonal expansion and therefore greater immune activity in LTSs. We therefore hypothesized that this higher immune pressure in LTSs would induce tumours to preferentially lose tumour clones with immunogenic neoantigens over time (Fig. 1a). To test this hypothesis, we compared how tumours evolved from primary to recurrent tumours. We found that compared with STSs, LTSs had later (Fig. 1c) and fewer recurrent tumours (Fig. 1d) that inversely correlated with longer survival times (Fig. 1e). Moreover, 75% of LTSs versus 0% of STSs had recurrent tumours that were only metastatic (Fig. 1f), with distinct tissue-tropic recurrence patterns (Fig. 1g). Thus, LTS tumours recur with distinct latency, multiplicity and tissue-dependent evolutionary trajectories.

To examine whether differential selection pressure could explain these unique recurrence patterns, we performed whole-exome sequencing (Extended Data Fig. 2a) and inferred the clonal structures of matched primary and recurrent tumours. We reasoned that greater immune selection pressure in LTS tumours should limit the diversity of tumour clones over time, due to immunoediting of neoantigens. Consistently, we found that, although primary tumours in LTSs were only slightly more homogeneous than in STSs, recurrent tumours in LTSs were much more homogeneous (Fig. 2a (left)), indicating that LTSs probably evolved fewer clones (Fig. 2a (right) and Extended Data Fig. 3a, b). To examine whether this could be explained by greater selection pressure on neoantigens, we compared the total number of non-synonymous mutations (tumour mutational burden (TMB)) and computationally predicted MHC-I restricted neoantigens4,5. Consistently, although primary LTS tumours had a similar TMB with a comparable number of neoantigens as STS tumours (Fig. 2b), recurrent LTS tumours had a lower TMB with fewer neoantigens (Fig. 2b). Despite these differences, LTS and STS tumours had comparable numbers of synonymous mutations and mutations in driver oncogenes (Extended Data Fig. 2b, c). Although recurrent tumours of LTSs had fewer co-occurring mutations in oncogenes compared with recurrent tumours of STSs (Extended Data Fig. 2d), the number of mutations in oncogenes did not correlate with TMB (Extended Data Fig. 2e). Furthermore, LTS recurrent tumours gained significantly fewer mutations and neoantigens compared with STS recurrent tumours (Fig. 2c), remaining largely neutral over time19. LTS tumours also gained fewer mutations that generate neoantigens than STS tumours (Fig. 2d), indicating that LTS tumours preferentially depleted neoantigenic mutations. These data support the hypothesis that greater immune selection in LTS tumours edited tumour clones and neoantigens.

Fig. 2: LTSs of PDAC develop tumours with fewer neoantigens.
figure 2

a, Shannon entropy (S, left), and the difference in Shannon entropy between recurrent (Srec) and primary (Sprim) PDACs (right). b, TMB and neoantigen number (NA) in primary and recurrent PDACs. c, d, The difference in TMB and NA (c), and the number of mutations that generate neoantigens (NA Mut) (d) between recurrent and primary PDACs. n indicates the number of individual tumours. The horizontal bars show the median values. For ad, P values were determined using two-tailed Mann–Whitney U-tests.

Source data

The neoantigen quality model

To identify the edited neoantigens, we extended our previous neoantigen quality model4,5 that quantifies the immunogenic features of a neoantigen to propose that two competing outcomes determine whether a neoantigen is high-quality—whether the immune system recognizes or tolerates a neoantigenic mutation (Fig. 3a). To estimate the likelihood the immune system recognizes a neoantigen, we measure the sequence similarity of the mutant neopeptide (pMT) to known immunogenic antigens. This infers the ‘non-self’ recognition potential R of pMT, a proxy for peptides within the recognition space of the T cell receptor (TCR) repertoire.

Fig. 3: High-quality neoantigens are immunoedited in LTS  PDACs.
figure 3

a, Neoantigen quality model. b, The model and experimental approach to estimate cross-reactivity distance C. c, d, Measured (top) and fitted (bottom) pMT–TCR activation curves (c, amino acid (AA) position 4), and activation heat maps (d, all amino acid positions) for stronger and weaker pWT–TCR pairs. e, Composite pMT–TCR EC50 values of all stronger and weaker pWT–TCR pairs. f, pMT–TCR activation heat map and observed versus modelled C(pWT, pMT) for the HLA-B*27:05-restricted pWT–TCR pair. n indicates the number of single-amino-acid-substituted pWT, pMT and pMT, pMT pairs. g, Cross-reactivity distance model C and dendrogram of agglomerative clustering of substitution matrix M. h, Observed amino acid substitution frequency versus matrix M-defined substitution distance in primary and recurrent STS and LTS PDACs. M distance is the matrix M-defined amino acid distance from g. Circles indicate substituted residues. n indicates the number of substitutions. i, Cumulative probability distributions of log(C) and D. n indicates the number of neoantigens. The red rectangles in the heat maps indicate amino acids in pWT. The green line is a linear regression fit. Heat maps are ordered according to the amino acid order in the dendogram in g. P values were determined using two-tailed Pearson correlation (f and h) and two-sided Kolmogorov–Smirnov tests (i).

Source data

By contrast, we posit that the immune system can also fail to discriminate pMT from its wild-type (WT) peptide (pWT), and therefore tolerate it as ‘self’. The immune system must therefore exert greater self discrimination D (Fig. 3a) in tumours to overcome the principles of negative T cell selection, the adaptation that limits autoreactivity to host tissues. We approximate the D between pWT and pMT by two features—differential MHC presentation and differential T cell reactivity. Differential MHC presentation of pWT and pMT (\({K}_{{\rm{d}}}^{\text{WT}}/{K}_{{\rm{d}}}^{\text{MT}}\)), previously introduced as the MHC amplitude A (refs. 4,5), estimates the availability of T cells to recognize pMT. If pWT is not presented to T cells in the thymus or the periphery (as with a high \({K}_{{\rm{d}}}^{\text{WT}}\), which implies poor pWT–MHC binding), pWT-specific T cells escape negative selection to expand the peripheral T cell precursor pool available to recognize a pMT presented on MHC (low \({K}_{{\rm{d}}}^{\text{MT}}\))20. Here we extend this concept and introduce cross-reactivity distance C, a new model term that estimates the antigenic distance required for T cells to discriminate between pMT and pWT. Thus, self discrimination D = log(A) + log(C) is a proxy for peptides outside the toleration space of the TCR repertoire. In summary, we define neoantigen quality as Q = R × D (Fig. 3a), now with components that estimate whether a neoantigen can be recognized as non-self and discriminated from self.

To model C, we leveraged recent findings that conserved structural features underlie TCR–peptide recognition. Specifically, the binding domains of peptide-degenerate TCRs21,22 and TCR-degenerate peptides23 share common amino acid motifs, suggesting that T cell cross-reactivity between pMT and pWT could estimate the relative C of different neoantigenic substitutions (Fig. 3b). We selected an HLA-A*02:01-restricted strong epitope (NLVPMVATV (NLV)) from human cytomegalovirus24 that was previously used to model TCR–peptide degeneracy21,22 as a model pWT, and three NLV-specific TCRs (Extended Data Fig. 4a–c). We then varied the NLV peptide by every amino acid at each position to model pMT substitutions, and compared how TCRs cross-react between each pMT and its pWT across a 10,000-fold concentration range where pWT changes maximally altered T cell activation (Fig. 3b). We observed that substitutions were either highly, moderately or poorly cross-reactive (Fig. 3c, d), and the cross-reactivity pattern depended on the substituted position and residue (Extended Data Fig. 5a). Interestingly, we found similar patterns of cross-reactivity between a model HLA-A*02:01-restricted weaker pWT epitope in the melanoma self-antigen gp10025,26 (Extended Data Figs. 4d and 5b), three pWT-specific TCRs and single-amino-acid-substituted pMTs, suggesting that conserved substitution patterns define C (Fig. 3e and Extended Data Fig. 5b). Thus, we quantified the cross-reactivity distance C between a pWT and its corresponding pMT as \(\,C\left({{\bf{p}}}^{{\rm{WT}}},{{\bf{p}}}^{{\rm{MT}}}\right)={{\rm{EC}}}_{50}^{{\rm{MT}}}/{{\rm{EC}}}_{50}^{{\rm{WT}}}\). We chose the half maximal effective concentration (EC50) to model C, as T cell activation to pWT was consistently a sigmoidal function (Extended Data Figs. 4c, d and 6a, b) described by a Hill equation, where EC50 determines how a ligand activates a receptor. We next estimated the EC50 of all 1,026 TCR–pMT pairs to infer a model for C that estimates whether a neoantigenic substitution is cross-reactive (and therefore tolerated) based on the substituted amino acid position and residue (Extended Data Figs. 6a, b and 7a, b). We then tested whether C predicted cross-reactive substitutions in an HLA-B*27:05-restricted neopeptide–TCR pair from an LTS (Extended Data Fig. 4e). Notably, C predicted cross-reactive pWT, pMT and pMT, pMT substitutions in this neopeptide–TCR pair (Fig. 3f and Extended Data Fig. 5c, 6c). Thus, we combined all 1,197 TCR–pMT pairs to derive a composite C—the antigenic distance for a TCR to cross-react between amino-acid-substitution pairs (Fig. 3g and Extended Data Fig. 7c). Broadly, two factors promote cross-reactivity: substitutions at peptide termini27 and within amino acid biochemical families (driven by amino acids of similar size and hydrophobicity; Fig. 3g). With this composite C, we now define self-discrimination D between a pWT and its corresponding pMT (Fig. 3a) as

$$D({{\bf{p}}}^{{\rm{W}}{\rm{T}}}\to {{\bf{p}}}^{{\rm{M}}{\rm{T}}})=(1-w)\log \,\left(\frac{{K}_{{\rm{d}}}^{{\rm{W}}{\rm{T}}}}{{K}_{{\rm{d}}}^{{\rm{M}}{\rm{T}}}}\right)+w\,\log \,\left(\frac{{{\rm{E}}{\rm{C}}}_{50}^{{\rm{M}}{\rm{T}}}}{{{\rm{E}}{\rm{C}}}_{50}^{{\rm{W}}{\rm{T}}}}\right),$$

where \(w\) sets the relative weight between the two terms. We chose the parameters of the neoantigen quality model to maximize the log-rank test score of survival analysis on an independent cohort of 58 patients with PDAC5 (Supplementary Methods and Extended Data Table 1a).

Immunoediting of neoantigens

We applied our model to PDAC, positing that immunoediting will differentially deplete neoantigens with higher D (less self) in LTS versus STS PDACs. First, we stratified the frequency of mutations by the antigenic distance as defined by C (Fig. 3g and Supplementary Methods). Compared with mutations with a lower antigenic distance, mutations with a greater antigenic distance from self were more significantly depleted in both LTS and STS PDACs (Fig. 3h (left and middle)) and, interestingly, preferentially more depleted in LTS compared with STS PDACs (Fig. 3h (right)). To further examine these observations, we applied the full D model to find that neoantigens with both a higher C and D were strikingly more depleted in LTS versus STS PDACs (Fig. 3i). Interestingly, genes in the HLA class-I pathway were not differentially mutated, deleted, expressed or localized in STS versus LTS PDACs, indicating that neoantigen depletion was not accompanied by acquired resistance in the HLA class-I pathway in LTSs (Extended Data Fig. 8a–c). Thus, tumours in LTSs selectively lose high-quality neoantigens.

Predicting recurrent tumour composition

We next incorporated neoantigen quality parameters into a fitness model4,5 to test whether our model that predicts clonal tumour evolution can identify immunoedited clones. We reconstructed joint multisample phylogenies28 for all tumours from each patient to provide a common clonal structure and track clone frequencies between the tumours of the same patient. To describe selective pressures acting on tumour clones, we accounted for positive selection due to cumulative mutations in driver oncogenes. We quantify this effect in a minimal model \({F}_{P}^{\alpha }\), which counts the number of missense mutations in canonical PDAC driver genes (KRAS, TP53, CDKN2A and SMAD4) in each clone α. The composite fitness model (Fig. 4a) defines fitness function, Fα, of clone α as the sum of a negative fitness cost due to immune recognition of high-quality neoantigens and positive fitness gain due to the accumulation of mutations in driver oncogenes,

$${F}^{\alpha }=-{\sigma }_{I}\mathop{max}\limits_{{{\bf{p}}}^{{\rm{M}}{\rm{T}}}\in \text{clone}\,\alpha }Q({{\bf{p}}}^{{\rm{M}}{\rm{T}}})+{\sigma }_{P}{F}_{P}^{\alpha }$$
Fig. 4: The neoantigen quality fitness model identifies edited clones to predict the clonal composition of recurrent tumours.
figure 4

a, Recurrent tumour clone composition prediction based on the primary tumour composition and the fitness model. b, Model fitted \({\hat{X}}_{{\rm{rec}}}^{\alpha }/{X}_{{\rm{prim}}}^{\alpha }\) and observed \({X}_{{\rm{rec}}}^{\alpha }/{X}_{{\rm{rec}}}^{\alpha }\) clone frequency changes for the STS (left) and LTS (right) cohorts. Frequency ratios below the sampling threshold were evaluated with pseudocounts. ce, The immune fitness cost \({\bar{F}}_{I}\) of recurrent tumours (c), new clones (e), and the percentage of new neoantigens in recurrent tumours (d). f, TCR dissimilarity index and immune fitness cost \({\bar{F}}_{I}\) in tumours. n indicates the number of tumours. The green line is a linear regression fit. The horizontal bars show the median values. P values were determined using two-tailed Spearman correlation (b), two-tailed Pearson correlation (f) and two-tailed Mann–Whitney U-tests (ce).

Source data

with the free parameters σI and σP setting the amplitude of the fitness components (Supplementary Methods). We use the model to predict the frequencies of clones propagated to recurrent tumours as

$${\hat{x}}_{{\rm{rec}}}^{\alpha }=\frac{1}{Z}{x}_{{\rm{prim}}}^{\alpha }\,\exp ({F}^{\alpha }),$$

where \({x}_{{\rm{prim}}}^{\alpha }\) is the frequency of clone α in the primary tumour, \({\hat{x}}_{{\rm{rec}}}^{\alpha }\) is its predicted frequency in the recurrent tumour and constant Z ensures correct normalization. We evaluated how closely the fitness model predicted clonal evolution in the recurrent tumours. To do this, for each recurrent tumour in the LTS and STS cohorts, we performed maximum-likelihood fitting of the model parameters σI and σP in equation (3).

We found that our model provided a better fit of the observed evolution of LTS compared to STS tumour clones, predicting observed evolution in 86% of LTS tumours versus 52% of STS tumours (Extended Data Table 1b) when compared with a neutral model (no selection pressure on clones; differences were quantified with a Bayesian information criterion; Supplementary Methods). Notably, a partial fitness model that incorporates only the oncogenicity component, \({F}^{\alpha }={\sigma }_{P}{F}_{P}^{\alpha }\), showed reduced performance for the LTS tumours but not STS tumours (Extended Data Table 1b and Extended Data Fig. 9). To illustrate this further, we compared observed and model-fitted clone frequency changes between the primary and recurrent tumours, \({X}_{{\rm{rec}}}^{\alpha }/{X}_{{\rm{prim}}}^{\alpha }\) and \({\hat{X}}_{{\rm{rec}}}^{\alpha }/{X}_{{\rm{prim}}}^{\alpha }\) (Fig. 4b), for all reliably predictable clones in the primary tumour (above 3% frequency; Supplementary Methods). The direction of frequency changes was correctly predicted for 71% of LTS and 58% of STS tumour clones (rank correlation ρ of 0.65 and 0.28, respectively; Fig. 4b and Extended Data Table 1b). We attribute the model’s better predictions in LTS tumours to the presence of immune selection in these tumours.

Next, we computed the overall tumour immune cost (averaging the immune component, \({F}_{I}^{\alpha }=\mathop{max}\limits_{{{\bf{p}}}^{{\rm{MT}}}\in \mathrm{clone}\alpha }Q({{\bf{p}}}^{{\rm{M}}{\rm{T}}})\) over all tumour clones). Consistently, the immune fitness cost was lower in recurrent LTS tumours compared with in STS tumours (Fig. 4c). Furthermore, we considered the immune cost only of clones that are new in recurrent tumours, but not present in primary tumours. Recurrent LTS tumours contained both fewer new neoantigens (1% versus 18%; Fig. 4d) and new clones with markedly lower immune fitness cost (Fig. 4e) compared with recurrent STS tumours. These observations again suggest that the LTS recurrent tumours had been subject to immunoediting.

Finally, we confirmed these results by analysing TCR sequencing data in the available recurrent tumour samples. We quantified the specificity of T cell clonal expansion using the TCR dissimilarity index18 (Supplementary Methods and Extended Data Fig. 1a, b) and correlated this index to immune fitness cost. We found greater T cell clonal expansion in tumours (lower TCR dissimilarity index) correlated with more highly edited tumours (lower immune fitness cost) (Fig. 4f and Extended Data Fig. 1c). In summary, these results strongly suggest that neoantigens are immunoedited in PDAC, and that our fitness model captures the selective pressures by T cells acting on tumour clones.


Here we clarify several questions on how the immune system interacts with cancer. First, does cancer immunoediting occur in humans? As the theory of cancer immunoediting was developed by studying carcinogen-induced highly mutated murine sarcomas1,3, it has remained uncertain whether these principles apply to human cancers29,30,31. We postulated that spontaneous immunoediting of a human cancer should manifest when the immune system recognizes an immunogenic antigen in a primary tumour, as this should induce the antigen to be subsequently eliminated in the recurrent tumour. Indeed, this is what we found—tumours that evolve under stronger immune pressure lose more immunogenic neoantigens. Although we did not assess the changes in non-mutated antigens or address how different cellular compositions and tissue environments may modulate editing, it is notable that the proof for immunoediting is revealed in PDAC, a low-mutated cancer that is considered to be resistant to endogenous immunity. This strengthens the claim that immunoediting is a broadly conserved principle of carcinogenesis.

Second, does immunoediting manifest as loss of immunogenic antigens, or do cancers also acquire genetic resistance? Interestingly, we observed the former but not the latter. We postulate that such phenotypes are governed by the magnitude of the selective pressure. Although LTSs exhibit higher immune pressures in tumours than STSs, this is ostensibly still lower than pharmacologically boosted immune pressure in a tumour32. Thus, in LTSs, as pressure is moderate, tumours lose immunogenic antigens; by contrast, where pressure is maximal, such as perhaps when under therapy, tumours acquire resistance32. This distils cancer evolution under immune selection to a simpler concept—selection determines clonal composition, and pressure determines adaptive change. Further studies will test these concepts.

Third, can we quantify how the immune system recognizes mutations?  We combined experimental techniques and machine learning to present a new metric that captures how T cells cross-react between peptides. We use C to quantify the antigenic distance of mutated peptides in the TCR-recognition space and the qualities that render individual mutations immunogenic, building on our previous efforts4,5 to formalize antigen quality. Although we used our quality model to identify immunogenic neoantigens, we propose that it captures common immunogenic features in antigens. Thus, we anticipate that our model can further illuminate the biology of antigens beyond cancer, including T cell cross-reactivity between antigens, pathologies of cross-reactivity (such as autoimmunity) and therapies that require rational antigen selection (such as vaccines).

Finally, it is notable that quantifying the ability of the immune system to discriminate changes in mere single amino acids can predict how cancers evolve. This undoubtedly reflects that a fundamental function of the immune system is to maintain integrity of the host genome. We therefore speculate that our model in essence captures the mechanisms through which the immune system preserves genomic integrity.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

All raw sequencing data obtained through the Johns Hopkins Hospital medical donation programme have been previously described19 and are available at the European Genome–Phenome Archive under accession number EGAS00001004097. All other raw sequencing data are available at the NCBI Sequence Read Archive under accession number PRJNA648923. The ICGC data used in this study are available at the ICGC ( under the identifier PACA-AU. The TCGA data used in this study are from TCGA-PAAD dataset available at the NCI Genomic Data Commons ( Source data are provided with this paper.

Code availability

Code used to construct and apply the model is available at GitHub (


  1. Shankaran, V. et al. IFNγ and lymphocytes prevent primary tumour development and shape tumour immunogenicity. Nature 410, 1107–1111 (2001).

    ADS  CAS  Article  Google Scholar 

  2. Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

    CAS  Article  Google Scholar 

  3. Matsushita, H. et al. Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature 482, 400–404 (2012).

    ADS  CAS  Article  Google Scholar 

  4. Łuksza, M. et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature 551, 517–520 (2017).

    ADS  Article  Google Scholar 

  5. Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature 551, 512–516 (2017).

    ADS  CAS  Article  Google Scholar 

  6. Burnet, F. M. The concept of immunological surveillance. Prog. Exp. Tumor Res. 13, 1–27 (1970).

    CAS  Article  Google Scholar 

  7. Dunn, G. P., Bruce, A. T., Ikeda, H., Old, L. J. & Schreiber, R. D. Cancer immunoediting: from immunosurveillance to tumor escape. Nat. Immunol. 3, 991–998 (2002).

    CAS  Article  Google Scholar 

  8. Schumacher, T. N. & Schreiber, R. D. Neoantigens in cancer immunotherapy. Science 348, 69–74 (2015).

    ADS  CAS  Article  Google Scholar 

  9. Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).

    ADS  CAS  Article  Google Scholar 

  10. Zhang, A. W. et al. Interfaces of malignant and immunologic clonal dynamics in ovarian cancer. Cell 173, 1755–1769 (2018).

    CAS  Article  Google Scholar 

  11. Jiménez-Sánchez, A. et al. Heterogeneous tumor-immune microenvironments among differentially growing metastases in an ovarian cancer patient. Cell 170, 927–938 (2017).

    Article  Google Scholar 

  12. Balli, D., Rech, A. J., Stanger, B. Z. & Vonderheide, R. H. Immune cytolytic activity stratifies molecular subsets of human pancreatic cancer. Clin. Cancer Res. 23, 3129–3138 (2017).

    CAS  Article  Google Scholar 

  13. Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non–small cell lung cancer. Science 348, 124–128 (2015).

    ADS  CAS  Article  Google Scholar 

  14. Allen, E. M. V. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

    ADS  Article  Google Scholar 

  15. Yachida, S. et al. Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature 467, 1114–1117 (2010).

    ADS  CAS  Article  Google Scholar 

  16. Ino, Y. et al. Immune cell infiltration as an indicator of the immune microenvironment of pancreatic cancer. Brit. J. Cancer 108, 914–923 (2013).

    CAS  Article  Google Scholar 

  17. Riquelme, E. et al. Tumor microbiome diversity and composition influence pancreatic cancer outcomes. Cell 178, 795–806 (2019).

    CAS  Article  Google Scholar 

  18. Bravi, B. et al. Probing T-cell response by sequence-based probabilistic modeling. PLoS Comput. Biol. 17, e1009297 (2021).

    CAS  Article  Google Scholar 

  19. Sakamoto, H. et al. The evolutionary origins of recurrent pancreatic cancer. Cancer Discov. 10, 792–805 (2020).

    CAS  Article  Google Scholar 

  20. Dyall, R. et al. Heteroclitic immunization induces tumor immunity. J. Exp. Med. 188, 1553–1561 (1998).

    CAS  Article  Google Scholar 

  21. Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).

    ADS  CAS  Article  Google Scholar 

  22. Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

    ADS  CAS  Article  Google Scholar 

  23. Birnbaum, M. E. et al. Deconstructing the peptide-MHC specificity of T cell recognition. Cell 157, 1073–1087 (2014).

    CAS  Article  Google Scholar 

  24. Solache, A. et al. Identification of three HLA-A*0201-restricted cytotoxic T cell epitopes in the cytomegalovirus protein pp65 that are conserved between eight strains of the virus. J. Immunol. 163, 5512–5518 (1999).

    CAS  PubMed  Google Scholar 

  25. Kawakami, Y. et al. Recognition of multiple epitopes in the human melanoma antigen gp100 by tumor-infiltrating T lymphocytes associated with in vivo tumor regression. J. Immunol. 154, 3961–3968 (1995).

    CAS  PubMed  Google Scholar 

  26. Parkhurst, M. R. et al. Improved induction of melanoma-reactive CTL with peptides from the melanoma antigen gp100 modified at HLA-A*0201-binding residues. J. Immunol. 157, 2539–2548 (1996).

    CAS  PubMed  Google Scholar 

  27. Capietto, A.-H. et al. Mutation position is an important determinant for predicting cancer neoantigens. J. Exp. Med. 217, e20190179 (2020).

    Article  Google Scholar 

  28. Deshwar, A. G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).

    Article  Google Scholar 

  29. Evans, R. A. et al. Lack of immunoediting in murine pancreatic cancer reversed with neoantigen. JCI Insight 1, e88328 (2016).

    Article  Google Scholar 

  30. Barthel, F. P. et al. Longitudinal molecular trajectories of diffuse glioma in adults. Nature 576, 112–120 (2019).

    ADS  CAS  Article  Google Scholar 

  31. Freed-Pastor, W. A. et al. The CD155/TIGIT axis promotes and maintains immune evasion in neoantigen-expressing pancreatic cancer. Cancer Cell 39, 1342–1360 (2021).

    CAS  Article  Google Scholar 

  32. Zaretsky, J. M. et al. Mutations associated with acquired resistance to PD-1 blockade in melanoma. N. Engl. J. Med. 375, 819–829 (2016).

    CAS  Article  Google Scholar 

Download references


This work was supported by NIH U01 CA224175 (to V.P.B.), a Stand Up to Cancer Convergence Award (to B.D.G., V.P.B. and M.Ł.), a Damon Runyon Clinical Investigator Award (to V.P.B.), and the Avner Pancreatic Cancer Foundation (to A.J. and A.G.). M.Ł. is a Pew Biomedical Scholar. Services of the Integrated Genomics Core were funded by the NCI Cancer Center Support Grant (P30 CA08748), Cycle for Survival, and the Marie-Josée and Henry R. Kravis Center for Molecular Oncology.

Author information

Authors and Affiliations



M.Ł., B.D.G. and V.P.B. conceived the study. Z.M.S., L.A.R. and V.P.B. conceived, L.A.R. experimentally performed and Z.M.S. constructed the cross-reactivity distance C model. M.Ł., Z.M.S., L.A.R., B.D.G. and V.P.B. conceived the neoantigen quality model. M.Ł., B.D.G. conceived, and M.Ł. and Z.M.S. constructed the fitness model. B.B., T. Mora, R.M., A.M.W. and S.C. conceived and constructed the TCR dissimilarity index. M.Ł., Z.M.S., L.A.R., K.S., J. Leung., J. Lihm, D.H., R.K., A.M.-M., A.J., A.G., M.A., P.G., A.Z., R.Y., A.K.C., Z.A., M.G., T. Merghoub, J.W., E.P., C.I.-D., B.D.G. and V.P.B. acquired, analysed, and interpreted the data. A.D. and M.S. assisted with T cell transductions. M.Ł., Z.M.S., L.A.R., E.P., B.D.G. and V.P.B. drafted the manuscript with input from all of the authors.

Corresponding authors

Correspondence to Marta Łuksza, Benjamin D. Greenbaum or Vinod P. Balachandran.

Ethics declarations

Competing interests

L.A.R. is listed as an inventor of a patent related to oncolytic viral therapy (US20170051022A1). L.A.R., Z.M.S. and V.P.B. are listed as inventors on a patent application related to work on antigen cross-reactivity. M.Ł., B.D.G. and V.P.B. are listed as inventors on a patent application related to work on neoantigen quality modelling (63/303,500). C.I.-D. has received research support from Bristol-Myers Squibb. B.D.G. has received honoraria for speaking engagements from Merck, Bristol-Meyers Squibb and Chugai Pharmaceuticals; has received research funding from Bristol-Meyers Squibb; and has been a compensated consultant for PMV Pharma and Rome Therapeutics of which he is a co-founder. V.P.B. has received research support from Bristol-Myers Squibb and Genentech. J.W. is a consultant for Adaptive Biotech, Amgen, Apricity, Ascentage Pharma, Arsenal IO, Astellas, AstraZeneca, Bayer, Beigene, Boehringer Ingelheim, Bristol Myers Squibb, Celgene, Chugai, Eli Lilly, Elucida, F Star, Georgiamune, Imvaq, Kyowa Hakko Kirin, Linneaus, Merck, Neon Therapeutics, Polynoma, Psioxus, Recepta, Takara Bio, Trieza, Truvax, Sellas, Serametrix, Surface Oncology, Syndax, Syntalogic and Werewolf Therapeutics.  J.W. receives grant/research support from Bristol Myers Squibb and Sephora. J.W. has equity in Tizona Pharmaceuticals, Adaptive Biotechnologies, Imvaq, Beigene, Linneaus, Apricity, Arsenal IO and Georgiamune. T. Merghoub is a co-founder and holds equity in IMVAQ Therapeutics; he is a consultant for Immunos Therapeutics, ImmunoGenesis and Pfizer; he has research support from Bristol-Myers Squibb, Surface Oncology, Kyn Therapeutics, Infinity Pharmaceuticals, Peregrine Pharmaceuticals, Adaptive Biotechnologies, Leap Therapeutics and Aprea; he has patents on applications related to work on oncolytic viral therapy, alphavirus-based vaccine, neoantigen modelling, CD40, GITR, OX40, PD-1 and CTLA-4. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Paul Thomas and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Top ranked T cells in LTS tumours have more similar CDR3β sequences.

(a) T cell receptor (TCR) CDR3β sequence dissimilarity (TCR dissimilarity index) in STS and LTS primary and recurrent PDACs. TCR dissimilarity index calculated using the Restricted Boltzmann Machine model18. n = individual tumours. Horizontal bars = median. (b) Trend of P value of TCR dissimilarity index between STS and LTS PDACs (as in left panel) with number of clones in the sample. n = 17 tumours. Blue line indicates a P value of 0.05; circle = mean P value; error bars = standard error of the mean. (c) TCR dissimilarity index based on T cell clone size (Supplementary Methods) and immune fitness cost \({\bar{F}}_{I}\). Green line = linear regression fit. P value by two-tailed Mann-Whitney U-test (a) and two-tailed Pearson correlation (c).

Source data

Extended Data Fig. 2 Tumour mutational features in STSs and LTSs of PDAC .

(a) Whole-exome sequencing depth and (b) number of synonymous mutations in primary and recurrent PDACs from STSs and LTSs. (c) Oncoprints of driver mutation frequencies in primary and recurrent PDACs. Frequencies = percentage of patients in each cohort that harbor corresponding driver gene mutations. (d) Frequency of primary (left) and recurrent (right) PDACs with mutations in ≥ 3 oncogenes. (e) Number of nonsynonymous mutations (TMB) versus number of mutations in oncogenes in primary and recurrent PDACs. n = individual tumours. Horizontal bars = median. P value by two-tailed Mann-Whitney U-test.

Source data

Extended Data Fig. 3 Tumour evolutionary trees in STSs and LTSs of PDAC.

(a, b) Tumour clone phylogenies in primary and recurrent PDACs from STSs (a, n = 6) and LTSs (b, n = 9).

Extended Data Fig. 4 TCR transduction and antigen specificity.

(a) Experimental schema to transduce and measure \({{\bf{p}}}^{\text{WT}}\)-specific T cell receptor (TCR) activation. hVα,β = human α and β variable regions; mCα,β = mouse α and β constant regions. (b) Representative gating strategy to detect transduced TCR activation and specificity. (ce) Sequences of model \({{\bf{p}}}^{\text{WT}}\)s and \({{\bf{p}}}^{\text{WT}}\)-specific TCRs, and TCR activation across varied \({{\bf{p}}}^{\text{WT}}\)concentrations.

Source data

Extended Data Fig. 5 T cell activation is variably degenerate to single amino acid substitutions.

(ac) T cell activation to model \({{\bf{p}}}^{\text{WT}}\)s (black curves) and single amino acid substituted \({{\bf{p}}}^{\text{MT}}\)s (color curves).

Source data

Extended Data Fig. 6 T cell activation to degenerate substitutions follows a sigmoidal function.

(ac) Fitted T cell activation curves to model \({{\bf{p}}}^{\text{WT}}\)s (black curves) and single amino acid substituted \({{\bf{p}}}^{\text{MT}}\)s (color curves).

Source data

Extended Data Fig. 7 Cross-reactivity distance C model.

Amino acid position dependent factor (a) and substitution matrix (b) of cross-reactivity model based on T cell receptor (TCR) cross-reactivity to strong (CMV) and weak (gp100) \({{\bf{p}}}^{\text{WT}}\)s and single amino acid substituted \({{\bf{p}}}^{\text{MT}}\)s (Fig. 3d, e). (c) Correlation of substitution-induced differential MHC-I binding (\({\rm{\log }}\left(A\right)\) = \({K}_{{\rm{d}}}^{{\rm{WT}}}/{K}_{{\rm{d}}}^{{\rm{MT}}}\)) and substitution induced differential TCR activation (\({\rm{\log }}\left(C\right)\) = \({\rm{E}}{{\rm{C}}}_{50}^{{\rm{MT}}}/{\rm{E}}{{\rm{C}}}_{50}^{{\rm{WT}}}\)) for all model \({{\bf{p}}}^{\text{WT}}\)-TCR pairs and single amino acid substituted \({{\bf{p}}}^{\text{MT}}\)s. \({K}_{{\rm{d}}}^{{\rm{WT}}}\) and \({K}_{{\rm{d}}}^{{\rm{WT}}}\) determined through computational predictions of \({{\bf{p}}}^{\text{WT}}\) and \({{\bf{p}}}^{\text{MT}}\) binding to HLA-A*02:01 (CMV, gp100 peptides) and HLA-B*27:05 (tumour neopeptide) with Net MHC 3.4. \({{\rm{EC}}}_{50}^{\text{MT}}\) and \({{\rm{EC}}}_{50}^{{\rm{WT}}}\) measured experimentally through \({{\bf{p}}}^{\text{WT}}\) and \({{\bf{p}}}^{\text{MT}}\) reactivity to TCRs. n = individual peptide-TCR measurements. P values by two-tailed Pearson correlation (c).

Source data

Extended Data Fig. 8 LTS and STS PDACs have equivalent genetic changes in HLA class-I pathway genes.

(a) Number of mutations (synonymous and non-synonymous), homozygous deletions, heterozygous deletions and copy number neutral loss of heterozygosity (LOH) changes in HLA class-I pathway genes (B2M, CANX, CALR, HLA-A, HLA-B, HLA-C, HLA-E, HLA-F, HLA-G, TAP1, TAP2, TAPBP, ERAP1, ERAP2, HSPA5, PDIA3, SAR1B, SEC13, SEC23A, SEC24A, SEC24B, SEC24C, SEC24D, SEC31A) in primary and recurrent PDACs. (b) mRNA expression in HLA class-I pathway genes by bulk RNA sequencing (ICGC, TCGA cohorts) and transcriptional analysis (Affymetrix, Memorial Sloan Kettering Cancer Center (MSKCC) cohort) in primary PDAC tumours. (c) Representative multiplexed immunohistochemical images (left) and ratio (right) of MHC-I+ tumour cells (CK19+) and MHC-I+ non-tumour cells (CK19-) in STS and LTS primary PDACs. n = individual tumours. Horizontal bars = median. Horizontal bars on violin plots show median and quartiles. P value by Wald’s test adjusted for multiple comparison testing.

Source data

Extended Data Fig. 9 Evaluation of clone fitness model predictions.

The log-likelihood score (Supplementary Methods, eq. (31)) is shown for the STS and LTS cohorts to estimate the statistical information gain of fitness models and the amount of evidence of the selective pressures captured by each of the models. The orange bars show the aggregated log-likelihood scores, \(\Delta {{\mathscr{L}}}^{\text{STS}}\left(F,{F}_{N}\right)\) and \(\Delta {{\mathscr{L}}}^{\text{LTS}}\left(F,{F}_{N}\right),\) of the two-component fitness model, \(F\), with parameters \({\sigma }_{I},{\sigma }_{P}\) optimized for each recurrent tumour sample, as compared to the null model, \({F}_{N}\), standing for neutral clone evolution, with zero fitness and parameters \({\sigma }_{I}=0,{\sigma }_{P}=0\). The red bars present the corresponding aggregated log-likelihood scores \(\Delta {{\mathscr{L}}}^{\text{STS}}\left({F}_{P},{F}_{N}\right)\) and \(\Delta {{\mathscr{L}}}^{\text{LTS}}\left({F}_{P},{F}_{N}\right)\) for the driver-gene only fitness model, \({F}_{P}\), which accounts for positive selection on driver genes but disregards the effect of immune selection, with parameter \({\sigma }_{I}=0,\) and \({\sigma }_{P}\) optimized for each recurrent tumour sample. Finally, the blue bars present the corresponding aggregated log-likelihood scores \(\Delta {{\mathscr{L}}}^{\text{STS}}\left({F}_{I},{F}_{N}\right)\) and \(\Delta {{\mathscr{L}}}^{\text{LTS}}\left({F}_{I},{F}_{N}\right)\) for the immune-only fitness model, \({F}_{I}\), with parameter \({\sigma }_{P}=0,\) and \({\sigma }_{I}\) optimized for each recurrent tumour sample.

Source data

Extended Data Table 1 Neoantigen quality fitness models

Supplementary information

Supplementary Information

Supplementary Table 1, legends for Supplementary Tables 2 and 3, and Supplementary Methods.

Reporting Summary

Supplementary Table 2

Supplementary Table 3

Source data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Łuksza, M., Sethna, Z.M., Rojas, L.A. et al. Neoantigen quality predicts immunoediting in survivors of pancreatic cancer. Nature 606, 389–395 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing