Abstract
Tumor neoepitopes presented by major histocompatibility complex (MHC) class I are recognized by tumor-infiltrating lymphocytes (TIL) and are targeted by adoptive T-cell therapies. Identifying which mutant neoepitopes from tumor cells are capable of recognition by T cells can assist in the development of tumor-specific, cell-based therapies and can shed light on antitumor responses. Here, we generate a ranking algorithm for class I candidate neoepitopes by using next-generation sequencing data and a dataset of 185 neoepitopes that are recognized by HLA class I–restricted TIL from individuals with metastatic cancer. Random forest model analysis showed that the inclusion of multiple factors impacting epitope presentation and recognition increased output sensitivity and specificity compared to the use of predicted HLA binding alone. The ranking score output provides a set of class I candidate neoantigens that may serve as therapeutic targets and provides a tool to facilitate in vitro and in vivo studies aimed at the development of more effective immunotherapies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All next-generation sequencing data are available on dbGap under accession number phs001003.v1.p1. Source data are available from the NIH figshare repository at https://doi.org/10.35092/yhjc.c.4792338.v2 (ref. 56).
Code availability
The models developed and presented in this paper are available at https://github.com/JaredJGartner/SB_neoantigen_Models.
References
Huang, J. et al. T cells associated with tumor regression recognize frameshifted products of the CDKN2A tumor suppressor gene locus and a mutated HLA class I gene product. J. Immunol. 172, 6057–6064 (2004).
Zhou, J., Dudley, M. E., Rosenberg, S. A. & Robbins, P. F. Persistence of multiple tumor-specific T-cell clones is associated with complete tumor regression in a melanoma patient receiving adoptive cell transfer therapy. J. Immunother. 28, 53–62 (2005).
Robbins, P. F. et al. Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat. Med. 19, 747–752 (2013).
Lu, Y. C. et al. Mutated PPP1R3B is recognized by T cells used to treat a melanoma patient who experienced a durable complete tumor regression. J. Immunol. 190, 6034–6042 (2013).
Lu, Y. C. et al. Efficient identification of mutated cancer antigens recognized by T cells associated with durable tumor regressions. Clin. Cancer Res. 20, 3401–3410 (2014).
Prickett, T. D. et al. Durable complete response from metastatic melanoma after transfer of autologous T cells recognizing 10 mutated tumor antigens. Cancer Immunol. Res. 4, 669–678 (2016).
Tran, E. et al. Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer. Science 344, 641–645 (2014).
Tran, E. et al. T-cell transfer therapy targeting mutant KRAS in cancer. N. Engl. J. Med. 375, 2255–2262 (2016).
Zacharakis, N. et al. Immune recognition of somatic mutations leading to complete durable regression in metastatic breast cancer. Nat. Med. 24, 724–730 (2018).
Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).
McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).
Hellmann, M. D. et al. Genomic features of response to combination immunotherapy in patients with advanced non-small-cell lung cancer. Cancer Cell 33, 843–852 (2018).
Le, D. T. et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science 357, 409–413 (2017).
Le, D. T. et al. PD-1 blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).
Peltomaki, P. DNA mismatch repair and cancer. Mutat. Res. 488, 77–85 (2001).
Peters, B. & Sette, A. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC Bioinf. 6, 132 (2005).
Alvarez, B. et al. NNAlign_MA; MHC peptidome deconvolution for accurate MHC binding motif characterization and improved T-cell epitope predictions. Mol. Cell. Proteomics 18, 2459–2477 (2019).
O’Donnell, T. J. et al. MHCflurry: open-source class I MHC binding affinity prediction. Cell Syst. 7, 129–132 (2018).
Duan, F. et al. Genomic and bioinformatic profiling of mutational neoepitopes reveals new rules to predict anticancer immunogenicity. J. Exp. Med. 211, 2231–2248 (2014).
Bulik-Sullivan, B. et al. Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nat. Biotechnol. 37, 55–63 (2019).
Hundal, J. et al. pVACtools: a computational toolkit to identify and visualize cancer neoantigens. Cancer Immunol. Res. 8, 409–420 (2020).
Bjerregaard, A. M., Nielsen, M., Hadrup, S. R., Szallasi, Z. & Eklund, A. C. MuPeXI: prediction of neo-epitopes from tumor sequencing data. Cancer Immunol. Immunother. 66, 1123–1130 (2017).
Kim, S. et al. Neopepsee: accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information. Ann. Oncol. 29, 1030–1036 (2018).
Kosaloglu-Yalcin, Z. et al. Predicting T cell recognition of MHC class I restricted neoepitopes. Oncoimmunology 7, e1492508 (2018).
Brown, S. D. et al. Neo-antigens predicted by tumor genome meta-analysis correlate with increased patient survival. Genome Res. 24, 743–750 (2014).
Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature 551, 512–516 (2017).
Parkhurst, M. R. et al. Unique neoantigens arise from somatic mutations in patients with gastrointestinal cancers. Cancer Discov. 9, 1022–1035 (2019).
Tran, E. et al. Immunogenicity of somatic mutations in human gastrointestinal cancers. Science 350, 1387–1390 (2015).
Lo, W. et al. Immunologic recognition of a shared p53 mutated neoantigen in a patient with metastatic colorectal cancer. Cancer Immunol. Res. 7, 534–543 (2019).
Jurtz, V. et al. NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).
Gfeller, D. et al. The length distribution and multiple specificity of naturally presented HLA-I ligands. J. Immunol. 201, 3705–3716 (2018).
Sarkizova, S. et al. A large peptidome dataset improves HLA class I epitope prediction across most of the human population. Nat. Biotechnol. 38, 199–209 (2020).
Paul, S. et al. HLA class I alleles are associated with peptide-binding repertoires of different size, affinity, and immunogenicity. J. Immunol. 191, 5831–5839 (2013).
Chen, W., Yewdell, J. W., Levine, R. L. & Bennink, J. R. Modification of cysteine residues in vitro and in vivo affects the immunogenicity and antigenicity of major histocompatibility complex class I-restricted viral determinants. J. Exp. Med. 189, 1757–1764 (1999).
Chen, J. L. et al. Structural and kinetic basis for heightened immunogenicity of T cell vaccines. J. Exp. Med. 201, 1243–1255 (2005).
Sachs, A., et al. Impact of cysteine residues on MHC binding predictions and recognition by tumor-reactive T cells. J. Immunol. 205, 539–549 (2020).
Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222–226 (2017).
Horton, P. et al. WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35, W585–W587 (2007).
Abelin, J. G. et al. Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 46, 315–326 (2017).
Rasmussen, M. et al. Pan-specific prediction of peptide–MHC class I complex stability, a correlate of T cell immunogenicity. J. Immunol. 197, 1517–1524 (2016).
Jorgensen, K. W., Rasmussen, M., Buus, S. & Nielsen, M. NetMHCstab—predicting stability of peptide–MHC-I complexes; impacts for cytotoxic T lymphocyte epitope discovery. Immunology 141, 18–26 (2014).
Groettrup, M., Kirk, C. J. & Basler, M. Proteasomes in immune cells: more than peptide producers? Nat. Rev. Immunol. 10, 73–78 (2010).
Larsen, M. V. et al. An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions. Eur. J. Immunol. 35, 2295–2303 (2005).
Capietto, A. H. et al. Mutation position is an important determinant for predicting cancer neoantigens. J. Exp. Med. 217, e20190179 (2020).
Calis, J. J. et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput. Biol. 9, e1003266 (2013).
Chowell, D. et al. TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes. Proc. Natl Acad. Sci. USA 112, E1754–E1762 (2015).
Cohen, C. J. et al. Isolation of neoantigen-specific T cells from tumor and peripheral lymphocytes. J. Clin. Invest. 125, 3981–3991 (2015).
Gros, A. et al. PD-1 identifies the patient-specific CD8+ tumor-reactive repertoire infiltrating human tumors. J. Clin. Invest. 124, 2246–2259 (2014).
Gros, A. et al. Prospective identification of neoantigen-specific lymphocytes in the peripheral blood of melanoma patients. Nat. Med. 22, 433–438 (2016).
Parkhurst, M. et al. Isolation of T-cell receptors specifically reactive with mutated tumor-associated antigens from tumor-infiltrating lymphocytes based on CD137 expression. Clin. Cancer Res. 23, 2491–2505 (2017).
Stevanovic, S. et al. Landscape of immunogenic tumor antigens in successful immunotherapy of virally induced epithelial cancer. Science 356, 200–205 (2017).
Deniger, D. C. et al. T-cell responses to TP53 “Hotspot” mutations and unique neoantigens expressed by human ovarian cancers. Clin. Cancer Res. 24, 5562–5573 (2018).
Yossef, R. et al. Enhanced detection of neoantigen-reactive T cells targeting unique and shared oncogenes for personalized cancer immunotherapy. JCI Insight 3, e122467 (2018).
Gros, A. et al. Recognition of human gastrointestinal cancer neoantigens by circulating PD-1+ lymphocytes. J. Clin. Invest. 129, 4992–5004 (2019).
Larsen, M. V. et al. Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinf. 8, 424 (2007).
Gartner, J. Datasets for ‘Development of a model for ranking candidate HLA class I neoantigens based upon datasets of known neoepitopes’. figshare https://doi.org/10.35092/yhjc.c.4792338.v2 (2020).
Acknowledgements
We thank members of the NIH High Performance Computing (HPC) group for all of their support, assistance and technical advice. This work utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov). We also thank all members of the tissue procurement team for all of their efforts in acquiring and maintaining the specimens used in this study.
Author information
Authors and Affiliations
Contributions
J.J.G., P.F.R. and S.A.R. designed the study and drafted the manuscript. J.J.G. trained models and evaluated all nmers and mmps. T.D.P. and S.C.C. generated exomes and RNA-seq libraries. N.Z., K.H., Y.F.L. and P.F.R. designed minigene constructs encoding candidate neoantigens and generated in vitro-transcribed RNA used to perform screening assays. M.R.P., M.F. and S. Kivitz synthesized peptides used for T-cell screening assays. M.R.P., A.G., E.T., M.S.J., A.C., K.H., N.Z., A.L., S. Krishna and A.S. evaluated T cells for their ability to recognize nmers/mmps in the context of the appropriate HLA class I restriction elements.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Cancer thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Percentile Rank Comparisons between NetMHCpan4.0 EL and MHCFlurry1.6 Percentile Rank.
Percentile rank of positive mmps were mapped by their MHCflurry1.6 rank on the x-axis and the NetMHCpan4.0 EL model rank on the y-axis. Red Triangles correspond to mmps containing cysteine residues at positions 2,3 or C-terminus (n=12) while orange dots correspond to peptides containing cysteine residues at position 1 or between positions 3 and the C-terminus (n=107).
Extended Data Fig. 2 Nmer localization predictions.
WoLF Psort algorithm was used on all nmer proteins (n=9541) to predicted for localization. Blue bars are CD8 + Positive nmers, Orange bars are negative nmers. Y-axis represents frequency of each group predicted to localize. X axis are the WoLF Psort prediction abbreviations. chlo = chloroplast, cyto = cytosol, cysk = cytoskeleton, E.R. = endoplasmic reticulum, extr = extracellular, golg = Golgi apparatus, lyso = lysosome, mito = mitochondria, nucl = nuclear, pero = peroxisome, plas = plasma membrane, vacu = vacuolar membrane . Individual totals for each groups positive and negative can be found in Supplementary Table 12. Hyphenated values denote compound prediction. P-values comparing positive to negative nmers displayed over each prediction. P-values calculated using a two-sided Fisher’s exact test and corrected using Bonferroni correction for multiple comparisons.
Extended Data Fig. 3 Gene expression Decile of mmps.
Gene expression deciles of positive (n=119) and negative mmps (n=2681162). Box indicates quartiles 2 & 3 and inter quartile range, median indicated by line in box plot, whiskers represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within the whisker values. Significance calculated with Mann-Whitney U test.
Extended Data Fig. 4 IEDB Immunogenicity scores of mmps.
IEDB Immunogenicity scores were generated for each mmp using the IEDB immunogenicity tool. The panels are split into all mmps (positive n=119, negative n=2681162), comparison of just those with a mutation anchor in position 2,3 or C-terminus (positive n=55, negative n= 1167363) and those without mutations in position 2,3, or C-terminus (positive n= 64, negative n= 1513799). Box indicates quartiles 2 & 3 and inter quartile range, median indicated by line in box plot, whiskers represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within the whisker values. Significance was calculated using the Mann-Whitney U test.
Extended Data Fig. 5 Hydrophobicity scores of T-cell contact regions.
Hydrophobicity scores were calculated summing the Kyte-Doolittle hydrophobicity score of positions 4 through n-1. The panels are split into all mmps (positive n=119, negative n=2681162), comparison of just those with a anchor in position 2,3 or C-terminus (positive n=55, negative n= 1167363) and those without mutations in position 2,3, or C-terminus (positive n= 64, negative n= 1513799). Box indicates quartiles 2 & 3 and inter quartile range, median indicated by line in box plot, whiskers represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within the whisker values. Significance calculated with Mann-Whitney U test.
Extended Data Fig. 6 Top NMER models using either MMP score of MHCflurry Score as input.
ROC curve showing the mean performance of the top models using either MMP model scores or MHCflurry scores as input. Solid line represents mean for each model across n=5 folds, shaded area is the standard deviation at each point along the x-axis.
Supplementary information
Rights and permissions
About this article
Cite this article
Gartner, J.J., Parkhurst, M.R., Gros, A. et al. A machine learning model for ranking candidate HLA class I neoantigens based on known neoepitopes from multiple human tumor types. Nat Cancer 2, 563–574 (2021). https://doi.org/10.1038/s43018-021-00197-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43018-021-00197-6
This article is cited by
-
Challenges in developing personalized neoantigen cancer vaccines
Nature Reviews Immunology (2024)
-
Informing immunotherapy with multi-omics driven machine learning
npj Digital Medicine (2024)
-
Genetic immune escape landscape in primary and metastatic cancer
Nature Genetics (2023)
-
The immunopeptidome landscape associated with T cell infiltration, inflammation and immune editing in lung cancer
Nature Cancer (2023)
-
Breaking the performance ceiling for neoantigen immunogenicity prediction
Nature Cancer (2023)