This article has been updated


Neoantigens, which are expressed on tumor cells, are one of the main targets of an effective antitumor T-cell response. Cancer immunotherapies to target neoantigens are of growing interest and are in early human trials, but methods to identify neoantigens either require invasive or difficult-to-obtain clinical specimens, require the screening of hundreds to thousands of synthetic peptides or tandem minigenes, or are only relevant to specific human leukocyte antigen (HLA) alleles. We apply deep learning to a large (N = 74 patients) HLA peptide and genomic dataset from various human tumors to create a computational model of antigen presentation for neoantigen prediction. We show that our model, named EDGE, increases the positive predictive value of HLA antigen prediction by up to ninefold. We apply EDGE to enable identification of neoantigens and neoantigen-reactive T cells using routine clinical specimens and small numbers of synthetic peptides for most common HLA alleles. EDGE could enable an improved ability to develop neoantigen-targeted immunotherapies for cancer patients.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 18 December 2018

    Supplementary Data 6 as originally posted was actually Supplementary Data 5, Supplementary Data 7 as originally posted was actually Supplementary Data 6, Supplementary Data 8 as originally posted was actually Supplementary Data 7, Supplementary Data 9 as originally posted was actually Supplementary Data 8, and Supplementary Data 5 as originally posted was actually a corrupted version of Supplementary Data 9. The error has been corrected online as of 18 December 2018.


  1. 1.

    et al. Genetic basis for clinical response to CTLA-4 blockade in melanoma. N. Engl. J. Med. 371, 2189–2199 (2014).

  2. 2.

    et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).

  3. 3.

    et al. A dendritic cell vaccine increases the breadth and diversity of melanoma neoantigen-specific T cells. Science 348, 803–808 (2015).

  4. 4.

    et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217–221 (2017).

  5. 5.

    et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222–226 (2017).

  6. 6.

    et al. T-cell transfer therapy targeting mutant KRAS in cancer. N. Engl. J. Med. 375, 2255–2262 (2016).

  7. 7.

    et al. Prospective identification of neoantigen-specific lymphocytes in the peripheral blood of melanoma patients. Nat. Med. 22, 433–438 (2016).

  8. 8.

    Anonymous. The problem with neoantigen prediction. Nat. Biotechnol. 35, 97 (2017).

  9. 9.

    & Neoantigen prediction and the need for validation. Nat. Biotechnol. 35, 815–817 (2017).

  10. 10.

    , , & Mass spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol. Cell. Proteomics 14, 658–673 (2015).

  11. 11.

    et al. The Immune Epitope Database (IEDB) 3.0. Nucleic Acids Res. 43, D405–D412 (2015).

  12. 12.

    & Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32, 511–517 (2016).

  13. 13.

    et al. MHCflurry: open-source class I MHC binding affinity prediction. Cell Syst. 7, 129–132.e4 (2018).

  14. 14.

    et al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat. Commun. 7, 13404 (2016).

  15. 15.

    et al. Mass Spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 46, 315–326 (2017).

  16. 16.

    et al. Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature 515, 572–576 (2014).

  17. 17.

    , , & NetCTLpan: pan-specific MHC class I pathway epitope predictions. Immunogenetics 62, 357–368 (2010).

  18. 18.

    et al. NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).

  19. 19.

    et al. Large-scale detection of antigen-specific T cells using peptide-MHC-I multimers labeled with DNA barcodes. Nat. Biotechnol. 34, 1037–1045 (2016).

  20. 20.

    et al. Immunogenicity of somatic mutations in human gastrointestinal cancers. Science 350, 1387–1390 (2015).

  21. 21.

    et al. Targeting of cancer neoantigens with donor-derived T cell receptor repertoires. Science 352, 1337–1341 (2016).

  22. 22.

    et al. The length distribution of class I-restricted T cell epitopes is determined by both peptide supply and MHC allele-specific binding preference. J. Immunol. 196, 1480–1487 (2016).

  23. 23.

    et al. Unveiling the peptide motifs of HLA-C and HLA-G from naturally presented peptides and generation of binding prediction matrices. J. Immunol. 199, 2639–2651 (2017).

  24. 24.

    , & Deep Learning (MIT Press, 2016).

  25. 25.

    et al. The relationship between class I binding affinity and immunogenicity of potential cytotoxic T cell epitopes. J. Immunol. 153, 5586–5592 (1994).

  26. 26.

    et al. The MHC class I peptide repertoire is molded by the transcriptome. J. Exp. Med. 205, 595–610 (2008).

  27. 27.

    et al. MHC class I-associated peptides derive from selective regions of the human genome. J. Clin. Invest. 126, 4690–4701 (2016).

  28. 28.

    et al. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLoS Comput. Biol. 13, e1005725 (2017).

  29. 29.

    , & Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach. Bioinformatics 29, 8–14 (2013).

  30. 30.

    , & GibbsCluster: unsupervised clustering and alignment of peptide sequences. Nucleic Acids Res. 45, W458–W463 (2017).

  31. 31.

    et al. Immune recognition of somatic mutations leading to complete durable regression in metastatic breast cancer. Nat. Med. 24, 724–730 (2018).

  32. 32.

    et al. Harmonisation of short-term in vitro culture for the expansion of antigen-specific CD8+ T cells with detection by ELISPOT and HLA-multimer staining. Cancer Immunol. Immunother. 63, 1199–1211 (2014).

  33. 33.

    et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

  34. 34.

    et al. Evolution of neoantigen landscape during immune checkpoint blockade in non-small cell lung cancer. Cancer Discov. 7, 264–276 (2017).

  35. 35.

    et al. Landscape of immunogenic tumor antigens in successful immunotherapy of virally induced epithelial cancer. Science 356, 200–205 (2017).

  36. 36.

    & Quantitative analysis of peptides and proteins in biomedicine by targeted mass spectrometry. Nat. Methods 10, 28–34 (2013).

  37. 37.

    , , , & A catalog of HLA type, HLA expression, and neo-epitope candidates in human cancer cell lines. Oncoimmunology 3, e954893 (2014).

  38. 38.

    et al. Melanoma-specific MHC-II expression represents a tumour-autonomous phenotype and predicts response to anti-PD-1/PD-L1 therapy. Nat. Commun. 7, 10582 (2016).

  39. 39.

    et al. A pilot trial using lymphocytes genetically engineered with an NY-ESO-1-reactive T-cell receptor: long-term follow-up and correlates with response. Clin. Cancer Res. 21, 1019–1027 (2015).

  40. 40.

    et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput. Biol. 9, e1003266 (2013).

  41. 41.

    et al. Genomic and bioinformatic profiling of mutational neoepitopes reveals new rules to predict anticancer immunogenicity. J. Exp. Med. 211, 2231–2248 (2014).

  42. 42.

    et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

  43. 43.

    et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).

  44. 44.

    et al. Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry. Science 255, 1261–1263 (1992).

  45. 45.

    et al. Identification of class I MHC-associated phosphopeptides as targets for cancer immunotherapy. Proc. Natl. Acad. Sci. USA 103, 14889–14894 (2006).

  46. 46.

    , & Comet: an open-source MS/MS sequence database search tool. Proteomics 13, 22–24 (2013).

  47. 47.

    , , & Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J. Proteome Res. 7, 29–34 (2008).

  48. 48.

    & RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

  49. 49.

    et al. Keras (2015).

  50. 50.

    et al. Theano: A Python framework for fast computation of mathematical expressions. Preprint at (2016).

  51. 51.

    & in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics Vol. 9 (eds. Teh, Y.W. & Titterington, M.) 249–256 (Proceedings of Machine Learning Research, 2010).

  52. 52.

    & Adam: a method for stochastic optimization. Preprint at (2014).

  53. 53.

    & Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100 (1990).

  54. 54.

    , , & Standardization and validation issues of the ELISPOT assay. Methods Mol. Biol. 302, 51–86 (2005).

  55. 55.

    et al. Guidelines for the automated evaluation of Elispot assays. Nat. Protoc. 10, 1098–1115 (2015).

  56. 56.

    & Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  57. 57.

    et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

  58. 58.

    & Haplotype-based variant detection from short-read sequencing. Preprint at (2012).

  59. 59.

    et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).

  60. 60.

    et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).

  61. 61.

    et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

  62. 62.

    et al. Human leukocyte antigen (HLA)-DRB1*15:01 and HLA-DRB5*01:01 present complementary peptide repertoires. Front. Immunol. 8, 984 (2017).

  63. 63.

    et al. Dominant protection from HLA-linked autoimmunity by antigen-specific regulatory T cells. Nature 545, 243–247 (2017).

Download references


We would like to thank C.J. Couter for her assistance with general laboratory tasks and establishment of the in vitro stimulation assays. T.A.C. acknowledges funding in part through the NIH/NCI Cancer Center Support Grant P30 CA008748, Pershing Square Sohn Cancer Research grant, the PaineWebber Chair, Stand Up 2 Cancer, NIH R01 CA205426, NIH R35 CA232097, and the STARR Cancer Consortium. V.T.D.M., O.M., G.S., P.B., S.N., N.K., R. Rosell, I.A., N.G., J.H., C.L., K. Choquette, A.S., E.F. and M.F. received research funding support for this study from Gritstone Oncology, Inc.

Author information


  1. Gritstone Oncology, Inc., Emeryville, California and Cambridge, Massachusetts, USA.

    • Brendan Bulik-Sullivan
    • , Jennifer Busby
    • , Christine D Palmer
    • , Matthew J Davis
    • , Tyler Murphy
    • , Andrew Clark
    • , Michele Busby
    • , Fujiko Duke
    • , Aaron Yang
    • , Lauren Young
    • , Noelle C Ojo
    • , Kamilah Caldwell
    • , Jesse Abhyankar
    • , Thomas Boucher
    • , Meghan G Hart
    • , Raphael Rousseau
    • , Cynthia Voong
    • , Karin Jooss
    • , Mojca Skoberne
    • , Joshua Francis
    •  & Roman Yelensky
  2. Memorial Sloan Kettering Cancer Center, New York, New York, USA.

    • Vladimir Makarov
    •  & Timothy A Chan
  3. Centre Chirurgical Marie Lannelongue, Le Plessis-Robinson, France.

    • Vincent Thomas De Montpreville
    • , Olaf Mercier
    •  & Elie Fadel
  4. University of Turin, Department of Oncology at San Luigi Hospital, Orbassano (Turin), Italy.

    • Giorgio Scagliotti
    • , Paolo Bironzo
    •  & Silvia Novello
  5. Instituto Oncologico Dr. Rosell – Hospital Universitari Quiron Dexeus Location, Barcelona, Spain.

    • Niki Karachaliou
  6. Catalan Institute of Oncology, Barcelona, Spain.

    • Rafael Rosell
  7. St Joseph Heritage Healthcare, Santa Rosa, California, USA.

    • Ian Anderson
  8. Gabrail Cancer Center, Canton, Ohio, USA.

    • Nashat Gabrail
  9. Hattiesburg Clinic/Forrest General Cancer Center, Hattiesburg, Mississippi, USA.

    • John Hrom
  10. Solano Hematology Oncology, Vallejo, California, USA.

    • Chainarong Limvarapuss
  11. Virginia Cancer Specialists, Fairfax, Virginia, USA.

    • Karin Choquette
    •  & Alexander Spira
  12. New York Presbyterian/Columbia University Medical Center, New York, New York, USA.

    • Naiyer A Rizvi
    •  & Mark Frattini


  1. Search for Brendan Bulik-Sullivan in:

  2. Search for Jennifer Busby in:

  3. Search for Christine D Palmer in:

  4. Search for Matthew J Davis in:

  5. Search for Tyler Murphy in:

  6. Search for Andrew Clark in:

  7. Search for Michele Busby in:

  8. Search for Fujiko Duke in:

  9. Search for Aaron Yang in:

  10. Search for Lauren Young in:

  11. Search for Noelle C Ojo in:

  12. Search for Kamilah Caldwell in:

  13. Search for Jesse Abhyankar in:

  14. Search for Thomas Boucher in:

  15. Search for Meghan G Hart in:

  16. Search for Vladimir Makarov in:

  17. Search for Vincent Thomas De Montpreville in:

  18. Search for Olaf Mercier in:

  19. Search for Timothy A Chan in:

  20. Search for Giorgio Scagliotti in:

  21. Search for Paolo Bironzo in:

  22. Search for Silvia Novello in:

  23. Search for Niki Karachaliou in:

  24. Search for Rafael Rosell in:

  25. Search for Ian Anderson in:

  26. Search for Nashat Gabrail in:

  27. Search for John Hrom in:

  28. Search for Chainarong Limvarapuss in:

  29. Search for Karin Choquette in:

  30. Search for Alexander Spira in:

  31. Search for Raphael Rousseau in:

  32. Search for Cynthia Voong in:

  33. Search for Naiyer A Rizvi in:

  34. Search for Elie Fadel in:

  35. Search for Mark Frattini in:

  36. Search for Karin Jooss in:

  37. Search for Mojca Skoberne in:

  38. Search for Joshua Francis in:

  39. Search for Roman Yelensky in:


Conception and design: B.B.-S., J.B., J.F., M.S., R.Y. Development of methodology: B.B.-S., J.B., J.F., M.S., R.Y., C.D.P., M.J.D., A.C., M.B., L.Y., T.B., V.M., R.Y. Provided patient material and clinical input: V.T.D.M., O.M., G.S., P.B., S.N., N.K., R. Rosell, I.A., N.G., J.H., C.L., A.S., E.F., M.F. Operational support and data management for patient material: K.C., J.A., C.V., K.C. Performed experiments: J.B., C.D.P., M.J.D., T.M., F.D., A.Y., N.C.O., M.G.H., M.S., J.F. Analysis and interpretation of data: B.B.-S., J.B., C.D.P., M.J.D., A.C., M.B., L.Y., T.B., K.J., M.S., J.F., R.Y., N.A.R., T.A.C. Writing, review and/or revision of the manuscript: B.B.-S., J.B., J.F., M.S., R.Y., C.D.P., N.A.R. Study supervision: R.Y., K.J., R. Rousseau.

Competing interests

B.B.-S., J.B., C.D.P., M.J.D., T.M., A.C., M.B., F.D., A.Y., L.Y., N.C.O., K. Caldwell, J.A., T.B., M.G.H., R. Rousseau, C.V., K.J., M.S., J.F. and R.Y. are employees and shareholders of Gritstone Oncology, Inc, a company developing neoantigen immunotherapies. T.A.C. and N.A.R. are founders, shareholders, and serve on the scientific advisory board of Gritstone Oncology. B.B.-S., J.B., C.D.P., T.B., M.S., J.F. and R.Y are inventors on patents and patent applications relating to this work. T.A.C. holds equity in An2H. T.A.C. acknowledges grant funding from Bristol-Myers Squibb, AstraZeneca, Illumina, Pfizer, An2H and Eisai. T.A.C. has served as an advisor for Bristol-Myers Squibb, AstraZeneca, Illumina, Eisai and An2H. T.A.C., N.A.R. and Memorial Sloan Kettering Cancer Center have a patent filing (PCT/US2015/062208) for the use of tumor mutation burden and HLA for prediction of immunotherapy efficacy, which is licensed to Personal Genome Diagnostics. S.N. is on speaker bureaus for Eli Lilly; Bristol-Myers Squibb; Takeda; Merck, Sharp & Dohme; Boehringer Ingelheim; AstraZeneca; and AbbVie.

Corresponding author

Correspondence to Roman Yelensky.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1, 2 and 5–13, Supplementary Table 1 and Supplementary Notes 1–3

  2. 2.

    Life Sciences Reporting Summary

  3. 3.

    Supplementary Figure 3a

    Motifs for HLA-A alleles

  4. 4.

    Supplementary Figure 3b

    Motifs for HLA-B alleles

  5. 5.

    Supplementary Figure 3c

    Motifs for HLA-C alleles

  6. 6.

    Supplementary Figure 4

    Precision-recall curves for all test samples

Zip files

  1. 1.

    Supplementary Software

    EDGE model code

Excel files

  1. 1.

    Supplementary Data 1

    Specimen characteristics and MS + NGS metrics

  2. 2.

    Supplementary Data 5

    Demographics of NSCLC patients

  3. 3.

    Supplementary Data 6

    Neoantigen and infectious disease epitopes in IVS control

  4. 4.

    Supplementary Data 7

    Neoantigen peptides tested in healthy donors

  5. 5.

    Supplementary Data 8

    MSD cytokine multiplex and ELISA assays on ELISpot supernatants from NSCLC neoantigen peptides

CSV files

  1. 1.

    Supplementary Data 2

    Model predicts HLA peptide stability

  2. 2.

    Supplementary Data 3a

    T-cell epitope dataset from studies A, B and D

  3. 3.

    Supplementary Data 3b

    T-cell epitope dataset from study C

  4. 4.

    Supplementary Data 3c

    Predicted ranks of mutations with pre-existing CD8 response

  5. 5.

    Supplementary Data 4

    Peptides tested for T-cell recognition in NSCLC patients

  6. 6.

    Supplementary Data 9

    RNA expression dataset used for model training and testing

About this article

Publication history