Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Personalized deep learning of individual immunopeptidomes to identify neoantigens for cancer vaccines


Tumour-specific neoantigens play a major role for developing personal vaccines in cancer immunotherapy. We propose a personalized de novo peptide sequencing workflow to identify HLA-I and HLA-II neoantigens directly and solely from mass spectrometry data. Our workflow trains a personal deep learning model on the immunopeptidome of an individual patient and then uses it to predict mutated neoantigens of that patient. This personalized learning and mass spectrometry-based approach enables comprehensive and accurate identification of neoantigens. We applied the workflow to datasets of five patients with melanoma and expanded their predicted immunopeptidomes by 5–15%. Subsequently, we discovered neoantigens of both HLA-I and HLA-II, including those with validated T-cell responses and those that had not been reported in previous studies.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Personalized de novo peptide sequencing workflow for neoantigen discovery.
Fig. 2: Performance of personalized and generic models on five patients.
Fig. 3: Immune characteristics of de novo HLA-I peptides from patient Mel-15.

Data availability

RAW files from ref. 14 were downloaded from the ProteomeXchange repository, accession no. PXD004894. RAW files from ref. 22 were downloaded from the MassIVE repository, accession nos. MSV000084172 and MSV000080527.

Code availability

DeepNovo and the workflow are implemented in Python. The latest version is open source and available on GitHub ( and


  1. Hu, Z., Ott, P. A. & Wu, C. J. Towards personalized, tumour-specific, therapeutic vaccines for cancer. Nat. Rev. Immunol. 18, 168–182 (2018).

    Article  Google Scholar 

  2. Van Allen, E. M. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

    Article  Google Scholar 

  3. Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).

    Article  Google Scholar 

  4. Ott, P. A. et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217–221 (2017).

    Article  Google Scholar 

  5. Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222–226 (2017).

    Article  Google Scholar 

  6. Carreno, B. M. et al. A dendritic cell vaccine increases the breadth and diversity of melanoma neoantigen-specific T cells. Science 348, 803–808 (2015).

    Article  Google Scholar 

  7. Andreatta, M. & Nielsen, M. Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32, 511–517 (2016).

    Article  Google Scholar 

  8. Jurtz, V. et al. NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).

    Article  Google Scholar 

  9. Abelin, J. G. et al. Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 46, 315–326 (2017).

    Article  Google Scholar 

  10. Bulik-Sullivan, B. et al. Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification. Nat. Biotechnol. 37, 55–63 (2019).

    Article  Google Scholar 

  11. Bassani-Sternberg, M. & Gfeller, D. Unsupervised HLA peptidome deconvolution improves ligand prediction accuracy and predicts cooperative effects in peptide–HLA interactions. J. Immunol. 197, 2492–2499 (2016).

    Article  Google Scholar 

  12. Bassani-Sternberg, M. et al. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLoS Comput. Biol. 13, e1005725 (2017).

    Article  Google Scholar 

  13. Gfeller, D. et al. The length distribution and multiple specificity of naturally presented HLA-I ligands. J. Immunol. 201, 3705–3716 (2018).

    Article  Google Scholar 

  14. Bassani-Sternberg, M. et al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat. Commun. 7, 13404 (2016).

    Article  Google Scholar 

  15. Laumont, C. M. et al. Noncoding regions are the main source of targetable tumor-specific antigens. Sci. Transl. Med. 10, eaau5516 (2018).

    Article  Google Scholar 

  16. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).

    Article  Google Scholar 

  17. Zhang, J. et al. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification. Mol. Cell. Proteomics 11, M111.010587 (2012).

    Article  Google Scholar 

  18. Faridi, P. et al. A subset of HLA-I peptides are not genomically templated: evidence for cis- and trans-spliced peptide ligands. Sci. Immunol. 3, eaar3947 (2018).

    Article  Google Scholar 

  19. Tran, N. H., Zhang, X., Xin, L., Shan, B. & Li, M. De novo peptide sequencing by deep learning. Proc. Natl Acad. Sci. USA 114, 8247–8252 (2017).

    Article  Google Scholar 

  20. Tran, N. H. et al. Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry. Nat. Methods 16, 63–66 (2019).

    Article  Google Scholar 

  21. Sutskever, I., Vinyals, O. & Le, Q. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27, 3104–3112 (2014).

    Google Scholar 

  22. Sarkizova, S. et al. A large peptidome dataset improves HLA class I epitope prediction across most of the human population. Nat. Biotechnol. 38, 199–209 (2020).

    Article  Google Scholar 

  23. Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2018).

    Article  Google Scholar 

  24. Andreatta, M., Alvarez, B. & Nielsen, M. GibbsCluster: unsupervised clustering and alignment of peptide sequences. Nucleic Acids Res. 45, W458–W463 (2017).

    Article  Google Scholar 

  25. Calis, J. J. A. et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput. Biol. 9, e1003266 (2013).

    Article  Google Scholar 

  26. Bassani-Sternberg, M., Pletscher-Frankild, S., Jensen, L. J. & Mann, M. Mass spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol. Cell. Proteomics 14, 658–673 (2015).

    Article  Google Scholar 

  27. Keskin, D. B. et al. Neoantigen vaccine generates intratumoral T-cell responses in phase Ib glioblastoma trial. Nature 565, 234–239 (2019).

    Article  Google Scholar 

Download references


This work was funded in part by the NSERC grant OGP0046506, the Canada Research Chair programme and the National Key R&D Program of China 2018YFB1003202. N.H.T. was supported by the Mitacs Elevate Fellowship. The authors thank K. Pui Choi and J. Xu for critical reading of the manuscript.

Author information

Authors and Affiliations



M.L. and B.S. conceived the research idea. N.H.T. designed the neoantigen discovery workflow. N.H.T. and R.Q. implemented the software and analysed the results. X.C. and L.X. contributed to model design, software development and data analysis. N.H.T., M.L. and R.Q. wrote the manuscript. M.L., B.S. and L.X. supervised the research project.

Corresponding authors

Correspondence to Baozhen Shan or Ming Li.

Ethics declarations

Competing interests

The workflow in Fig. 1 is the subject of an application for a patent (as a USPTO provisional application by Bioinformatics Solutions Inc., Waterloo, Canada). The authors are named inventors in the patent application. L.X., X.C. and B.S. are employees of Bioinformatics Solutions.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Length distributions of HLA de novo and database peptides.

a, Mel-5 HLA-I; b, Mel-8 HLA-I; c, Mel-12 HLA-I; d, Mel-16 HLA-I; e, Mel-15 HLA-II; f, Mel-16 HLA-II.

Extended Data Fig. 2 Binding affinity of de novo and database HLA-I peptides.

Dashed lines indicate default thresholds of weak-binding (rank 2.0%) and strong-binding (rank 0.5%) of NetMHCpan.

Extended Data Fig. 3

Immunogenicity of de novo and database HLA-I peptides.

Supplementary information

Supplementary Information

Supplementary Figs. 1–6.

Reporting Summary

Supplementary Table 1

Step-by-step results of our personalized workflow for neoantigen discovery on five patients Mel-5, Mel-8, Mel-12, Mel-15 and Mel-16 from ref. 14. S2: Number of de-novo and database HLA peptides identified at 1% FDR. S3: Peptide-spectrum matches of de-novo HLA peptides at 1% FDR. S4: Performance of the personalized models versus the generic model that was trained on a dataset of 95 HLA-I alleles by Sarkizova and others22. S5: Criteria to select candidate neoantigens from de-novo HLA peptides.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tran, N.H., Qiao, R., Xin, L. et al. Personalized deep learning of individual immunopeptidomes to identify neoantigens for cancer vaccines. Nat Mach Intell 2, 764–771 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing