Predicting HLA class II antigen presentation through integrated deep learning

Article metrics


Accurate prediction of antigen presentation by human leukocyte antigen (HLA) class II molecules would be valuable for vaccine development and cancer immunotherapies. Current computational methods trained on in vitro binding data are limited by insufficient training data and algorithmic constraints. Here we describe MARIA (major histocompatibility complex analysis with recurrent integrated architecture;, a multimodal recurrent neural network for predicting the likelihood of antigen presentation from a gene of interest in the context of specific HLA class II alleles. In addition to in vitro binding measurements, MARIA is trained on peptide HLA ligand sequences identified by mass spectrometry, expression levels of antigen genes and protease cleavage signatures. Because it leverages these diverse training data and our improved machine learning framework, MARIA (area under the curve = 0.89–0.92) outperformed existing methods in validation datasets. Across independent cancer neoantigen studies, peptides with high MARIA scores are more likely to elicit strong CD4+ T cell responses. MARIA allows identification of immunogenic epitopes in diverse cancers and autoimmune disease.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Rationale and framework for the development of a new method for prediction of HLA-II ligands.
Fig. 2: Features, model architecture and validation performance of MARIA.
Fig. 3: Benchmarking MARIA performance against existing binding-based methods with independent HLA-DR test sets.
Fig. 4: MARIA trained on human HLA-DQ ligand peptides identified celiac-related gluten antigens.
Fig. 5: MARIA identifies lymphoma immunoglobulin HLA-DR presentation hotspots in patients with MCL.
Fig. 6: MARIA scores predict melanoma HLA-II-presented antigens and are associated with post-vaccine CD4+ T cell responses.

Data availability

Ligandomes are available from the PRIDE Archive under accession numbers PXD004746 and PXD005704. Data from two K562 ligandomes (Fig. 3) are provided in Supplementary Tables 5 and 6. The remaining HLA ligand datasets are publicly available from the provided references. All other data are available from the corresponding authors upon reasonable request.

Code availability

Researchers can run MARIA online at Custom software code described in this work is available for academic research upon request from the authors or through Commercial entities with interest in the software should contact Stanford University’s Office of Technology Licensing and reference docket S19-020.


  1. 1.

    Neefjes, J., Jongsma, M. L., Paul, P. & Bakke, O. Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat. Rev. Immunol. 11, 823–836 (2011).

  2. 2.

    Khodadoust, M. S. et al. Antigen presentation profiling reveals recognition of lymphoma immunoglobulin neoantigens. Nature 543, 723–727 (2017).

  3. 3.

    Bassani-Sternberg, M. et al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat. Commun. 7, 13404 (2016).

  4. 4.

    Toes, R. E., Ossendorp, F., Offringa, R. & Melief, C. J. CD4 T cells and their role in antitumor immune responses. J. Exp. Med. 189, 753–756 (1999).

  5. 5.

    Schreiber, R. D., Old, L. J. & Smyth, M. J. Cancer immunoediting: integrating immunity’s roles in cancer suppression and promotion. Science 331, 1565–1570 (2011).

  6. 6.

    Linnemann, C. et al. High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma. Nat. Med. 21, 81 (2015).

  7. 7.

    Tran, E. et al. Immunogenicity of somatic mutations in human gastrointestinal cancers. Science 350, 1387–1390 (2015).

  8. 8.

    Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222–226 (2017).

  9. 9.

    Ott, P. A. et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217–221 (2017).

  10. 10.

    Van Allen, E. M. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

  11. 11.

    Kreiter, S. et al. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature 520, 692–696 (2015).

  12. 12.

    The problem with neoantigen prediction. Nat. Biotechnol. 35, 97 (2017).

  13. 13.

    Khodadoust, M. S. & Alizadeh, A. A. Tumor antigen discovery through translation of the cancer genome. Immunol. Res. 58, 292–299 (2014).

  14. 14.

    Moss, D. L., Park, H. W., Mettu, R. R. & Landry, S. J. Deimmunizing substitutions in Pseudomonas exotoxin domain III perturb antigen processing without eliminating T-cell epitopes. J. Biol. Chem. 294, 4667–4681 (2019).

  15. 15.

    Andreatta, M. et al. An automated benchmarking platform for MHC class II binding prediction methods. Bioinformatics 34, 1522–1528 (2017).

  16. 16.

    Marty, R., Thompson, W. K., Salem, R. M., Zanetti, M. & Carter, H. Evolutionary pressure against MHC class II binding cancer mutations. Cell 175, 416–428 (2018).

  17. 17.

    Zhao, W. & Sher, X. Systematically benchmarking peptide-MHC binding predictors: from synthetic to naturally processed epitopes. PLoS Comput. Biol. 14, e1006457 (2018).

  18. 18.

    Yadav, M. et al. Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature 515, 572–576 (2014).

  19. 19.

    Caron, E. et al. Analysis of major histocompatibility complex (MHC) immunopeptidomes using mass spectrometry. Mol. Cell. Proteomics 14, 3105–3117 (2015).

  20. 20.

    Abelin, J. G. et al. Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 46, 315–326 (2017).

  21. 21.

    Vita, R. et al. The immune epitope database (IEDB) 3.0. Nucleic Acids Res. 43, D405–D412 (2015).

  22. 22.

    Andreatta, M. et al. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics 67, 641–650 (2015).

  23. 23.

    Paul, S. et al. Determination of a predictive cleavage motif for eluted major histocompatibility complex class II ligands. Front. Immunol. 9, 1795 (2018).

  24. 24.

    Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Computation 9, 1735–1780 (1997).

  25. 25.

    Liu, H., Han, F., Zhou, H., Yan, X. & Kosik, K. S. Fast motif discovery in short sequences. In Proc. 32nd IEEE International Conference on Data Engineering 1158–1169 (IEEE, 2016).

  26. 26.

    Nielsen, M., Lundegaard, C. & Lund, O. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. BMC Bioinformatics 8, 238 (2007).

  27. 27.

    Nielsen, M. & Lund, O. NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC Bioinformatics 10, 296 (2009).

  28. 28.

    Sturniolo, T. et al. Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat. Biotechnol. 17, 555–561 (1999).

  29. 29.

    Sidney, J. et al. Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries. Immunome Res. 4, 2 (2008).

  30. 30.

    Wang, P. et al. Peptide binding predictions for HLA DR, DP and DQ molecules. BMC Bioinformatics 11, 568 (2010).

  31. 31.

    Ciccocioppo, R., Di Sabatino, A. & Corazza, G. R. The immune recognition of gluten in coeliac disease. Clin. Exp. Immunol. 140, 408–416 (2005).

  32. 32.

    Bergseng, E. et al. Different binding motifs of the celiac disease-associated HLA molecules DQ2.5, DQ2.2, and DQ7.5 revealed by relative quantitative proteomics of endogenous peptide repertoires. Immunogenetics 67, 73–84 (2015).

  33. 33.

    Dorum, S. et al. HLA-DQ molecules as affinity matrix for identification of gluten T cell epitopes. J. Immunol. 193, 4497–4506 (2014).

  34. 34.

    Falk, K., Rotzschke, O., Stevanovic, S., Jung, G. & Rammensee, H. G. Pool sequencing of natural HLA-DR, DQ, and DP ligands reveals detailed peptide motifs, constraints of processing, and general rules. Immunogenetics 39, 230–242 (1994).

  35. 35.

    Chicz, R. M., Graziano, D. F., Trucco, M., Strominger, J. L. & Gorga, J. C. HLA-DP2: self peptide sequences and binding properties. J. Immunol. 159, 4935–4942 (1997).

  36. 36.

    Lorente, E. et al. Proteomics analysis reveals that structural proteins of the virion core and involved in gene expression are the main source for HLA class II ligands in vaccinia virus-infected cells. J. Proteome Res. 18, 900–911 (2019).

  37. 37.

    Chong, C. et al. High-throughput and sensitive immunopeptidomics platform reveals profound interferongamma-mediated remodeling of the human leukocyte antigen (HLA) ligandome. Mol. Cell Proteomics 17, 533–548 (2018).

  38. 38.

    Butterfield, L. H State of the art review: cancer vaccines. BMJ 350, h988 (2015).

  39. 39.

    O'Donnell, T.J. et al. MHCflurry: open-source class I MHC binding affinity prediction. Cell Syst. 7, 129–132 (2018).

  40. 40.

    Rubinsteyn, A., Hodes, I., Kodysh, J. & Hammerbacher, J. Vaxrank: a computational tool for designing personalized cancer vaccines. Preprint at biorXiv (2017).

  41. 41.

    Parkhurst, M. et al. Isolation of T-Cell receptors specifically reactive with mutated tumor-associated antigens from tumor-infiltrating lymphocytes based on CD137 expression. Clinical Can. Res. 23, 2491–2505 (2017).

  42. 42.

    Zacharakis, N. et al. Immune recognition of somatic mutations leading to complete durable regression in metastatic breast cancer. Nat. Med. 24, 724–730 (2018).

  43. 43.

    Stevanovic, S. et al. Landscape of immunogenic tumor antigens in successful immunotherapy of virally induced epithelial cancer. Science 356, 200–205 (2017).

  44. 44.

    Iiizumi, S et al. Identification of novel HLA Class II-restricted neoantigens derived from driver mutations. Cancers (Basel) 11, 266 (2019).

  45. 45.

    Keskin, D. B. et al. Neoantigen vaccine generates intratumoral T cell responses in phase Ib glioblastoma trial. Nature 565, 234–239 (2019).

  46. 46.

    Muller, M., Gfeller, D., Coukos, G. & Bassani-Sternberg, M. ‘Hotspots’ of antigen presentation revealed by human leukocyte antigen ligandomics for neoantigen prioritization. Front. Immunol. 8, 1367 (2017).

  47. 47.

    Luo, H. et al. Machine learning methods for predicting HLA-peptide binding activity. Bioinform. Biol. Insights. 9, 21–29 (2015).

  48. 48.

    Bassani-Sternberg, M. et al. Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity. PLoS Comput. Biol. 13, e1005725 (2017).

  49. 49.

    Jurtz, V. et al. NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).

  50. 50.

    Bassani-Sternberg, M., Pletscher-Frankild, S., Jensen, L. J. & Mann, M. Mass spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol. Cell Proteomics 14, 658–673 (2015).

  51. 51.

    Shao, W. et al. The SysteMHC Atlas project. Nucleic Acids Res. 46, D1237–D1247 (2018).

  52. 52.

    Racle, J. et al. Deep motif deconvolution of HLA-II peptidomes for robust class II epitope predictions. Preprint at biorXiv (2019).

  53. 53.

    Bhattacharya, R. et al. Prediction of peptide binding to MHC Class I proteins in the age of deep learning. Preprint at biorXiv (2017).

  54. 54.

    Mommen, G. P. et al. Sampling from the proteome to the human leukocyte antigen-DR (HLA-DR) ligandome proceeds via high specificity. Mol. Cell Proteomics 15, 1412–1423 (2016).

  55. 55.

    Graham, D. B. et al. Antigen discovery and specification of immunodominance hierarchies for MHCII-restricted epitopes. Nat. Med. 24, 1762–1772 (2018).

  56. 56.

    Forsstrom, B. et al. Proteome-wide epitope mapping of antibodies using ultra-dense peptide arrays. Mol. Cell Proteomics 13, 1585–1597 (2014).

  57. 57.

    Jorgensen, K. W., Rasmussen, M., Buus, S. & Nielsen, M. NetMHCstab—predicting stability of peptide–MHC-I complexes; impacts for cytotoxic T lymphocyte epitope discovery. Immunology 141, 18–26 (2014).

  58. 58.

    Boelen, L. et al. BIITE: a tool to determine HLA class II epitopes from T cell ELISpot data. PLoS Comput. Biol. 12, e1004796 (2016).

  59. 59.

    Esteva, A. et al. A guide to deep learning in healthcare. Nat. Med. 25, 24–29 (2019).

  60. 60.

    Maddelein, D. et al. The IceLogo web server and SOAP service for determining protein consensus sequences. Nucleic Acids Res. 43, W543–W546 (2015).

  61. 61.

    Tiscornia, G., Singer, O. & Verma, I. M. Production and purification of lentiviral vectors. Nat. Protoc. 1, 241–245 (2006).

  62. 62.

    Fujita, H. et al. Human Langerhans cells induce distinct IL-22-producing CD4+ T cells lacking IL-17 production. Proc. Natl Acad. Sci. USA 106, 21795–21800 (2009).

  63. 63.

    Hunt, D. F. et al. Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry. Science 255, 1261–1263 (1992).

  64. 64.

    Bai, Y., Ni, M., Cooper, B., Wei, Y. & Fury, W. Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads. BMC Genomics 15, 325 (2014).

  65. 65.

    Nariai, N. et al. HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data. BMC Genomics 16, S7 (2015).

  66. 66.

    Verdegaal, E. M. et al. Neoantigen landscape dynamics during human melanoma–T cell interactions. Nature 536, 91–95 (2016).

  67. 67.

    The Cancer Genome Atlas Network Genomic classification of cutaneous melanoma. Cell 161, 1681–1696 (2015).

  68. 68.

    Rahal, R. et al. Pharmacological and genomic profiling identifies NF-κB-targeted treatment strategies for mantle cell lymphoma. Nat. Med. 20, 87–92 (2014).

  69. 69.

    The ENCODE Project Consortium. A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 9, e1001046 (2011).

  70. 70.

    Chen, J., Aronow, B. J. & Jegga, A. G. Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 10, 73 (2009).

  71. 71.

    The UniProt Consortium UniProt: the universal protein knowledgebase. Nucleic Acids Res. 46, 2699 (2018).

  72. 72.

    Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).

  73. 73.

    Asgari, E. & Mofrad, M. R. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS One 10, e0141287 (2015).

  74. 74.

    Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Proc. 27th International Conference on Machine Learning (Eds. Fuernkranz, J. & Joachims, T.) 807–814 (Omnipress, 2010).

  75. 75.

    Karosiene, E. et al. NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ. Immunogenetics 65, 711–724 (2013).

  76. 76.

    Nielsen, M., Justesen, S., Lund, O., Lundegaard, C. & Buus, S. NetMHCIIpan-2.0—improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure. Immunome Res. 6, 9 (2010).

  77. 77.

    Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

  78. 78.

    Lefranc, M. P. et al. IMGT, the international ImMunoGeneTics information system 25 years on. Nucleic Acids Res. 43, D413–D422 (2015).

  79. 79.

    Tran, E. et al. Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer. Science 344, 641–645 (2014).

  80. 80.

    Dhanda, S. K. et al. Predicting HLA CD4 immunogenicity in human populations. Front. Immunol. 9, 1369 (2018).

  81. 81.

    Buitinck, L. et al. API design for machine learning software: experiences from the scikit-learn project. Preprint at arXiv (2013).

  82. 82.

    Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).

  83. 83.

    Sun, X. & Xu, W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process. Lett. 21, 1389–1393 (2014).

Download references


This work was supported by National Institutes of Health (NIH) grant U01 CA194389 (to A.A.A.), NIH grant K08 CA207882 (to M.S.K.), NIH grant GM 102365 (to R.B.A.), NIH/Stanford MSTP training grant (to B.C.), an NSF GSF (to E.F.), an American Society of Hematology Scholar Award (to A.A.A.), the V-Foundation (to A.A.A.), a Damon Runyon-Rachleff Innovation Award (to J.E.E.), a W.M. Keck Foundation Medical Research Grant (to J.E.E.), a Conquer Cancer Foundation Young Investigator Award (to M.S.K.), the Leukemia and Lymphoma Society (to A.A.A. and M.S.K.), a Knut and Alice Wallenberg Foundation Postdoctoral Fellowship (to N.O.), a PD Soros New American Fellowship (to B.C.), the Stanford Bio-X Fellowship (to B.C.), the Virginia and D.K. Ludwig Fund for Cancer Research (to A.A.A.), the Bakewell Foundation (to M.D. and A.A.A.) and the SDW/DT and Shanahan Family Foundations (to A.A.A.). A.A.A. is a scholar of the Leukemia and Lymphoma Society. This work used the XStream computational resource, which is supported by the National Science Foundation Major Research Instrumentation program (ACI-1429830). This work used the shared FACS facility, which is supported by NIH S10 Shared Instrument Grant (S10RR027431-01). We thank the NIH Tetramer Facility for providing recombinant HLA-DR monomers for the peptide binding experiment. We thank M. Nielsen for providing insights regarding implementation of the NetMHCIIpan algorithm. We thank C. Linnemann and T. Schumacher for providing detailed peptide sequences. We thank Maria Birukova for discussions; MARIA is named in honor of and dedicated to the memory of the late Maria Birukova (1990–2016).

Author information

B.C., A.A.A., M.S.K., J.E.E., M.M.D., R.L. and R.B.A. conceived the main algorithm design and validation of MARIA. B.C. and E.F. implemented and tested MARIA. B.C. performed statistical analysis. M.S.K., N.O., B.S. and L.E.W. generated training and validation data for MARIA. B.C. and C.L.L. implemented web infrastructure. Y.M. and B.C. conducted structural studies of HLA-DR complexes. B.C., M.S.K. and A.A.A. wrote the paper with input from M.S.K., J.E.E., R.L., M.D. and R.B.A. All authors commented on the final manuscript.

Correspondence to Ash A. Alizadeh.

Ethics declarations

Competing interests

A.A.A. declares the following competing interests: stock or other ownership (CiberMed and Forty Seven); honoraria (Janssen Oncology); consulting or advisory roles (Celgene, Roche/Genentech and Gilead Sciences); research funding (Celgene); patents, royalties or other intellectual property (patent filings on immune deconvolution and circulating tumor DNA detection assigned to Stanford University); and travel, accommodations or expenses (Roche and Gilead Sciences). R.B.A. declares the following competing interests: stock or other ownership (Personalis); consulting or advisory role (Pfizer, Youscript, 23andme and WithHealth); patents, royalties or other intellectual property (royalties for patents related to genome sequencing).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 In vitro HLA-DR peptide binding assay for experimental validation and associated results.

(a) Workflow of flow cytometry based HLA-DR peptide binding assay. Dinitrophenyl-tagged peptides were exchanged into biotin-tagged HLA-DR (alpha and beta chains arranged with a leucine zipper) molecules and loaded onto streptavidin microspheres. Bound HLA-DR peptide complexes were visualized with flow cytometry fluorescence read-outs from anti-DR antibodies (APC) and anti-dinitrophenyl antibodies (FITC). HLA-DR peptide binding strength was estimated with percentages of the microsphere population that is high in both fluorescence markers. Peptide-HLA microspheres with >50% of the population above background FITC and APC signals are considered as positive binders. (b) MS identified JeKo-1-1 cell HLA-DR ligands and negative controls were tested for binding to HLA-DR recombinant proteins (HLA-DRB1*07:01 or HLA-DRB1*04:03) at 37°C. All ligand sequences have >10% NetMHCIIpan ranks (predicted non-binders). (c) Two non-binders in the 37°C experiment were tested for binding to HLA-DR recombinant proteins (HLA-DRB1*07:01 or HLA-DRB1*04:03) under 32°C. KEFYLFPTVFEDN was able to bind to HLA-DRB1*07:01 under 32°C. (d) Two technical controls, five positive controls, five negative controls and five JeKo HLA-DR ligands were incubated with HLA-DRB1*04:03 or HLA-DRB1*07:01 protein in 37C overnight. HLA-DR-APC and anti-DNP-FITC double positive populations are considered to be positive binders. Three additional JeKo ligands bound to at least one HLA-DR allele in spite of their poor NetMHCIIpan scores. References of controls peptides are listed in Supplementary Table 11. Each peptide in (b)-(d) was tested in two independent flow cytometry experiments to confirm the result.

Supplementary Figure 2 Peptide sequence encoding and detailed individual neural network architectures.

(a) An example (GCSADQACN) of how variable length amino acid sequences (8-26AA) to be one-hot encoded for machine learning purposes. A peptide is represented by a 21x26 matrix. Each row represents 21 possible amino acids, and each column represents the true amino acid at that position (1 = true). Any positions not encoding for an amino acid due to short length of a peptide are encoded as an all-zero vector which will be ignored by the neural network masking layer. (b) The model architecture of peptide sequence cleavage scores for HLA-DR presentation. The algorithm takes in a pair of query gene and short peptide sequence, and look up human proteome sequence database to determine the upstream and downstream six amino acid sequences (flanking sequences). A two-layer conventional neural network takes in these 12 amino acid sequences and output a 0-1 cleavage score indicating likelihood of HLA-DR presentation by knowing flanking sequences only. (c) The deep RNN model for predicting HLA-DR peptide presentation based on peptide sequences only. The deep RNN model consists of one masking layer, one RNN layer and two conventional dense layers. The deep RNN model takes in hot-hot encoded peptide sequences and output presentation scores indicating likelihood of HLA-DR presentation by knowing peptide sequences only. This model was trained on naturally presented MCL HLA-DR peptide ligands. (d) The deep RNN model for predicting HLA-DR peptide in vitro binding affinities based on IEDB binding data. A pair of query HLA-DR and peptide is encoded as a single sequence consisting of HLA-DRB1 pseudosequence, a spacer (-), and query peptide sequence. A deep RNN model takes in one-hot encodes sequences and outputs estimated in vitro binding affinities (1 - log50k(nM)). This model was trained on the IEDB quantitative HLA-DR to peptide binding data, identical to the data used by NetMHCIIpan3.1. (e) Selecting training and validation data for HLA-DR presentation prediction models. ~35k naturally presented HLA-DR peptide ligands and ~105k length-matched random human peptides are randomly assigned into training (85%) and validation (15%) sets. Peptides in the validation set identical or of substring to any peptides in training set were moved to the training set to avoid overfitting. The training and cross validation repeated 10 times to determine regularization parameters and estimates predictive powers of various models. The final performance of MARIA was determined with independent test sets.

Supplementary Figure 3 Relationship between gene expression levels and HLA ligand presentation.

(a) Comparison of gene expression levels of HLA-DR ligands and non-ligands in JeKo-1 cell line. Distributions of RNA-Seq estimated gene expression level (TPM) from JeKo-1 cell line HLA-DR ligands (red, n=5720), whole transcriptome (green, n=23165), and non-ligand genes (blue, n=22000) were plotted in a logarithmic space. HLA-DR ligands have significantly higher gene expression levels than the whole transcriptome (Mann-Whitney U test, **=p<1e-5), non-ligands have significantly lower gene expression levels than the whole transcriptome (Mann-Whitney U test, **=P<1e-5). In (a)-(c), Violin curves represent probability distribution function of gene expression, black boxes represent middle two quartiles, and white dot represents median. (b) Effects of correcting blood and ECM genes in estimating ligand gene expression levels in MCL patients. We reassigned gene expression values of blood particle or extracellular (ECM) associated genes to 50 TPM. Statistical significantly lower numbers of MCL patient HLA-DR ligand genes have low expression levels (0.1 TPM) after correction of blood and ECM genes (green, Mann-Whitney U test, **=P<1e-5, n=34049). (c) Comparison of gene expression levels of HLA-I ligands and non-ligands in JeKo-1 cell line and MCL patients. Distributions of RNA-Seq estimated gene expression level (TPM) from HLA-I ligands (red, n=60169 and 5555) and random protein-coding genes (green, n=23165 and 23165) were plotted in a logarithmic space. HLA-I ligands have significantly higher gene expression levels than the random protein-coding genes in both JeKo-1 cell lines (Mann-Whitney U test, **=P<1e-5) and MCL patients (Mann-Whitney U test, **=P<1e-5). Lowly expressed HLA-I ligands (<0.1TPM) are weakly enriched for blood micro-particles (FDR-corrected hypergeometrical test, q-value < 0.05). (d) Influence of filtering decoy peptide gene expression values on predictive power of gene expression. Gene expressions alone can differentiate between presented peptides and random decoys with 0.81 AUC. However, the predictive decreases as we removes lowly expressed genes. Gene expression does not differentiate MCL HLA-DR ligands and decoy peptides with >25 TPM gene expression values (0.51 AUC, n=3300 for ligands and n=10000 for decoy peptides). (e) Performance of MARIA on MCL validation set using various gene expression profiles. For 6 MCL patients, MARIA was run with three different gene expression profile dictionaries: patient-matched RNASeq, external MCL RNASeq, and shuffled external MCL RNASeq. MARIA predictive powers are not different between using patient-matched and external MCL RNASeq dictionaries (P=0.3, n=6). MARIA predictive power decreases when using a shuffled RNASeq dictionary (P=0.0002, n=6). P-values were determined with two-tailed paired t-test. (f) Performance of MARIA using tissue-matched or tissue-mismatched gene expression profile. For predicting ligands presented by melanoma HLA-II (n=10513), using tissue-mismatched RNASeq dictionaries decreases MARIA by less than 1% AUC. However, using a shuffled RNASeq dictionary profile data decreased MARIA performance by 7% AUC. SKCM: Skin cutaneous melanoma, GBM: Glioblastoma multiforme, BRCA: Breast invasive carcinoma, LUSC: Lung squamous cell carcinoma. (g) Influence of expression level thresholds for correction of extracellular matrix (ECM) genes on model performance. Depicted is the relationship between cross-validation AUC at various correction levels (b, TPM thresholds) for genes associated with extracellular matrix (Gene Ontology Cellular Compartment accession GO:0031012). Asterisks capture significant differences in cross-validation AUCs when comparing lower TPM thresholds than the mean level (TPM~50) that we had originally selected (** indicates Mann-Whitney U test p=0.0002 for TPM0 vs. TPM50, n=10; * indicates Mann-Whitney U test P=0.001 for TPM10 vs. TPM50, n=10). Models using thresholds between 20-100 TPM yielded highly similar validation AUC scores.

Supplementary Figure 4 Analysis of HLA ligand cleavage signatures.

(a, b) Cleavage signature analysis for MCL patient HLA-I ligands. Frequencies of 20 amino acids plus protein terminal position (-) in for 6 amino acids upstream of the C-terminus of HLA-I ligand (-6 to -1) and 6 amino acids downstream of HLA-I ligands (+1 to +6) are compared to the background distribution (n=22100 and n=42906) to determine amino acid enrichment and depletion surrounding HLA-DR ligands. Colors of the heat-map (a) and sizes of the IceLogo plot (b) letters indicate fold-change. The logo plot only includes statistically significant enrichment (P<0.0001). Compared to HLA-DR, HLA-I showed preferences for presences of tryptophan in both upstream and downstream sequences. (c) Comparison of cleavage signatures distinguishing flanking sequences of HLA-I and HLA-DR ligands. The IceLogo plot only includes statistically significant enrichments/depletions (P<0.001, two-tailed independent t-test by IceLogo60). Methionines, lysines, glutamines, and histidines are notably enriched in sequences flanking HLA-I ligands (n=22100) compared to the same regions of HLA-DR ligands (n=12150). In contrast, tryptophans, glycines, and tyrosines appear depleted from HLA-I flanks relative to HLA-DR. Cleavage signatures of (d) JeKo-1 mantle cell lymphoma cell lines, (e) L128 mantle cell lymphoma cell lines, (f) patient melanoma tissues, and (g) MUTZ3 dendritic cell lines. Each cell or tissue type exhibits variable cleavage signatures, but their HLA-DR ligands consistently show enrichment for peptides at the tails of protein (-) and depletion for proline (P) and histidine (H) flanking regions. Cleavage signatures of our lymphoma HLA-I and HLA-II ligands compared to two previous studies. (h) HLA-II ligands from dendritic cell line MUTZ3 (HLA-DRB1*10, HLA-DRB1*11, n=11419)1 were profiled by Mommen et al. 2016 Amino acid frequencies at six amino acids upstream and downstream of the presented ligands were compared to their counterparts from our MCL HLA-II ligands. ICE logo plots showed statistically significant enrichment or depletion in each position (p-value cut-off = 0.001). Methionine showed enrichment in dendritic cell cleavage signatures, and prolines showed stronger depletions at the majority of positions. (i) Similarly HLA-I ligands from B-lymphoblastoid cells profiled by Abelin et al. 2017 (HLA-A*01:01, HLA-A*02:01, HLA-B*35:01, n=4857)2 were compared to our MCL HLA-I ligands. Fewer significant differences were observed. Phenylalanines were enriched in B-lymphoblastoid cleavage signatures.

Supplementary Figure 5 Training and validation data sources and feature importance for presentation models.

(a) Detailed HLA-II ligand data and gene expression data used in training and validation of MARIA models. (b) Distributions of minimum additive distances of validation peptide sequences to training peptide sequences. The median minimum additive distance is around 7, which indicates these validation peptides need to undergo at least seven amino acid changes to become a peptide in the training set. No identical peptides were present in both training and validation sets (minimum additive distance > 0). (c) Performance of RNN-based binding models compared to NetMHCIIpan3.1. RNN-based HLA-DR in vitro binding model was trained on the identical IEDB HLA-DR data of NetMHCIIpan3.1 and validated on naturally presented MCL HLA-DR ligands (18 MCL samples). RNN-based binding models and NetMHCIIpan3.1 got about the same predictive performance (ROC-AUC=0.64, Mann-Whitney U test P=0.34, n=18). (d) Detailed 10-fold cross validation performance on identifying naturally presented with different predictors. MARIA models considering all relevant features (peptide sequence, gene expression, predicted in vitro binding, and cleavage scores) have higher average AUC scores than the second best model (RNN with sequence only, Mann-Whitney U test P<1e-5, n=10). (e) Validation performance of logistical regression models combining gene expression, binding scores and cleavage scores. Logistical regression models were trained on training MCL HLA-DR ligand data, and the validation performance was reported as average AUCs of 10-fold cross validation. Combining gene expression, binding scores and cleavage scores moderately increases the AUC compared to gene expression alone or combined with one additional feature (AUC=0.82, DeLong test p<0.0001, n=3300 for ligand peptides and n=10,000 for decoy peptides). (f) Comparing deep RNN models and shallow neural network (NN) models on predicting HLA-DR ligands based on peptide sequences only. Trained and validated on the identical sequence data, deep RNN models achieved higher validation AUC than shallow NN models after the 6th epoch. The solid lines indicate average validation AUC of 5 independent training experiments, and the shaded areas indicate 95% confidence interval (n=3300 for ligand peptides and n=10,000 for decoy peptides). (g) Impact of training dataset size on prediction performance for pan-HLA-II MARIA models. We trained new MARIA models using varying randomly sampled levels (x-axis) of training peptide ligand examples from a pan-HLA-II dataset profiling diverse cell types, when combined with the data we originally used to train MARIA (Khodadoust et al. 2017). Validation AUCs (y-axis) were then calculated using two monoallelic HLA-DR datasets (top panel: DRB1*01:01, bottom panel: DRB1*04:04) originally shown in Fig. 3. Models with more training examples show stronger performance, but with saturating plateaus in AUC performance gains after consideration of ~20k peptides. Surprisingly, models trained using pan-HLA-II data from diverse cell types did not significantly outperform the original MARIA model trained only on HLA-DR ligands from a single tumor type (two-tailed independent t-test P=0.35, n=10). The shaded area depicts the 95% confidence interval around the mean, based on 10 independently trained models, with the mean performance depicted by the solid line. (h) Performance of pan-HLA-II models for differentiating HLA-DP ligands from random human peptides. A recurrent neural network model was trained on presented HLA II ligands identified with MS and used to scores 20 reported HLA-DP ligands (Supplementary Table 12) and 100 random human peptides. Presentation scores for HLA-DP ligands were significantly higher than those for random human peptides (Mann-Whitney U test p=3e-6) and this difference achieved an AUC of 0.82.

Supplementary Figure 6 Insert figure title here by deleting or overwriting this text; keep title to a single sentence.

(a) Comparisons of interacting residues in HLA-DRB1*01:01 and HLA-DRB1*04:04 alleles. Two alleles differ in 7 out of 19 amino acid positions which potentially interact with peptide ligands. (b) Surface HLA-DR, HLA-I, and Immunoglobulin M (IgM) densities of K562 cell lines after lentiviral transductions. Transduced K562 cell lines are HLA-DR positive and HLA-I negative. K562 cell HLA-DR densities are substantially lower than B-cell lines (JeKo-1 and HBL-1). (c) HLA-DR densities after sorting and antibiotic selection. Transduced K562 cell lines were sorted for the top 1% expression for HLA-DR densities and grown in selective media (2ug/ml puromycin). The sorted mono-allelic K562 cells for DRB1*01:01 and DRB1*04:04 have higher HLA-DR densities (~10 fold increase compared to the unsorted populations). Two flow cytometry profiling experiments were conducted for each K562 cell line in (b) and (c). (d) Overlaps of K562 DRB1*01:01 and DRB1*04:04 peptide ligand sequences. Ligands from these two alleles overlap 15% when counting identical peptide sequences only and 31% when including peptides which are substrings of each other (Fig. 3a).

Supplementary Figure 7 Training of MARIA on HLA-DQ ligands and gluten peptide deamination effects on HLA-DQ presentation.

(a) Overlap of HLA-DQ2.2 and HLA-DQ2.5 peptide ligands. Ligands from these two alleles overlap 29% when counting identical peptide sequences only. (b) Training, validation, test of MARIA models for HLA-DQ2.2 presentation. To train the MARIA DQ2.2 model, 5845 peptides shared between HLA-DQ2.2 and HLA-DQ2.5, and 2529 peptide unique to HLA-DQ2.2 were used as the positive examples; 8374 length-matched peptides were used as negative examples. Peptide sequences were assigned into training, validation, and test set. No peptides in validation and test set were substring of a training peptide, vice versa. (c) MARIA predicted presentation scores on HLA-DQ2.2 presentation of five known celiac disease related gluten peptides upon all possible Q->E or Q->K mutations. Based on MARIA-DQ ranks, deamination forms (Q->E) of gluten peptides present better compared to unmodified forms or Q -> K forms of gluten peptides (* indicates p=3e-4, ** = P<1e-5, Mann-Whitney U test, n=15, 255, 7, 31, 63).

Supplementary Figure 8 Validating MARIA performance for predicting patient IgH HLA-DR presentation and immune response.

(a) NetMHCIIpan predicted HLA-DR presentation of lymphoma immunoglobulin correlated with experimentally identified HLA-DR immunoglobulin ligands. 18 MCL immunoglobulin sequences were analyzed by NetMHCIIpan(left, blue). The same 18 MCL samples were profiled with LC-MS/MS to determine the regions of immunoglobulin presented by HLA-DR. Predicted and observed presentation hot spots were significantly correlated on light chains (Spearman rho 0.48, p=3.8e-14, n=311). NetMHCIIpan prediction correlated with observed heavy chain presentation moderately (Spearman rho 0.10, p=0.02, n=1015). NetMHCIIpan predicted ligand numbers were normalized with the MS identified maximum ligand numbers for visualization purposes. (b) Precision-Recall curves of different models for identifying immunoglobulin (Ig) HLA-DR ligands. Curves depict the comparison of the precision/PPV (y-axis) for MARIA (blue curves) versus NetMHCIIpan (green curves) when considering a range of recall/sensitivity thresholds (x-axis). At 20% recall, MARIA achieved 56% and 31% precision for predicting Ig heavy chain (left panel) and light chain (right panel) presentation, respectively. In comparison, NetMHCIIpan achieved 13% and 16% at the same recall. Prevalence of 1% was used for all calculations. (c) Gating strategies of identifying alive CD4 T-cells after peptide stimulations. Analysis (d) was based on singlet lymphocyte populations with low 7-AAD and high CD4 levels. (d) Experimental validation of CD4 immunogenicity for candidate peptide neoantigens identified by MARIA. Peripheral blood mononuclear cells (PBMCs) were isolated from 3 MCL patients after immunization with autologous tumor vaccines. Patient PBMCs were re-stimulated with MARIA identified IgH neoantigens (>99.5th percentile) or control peptides for 30hrs. Antigen-specific T-cell activation was evaluated by cell surface CD137 induction. For 2 of 3 patients (MCL005 and MCL052), neoantigens induced specific CD4 T-cell activation with CD137+ levels comparable to positive controls (Pathogen Peptide Pool). These experiments were independently repeated in three patients with no technical replicates due to limited patient samples.

Supplementary Figure 9 Comparison and structural anlaysis of MARIA and NetMHCIIpan for mutated CLIP peptides.

(a) Key amino acid residues on CLIP peptide (PVSKMRMATPLLMQALP) interacting with HLA-DRB1*01:01 complex. Based on published crystal structures (PDB ID 3PDO), seven amino acids in the natural ligand of HLA-DR (CLIP) form hydrogen bonds with either HLA-DRA1 or HLA-DRB1*01:01. (b) MARIA scores change consistently with the influence of CLIP amino acid mutations. Seven mutated CLIP HLA-DR complexes with single amino acid substitution was created in silico according to the key residues defined in a. Mutated peptide atom positions in the HLA-DR environment were optimized with FlexPepDock. M107W, L113R, and M115R have higher MARIA percentile than CLIP WT (91.60% percentile), which is consistent with their gaining of hydrogen bonds or enhanced Van der Waals interactions resulting from mutations. R108 has lower MARIA percentile than CLIP WT, which is consistent with its loss of one hydrogen bound. K106R and K106D have about the same MARIA scores as WT (91.42% and 90.48%) despite opposite charges of these two mutants. Structure analysis showed the amino acid side chain in the position 106 does not contribute to hydrogen bond forming. Six out of seven mutants have about the same NetMHCIIpan percentiles compared to WT (99.85%-99.99%). NetMHCIIpan percentiles were calculated with 100% - NetMHCIIpan rank.

Supplementary Figure 10 Performance of MARIA for predicting melanoma antigen presentation and vaccine T-cell responses.

Performance of MARIA in predicting CD4 T-cell responses to personalized vaccines. Plots depict results for two melanoma clinical trials of personalized cancer vaccines (Sahin et al. 2017 (a, top) or Ott et al. 2017 (b, bottom)), where a range of MARIA score cutoffs (x-axis) are related to the Positive predictive values (PPV), negative predictive values (NPV) and sensitivity (y-axis) for predicting post-vaccination CD4 T-cell responses. MARIA scores of 95% and 99.5% were used as cut-offs for ‘medium’ and ‘high’ confidence categories depicted in Fig. 5. (c) Potential CD4 T cell epitopes in Ott et al. cohort based on MARIA scores. Numbers of neoantigens in melanoma above MARIA-high cut-off. Each nonsynonymous mutation in 6 melanoma patients (Ott et al. 2017) was scored with MARIA on a basis of 15mer sliding windows. The best MARIA score of all potential 15mer windows was used to represent the neoantigen. ~7% of nonsynonymous mutations reached 99.5% MARIA-high cut-off. Except the patient 1, all patients had at least 20 neoantigens in the MARIA-high category (MARIA percentile >99.5th). (d) Weak association of NetMHCIIpan and CD4 T-cell post-vaccination responses. Each vaccine peptide sequence in Ott et al. was scored with NetMHCIIpan and was stratified into three categories based on the same cut-off used for MARIA (Fig. 6d): low (<95th), medium (95-99.5th) and high (>99.5th). NetMHCIIpan score categories were weakly associated with CD4 T-cell responses but did not reach statistical significance (chi-square test, P=0.3). Dashed red lines indicate average response rates of the whole cohort. (e) Precision-recall curves of MARIA and NetMHCIIpan for identifying melanoma HLA-II ligands. Curves depict the comparison of the precision (y-axis) of each of three methods (full MARIA model, NetMHCIIpan 3.1, and a ‘random’ MARIA model trained on shuffled data) when considering a range of recall/sensitivity thresholds (x-axis). At 20% recall, MARIA achieved 38% precision (PPV), assuming a 1% prevalence of true antigen presentation.

Supplementary Figure 11 Abundance of HLA-II gene expression in various tumors.

Bulk RNA-Seq values of (a) HLA-DRA, (b) HLA-DQA1, and (c) HLA-DPA of 5077 TCGA tumor samples and 6 MCL tumor samples plotted as box plots. All tumor types have higher than 100 median TPM for HLA-DRA and HLA-DPA1. Most of tumors have lower HLA-II gene expressions compared to diffuse large B-cell lymphoma (DLBC, ~1000 median TPM). Few tumor samples have lower than 10 TPM (grey dash line) for HLA-DRA and HLA-DPA1. Top and bottom lines indicate 95% confidence interval, and the box indicates the first and third quartiles. MCL: mantle cell lymphoma, n=8; PRAD: prostate adenocarcinoma, n=558; HNSC: head and neck squamous cell carcinoma, n=566; OV: ovarian serous cystadenocarcinoma, n=430; SKCM: skin cutaneous melanoma, n=473; SARC: sarcoma, n=265; BRCA: breast invasive carcinoma, n=1256; MESO: mesothelioma, n=87; GBM: glioblastoma multiforme, n=175; KIRC: kidney renal clear cell carcinoma, n=618; LUAD: lung adenocarcinoma, n=601; DLBC: diffuse large B-cell lymphoma, n=48.

Supplementary information

Supplementary Information

Supplementary Figs. 1–11 and Supplementary Notes 1–6.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–15.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, B., Khodadoust, M.S., Olsson, N. et al. Predicting HLA class II antigen presentation through integrated deep learning. Nat Biotechnol 37, 1332–1343 (2019) doi:10.1038/s41587-019-0280-2

Download citation