Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning

Abstract

Computational modelling of the interactions between T-cell receptors (TCRs) and epitopes is of great importance for immunotherapy and antigen discovery. However, current TCR–epitope interaction prediction tools are still in a relatively primitive stage and have limited capacity in deciphering the underlying binding mechanisms, for example, characterizing the pairwise residue interactions between TCRs and epitopes. Here we designed a new deep-learning-based framework for modelling TCR–epitope interactions, called TCR–Epitope Interaction Modelling at Residue Level (TEIM-Res), which took the sequences of TCRs and epitopes as input and predicted both pairwise residue distances and contact sites involved in the interactions. To tackle the current bottleneck of data deficiency, we applied a few-shot learning strategy by incorporating sequence-level binding information into residue-level interaction prediction. The validation experiments and analyses indicated its good prediction performance and the effectiveness of its design. We demonstrated three potential applications: revealing the subtle conformation changes of mutant TCR–epitope pairs, uncovering the key contacts based on epitope-specific TCR pools, and mining the intrinsic binding rules and patterns. In summary, our model can serve as a useful tool for comprehensively characterizing TCR–epitope interactions and understanding the molecular basis of binding mechanisms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Model architectures and performance evaluation.
Fig. 2: Detailed analyses of model performance.
Fig. 3: The performance of TEIM-Res on the mutation analysis on the interactions between A6 TCR and Tax epitope.
Fig. 4: The analyses of three epitope-specific TCR pools.
Fig. 5: TEIM-Res can be used to discover the residue-level binding patterns of TCR–epitope interactions.
Fig. 6: Validation and application of TEIM-Seq in sequence-level binding prediction.

Similar content being viewed by others

Data availability

We provide the processed data for our model training and evaluation in the GitHub repository at https://github.com/pengxingang/TEIM. The raw data were all downloaded from public websites. The sequence-level binding datasets were downloaded from VDJdb (https://vdjdb.cdr3.net/search), McPAS-TCR (complete database at http://friedmanlab.weizmann.ac.il/McPAS-TCR/) and ImmuneCODE (https://clients.adaptivebiotech.com/pub/covid-2020). The structures of TCR–epitope complexes were downloaded from STCRDab (https://opig.stats.ox.ac.uk/webapps/stcrdab/Browser?all=true#downloads). The epitope sequence dataset was retrieved from https://www.iedb.org/by setting three filters (‘Epitope Structure: Linear Sequence’, ‘No B cell assays’ and ‘MHC Restriction Type I’) and pressing ‘Export Results’ for epitopes. The processed data for training the models (contact maps, sequence-level pairs, and all epitope sequences) is available at https://github.com/pengxingang/TEIM/tree/main/data. The affinity changes and sequences of the mutated A6-Tax sequences were retrieved from http://atlas.wenglab.org/web/search.php by searching TCR name A6 and also validated from their original papers (Supplementary Table 1). The TCR repertoire data for our analyses were retrieved from Supplementary Table 1 of https://www.nature.com/articles/nature22976. The crystal structures with the mentioned PDB IDs (5SWS, 5SWZ, 6UZI, 1AO7, 2VLJ, 3O4L, and 3GSN) were downloaded from the STCRDab dataset (https://opig.stats.ox.ac.uk/webapps/stcrdab/Browser?all=true#dbsearch).

Code availability

The source codes and model weights of TEIM-Res and TEIM-Seq are available on GitHub (https://github.com/pengxingang/TEIM) and Zenodo (https://zenodo.org/record/7604787)62.

References

  1. Peters, B., Nielsen, M. & Sette, A. T cell epitope predictions. Ann. Rev. Immunol. 38, 123–145 (2020).

    Article  Google Scholar 

  2. He, Q., Jiang, X., Zhou, X. & Weng, J. Targeting cancers through TCR-peptide/MHC interactions. J. Hematol. Oncol. 12, 1–17 (2019).

    Article  Google Scholar 

  3. Huppa, J. B. et al. TCR–peptide–MHC interactions in situ show accelerated kinetics and increased affinity. Nature 463, 963–967 (2010).

    Article  Google Scholar 

  4. Yamamoto, T., Kishton, R. & Restifo, N. Developing neoantigen-targeted T cell–based treatments for solid tumors. Nat. Med. 25, 1488–1499 (2019).

    Article  Google Scholar 

  5. Candia, Martín, Kratzer, B. & Pickl, W. F. On peptides and altered peptide ligands: from origin, mode of action and design to clinical application (immunotherapy). Int. Arch. Allergy Immunol. 170, 211–233 (2016).

    Article  Google Scholar 

  6. Joglekar, A. & Li, G. T cell antigen discovery. Nat. Methods, 18, 873–880 (2021).

  7. Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

    Article  Google Scholar 

  8. Lu, T. et al. Deep learning-based prediction of the T cell receptor–antigen binding specificity. Nat. Mach. Intell. 3, 864–875 (2021).

    Article  Google Scholar 

  9. Gielis, S. et al. Detection of enriched T cell epitope specificity in full t cell receptor sequence repertoires. Front. Immunol. 10, 2820 (2019).

    Article  Google Scholar 

  10. Jurtz, V. A. et al. NetTCR: sequence-based prediction of TCR binding to peptide-mhc complexes using convolutional neural networks. Preprint at bioRxiv https://doi.org/10.1101/433706 (2018).

  11. Springer, I., Besser, H., Tickotsky-Moskovitz, N., Dvorkin, S. & Louzoun, Y. Prediction of specific TCR-peptide binding from large dictionaries of TCR-peptide pairs. Front. Immunol. 11, 1803 (2020).

    Article  Google Scholar 

  12. Moris, P. et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 12, bbaa318 (2020).

    Google Scholar 

  13. Kjærgaard, J. K. et al. TCRpMHCmodels: Structural modelling of TCR-pMHC class I complexes. Sci. Rep. 9, 14530 (2019).

  14. Lanzarotti, E., Marcatili, P. & Nielsen, M. Identification of the cognate peptide-MHC target of t cell receptors using molecular modeling and force field scoring. Mol. Immunol. 94, 91–97 (2018).

    Article  Google Scholar 

  15. Jumper, J. & Hassabis, D. Protein structure predictions to atomic accuracy with AlphaFold. Nat. Methods 19, 11–12 (2022).

    Article  Google Scholar 

  16. Yin, R., Feng, B. Y., Varshney, A. & Pierce, B. G. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Sci. 31, e4379 (2022).

  17. Lin, X. et al. Rapid assessment of T-cell receptor specificity of the immune repertoire. Nat. Comput. Sci. 1, 362–373 (2021).

    Article  Google Scholar 

  18. Lee, H., Heo, L., Lee, MyeongSup & Seok, C. GalaxyPepDock: a protein-peptide docking tool based on interaction similarity and energy optimization. Nucleic Acids Res. 43, W431–W435 (2015).

    Article  Google Scholar 

  19. Ciemny, M. et al. Protein–peptide docking: opportunities and challenges. Drug Discov. Today 23, 1530–1537 (2018).

    Article  Google Scholar 

  20. Antunes, D. A. et al. Dinc 2.0: a new protein–peptide docking webserver using an incremental approach. Cancer Res. 77, e55–e57 (2017).

    Article  MathSciNet  Google Scholar 

  21. Blaszczyk, M., Ciemny, MaciejPawel, Kolinski, A., Kurcinski, M. & Kmiecik, S. Protein–peptide docking using CABS-dock and contact information. Brief. Bioinform. 20, 2299–2305 (2019).

    Article  Google Scholar 

  22. Abdin, O., Nim, S., Wen, H. & Kim, P. M. PepNN: a deep attention model for the identification of peptide binding sites. Commun. Biol. 5, 503 (2022).

  23. Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & Koes, DavidRyan Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57, 942–957 (2017).

    Article  Google Scholar 

  24. Yan, C. & Zou, X. Predicting peptide binding sites on protein surfaces by clustering chemical interactions. J. Comput. Chem. 36, 49–61 (2015).

    Article  Google Scholar 

  25. Zhao, Z., Peng, Z. & Yang, J. Improving sequence-based prediction of protein–peptide binding residues by introducing intrinsic disorder and a consensus method. J. Chem. Inf. Model. 58, 1459–1468 (2018).

    Article  Google Scholar 

  26. Wang, Y., Yao, Q., Kwok, J. T. & Ni, L. M. Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2020).

    Google Scholar 

  27. Donahue, J. et al. DeCAF: a deep convolutional activation feature for generic visual recognition. In Int. Conf. Machine Learning 647–655 (PMLR, 2014).

  28. Gras, S. et al. Reversed T cell receptor docking on a major histocompatibility class I complex limits involvement in the immune response. Immunity 45, 749–760 (2016).

    Article  Google Scholar 

  29. Adasme, M. F. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 5, gkab294 (2021).

    Google Scholar 

  30. Garboczi, D. N. et al. Structure of the complex between human T-cell receptor, viral peptide and HLA-A2. Nature 384, 134–141 (1996).

    Article  Google Scholar 

  31. Borrman, T. et al. ATLAS: a database linking binding affinities with structures for wild-type and mutant TCR-PMHC complexes. Proteins 85, 908–916 (2017).

    Article  Google Scholar 

  32. Scott, D. R., Borbulevych, O. Y., Piepenbrink, K. H., Corcelli, S. A. & Baker, B. M. Disparate degrees of hypervariable loop flexibility control t-cell receptor cross-reactivity, specificity, and binding mechanism. J. Mol. Biol. 414, 385–400 (2011).

    Article  Google Scholar 

  33. Borbulevych, O. Y. et al. T cell receptor cross-reactivity directed by antigen-dependent tuning of peptide-mhc molecular flexibility. Immunity 31, 885–896 (2009).

    Article  Google Scholar 

  34. Haidar, J. N. et al. Structure-based design of a T-cell receptor leads to nearly 100-fold improvement in binding affinity for pepMHC. Proteins 74, 948–960 (2009).

    Article  Google Scholar 

  35. Li, Y. et al. Directed evolution of human T-cell receptors with picomolar affinities by phage display. Nat. Biotechnol. 23, 349–354 (2005).

    Article  Google Scholar 

  36. Pierce, B. G., Haidar, J. N., Yu, Y. & Weng, Z. Combinations of affinity-enhancing mutations in a T cell receptor reveal highly nonadditive effects within and between complementarity determining regions and chains. Biochemistry 49, 7050–7059 (2010).

    Article  Google Scholar 

  37. Borg, N. A. et al. The CDR3 regions of an immunodominant T cell receptor dictate the’energetic landscape’of peptide-MHC recognition. Nat. Immunol. 6, 171–180 (2005).

    Article  Google Scholar 

  38. Cole, DavidKenneth Increased peptide contacts govern high affinity binding of a modified TCR whilst maintaining a native PMHC docking mode. Front. Immunol. 4, 168 (2013).

    Article  Google Scholar 

  39. Piepenbrink, K. H., Blevins, S. J., Scott, D. R. & Baker, B. M. The basis for limited specificity and MHC restriction in a T cell receptor interface. Nat. Commun. 4, 1948 (2013).

    Article  Google Scholar 

  40. Ding, Yuan-Hua, Baker, B. M., Garboczi, D. N., Biddison, W. E. & Wiley, D. C. Four A6-TCR/peptide/HLA-A2 structures that generate very different T cell signals are nearly identical. Immunity 11, 45–56 (1999).

    Article  Google Scholar 

  41. Shang, X. et al. Rational optimization of tumor epitopes using in silico analysis-assisted substitution of TCR contact residues: molecular immunology. Eur. J. Immunol. 39, 2248–2258 (2009).

    Article  Google Scholar 

  42. Ochi, T. et al. Optimization of T-cell reactivity by exploiting TCR chain centricity for the purpose of safe and effective antitumor TCR gene therapy. Cancer Immunol. Res. 3, 1070–1081 (2015).

    Article  Google Scholar 

  43. Bassan, D. et al. Avidity optimization of a MAGE-A1-specific TCR with somatic hypermutation. Eur. J. Immunol. 51, 1505–1518 (2021).

  44. Gutierrez, L., Beckford, J. & Alachkar, H. Deciphering the TCR repertoire to solve the COVID-19 mystery. Trends Pharmacol. Sci. 41, 518–530 (2020).

    Article  Google Scholar 

  45. Leem, J., de Oliveira, SauloH. P., Krawczyk, K. & Deane, C. M. STCRDab: the Structural T-cell Receptor Database. Nucleic Acids Res. 46, D406–D412 (2017).

    Article  Google Scholar 

  46. Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).

    Article  Google Scholar 

  47. Bagaev, D. V. et al. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 48, D1057–D1062 (2020).

    Article  Google Scholar 

  48. Sewell, A. K. Why must T cells be cross-reactive? Nat. Rev. Immunol. 12, 669–677 (2012).

    Article  Google Scholar 

  49. Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics 33, 2924–2929 (2017).

    Article  Google Scholar 

  50. Klinger, M. et al. Multiplex identification of antigen-specific T cell receptors using a combination of immune assays and immune receptor sequencing. PLoS ONE 10, e0141561 (2015).

    Article  Google Scholar 

  51. Sidhom, J.-W. & Baras, A. S. Analysis of SARS-CoV-2 specific T-cell receptors in immunecode reveals cross-reactivity to immunodominant influenza M1 epitope. Preprint at bioRxiv https://doi.org/10.1101/2020.06.20.160499 (2020).

  52. Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2018).

    Article  Google Scholar 

  53. Lefranc, M.-P. et al. IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains. Dev. Comp. Immunol. 29, 185–203 (2005).

    Article  Google Scholar 

  54. Dunbar, J. & Deane, C. M. ANARCI: antigen receptor numbering and receptor classification. Bioinformatics 32, 298–300 (2015).

    Article  Google Scholar 

  55. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Proc. of the 3rd International Conference on Learning Representations, ICLR 2015 (eds Bengio, Y. & LeCun, Y.) (2015).

  56. Li, S. et al. MONN: a multi-objective neural network for predicting compound-protein interactions and affinities. Cell Syst. 10, 308–322.e11 (2020).

    Google Scholar 

  57. Lei, Y. et al. A deep-learning framework for multi-level peptide–protein interaction prediction. Nat. Commun. 12, 5465 (2021).

  58. Zhao, M., Lee, W.-P., Garrison, E. P. & Marth, G. T. SSW Library: an SIMD Smith–Waterman C/C++ library for use in genomic applications. PLoS ONE 8, e82138 (2013).

    Article  Google Scholar 

  59. Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128, 336–359 (2019).

    Article  Google Scholar 

  60. Qin, Z., Yu, F., Liu, C. & Chen, X. How convolutional neural networks see the world—a survey of convolutional neural network visualization methods. Math. Found. Comput. 1, 149–180 (2018).

    Article  Google Scholar 

  61. Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2019).

    Article  Google Scholar 

  62. Xingang, P. pengxingang/TEIM: TEIM. Zenodo https://zenodo.org/record/7604787 (2023).

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (T2125007 and 61872216 to J.Z.; 31900862 to DZ), the National Key Research and Development Program of China (2021YFF1201300), the Turing AI Institute of Nanjing, and the Tsinghua-Toyota Joint Research Fund.

Author information

Authors and Affiliations

Authors

Contributions

X.P., Y.L., D.Z. and J.Z. conceived the concept. X.P. and Y.L. implemented the model and performed computational experiments. Y.L. and P.F. prepared and processed all data. X.P., Y.L., L.J., J.M., D.Z. and J.Z. analysed the results. X.P., Y.L., D.Z. and J.Z. wrote the paper with help from all the authors.

Corresponding authors

Correspondence to Dan Zhao or Jianyang Zeng.

Ethics declarations

Competing interests

J.Z. is a founder of Silexon AI Technology and has an equity interest. All other authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Geir Kjetil Sandve, Pieter Meysman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Comparison among GalaxyPepDock, TEIM-Res and the average baseline.

a, Comparison between GalaxyPepDock and TEIM-Res in terms of the mean squared errors and the mean relative errors. b, Comparison between GalaxyPepDock and the average baseline in terms of the correlation coefficients, the mean squared errors, the mean relative errors, and MCCs.

Extended Data Fig. 2 The distributions of the individual evaluation metrics per sample under the new-epitope splitting setting.

The five subfigures show the distributions of the correlation coefficient, mean squared error, mean relative error, AUPR, and MCC, respectively.

Extended Data Fig. 3 The true and predicted distances/contacts of the wild-type sample A6-Tax.

a, The true and predicted pairwise distances of the A6-Tax sample. b, The true contact and predicted contact scores of the A6-Tax sample. c, The distance errors of the A6-Tax sample. The errors are defined as the predicted distances minus the corresponding true distances.

Extended Data Fig. 4 The crystal structures of the three epitopes interacting with the CDR3βs.

a, Different views of the epitope GILGFVFTL interacting with CDR3β (PDB ID: 2VLJ). b, Different views of the epitope GLCTLVAML interacting with the CDR3β (PDB ID: 3O4L). c, Different views of the epitope NLVPMVATV interacting with the CDR3β (PDB ID: 3GSN). The CDR3βs are shown in cyan and the MHCs are shown in grey. The epitopes are shown in red and the darker emphasizes the important binding residues.

Supplementary information

Supplementary Information

Supplementary Figs. 1–22, Tables 1–4 and text.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, X., Lei, Y., Feng, P. et al. Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning. Nat Mach Intell 5, 395–407 (2023). https://doi.org/10.1038/s42256-023-00634-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-023-00634-4

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing