Abstract
Computational modelling of the interactions between T-cell receptors (TCRs) and epitopes is of great importance for immunotherapy and antigen discovery. However, current TCR–epitope interaction prediction tools are still in a relatively primitive stage and have limited capacity in deciphering the underlying binding mechanisms, for example, characterizing the pairwise residue interactions between TCRs and epitopes. Here we designed a new deep-learning-based framework for modelling TCR–epitope interactions, called TCR–Epitope Interaction Modelling at Residue Level (TEIM-Res), which took the sequences of TCRs and epitopes as input and predicted both pairwise residue distances and contact sites involved in the interactions. To tackle the current bottleneck of data deficiency, we applied a few-shot learning strategy by incorporating sequence-level binding information into residue-level interaction prediction. The validation experiments and analyses indicated its good prediction performance and the effectiveness of its design. We demonstrated three potential applications: revealing the subtle conformation changes of mutant TCR–epitope pairs, uncovering the key contacts based on epitope-specific TCR pools, and mining the intrinsic binding rules and patterns. In summary, our model can serve as a useful tool for comprehensively characterizing TCR–epitope interactions and understanding the molecular basis of binding mechanisms.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
We provide the processed data for our model training and evaluation in the GitHub repository at https://github.com/pengxingang/TEIM. The raw data were all downloaded from public websites. The sequence-level binding datasets were downloaded from VDJdb (https://vdjdb.cdr3.net/search), McPAS-TCR (complete database at http://friedmanlab.weizmann.ac.il/McPAS-TCR/) and ImmuneCODE (https://clients.adaptivebiotech.com/pub/covid-2020). The structures of TCR–epitope complexes were downloaded from STCRDab (https://opig.stats.ox.ac.uk/webapps/stcrdab/Browser?all=true#downloads). The epitope sequence dataset was retrieved from https://www.iedb.org/by setting three filters (‘Epitope Structure: Linear Sequence’, ‘No B cell assays’ and ‘MHC Restriction Type I’) and pressing ‘Export Results’ for epitopes. The processed data for training the models (contact maps, sequence-level pairs, and all epitope sequences) is available at https://github.com/pengxingang/TEIM/tree/main/data. The affinity changes and sequences of the mutated A6-Tax sequences were retrieved from http://atlas.wenglab.org/web/search.php by searching TCR name A6 and also validated from their original papers (Supplementary Table 1). The TCR repertoire data for our analyses were retrieved from Supplementary Table 1 of https://www.nature.com/articles/nature22976. The crystal structures with the mentioned PDB IDs (5SWS, 5SWZ, 6UZI, 1AO7, 2VLJ, 3O4L, and 3GSN) were downloaded from the STCRDab dataset (https://opig.stats.ox.ac.uk/webapps/stcrdab/Browser?all=true#dbsearch).
Code availability
The source codes and model weights of TEIM-Res and TEIM-Seq are available on GitHub (https://github.com/pengxingang/TEIM) and Zenodo (https://zenodo.org/record/7604787)62.
References
Peters, B., Nielsen, M. & Sette, A. T cell epitope predictions. Ann. Rev. Immunol. 38, 123–145 (2020).
He, Q., Jiang, X., Zhou, X. & Weng, J. Targeting cancers through TCR-peptide/MHC interactions. J. Hematol. Oncol. 12, 1–17 (2019).
Huppa, J. B. et al. TCR–peptide–MHC interactions in situ show accelerated kinetics and increased affinity. Nature 463, 963–967 (2010).
Yamamoto, T., Kishton, R. & Restifo, N. Developing neoantigen-targeted T cell–based treatments for solid tumors. Nat. Med. 25, 1488–1499 (2019).
Candia, Martín, Kratzer, B. & Pickl, W. F. On peptides and altered peptide ligands: from origin, mode of action and design to clinical application (immunotherapy). Int. Arch. Allergy Immunol. 170, 211–233 (2016).
Joglekar, A. & Li, G. T cell antigen discovery. Nat. Methods, 18, 873–880 (2021).
Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).
Lu, T. et al. Deep learning-based prediction of the T cell receptor–antigen binding specificity. Nat. Mach. Intell. 3, 864–875 (2021).
Gielis, S. et al. Detection of enriched T cell epitope specificity in full t cell receptor sequence repertoires. Front. Immunol. 10, 2820 (2019).
Jurtz, V. A. et al. NetTCR: sequence-based prediction of TCR binding to peptide-mhc complexes using convolutional neural networks. Preprint at bioRxiv https://doi.org/10.1101/433706 (2018).
Springer, I., Besser, H., Tickotsky-Moskovitz, N., Dvorkin, S. & Louzoun, Y. Prediction of specific TCR-peptide binding from large dictionaries of TCR-peptide pairs. Front. Immunol. 11, 1803 (2020).
Moris, P. et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 12, bbaa318 (2020).
Kjærgaard, J. K. et al. TCRpMHCmodels: Structural modelling of TCR-pMHC class I complexes. Sci. Rep. 9, 14530 (2019).
Lanzarotti, E., Marcatili, P. & Nielsen, M. Identification of the cognate peptide-MHC target of t cell receptors using molecular modeling and force field scoring. Mol. Immunol. 94, 91–97 (2018).
Jumper, J. & Hassabis, D. Protein structure predictions to atomic accuracy with AlphaFold. Nat. Methods 19, 11–12 (2022).
Yin, R., Feng, B. Y., Varshney, A. & Pierce, B. G. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Sci. 31, e4379 (2022).
Lin, X. et al. Rapid assessment of T-cell receptor specificity of the immune repertoire. Nat. Comput. Sci. 1, 362–373 (2021).
Lee, H., Heo, L., Lee, MyeongSup & Seok, C. GalaxyPepDock: a protein-peptide docking tool based on interaction similarity and energy optimization. Nucleic Acids Res. 43, W431–W435 (2015).
Ciemny, M. et al. Protein–peptide docking: opportunities and challenges. Drug Discov. Today 23, 1530–1537 (2018).
Antunes, D. A. et al. Dinc 2.0: a new protein–peptide docking webserver using an incremental approach. Cancer Res. 77, e55–e57 (2017).
Blaszczyk, M., Ciemny, MaciejPawel, Kolinski, A., Kurcinski, M. & Kmiecik, S. Protein–peptide docking using CABS-dock and contact information. Brief. Bioinform. 20, 2299–2305 (2019).
Abdin, O., Nim, S., Wen, H. & Kim, P. M. PepNN: a deep attention model for the identification of peptide binding sites. Commun. Biol. 5, 503 (2022).
Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & Koes, DavidRyan Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57, 942–957 (2017).
Yan, C. & Zou, X. Predicting peptide binding sites on protein surfaces by clustering chemical interactions. J. Comput. Chem. 36, 49–61 (2015).
Zhao, Z., Peng, Z. & Yang, J. Improving sequence-based prediction of protein–peptide binding residues by introducing intrinsic disorder and a consensus method. J. Chem. Inf. Model. 58, 1459–1468 (2018).
Wang, Y., Yao, Q., Kwok, J. T. & Ni, L. M. Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2020).
Donahue, J. et al. DeCAF: a deep convolutional activation feature for generic visual recognition. In Int. Conf. Machine Learning 647–655 (PMLR, 2014).
Gras, S. et al. Reversed T cell receptor docking on a major histocompatibility class I complex limits involvement in the immune response. Immunity 45, 749–760 (2016).
Adasme, M. F. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 5, gkab294 (2021).
Garboczi, D. N. et al. Structure of the complex between human T-cell receptor, viral peptide and HLA-A2. Nature 384, 134–141 (1996).
Borrman, T. et al. ATLAS: a database linking binding affinities with structures for wild-type and mutant TCR-PMHC complexes. Proteins 85, 908–916 (2017).
Scott, D. R., Borbulevych, O. Y., Piepenbrink, K. H., Corcelli, S. A. & Baker, B. M. Disparate degrees of hypervariable loop flexibility control t-cell receptor cross-reactivity, specificity, and binding mechanism. J. Mol. Biol. 414, 385–400 (2011).
Borbulevych, O. Y. et al. T cell receptor cross-reactivity directed by antigen-dependent tuning of peptide-mhc molecular flexibility. Immunity 31, 885–896 (2009).
Haidar, J. N. et al. Structure-based design of a T-cell receptor leads to nearly 100-fold improvement in binding affinity for pepMHC. Proteins 74, 948–960 (2009).
Li, Y. et al. Directed evolution of human T-cell receptors with picomolar affinities by phage display. Nat. Biotechnol. 23, 349–354 (2005).
Pierce, B. G., Haidar, J. N., Yu, Y. & Weng, Z. Combinations of affinity-enhancing mutations in a T cell receptor reveal highly nonadditive effects within and between complementarity determining regions and chains. Biochemistry 49, 7050–7059 (2010).
Borg, N. A. et al. The CDR3 regions of an immunodominant T cell receptor dictate the’energetic landscape’of peptide-MHC recognition. Nat. Immunol. 6, 171–180 (2005).
Cole, DavidKenneth Increased peptide contacts govern high affinity binding of a modified TCR whilst maintaining a native PMHC docking mode. Front. Immunol. 4, 168 (2013).
Piepenbrink, K. H., Blevins, S. J., Scott, D. R. & Baker, B. M. The basis for limited specificity and MHC restriction in a T cell receptor interface. Nat. Commun. 4, 1948 (2013).
Ding, Yuan-Hua, Baker, B. M., Garboczi, D. N., Biddison, W. E. & Wiley, D. C. Four A6-TCR/peptide/HLA-A2 structures that generate very different T cell signals are nearly identical. Immunity 11, 45–56 (1999).
Shang, X. et al. Rational optimization of tumor epitopes using in silico analysis-assisted substitution of TCR contact residues: molecular immunology. Eur. J. Immunol. 39, 2248–2258 (2009).
Ochi, T. et al. Optimization of T-cell reactivity by exploiting TCR chain centricity for the purpose of safe and effective antitumor TCR gene therapy. Cancer Immunol. Res. 3, 1070–1081 (2015).
Bassan, D. et al. Avidity optimization of a MAGE-A1-specific TCR with somatic hypermutation. Eur. J. Immunol. 51, 1505–1518 (2021).
Gutierrez, L., Beckford, J. & Alachkar, H. Deciphering the TCR repertoire to solve the COVID-19 mystery. Trends Pharmacol. Sci. 41, 518–530 (2020).
Leem, J., de Oliveira, SauloH. P., Krawczyk, K. & Deane, C. M. STCRDab: the Structural T-cell Receptor Database. Nucleic Acids Res. 46, D406–D412 (2017).
Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).
Bagaev, D. V. et al. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 48, D1057–D1062 (2020).
Sewell, A. K. Why must T cells be cross-reactive? Nat. Rev. Immunol. 12, 669–677 (2012).
Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics 33, 2924–2929 (2017).
Klinger, M. et al. Multiplex identification of antigen-specific T cell receptors using a combination of immune assays and immune receptor sequencing. PLoS ONE 10, e0141561 (2015).
Sidhom, J.-W. & Baras, A. S. Analysis of SARS-CoV-2 specific T-cell receptors in immunecode reveals cross-reactivity to immunodominant influenza M1 epitope. Preprint at bioRxiv https://doi.org/10.1101/2020.06.20.160499 (2020).
Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2018).
Lefranc, M.-P. et al. IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains. Dev. Comp. Immunol. 29, 185–203 (2005).
Dunbar, J. & Deane, C. M. ANARCI: antigen receptor numbering and receptor classification. Bioinformatics 32, 298–300 (2015).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Proc. of the 3rd International Conference on Learning Representations, ICLR 2015 (eds Bengio, Y. & LeCun, Y.) (2015).
Li, S. et al. MONN: a multi-objective neural network for predicting compound-protein interactions and affinities. Cell Syst. 10, 308–322.e11 (2020).
Lei, Y. et al. A deep-learning framework for multi-level peptide–protein interaction prediction. Nat. Commun. 12, 5465 (2021).
Zhao, M., Lee, W.-P., Garrison, E. P. & Marth, G. T. SSW Library: an SIMD Smith–Waterman C/C++ library for use in genomic applications. PLoS ONE 8, e82138 (2013).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128, 336–359 (2019).
Qin, Z., Yu, F., Liu, C. & Chen, X. How convolutional neural networks see the world—a survey of convolutional neural network visualization methods. Math. Found. Comput. 1, 149–180 (2018).
Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2019).
Xingang, P. pengxingang/TEIM: TEIM. Zenodo https://zenodo.org/record/7604787 (2023).
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (T2125007 and 61872216 to J.Z.; 31900862 to DZ), the National Key Research and Development Program of China (2021YFF1201300), the Turing AI Institute of Nanjing, and the Tsinghua-Toyota Joint Research Fund.
Author information
Authors and Affiliations
Contributions
X.P., Y.L., D.Z. and J.Z. conceived the concept. X.P. and Y.L. implemented the model and performed computational experiments. Y.L. and P.F. prepared and processed all data. X.P., Y.L., L.J., J.M., D.Z. and J.Z. analysed the results. X.P., Y.L., D.Z. and J.Z. wrote the paper with help from all the authors.
Corresponding authors
Ethics declarations
Competing interests
J.Z. is a founder of Silexon AI Technology and has an equity interest. All other authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Geir Kjetil Sandve, Pieter Meysman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Comparison among GalaxyPepDock, TEIM-Res and the average baseline.
a, Comparison between GalaxyPepDock and TEIM-Res in terms of the mean squared errors and the mean relative errors. b, Comparison between GalaxyPepDock and the average baseline in terms of the correlation coefficients, the mean squared errors, the mean relative errors, and MCCs.
Extended Data Fig. 2 The distributions of the individual evaluation metrics per sample under the new-epitope splitting setting.
The five subfigures show the distributions of the correlation coefficient, mean squared error, mean relative error, AUPR, and MCC, respectively.
Extended Data Fig. 3 The true and predicted distances/contacts of the wild-type sample A6-Tax.
a, The true and predicted pairwise distances of the A6-Tax sample. b, The true contact and predicted contact scores of the A6-Tax sample. c, The distance errors of the A6-Tax sample. The errors are defined as the predicted distances minus the corresponding true distances.
Extended Data Fig. 4 The crystal structures of the three epitopes interacting with the CDR3βs.
a, Different views of the epitope GILGFVFTL interacting with CDR3β (PDB ID: 2VLJ). b, Different views of the epitope GLCTLVAML interacting with the CDR3β (PDB ID: 3O4L). c, Different views of the epitope NLVPMVATV interacting with the CDR3β (PDB ID: 3GSN). The CDR3βs are shown in cyan and the MHCs are shown in grey. The epitopes are shown in red and the darker emphasizes the important binding residues.
Supplementary information
Supplementary Information
Supplementary Figs. 1–22, Tables 1–4 and text.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Peng, X., Lei, Y., Feng, P. et al. Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning. Nat Mach Intell 5, 395–407 (2023). https://doi.org/10.1038/s42256-023-00634-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00634-4
This article is cited by
-
Adaptive immune receptor repertoire analysis
Nature Reviews Methods Primers (2024)