Each T cell receptor (TCR) gene is created without regard for which substances (antigens) the receptor can recognize. T cell selection culls developing T cells when their TCRs (i) fail to recognize major histocompatibility complexes (MHCs) that act as antigen presenting platforms or (ii) recognize with high affinity self-antigens derived from healthy cells and tissue. While T cell selection has been thoroughly studied, little is known about which TCRs are retained or removed by this process. Therefore, we develop an approach using TCR gene sequencing and machine learning to identify patterns in TCR protein sequences influencing the outcome of T cell receptor selection. We verify the trained models classify TCRs from developing T cells as being before selection and TCRs from mature T cells as being after selection. Our approach may provide future avenues for studying the relationship between T cell selection and conditions like autoimmune diseases.
This is a preview of subscription content, access via your institution
Subscribe to this journal
Receive 6 digital issues and online access to articles
$119.00 per year
only $19.83 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
All computer code is written in Python3 using NumPy, Tensorflow v1.14, and Keras. Aspects of the computer code can be found at https://github.com/jostmey/dkm. A full copy of the computer source code, detailed instructions for running computer code, and trained models are available upon request under a signed confidentiality agreement. Email the corresponding author if interested.
Davis CB, Killeen N, Crooks MEC, Raulet D, Littman DR. Evidence for a stochastic mechanism in the differentiation of mature subsets of T lymphocytes. Cell. 1993;73:237–47.
Itano A, Kioussis D, Robey E. Stochastic component to development of class I major histocompatibility complex-specific T cells. Proc Natl Acad Sci USA. 1994;91:220–4.
Yates AJ. Theories and quantification of thymic selection. Front Immunol. 2014;5:13–13.
Baumann B, Potash MJ, Köhler G. Consequences of frameshift mutations at the immunoglobulin heavy chain locus of the mouse. EMBO J. 1985;4:351–9.
Li S, Wilkinson MF. Nonsense Surveillance in Lymphocytes. Immunity. 1998;8:135–41.
Currier JR, Yassai M, Robinson MA, Gorski J. Molecular defects in TCRBV genes preclude thymic selection and limit the expressed TCR repertoire. J Immunol. 1996;157:170–5.
Manfras BJ, Terjung D, Boehm BO. Non-productive human TCR β chain genes represent V-D-J diversity before selection upon function: insight into biased usage of TCRBD and TCRBJ genes and diversity of CDR3 region length. Hum Immunol. 1999;60:1090–1100.
Li H, Ye C, Ji G, Wu X, Xiang Z, Li Y, et al. Recombinatorial biases and convergent recombination determine interindividual TCRβ sharing in murine thymocytes. J Immunol. 2012;189:2404–13.
Heikkilä N, Vanhanen R, Yohannes DA, Kleino I, Mattila IP, Saramäki J, et al. Human thymic T cell repertoire is imprinted with strong convergence to shared sequences. Mol Immunol. 2020;127:112–23.
LMOD Bruin, Bosticardo M, Barbieri A, Lin SG, Rowe JH, Poliani PL, et al. Hypomorphic Rag1 mutations alter the preimmune repertoire at early stages of lymphoid development. Blood. 2018;132:281–92.
Pannetier C, Cochet M, Darche S, Casrouge A, Zoller M, Kourilsky P. The sizes of the CDR3 hypervariable regions of the murine T-cell receptor beta chains vary as a function of the recombined germ-line segments. Proc Natl Acad Sci USA. 1993;90:4319–23.
Funck T, Barnkob MB, Holm N, Ohm-Laursen L, Mehlum CS, Möller S, et al. Nucleotide composition of human Ig nontemplated regions depends on trimming of the flanking gene segments, and terminal deoxynucleotidyl transferase favors adding cytosine, not guanosine, in most VDJ rearrangements. J Immunol. 2018;201:1765–74.
Roldan EQ, Sottini A, Bettinardi A, Albertini A, Imberti L, Primi D. Different TCRBV genes generate biased patterns of V-D-J diversity in human T cells. Immunogenetics. 1995;41:91–100.
Srivastava SK, Robins HS Palindromic nucleotide analysis in human T cell receptor rearrangements. PLOS ONE. 2012; 7: e52250.
Robins HS, Campregher PV, Srivastava SK, Wacher A, Turtle CJ, Kahsai O, et al. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood 2009;114:4099–107.
Sherwood AM, Desmarais C, Livingston RJ, Andriesen J, Haussler M, Carlson CS, et al. Deep sequencing of the human TCRγ and TCRβ repertoires suggests that TCRβ rearranges after αβ and γδ T cell commitment. Sci Transl Med. 2011; 3: 90ra61-90ra61.
Ostmeyer J, Christley S, Cowell L Dynamic kernel matching for non-conforming data: a case study of T-cell receptor datasets. arXiv. https://arxiv.org/abs/2103.10472.
Kontschieder P, Fiterau M, Criminisi A, Bulò SR Deep neural decision forests. In IJCAI'16 Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence; 2016. p. 4190-4.
Naparstek Y, Holoshitz J, Eisenstein S, Reshef T, Rappaport S, Chemke J, et al. Effector T lymphocyte line cells migrate to the thymus and persist there. Nature 1982;300:262–4.
Naparstek Y, Ben-Nun A, Holoshitz J, Reshef T, Frenkel A, Rosenberg M. et al. T lymphocyte lines producing or vaccinating against autoimmune encephalomyelitis (EAE). Funct activation induces peanut agglutinin receptors Accumul brain thymus line cells. Eur J Immunol. 1983;13:418–23.
Michie SA, Kirkpatrick EA, Rouse RV. Rare peripheral T cells migrate to and persist in normal mouse thymus. J Exp Med. 1988;168:1929–34.
Atchley WR, Zhao J, Fernandes AD, Drüke T. Solving the protein sequence metric problem. Proc Natl Acad Sci USA. 2005;102:6395–6400.
Ostmeyer J, Christley S, Rounds WH, Toby I, Greenberg BM, Monson NL, et al. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis. BMC Bioinforma 2017;18:401–401.
Ostmeyer J, Christley S, Toby IT, Cowell LG. Biophysicochemical motifs in T-cell receptor sequences distinguish repertoires from tumor-infiltrating lymphocyte and adjacent healthy tissue. Cancer Res. 2019;79:1671–80.
Ostmeyer J, Lucas E, Christley S, Lea J, Monson N, Tiro J, et al. Biophysicochemical motifs in T cell receptor sequences as a potential biomarker for high-grade serous ovarian carcinoma. PLOS ONE. 2020; 15: e0229569.
Christley S, Ostmeyer J, Quirk L, Zhang W, Monson N, Sirak B, et al. T cell receptor repertoires acquired via routine pap testing may help refine cervical cancer and precancer risk estimates. Front Immunol 2021;12:937.
Glorot X, Bengio Y Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics; 2010. p. 249-56.
Kingma DP, Ba JL Adam: A Method for Stochastic Optimization. In ICLR 2015: International Conference on Learning Representations 2015; 2015.
Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw. 1994;5:157–66.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–80.
JO is grateful to the department of Population & Data Sciences and the University of Texas Southwestern Medical Center for the salary support he received for this study.
JO used his protected time from the department of Population & Data Sciences to conduct this study. LC and Sc may have been supported in part by the US National Institute of Allergy and Infectious Diseases (NIAID) (R01AI097403) and the EU Framework Programme for Research and Innovation (825821).
A provisional patent has been filed based on this study.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ostmeyer, J., Cowell, L., Greenberg, B. et al. Reconstituting T cell receptor selection in-silico. Genes Immun 22, 187–193 (2021). https://doi.org/10.1038/s41435-021-00141-9