The binding of T-cell receptors to peptide molecules not normally present in the body can trigger an immune response. Predicting which peptide a T-cell receptor will bind to — a difficult feat — has now been achieved. See Letters p.89 & p.94
The T cells of the immune system are covered in protein complexes called T-cell receptors (TCRs). If a TCR binds a peptide fragment of a protein known as an antigen that is not usually present in the body, such as an antigen from a pathogen, this can trigger an immune response. Target recognition by TCRs is often essential in providing immunological protection against infectious diseases and cancer. However, trying to determine or predict the antigen specificity of a TCR on the basis of TCR amino-acid sequence alone is extremely challenging. On pages 89 and 94, respectively, Dash et al.1 and Glanville et al.2 report studies that investigated the relationship between TCR sequence and TCR antigen specificity.
A large collection of gene segments encodes the variable regions of TCR sequences. These genetic fragments undergo a rearrangement process during T-cell development that results in each T cell in the body possessing a unique TCR. The entire assembly of TCRs in an individual is referred to as the TCR repertoire, the scale of which is enormous — estimated3 to be in the range of about 1018 different TCRs (a number similar to the predicted number of grains of sand on Earth). This immense diversity enables the immune system to both recognize and respond to a wide range of potential pathogenic threats. Yet this diversity also makes it a challenge to determine the antigen specificity of a TCR from only its sequence, because numerous possible TCRs can bind to the same antigen4.
Advances in high-throughput DNA sequencing5 have made it possible to investigate TCR repertoires in much greater detail than ever before. These advances include single-cell sequencing techniques that provide information on the natural pairing of parts of the TCR complex known as the α- and β-variable chains6,7. The combination of these two chains provides the TCR structure that binds to an antigen on the surface of a cell, which is presented as a complex with a protein of the major histocompatibility complex (MHC) family. Complexes of MHC proteins and peptides can be produced from genetically engineered cells and used to search for T cells that specifically bind to a given antigen8.
Both Dash et al. and Glanville et al. followed an approach in which MHC proteins and antigens that are associated with common diseases such as influenza or tuberculosis are complexed together and fluorescently labelled, and the resulting structures used to isolate T cells that bind to the specific antigen. The authors performed high-throughput, single-cell DNA sequencing of the TCR genes from these T cells, enabling them to generate a large database of antigen-specific TCR sequences. They then performed a detailed analysis of these protein sequences to identify the patterns of sequence motifs that correlated with antigen specificity (Fig. 1).
Structural analysis has revealed9 that small portions of the TCR, most notably, sections known as complementarity-determining regions (CDRs), are the main sites of direct interaction with MHC–peptide complexes. Dash and colleagues developed a metric known as a distance measure to capture the similarity of any two TCR sequences on the basis of their common amino acids. They used it to calculate similarities between CDRs and identified clustered, highly similar, antigen-specific groups of TCRs that were present in many different samples of T cells from humans or mice. Statistical analysis of these TCR clusters revealed the presence of short sequence motifs in CDRs that were specifically enriched when compared to repertoires that had not been selected for binding to the antigen of interest. Glanville and colleagues performed a similar analysis, focusing exclusively on TCRs from humans that bind to specific antigens that are associated with viral or bacterial infections. Their work also led to the discovery of sequence-enrichment motifs in CDRs, which were not observed in the human TCR repertoires that did not bind to the given antigen.
Both groups then used the enriched sequence motifs they had identified to develop a classification system for predicting antigen specificity on the basis of TCR sequence. In the system developed by Dash and colleagues, the specificity of any given TCR is predicted by identifying the cluster of antigen-specific TCRs that shares the highest sequence similarity with the TCR in question. This approach enabled them to correctly assign the antigen-binding specificity of human and mouse TCRs. When testing 10 different antigens, the authors achieved a success rate of about 80%.
“This classification system was able to predict the antigen-binding specificity of T-cell receptors that it had not encountered before.”
To determine the limits of their prediction method, Dash et al. generated an independent TCR repertoire data set from mouse T cells that bound to four of their tested antigens. They then assigned each TCR of the new data set to an antigen group using their previously developed sequence-similarity classifier. Their prediction of TCR antigen-binding specificity was 90% accurate for three of the four antigens tested. Importantly, 85% of the correctly assigned TCRs had not been observed in the previous data set, demonstrating that this classification system was able to predict the antigen-binding specificity of TCRs that it had not encountered before.
The antigen-prediction system developed by Glanville et al. also enabled the assignment of TCRs to antigen-specific binding groups. When using an independent set of TCR repertoires from humans, the authors could assign TCRs to the correct antigen-binding group, and also identify additional TCR sequences that could bind to the previously tested antigens. Using the enriched sequence motifs present in a previously analysed TCR repertoire that was specific for an antigen from Mycobacterium tuberculosis (the bacterium that causes tuberculosis), the authors designed a set of ten synthetic TCRs not found in the biological samples they had tested, and that they predicted would be specific for the M. tuberculosis antigen. By genetically modifying T cells to express these TCRs, the authors observed during in vitro analysis that eight of the ten synthetic TCRs specifically bound to the antigen. And when they assessed the level of T-cell activation, two of their synthetic TCRs had higher activity than a naturally generated antigen-specific TCR.
An exciting potential application that might emerge from the studies by these two groups is the development of new types of TCR-based clinical diagnostic tool10,11. For example, it might be possible to determine whether having a greater number of antigen-specific TCRs in an individual's repertoire correlates with better immunological protection. However, a limitation of the approach used by Dash et al. and Glanville et al. is that it requires pre-existing knowledge and access to antigenic MHC–peptide complexes to identify the sequence patterns that can be used to predict the antigen-binding specificity of TCRs.
In some clinical situations, including the treatment of cancer, it would be useful to know the extent of the tumour-specific T-cell responses that are present in an individual. However, it is difficult to gain prior knowledge of the tumour antigens that are being targeted by T cells, given the high number of genetic mutations present in tumour cells12. Therefore, methods for identifying TCR antigen-specificity groups without the need to isolate antigen-specific T-cells would be highly valuable. Enriched TCR sequence motifs can be determined in mice that are treated with a specific antigen13, which shows that this might be a feasible approach to overcoming the problem of isolating antigen-specific T cells.
Although Glanville et al. demonstrated that they could design synthetic TCRs that have antigen specificity, they used a relatively simple approach of mixing TCR sequences based on those of natural TCRs. In the future, more-advanced methods that integrate computational biology and structural modelling14 might be used to design highly specific and potent TCRs for use in T-cell therapies15.Footnote 1
Dash, P et al. Nature 547, 89–93 (2017).
Glanville, J. et al. Nature 547, 94–98 (2017).
Murphy, K., Travers, P., Walport, M. & Janeway, C. Janeway's Immunobiology 8th edn (Garland Science, 2012).
Sewell, A. K. Nature Rev. Immunol. 12, 669–677 (2012).
Friedensohn, S., Khan, T. A. & Reddy, S. T. Trends Biotechnol. 35, 203–214 (2017).
Dash, P. et al. J. Clin. Invest. 121, 288–295 (2011).
Han, A., Glanville, J., Hansmann, L. & Davis, M. M. Nature Biotechnol. 32, 684–692 (2014).
Altman, J. D. et al. Science 274, 94–96 (1996).
Garcia, K. C. & Adams, E. J. Cell 122, 333–336 (2005).
Emerson, R. O. et al. Nature Genet. 49, 659–665 (2017).
Attaf, M., Huseby, E. & Sewell, A. K. Cell Mol. Immunol. 12, 391–399 (2015).
Liu, X. S. & Mardis, E. R. Cell 168, 600–612 (2017).
Sun, Y. et al. Front. Immunol. 8, 430 (2017).
Pierce, B. G. et al. PLoS Comput. Biol. 10, e1003478 (2014).
Sadelain, M., Rivière, I. & Riddell, S. Nature 545, 423–431 (2017).