Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics

Zhang, Ze; Xiong, Danyi; Wang, Xinlei; Liu, Hongyu; Wang, Tao

doi:10.1038/s41592-020-01020-3

Article
Published: 06 January 2021

Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics

Nature Methods volume 18, pages 92–99 (2021)Cite this article

14k Accesses
34 Citations
68 Altmetric
Metrics details

Subjects

Abstract

Many experimental and bioinformatics approaches have been developed to characterize the human T cell receptor (TCR) repertoire. However, the unknown functional relevance of TCR profiling hinders unbiased interpretation of the biology of T cells. To address this inadequacy, we developed tessa, a tool to integrate TCRs with gene expression of T cells to estimate the effect that TCRs confer on the phenotypes of T cells. Tessa leveraged techniques combining single-cell RNA-sequencing with TCR sequencing. We validated tessa and showed its superiority over existing approaches that investigate only the TCR sequences. With tessa, we demonstrated that TCR similarity constrains the phenotypes of T cells to be similar and dictates a gradient in antigen targeting efficiency of T cell clonotypes with convergent TCRs. We showed this constraint could predict a functional dichotomization of T cells postimmunotherapy treatment and is weakened in tumor contexts.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: TCR networks demonstrate a gradient of targeting efficiency.**

**Fig. 3: TCR similarity determines fate of T cells postimmunotherapy treatment.**

**Fig. 4: CD8⁺ T cells are functionally constrained by TCRs differently in healthy donors and tumor patients.**

High-throughput and single-cell T cell receptor sequencing technologies

Article 19 July 2021

Benchmarking of T cell receptor repertoire profiling methods reveals large systematic biases

Article 07 September 2020

High-sensitive spatially resolved T cell receptor sequencing with SPTCR-seq

Article Open access 16 November 2023

Data availability

The bulk RNA-seq datasets used for deriving TCRs and then for the auto-encoder training are publicly available at https://gdc.cancer.gov/about-data/publications/panimmune (TCGA²³), https://www.iedb.org/database_export_v3.php (IEDB) and http://friedmanlab.weizmann.ac.il/McPAS-TCR/ (McPAS²⁵). We made the Kidney-bulkRNA²⁴ dataset available in csv format at https://github.com/jcao89757/TESSA/tree/master/Tessa_released_data. All scRNA-seq/TCR-seq datasets are publicly available. The NSCLC-1 and healthy PBMC-1 datasets are available on the 10x website https://support.10xgenomics.com/single-cell-vdj/datasets/2.2.0. The healthy-CD8 1–4 datasets are available on https://www.10xgenomics.com/resources/application-notes/a-new-way-of-exploring-immunity-linking-highly-multiplexed-antigen-recognition-to-immune-repertoire-and-phenotype/. The healthy PBMC-2 dataset is also available on the 10x Genomics website https://support.10xgenomics.com/single-cell-vdj/datasets/3.0.0. The NSCLC-2 (ref. ²⁶), CRC²⁷ and HCC²⁸ datasets are downloaded from the European Genome-Phenome Archive (EGA) under accession numbers EGAS00001002430, EGAS00001002791 and EGAS00001002072, respectively. The Breast-1–5 (ref. ²⁹) datasets are available on the Gene Expression Omnibus (GEO) under accession numbers GSE114727 and GSE114724. The Melanoma³⁰, BCC³¹ and ECCITE-Seq¹⁶ datasets are also on the GEO database under study numbers GSE123139, GSE113590 and GSE126310. The Glanville¹⁰ dataset is downloaded from https://doi.org/10.1038/nature22976. The Dash¹¹ dataset is available in the National Center for Biotechnology Information Sequence Read Archive under accession number SRP101659. The details of the data used, including sample size, role in the analysis and references, are shown in Supplementary Table 1. All scRNA-seq data were involved in Fig. 2 (directly or indirectly mentioned), the BCC scRNA-seq data were used in Fig. 3 and all scRNA-seq data were used in Fig. 4. Source data are provided with this paper.

Code availability

The tessa model is available at https://github.com/jcao89757/tessa (https://doi.org/10.5281/zenodo.4161819)⁴⁶. The SCINA model is available at https://github.com/jcao89757/SCINA (https://doi.org/10.3390/genes10070531)⁴⁵.

References

Oettinger, M. A. V(D)J recombination: on the cutting edge. Curr. Opin. Cell Biol. 11, 325–329 (1999).
Article CAS Google Scholar
Jung, D. & Alt, F. W. Unraveling V(D)J recombination: insights into gene regulation. Cell 116, 299–311 (2004).
Article CAS Google Scholar
Kappler, J. et al. The major histocompatibility complex-restricted antigen receptor on T cells in mouse and man: identification of constant and variable peptides. Cell 35, 295–302 (1983).
Article CAS Google Scholar
Haskins, K. et al. The major histocompatibility complex-restricted antigen receptor on T cells. I. Isolation with a monoclonal antibody. J. Exp. Med. 157, 1149–1169 (1983).
Article CAS Google Scholar
Staveley-O’Carroll, K. et al. Induction of antigen-specific T cell anergy: an early event in the course of tumor progression. Proc. Natl Acad. Sci. USA 95, 1178–1183 (1998).
Article Google Scholar
Skapenko, A., Leipe, J., Lipsky, P. E. & Schulze-Koops, H. The role of the T cell in autoimmune inflammation. Arthritis Res. Ther. 7, S4–S14 (2005).
Article Google Scholar
Stubbington, M. J. T. et al. T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13, 329–332 (2016).
Article Google Scholar
Bolotin, D. A. et al. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 35, 908–911 (2017).
Article CAS Google Scholar
Eltahla, A. A. et al. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells. Immunol. Cell Biol. 94, 604–611 (2016).
Article CAS Google Scholar
Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).
Article CAS Google Scholar
Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).
Article CAS Google Scholar
Tubo, N. J. et al. Single naive CD4⁺ T cells from a diverse repertoire produce different effector cell types during infection. Cell 153, 785–796 (2013).
Article CAS Google Scholar
Buchholz, V. R. et al. Disparate individual fates compose robust CD8⁺ T cell immunity. Science 340, 630–635 (2013).
Article CAS Google Scholar
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Article CAS Google Scholar
Sheng, K., Cao, W., Niu, Y., Deng, Q. & Zong, C. Effective detection of variation in single-cell transcriptomes using MATQ-seq. Nat. Methods 14, 267–270 (2017).
Article CAS Google Scholar
Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).
Article CAS Google Scholar
Atchley, W. R., Zhao, J., Fernandes, A. D. & Drüke, T. Solving the protein sequence metric problem. Proc. Natl Acad. Sci. USA 102, 6395–6400 (2005).
Article CAS Google Scholar
Ballard, D. Modular learning in neural networks. In Proc. Sixth National Conference on Artificial Intelligence Vol. 1, 279–284 (ACM, 1987).
Ostmeyer, J. et al. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis. BMC Bioinf. 18, 401 (2017).
Article Google Scholar
Ostmeyer, J., Christley, S., Toby, I. T. & Cowell, L. G. Biophysicochemical motifs in T-cell receptor sequences distinguish repertoires from tumor-infiltrating lymphocyte and adjacent healthy tissue. Cancer Res. 79, 1671–1680 (2019).
Article CAS Google Scholar
Thomas, N. et al. Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence. Bioinformatics 30, 3181–3188 (2014).
Article CAS Google Scholar
Zhang, A. W. et al. Interfaces of malignant and immunologic clonal dynamics in ovarian cancer. Cell 173, 1755–1769.e22 (2018).
Article CAS Google Scholar
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830.e14 (2018).
Article CAS Google Scholar
Wang, T. et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies missing link to prognostic inflammatory factors. Cancer Disco. 8, 1142–1155 (2018).
Article CAS Google Scholar
Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics 33, 2924–2929 (2017).
Article CAS Google Scholar
Guo, X. et al. Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing. Nat. Med. 24, 978–985 (2018).
Article CAS Google Scholar
Zhang, L. et al. Lineage tracking reveals dynamic relationships of T cells in colorectal cancer. Nature 564, 268–272 (2018).
Article CAS Google Scholar
Zheng, C. et al. Landscape of Infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell 169, 1342–1356.e16 (2017).
Article CAS Google Scholar
Azizi, E. et al. Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 174, 1293–1308.e36 (2018).
Article CAS Google Scholar
Li, H. et al. Dysfunctional CD8 T cells form a proliferative, dynamically regulated compartment within human melanoma. Cell 176, 775–789.e18 (2019).
Article CAS Google Scholar
Yost, K. E. et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. Nat. Med. 25, 1251–1259 (2019).
Article CAS Google Scholar
Eduati, F. et al. Prediction of human population responses to toxic compounds by a collaborative competition. Nat. Biotechnol. 33, 933–940 (2015).
Article CAS Google Scholar
Bansal, M. et al. A community computational challenge to predict the activity of pairs of compounds. Nat. Biotechnol. 32, 1213–1222 (2014).
Article CAS Google Scholar
Costello, J. C. & Stolovitzky, G. Seeking the wisdom of crowds through challenge-based competitions in biomedical research. Clin. Pharmacol. Ther. 93, 396–398 (2013).
Article CAS Google Scholar
Waugh, K. A. et al. Molecular profile of tumor-specific CD8+ T cell hypofunction in a transplantable murine cancer model. J. Immunol. 197, 1477–1488 (2016).
Article CAS Google Scholar
Wu, A. A., Drake, V., Huang, H.-S., Chiu, S. & Zheng, L. Reprogramming the tumor microenvironment: tumor-induced immunosuppressive factors paralyze T cells. Oncoimmunology 4, e1016700 (2015).
Article Google Scholar
Burkholder, B. et al. Tumor-induced perturbations of cytokines and immune cell networks. Biochim. Biophys. Acta 1845, 182–201 (2014).
CAS PubMed Google Scholar
Conley, J. M., Gallagher, M. P. & Berg, L. J. T cells and gene regulation: The switching on and turning up of genes after T cell receptor stimulation in CD8 T cells. Front. Immunol. https://doi.org/10.3389/fimmu.2016.00076 (2016).
Cho, J.-H. et al. Unique features of naive CD8⁺ T cell activation by IL-2. J. Immunol. 191, 5559–5573 (2013).
Article CAS Google Scholar
Iezzi, G., Karjalainen, K. & Lanzavecchia, A. The duration of antigenic stimulation determines the fate of naive and effector T cells. Immunity 8, 89–95 (1998).
Article CAS Google Scholar
Moskophidis, D., Lechner, F., Pircher, H. & Zinkernagel, R. M. Virus persistence in acutely infected immunocompetent mice by exhaustion of antiviral cytotoxic effector T cells. Nature 362, 758–761 (1993).
Article CAS Google Scholar
Kalergis, A. M. et al. Efficient T cell activation requires an optimal dwell-time of interaction between the TCR and the pMHC complex. Nat. Immunol. 2, 229–234 (2001).
Article CAS Google Scholar
Corse, E., Gottschalk, R. A., Krogsgaard, M. & Allison, J. P. Attenuated T cell responses to a high-potency ligand in vivo. PLoS Biol. https://doi.org/10.1371/journal.pbio.1000481 (2010).
Mikolov, T., Chen, K., Corrado, G.S., & Dean, J. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).
Zhang, Z. et al. SCINA: a semi-supervised subtyping algorithm of single cells and bulk samples. Genes https://doi.org/10.3390/genes10070531 (2019).
Zhang, Z. jcao89757/TESSA: mapping the functional landscape of T cell receptor repertoire by single T cell transcriptomics. Zenodo https://doi.org/10.5281/zenodo.4161819 (2020).

Download references

Acknowledgements

We thank L.H.R. Xu for his valuable input on the manuscript writing. This study was supported by the National Institutes of Health (NIH) (grant nos. CCSG 5P30CA142543 to T.W. and R15GM131390 to X.W.) and Cancer Prevention Research Institute of Texas (grant no. CPRIT RP190208 to T.W.).

Author information

Authors and Affiliations

Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
Ze Zhang, Hongyu Liu & Tao Wang
Department of Statistical Science, Southern Methodist University, Dallas, TX, USA
Danyi Xiong & Xinlei Wang
Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, USA
Tao Wang

Authors

Ze Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Danyi Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Xinlei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.Z. contributed to the computational analyses and manuscript writing. D.X. and X.W. contributed to the design and write-up of the statistical methodologies. H.L. provided valuable suggestions on the direction of the project, and contributed to manuscript writing. T.W. contributed to the overall supervision of the project, study design and manuscript writing.

Corresponding author

Correspondence to Tao Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Madhura Mukhopadhyay was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Details of the stacked auto-encoder for TCR embedding.

a, The structure of the auto-encoder, with the configurations of each layer shown. b, Typical examples of TCR CDR3b sequences, heatmaps of the initially embedded ‘Atchley’ matrices of TCRs, and heatmaps of the auto-encoder-reconstructed ‘Athley’ matrices. The TCR sequence examples were not used in the training step of the auto-encoder. c, Scatterplots showing the consistency between the ‘Atchley factor’ values of the original and re-constructed TCRs. Green points represent tiles in the heatmaps in (b).

Source data

Extended Data Fig. 2 Scatterplots showing the relationships between the distances of TCRs and the distances of RNA expression levels for several more datasets.

Both distances are calculated in a pair-wise manner between all the T cell clonotypes of each dataset. Four example datasets are shown: Healthy-CD8-3 (a), Healthy-CD8-4 (b), Breast-1 (c), and Breast-2 (d) (Supplementary Table 1). The P values indicate the significance of the Pearson correlation coefficients. The shaded areas denote the 95% confidence intervals for linear regressions.

Source data

Extended Data Fig. 3 The weights of the TCR embeddings learned from tessa.

The X axis shows the digits of the 30-dimensional embeddings, and the Y axis shows the weights learned for all datasets. Each bar represents one digit of the weights and shows the values of that digit obtained from all the 19 scRNA datasets in the Supplementary Table 1.

Source data

Extended Data Fig. 4 Benchmarking results using GLIPH.

a, Clustering rates of the four Healthy-CD8 datasets from 10x Genomics, the Glanville dataset, and the Dash dataset under different global convergence distance cutoff (‘gccutoff’) values (Supplementary Table 1). The dashed lines represented the tessa clustering rates of the corresponding datasets. b, Clustering purities of GLIPH when the ‘gccutoff’ equals to 3. The cutoff value was selected so that the GLIPH clusters achieved clustering rates that are most similar to the tessa networks. The clustering purities were calculated with the same method as in Fig. 2. c, d, The GLIPH network purities (c) and number of networks (d) with different ‘gccutoff’ values, compared with the tessa network purities and the number of networks.

Source data

Extended Data Fig. 5 Clustering of TCR clonotypes informed by tessa is reflective of antigen binding specificity.

The antigen binding specificity of 207 Human TCRβ chains from 704 T cells were profiled against two epitopes in the Dash dataset, and 276 TCRs from 415 T cells against three epitopes in the Glanville dataset. a, b, T-SNE plots showing the TCR clonotypes in the space of the TCR embeddings, with the embeddings adjusted by the tessa-inferred weights. The hierarchical clustering tree cutoff used in the two plots was represented with green dashed lines in c-f. Each point in the plots represents one TCR clonotype, and the size of the point refers to the clone size. Points are colored by the true antigens that the corresponding TCRs target according to the original report. Points are connected if they are clustered into the same network based on hierarchical clustering of the TCR embeddings. T cell clones with only one cell were deemed as having low confidence and unclustered clones, which does not affect the calculation of the purities, were excluded from visualization. c, d, The numbers of TCR networks and the clustering rates with different hierarchical tree cutoffs in the Dash dataset (c) and in the Glanville dataset (d). Cluster rates were calculated as the number of TCR clonotypes that are clustered with at least another TCR clonotype, divided by the total number of TCR clonotypes. e, f, The network purities and p-values testing the significance of the purities with different hierarchical tree cutoffs in the Dash dataset (c) and the Glanville dataset (d). The network purity and P value calculations were described in the Methods section.

Source data

Extended Data Fig. 6 T cell pathway activity scores of the different T cell subsets in the BCC dataset.

The naive and activated pathways are shown, to be compared against the inhibition, memory and exhausted pathways shown in Fig. 3. The T cell subsets were the same as those in Fig. 3e-g.

Source data

Extended Data Fig. 7 Pseudotime analysis of the different T cell subsets in the BCC dataset.

The T cell subsets were the same as those in Fig. 3e–g.

Source data

Extended Data Fig. 8 A cartoon sketch shows how the unexplained variance in gene expression of the TCR networks were determined.

Details were described in the Materials and Methods section.

Source data

Supplementary information

Supplementary Information

Supplementary Tables 1 and 2 and Notes 1 and 2.

Reporting Summary

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Xiong, D., Wang, X. et al. Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics. Nat Methods 18, 92–99 (2021). https://doi.org/10.1038/s41592-020-01020-3

Download citation

Received: 08 April 2020
Accepted: 12 November 2020
Published: 06 January 2021
Issue Date: January 2021
DOI: https://doi.org/10.1038/s41592-020-01020-3

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links