Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Lineage tracking reveals dynamic relationships of T cells in colorectal cancer


T cells are key elements of cancer immunotherapy1 but certain fundamental properties, such as the development and migration of T cells within tumours, remain unknown. The enormous T cell receptor (TCR) repertoire, which is required for the recognition of foreign and self-antigens2, could serve as lineage tags to track these T cells in tumours3. Here we obtained transcriptomes of 11,138 single T cells from 12 patients with colorectal cancer, and developed single T cell analysis by RNA sequencing and TCR tracking (STARTRAC) indices to quantitatively analyse the dynamic relationships among 20 identified T cell subsets with distinct functions and clonalities. Although both CD8+ effector and ‘exhausted’ T cells exhibited high clonal expansion, they were independently connected with tumour-resident CD8+ effector memory cells, implicating a TCR-based fate decision. Of the CD4+ T cells, most tumour-infiltrating T regulatory (Treg) cells showed clonal exclusivity, whereas certain Treg cell clones were developmentally linked to several T helper (TH) cell clones. Notably, we identified two IFNG+ TH1-like cell clusters in tumours that were associated with distinct IFNγ-regulating transcription factors —the GZMK+ effector memory T cells, which were associated with EOMES and RUNX3, and CXCL13+BHLHE40+ TH1-like cell clusters, which were associated with BHLHE40. Only CXCL13+BHLHE40+ TH1-like cells were preferentially enriched in patients with microsatellite-instable tumours, and this might explain their favourable responses to immune-checkpoint blockade. Furthermore, IGFLR1 was highly expressed in both CXCL13+BHLHE40+ TH1-like cells and CD8+ exhausted T cells and possessed co-stimulatory functions. Our integrated STARTRAC analyses provide a powerful approach to dissect the T cell properties in colorectal cancer comprehensively, and could provide insights into the dynamic relationships of T cells in other cancers.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Properties of CD8+ T cell clonal expansion, migration and developmental transition.
Fig. 2: Properties of CD4+ T cell clonal expansion, migration and developmental transition.
Fig. 3: Clonal TH1-like T cells are enriched in MSI tumours.
Fig. 4: IGFLR1 functions as a co-stimulatory receptor in T cells.

Data availability

The data that support the findings of this study are available from the corresponding author upon request. Sequencing data are available at EGA (accession number EGAS00001002791), and processed gene expression data can be obtained from Gene Expression Omnibus (GEO) (accession number GSE108989).


  1. 1.

    Chen, D. S. & Mellman, I. Elements of cancer immunity and the cancer-immune set point. Nature 541, 321–330 (2017).

    ADS  CAS  Article  Google Scholar 

  2. 2.

    Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

    ADS  CAS  Article  Google Scholar 

  3. 3.

    Stubbington, M. J. T. et al. T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13, 329–332 (2016).

    Article  Google Scholar 

  4. 4.

    Le, D. T. et al. PD-1 Blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).

    CAS  Article  Google Scholar 

  5. 5.

    Le, D. T. et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science 357, 409–413 (2017).

    ADS  CAS  Article  Google Scholar 

  6. 6.

    Llosa, N. J. et al. The vigorous immune microenvironment of microsatellite instable colon cancer is balanced by multiple counter-inhibitory checkpoints. Cancer Discov. 5, 43–51 (2015).

    CAS  Article  Google Scholar 

  7. 7.

    Mlecnik, B. et al. Integrative analyses of colorectal cancer show immunoscore is a stronger predictor of patient survival than microsatellite instability. Immunity 44, 698–711 (2016).

    CAS  Article  Google Scholar 

  8. 8.

    Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).

    ADS  CAS  Article  Google Scholar 

  9. 9.

    Zheng, C. et al. Landscape of infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell 169, 1342–1356.e16 (2017).

    CAS  Article  Google Scholar 

  10. 10.

    Guo, X. et al. Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing. Nat. Med. 24, 978–985 (2018).

    CAS  Article  Google Scholar 

  11. 11.

    Han, A., Glanville, J., Hansmann, L. & Davis, M. M. Linking T-cell receptor sequence to functional phenotype at the single-cell level. Nat. Biotechnol. 32, 684–692 (2014).

    CAS  Article  Google Scholar 

  12. 12.

    The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

    ADS  Article  Google Scholar 

  13. 13.

    Schenkel, J. M. & Masopust, D. Tissue-resident memory T cells. Immunity 41, 886–897 (2014).

    CAS  Article  Google Scholar 

  14. 14.

    Cheroutre, H., Lambolez, F. & Mucida, D. The light and dark sides of intestinal intraepithelial lymphocytes. Nat. Rev. Immunol. 11, 445–456 (2011).

    CAS  Article  Google Scholar 

  15. 15.

    Paley, M. A. et al. Progenitor and terminal subsets of CD8+ T cells cooperate to contain chronic viral infection. Science 338, 1220–1225 (2012).

    ADS  CAS  Article  Google Scholar 

  16. 16.

    Thommen, D. S. et al. A transcriptionally and functionally distinct PD-1+ CD8+ T cell pool with predictive potential in non-small-cell lung cancer treated with PD-1 blockade. Nat. Med. 24, 994–1004 (2018).

    CAS  Article  Google Scholar 

  17. 17.

    Chihara, N. et al. Induction and transcriptional regulation of the co-inhibitory gene module in T cells. Nature 558, 454–459 (2018).

    ADS  CAS  Article  Google Scholar 

  18. 18.

    Griffith, J. W., Sokol, C. L. & Luster, A. D. Chemokines and chemokine receptors: positioning cells for host defense and immunity. Annu. Rev. Immunol. 32, 659–702 (2014).

    CAS  Article  Google Scholar 

  19. 19.

    Adeegbe, D. O. & Nishikawa, H. Natural and induced T regulatory cells in cancer. Front. Immunol. 4, 190 (2013).

    CAS  Article  Google Scholar 

  20. 20.

    Sefik, E. et al. Individual intestinal symbionts induce a distinct population of RORγ+ regulatory T cells. Science 349, 993–997 (2015).

    ADS  CAS  Article  Google Scholar 

  21. 21.

    Beyer, M. et al. Repression of the genome organizer SATB1 in regulatory T cells is required for suppressive function and inhibition of effector differentiation. Nat. Immunol. 12, 898–907 (2011).

    CAS  Article  Google Scholar 

  22. 22.

    Roychoudhuri, R. et al. BACH2 represses effector programs to stabilize Treg-mediated immune homeostasis. Nature 498, 506–510 (2013).

    ADS  CAS  Article  Google Scholar 

  23. 23.

    Zhang, Y. et al. Genome-wide DNA methylation analysis identifies hypomethylated genes regulated by FOXP3 in human regulatory T cells. Blood 122, 2823–2836 (2013).

    CAS  Article  Google Scholar 

  24. 24.

    Djuretic, I. M. et al. Transcription factors T-bet and Runx3 cooperate to activate Ifng and silence Il4 in T helper type 1 cells. Nat. Immunol. 8, 145–153 (2007).

    CAS  Article  Google Scholar 

  25. 25.

    Yu, F. et al. The transcription factor Bhlhe40 is a switch of inflammatory versus antiinflammatory Th1 cell fate determination. J. Exp. Med. 215, 1813–1821 (2018).

    CAS  Article  Google Scholar 

  26. 26.

    Huynh, J. P. et al. Bhlhe40 is an essential repressor of IL-10 during Mycobacterium tuberculosis infection. J. Exp. Med. 215, 1823–1838 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Wei, S. C. et al. Distinct cellular mechanisms underlie anti-CTLA-4 and anti-PD-1 checkpoint blockade. Cell 170, 1120–1133.e17 (2017).

    CAS  Article  Google Scholar 

  28. 28.

    Lobito, A. A. et al. Murine insulin growth factor-like (IGFL) and human IGFL1 proteins are induced in inflammatory skin conditions and bind to a novel tumor necrosis factor receptor family member, IGFLR1. J. Biol. Chem. 286, 18969–18981 (2011).

    CAS  Article  Google Scholar 

  29. 29.

    Singer, M. et al. A distinct gene module for dysfunction uncoupled from activation in tumor-infiltrating T cells. Cell 166, 1500–1511.e9 (2016).

    CAS  Article  Google Scholar 

  30. 30.

    Schietinger, A. et al. Tumor-specific T cell dysfunction is a dynamic antigen-driven differentiation program initiated early during tumorigenesis. Immunity 45, 389–401 (2016).

    CAS  Article  Google Scholar 

  31. 31.

    Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    CAS  Article  Google Scholar 

  32. 32.

    Emtage, P. C. et al. Second-generation anti-carcinoembryonic antigen designer T cells resist activation-induced cell death, proliferate on tumor contact, secrete cytokines, and exhibit superior antitumor activity in vivo: a preclinical evaluation. Clin. Cancer Res. 14, 8112–8122 (2008).

    CAS  Article  Google Scholar 

  33. 33.

    Chodisetti, S. B. et al. Triggering through Toll-like receptor 2 limits chronically stimulated T-helper type 1 cells from undergoing exhaustion. J. Infect. Dis. 211, 486–496 (2015).

    CAS  Article  Google Scholar 

  34. 34.

    Wu, T. D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).

    CAS  Article  Google Scholar 

  35. 35.

    Lun, A. T., Bach, K. & Marioni, J. C. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17, 75 (2016).

    Article  Google Scholar 

  36. 36.

    Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).

    Article  Google Scholar 

  37. 37.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  Article  Google Scholar 

  38. 38.

    Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–1817 (2012).

    CAS  Article  Google Scholar 

  39. 39.

    Amarasinghe, K. C. et al. Inferring copy number and genotype in tumour exome data. BMC Genomics 15, 732 (2014).

    Article  Google Scholar 

  40. 40.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  Google Scholar 

  41. 41.

    van Wilgenburg, B. et al. MAIT cells are activated during human viral infections. Nat. Commun. 7, 11653 (2016).

    ADS  Article  Google Scholar 

  42. 42.

    Godfrey, D. I., Stankovic, S. & Baxter, A. G. Raising the NKT cell family. Nat. Immunol. 11, 197–206 (2010).

    CAS  Article  Google Scholar 

  43. 43.

    Kiselev, V. Y. et al. SC3: consensus clustering of single-cell RNA-seq data. Nat. Methods 14, 483–486 (2017).

    CAS  Article  Google Scholar 

  44. 44.

    Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    CAS  Article  Google Scholar 

  45. 45.

    Rodriguez, A. & Laio, A. Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014).

    ADS  CAS  Article  Google Scholar 

  46. 46.

    Strehl, A. & Ghosh, J. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002).

    MathSciNet  MATH  Google Scholar 

  47. 47.

    Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    ADS  CAS  Article  Google Scholar 

  48. 48.

    Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    CAS  Article  Google Scholar 

  49. 49.

    Whitfield, M. L., George, L. K., Grant, G. D. & Perou, C. M. Common markers of proliferation. Nat. Rev. Cancer 6, 99–106 (2006).

    CAS  Article  Google Scholar 

  50. 50.

    Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods 14, 309–315 (2017).

    CAS  Article  Google Scholar 

  51. 51.

    Kirsch, I., Vignali, M. & Robins, H. T-cell receptor profiling in cancer. Mol. Oncol. 9, 2063–2070 (2015).

    CAS  Article  Google Scholar 

Download references


We thank C. X. Ye for sample preparation and F. Wang, X. Zhang and J. S. Li for assistance with FACS. We thank the Computing Platform of the CLS (Peking University). This project was supported by Beijing Advanced Innovation Centre for Genomics at Peking University, Key Technologies R&D Program (2016YFC0900100), National Natural Science Foundation of China (81573022, 31530036, 91742203 and 81672375) and Amgen Corporation (USA). L.Z. was supported by the Postdoctoral Foundation of CLS.

Reviewer information

Nature thanks N. Haining, M. Suva and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information




Z.Z., W.O. and L.Z. designed experiments. L.Z., X.Y., Q.F., R.G., Q.Z., J.Y.H., H.K. and X.G. performed the experiments. L.Z., L.T.Z., Y.Z., X.R. and X.H. analysed sequencing data. Y.L., J.Y., S.W., Y.G. and Z.S. collected clinical samples. B.K. constructed the website. L.Z., W.O. and Z.Z. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Zhanlong Shen or Wenjun Ouyang or Zemin Zhang.

Ethics declarations

Competing interests

W.O., X.Y., H.K. and J.Y.H. are employees of Amgen Inc.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Study design and tracking T cell dynamics of patients with CRC by STARTRAC.

a, The experimental flowchart of this study. b, A cartoon illustrating four indices defined by STARTRAC to characterize T cell dynamics. STARTRAC-dist, tissue preference of a cluster estimated by ratios of observed cell numbers to random expectations (Ro/e); STARTRAC-expa, degree of clonal expansion of a cluster defined as ‘1 − evenness’, with evenness as the normalized Shannon entropy of its TCR distribution; STARTRAC-migr, migratory potential of a cluster estimated by the average entropy of its clonotypes across tissues; STARTRAC-tran, potentials of developmental transitions of a cluster, estimated by the average entropy of its clonotypes across two different functional clusters. The detailed definitions of STARTRAC indices are in Methods. c, Opal multi-colour IHC staining with anti-CD3, -CD4, -CD8 and -FOXP3 antibodies to validate the existence of T cells in CRC tumours (exemplified by patient P0215). Original magnification, ×20. TC, CD8+ cytotoxic T cells; TH, CD4+ T helper cells. d, Gating strategy for single T cell sorting in this study (exemplified by patient P0215). TC, TH and Treg cells were enriched by sorting 7AADCD3+CD8+, 7AADCD3+CD4+ CD25-/int and 7AADCD3+CD4+ CD25hi T cells, respectively.

Extended Data Fig. 2 Pathological and genomic characteristics of CRC tumours in the study.

a, Deficiency of mismatch repair proteins including MLH1, MSH2, MSH6 and PMS2 in all MSI patients (P0413, P0825, P0909 and P0123) measured by IHC (n = 12 patients). +, proficiency; −, deficiency. Original magnification, ×200. b, Profiles of DNA copy numbers of two representative patients (MSI patient, top; MSS patient, bottom). The copy number information was obtained by ADTex and depicted in bin count plots across chromosomes. The read count ratios (‘1’ in y axis means baseline copy number) and B allele frequencies (BAF) are shown. Various coloured dots in the ratio graph represent different copy number status of each segment. ASCNA, allele-specific copy number alteration; HET, heterozygous; LOH, loss of heterozygosity.

Extended Data Fig. 3 Basic information of the single T cell RNA-seq data.

a, Saturation curves of the number of detected genes against sequencing depth (exemplified by cell NTC-53 from patient P1228). Each point on the curve is derived from calculations based on the random selection of a fraction of raw reads from each sample, representing the average of 100 replicate sub-samplings. Error bars denote s.d. Each line with a different colour shows how fast a gene can reach detection saturation at different expression levels, represented by a particular TPM value. b, Unbiased coverage of gene body from 5′ to 3′ between blood, tumours and adjacent normal tissues. c, Frequencies of the V and J segments of the TCR α chains. d, Frequencies of the V and J segments of the TCR β chains. e, Bar plots showing the number of clonotypes and clonal cells in each CD8+ and CD4+ T cell cluster. The clonotypes are categorized as unique (n = 1) and clonal (n = 2 and n ≥ 3) based on their cell numbers. Clonal cells are defined as those clonotypes containing at least two cells. f, t-SNE projection of 3,557 CD8+ T cells (CD8_C01-LEF1, n = 174; CD8_C02-GPR183, n = 169; CD8_C03-CX3CR1, n = 743; CD8_C04-GZMK, n = 773; CD8_C05-CD6, n = 487; CD8_C06-CD160, n = 351; CD8_C07-LAYN, n = 860) based on different clustering methods including SC3, Seurat and sscClust. Each point represents one single cell coloured by cluster label. g. Box plots showing the down-sampling analysis of clustering performed on CD8+ and CD4+ T cell dataset. Each dot represents an individual clustering of a given number of T cells. The down-sampling and clustering were performed iteratively for each cell number (n = 10 times). Each down-sampled clustering was compared to the clustering performed on the entire dataset, using the NMI index. Higher NMI values indicate more accurate cluster assignment.

Extended Data Fig. 4 Expression levels of signature genes in each T cell cluster.

a, Gene expression heat map of 8 CD8+ T cell (n = 3,628) clusters. Rows represent signature genes and columns represent different clusters. b, Gene expression heat map of 12 CD4+ T cell clusters (n = 4,902). c, t-SNE plot of expression levels of selected genes in different clusters indicated by the coloured oval corresponding to Fig. 1a. Number of cells contained in each cluster: CD8_C01-LEF1, n = 174; CD8_C02-GPR183, n = 169; CD8_C03-CX3CR1, n = 743; CD8_C04-GZMK, n = 773; CD8_C05-CD6, n = 487; CD8_C06-CD160, n = 351; CD8_C07-LAYN, n = 860; CD8_C08-SLC4A40, n = 71; CD4_C01-CCR7, n = 462; CD4_C02-ANXA1, n = 472; CD4_C03-GNLY, n = 190; CD4_C04-TCF7, n = 388; CD4_C05-CXCR6, n = 568; CD4_C06-CXCR5, n = 262; CD4_C07-GZMK, n = 185; CD4_C08-IL23R, n = 244; CD4_C09-CXCL13, n = 319; CD4_C10-FOXP3, n = 389; CD4_C11-IL10, n = 103; CD4_C12-CTLA4, n = 1,320.

Extended Data Fig. 5 Summary of functional properties of various T cell clusters.

a, Functional subsets of CD4+ T cell (n = 4,902) clusters defined by a set of known marker genes. Number of cells contained in each CD4+ cluster: TN, n = 462; P.TCM, n = 472; TEMRA, n = 190; N.TCM, n = 388; TRM, n = 568; follicular T helper (TFH), n = 262; TEM, n = 185; TH17, n = 244; TH1-like cells, n = 319; P.Treg, n = 389; N.Treg, n = 103; T.Treg, n = 1,320. N, normal tissue; P, peripheral blood; T, tumour. b, Characteristics of the CD8+ IEL T cells as defined by the expression properties of a panel of functionally relevant genes in CD8+ T cells (n = 3,628). Number of cells contained in each CD8+ cluster: TN, n = 174; TCM, n = 169; TEMRA, n = 743; TEM, n = 773; TRM, n = 487; IEL, n = 351; TEX, n = 860; MAIT, n = 71. For violin plots in a and b, colours denote average expression levels; widths denote cell densities. c, t-SNE plot showing the presence of different T cell clusters in peripheral blood (n = 2,449; CD8+ T cells, n = 1,021; CD4+ T cells, n = 1,428), adjacent normal tissues (n = 1,962; CD8+ T cells, n = 961; CD4+ T cells, n = 1,001) and tumours (n = 4,119; CD8+ T cells, n = 1,646; CD4+ T cells, n = 2,473). d, Overview of T cell cluster characteristics. STARTRAC-dist: +++ indicates Ro/e > 1; ++, 0.8 < Ro/e ≤ 1; +, 0.2 < Ro/e ≤ 0.8; +/−, 0 < Ro/e ≤ 0.2; −, Ro/e = 0. STARTRAC-expa: +++ indicates \({I}_{{\rm{expa}}}^{{\rm{STARTRAC}}}\) > 0.10; ++, 0.06 < \({I}_{{\rm{expa}}}^{{\rm{STARTRAC}}}\) ≤ 0.10; +, 0.005 < \({I}_{{\rm{expa}}}^{{\rm{STARTRAC}}}\) ≤ 0.05; -, \({I}_{{\rm{expa}}}^{{\rm{STARTRAC}}}\) ≤ 0.005. STARTRAC-migr: +++ indicates \({I}_{{\rm{migr}}}^{{\rm{STARTRAC}}}\) > 0.50; ++, 0.21 < \({I}_{{\rm{migr}}}^{{\rm{STARTRAC}}}\) ≤ 0.50; +, 0.1 < \({I}_{{\rm{migr}}}^{{\rm{STARTRAC}}}\) ≤ 0.20; −, \({I}_{{\rm{migr}}}^{{\rm{STARTRAC}}}\) ≤ 0.1. STARTRAC-tran, +++ indicates \({I}_{{\rm{tran}}}^{{\rm{STARTRAC}}}\) > 0.20; ++, 0.10 < \({I}_{{\rm{tran}}}^{{\rm{STARTRAC}}}\) ≤ 0.20; +, 0.05 < \({I}_{{\rm{tran}}}^{{\rm{STARTRAC}}}\) ≤ 0.10; −, \({I}_{{\rm{tran}}}^{{\rm{STARTRAC}}}\) ≤ 0.05.

Extended Data Fig. 6 CD8+ TEX cells are characterized by high proliferation property and production of effector molecules.

a, A subpopulation of CD8+ TEX shows high expression of MKI67 among 8,530 T cells. b, Gene set enrichment analysis (GSEA) showing the enrichment of proliferation-related pathways in CD8+ TEX cells (n = 3,628; false discovery rate < 0.01; labelled in red). c, Representative example of a CRC tumour stained by multi-coloured IHC showing co-expression of Ki67, CD8, PD-1 and HAVCR2 in CD8+ TEX cells (exemplified by P0413; n = 2 patients). Original magnification, ×20. d, Volcano plot showing the differentially expressed genes between high-proliferative (n = 140) and low-proliferative (n = 720) TEX cells. Most of the highly expressed genes in high-proliferative TEX cells are related to cell proliferation. Adjusted P < 0.01; fold change ≥ 2; two-sided unpaired limma-moderated t-test; Benjamini–Hochberg adjusted P value e, Violin plot showing the expression of TBX21, EOMES and PDCD1 in each CD8+ T cell (n = 3,628) cluster and the low-proliferative (n = 720) or high-proliferative (n = 140) TEX cell subsets. f, Most of the clonotypes of high-proliferative TEX cells were also found in low-proliferative TEX cells (top). Each row represents an individual clonotype from one patient. Venn diagram showing overlapped clonal clonotypes (≥2 cells) of high- and low-proliferative TEX cells (bottom). g. Characteristics of CD8+ TEX cells (n = 3,628) as defined by the gene expression of a series of transcription factors, checkpoint receptors, and effector molecules. For violin plots in e and g, colours denote average expression levels; widths denote cell densities.

Extended Data Fig. 7 Distinct migration capabilities of different CD8+ T cell clusters.

a, Top, chord diagram showing the distribution of clonotypes in blood, normal mucosa and tumours for different CD8+ T cell clusters. CD8+ TEMRA cells show remarkable TCR sharing among different tissues. The shadows coloured in transparent yellow, green and orange represent blood, normal and tumour-specific clonotypes, respectively. The bridges coloured in dark green, dark red and dark blue represent clonotypes shared by blood–normal, blood–tumour and normal–tumour, respectively. Bottom, Venn diagram showing the distribution of expanded clonotypes in blood, tumour and normal TEMRA cells. b, Relative average expression patterns of migration-related genes across CD8+ T cell clusters (total n = 3,557 cells, excluding MAIT cells). Number of cells contained in each cluster: TN, n = 174; TCM, n = 169; TEMRA, n = 743; TEM, n = 773; TRM, n = 487; IEL, n = 351; TEX, n = 860.

Extended Data Fig. 8 TCR sharing and state transitions of CD8+ T cell clusters implicated by STARTRAC-tran indices.

a, Pie charts showing the fraction of shared clonotypes with CD8+ TEM cells within the other indicated clusters (left). P12 represents merged data of 12 patients with CRC. Bar plots showing the fraction of shared clonotypes of CD8+ TEM with other clusters within the CD8+ TEM. b, pSTARTRAC-tran indices of CD8+ TCM, TEMRA, TRM, IEL and TEX cells for each patient (depicted by dots). *P < 0.05, **P < 0.01, ***P < 0.001, Kruskal–Wallis test. c, Potential developmental trajectory of CD8+ T cells (n = 3,557, excluding MAIT cells) inferred by Monocle2 based on gene expressions. d, Frequency of shared clonotypes in CD8+ TEMRA cells with various TEM cell subsets in each patient (n = 12). e, Statistical analysis of tumour TEM shared TCRs with blood TEMRA and tumour TEX cells based on the number of clonotypes and clonal cells (related to Fig. 1h). ***P < 0.001, two-sided Fisher’s exact test. f, Clonotypes of tumour TEM cells crossing different clusters showing mutually exclusive TCR sharing of tumour TEM cells with blood TEMRA and tumour TEX cells. Each row represents an individual clonotype from one patient. *P < 0.05, **P < 0.01, ***P < 0.001, two-sided Fisher’s exact test (based on the number of clonal cells in each patient). Number of clonal cells analysed in each patient: P1212, n = 30; P1228, n = 27; P0411, n = 11; P0825, n = 10; P1012, n = 7; P0701, n = 9; P0123, n = 9; P0215, n = 17; P0309, n = 9; P0413, n = 2; P1207, n = 7; P0909, n = 2.

Extended Data Fig. 9 Characterization of CD4+ TEMRA and tumour Treg cells by STARTRAC analysis.

a, Violin plots showing normalized expression of cytotoxic related molecules in 12 CD4+ (n = 4,902 cells) and 8 CD8+ (n = 3,628) T cell clusters. Colours denote mean values; width denotes cell densities. b, Venn diagram highlighting common clonotypes (ncell ≥ 2) shared between tumour Treg and other CD4+ T cell clusters. c, Developmental transition of tumour Treg cells, TH17 cells and TH1-like cells with other CD4+ cells quantified by pSTARTRAC-tran indices for each patient (n = 11). d, Representative example of a CRC tumour stained by IHC, with white arrow showing co-expression of CD3, FOXP3 and RORγ (n = 2 patients). Original magnification, ×20.

Extended Data Fig. 10 Comparative analysis of T cells from different cancer indications based on integrated analyses.

a, t-SNE plot of 8,874 single CD8+ T cells from this CRC study (n = 3,632), and previous HCC9 (n = 1,467) and NSCLC10 (n = 3,775) studies. Nine CD8+ clusters were generated by sscClust based on the integrated dataset. The CRC-specific IEL cells (CD8_C06-CD160) are highlighted. b, t-SNE plot of 12,635 single CD4+ T cells from this CRC study (n = 4,929), and previous HCC9 (n = 2,472) and NSCLC10 (n = 5,234) studies. The CRC-enriched TH17 cells (CD4_C10-IL23R) are highlighted. Each dot represents one single cell coloured by clusters and shaped by tumour types in a and b. c, Composition of different CD8+ T cells in each tumour type by different tissue origins. CD8+ T cell clusters with frequencies below 3% are not labelled. d, Composition of different CD4+ T cells in each tumour type by different tissue origins. CD4+ T cell clusters with frequencies below 3% are not labelled. e, Comparison of the fractions of CD8+ IEL (CD8_C08-CD160) and MAIT (CD8_C09-SLC4A10) cells in tumours from patients with CRC (n = 12), HCC (n = 5) and NSCLC (n = 14). f, Comparison of the fractions of different CD8+ T cells and CD4+ T cells in control tissues from patients with CRC (n = 12), HCC (n = 5) and NSCLC (n = 14). g, Validation of the enrichments of CXCL13+BHLHE40+ TH1-like cells in patients with MSI-H CRC (n = 62) and TH17 cells in patients with MSS CRC (n = 286) in the TCGA COAD and READ cohorts by comparison of the indicated signature gene expression. Centre lines denote the median, top and bottom lines denote the 25th and 75th percentiles. *P < 0.05, **P < 0.01, ***P < 0.001; two-sided Wilcoxon test (eg).

Extended Data Fig. 11 IGFLR1 expression in activated CD4+ T cells and exhausted CD8+ T cells.

a, Volcano plot showing differentially expressed genes between tumour CXCL13+BHLHE40+ TH1-like T cells (n = 203) and other TH cells in tumours (n = 723; Supplementary Table 10). Adjusted P < 0.01 (two-sided unpaired limma-moderated t-test; Benjamini–Hochberg adjusted P value) and fold change ≥ 2. b, Venn diagram showing the overlap of tumour CD8+ exhaustion-related genes identified in this study (n = 68, Supplementary Table 11) with those from previous melanoma8 (n = 349), HCC9 (n = 82) and NSCLC10 (n = 90) studies. The detailed overlaps of CD8+ exhaustion-related genes in different cancer types are in Supplementary Table 11. P < 2.2 × 10−16, hypergeometric test. c, CD4+ naive (TN) and memory (TMEM) T cells were gated as CD45RA+CCR7+ and CD45RACCR7+/− cells by FACS. d, FACS plots of IGFLR1 expression in activated CD4+ T cells (n = 6 donors, n = 3 independent experiments). e, Quantification of IGFLR1 expression levels from d as a percentage of IGFLR1+ TN or TMEM CD4+ subsets under suboptimal activation conditions (n = 7). Each symbol represents a donor with mean ± s.e.m. shown (e, l). f, Representative FACS plots for HAVCR2 and IFNγ expression levels in CD8+ Tconv (activated by anti-CD3 plus anti-CD28) and TCS cells (in vitro chronically stimulated exhausted CD8+ T cells from corresponding individuals). Numbers in quadrants indicate the percentage of positive cells (n = 5 donors, n = 2 independent experiments). g, Representative histograms of PD-1, HAVCR2 (n = 8 donors, n = 3 independent experiments), CD39 and LAG3 (n = 4 donors, n = 2 independent experiments) expression levels in CD8+ Tconv and TCS cells. h, Quantification of IFNγ levels produced by CD8+ Tconv and TEX cells from g of three donors. i, Representative histogram of IGFLR1 expression levels in CD8+ Tconv and TCS cells. j, Expression levels of IGFLR1 in activated CD8+ Tconv and TCS cells determined by FACS (MFI, mean fluorescent intensity; n = 6 donors, n = 4 independent experiments). k, Representative histograms of HAVCR2 expression in TCS cells subjected to re-stimulation with anti-CD3 alone (control) or together with recombinant human IGFL3 as well as indicated antibodies for 2 days (n = 5 donors, n = 3 independent experiments). l, Quantification of HAVCR2 levels from k. Two-sided paired Student’s t-test (e, j and l).

Supplementary information

Reporting Summary

Supplementary Tables

This file contains Supplementary Tables 1-11 and a Supplementary Table Guide.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Yu, X., Zheng, L. et al. Lineage tracking reveals dynamic relationships of T cells in colorectal cancer. Nature 564, 268–272 (2018).

Download citation


  • Treg Cells
  • MAIT Cells
  • Clonotypes
  • TEMRA Cells
  • Mucosal-associated Invariant T (MAIT)

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing