Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening


CD4+ T cells are critical to fighting pathogens, but a comprehensive analysis of human T-cell specificities is hindered by the diversity of HLA alleles (>20,000) and the complexity of many pathogen genomes. We previously described GLIPH, an algorithm to cluster T-cell receptors (TCRs) that recognize the same epitope and to predict their HLA restriction, but this method loses efficiency and accuracy when >10,000 TCRs are analyzed. Here we describe an improved algorithm, GLIPH2, that can process millions of TCR sequences. We used GLIPH2 to analyze 19,044 unique TCRβ sequences from 58 individuals latently infected with Mycobacterium tuberculosis (Mtb) and to group them according to their specificity. To identify the epitopes targeted by clusters of Mtb-specific T cells, we carried out a screen of 3,724 distinct proteins covering 95% of Mtb protein-coding genes using artificial antigen-presenting cells (aAPCs) and reporter T cells. We found that at least five PPE (Pro-Pro-Glu) proteins are targets for T-cell recognition in Mtb.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: The workflow of Mtb-specific T-cell repertoire and GLIPH2 analysis.
Fig. 2: A reporter system to efficiently screen protein antigen.
Fig. 3: Antigen discovery using proteome screening.
Fig. 4: Antigen discovery for T-cell receptor specificity group III.
Fig. 5: Antigen discovery for T-cell receptor specificity group I.
Fig. 6: The discrepancy between peptide and protein stimulation.

Data availability

The data supporting the findings of this study are available within the paper and in its Supplementary Information files.

Code availability

Two compiled standalone versions of GLIPH2 (Executable for MacOS ≥ 10.14.14 and Linux server Centos 7) are provided as Supplementary Code. Also, a web tool for GLIPH2 analysis is available at


  1. 1.

    Zhu, J. & Paul, W. E. CD4 T cells: fates, functions, and faults. Blood 112, 1557–1569 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Corbett, E. L. et al. The growing burden of tuberculosis: global trends and interactions with the HIV epidemic. Arch. Intern. Med. 163, 1009–1021 (2003).

    PubMed  Google Scholar 

  3. 3.

    Lindestam Arlehamn, C. S. & Sette, A. Definition of CD4 immunosignatures associated with MTB. Front. Immunol. 5, 124 (2014).

    PubMed  PubMed Central  Google Scholar 

  4. 4.

    Lindestam Arlehamn, C. S. et al. A quantitative analysis of complexity of human pathogen-specific CD4 T cell responses in healthy M. tuberculosis infected South Africans. PLoS Pathog. 12, e1005760 (2016).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Altman, J. D. et al. Phenotypic analysis of antigen-specific T lymphocytes. Science 274, 94–96 (1996).

    CAS  Google Scholar 

  6. 6.

    Bentzen, A. K. et al. Large-scale detection of antigen-specific T cells using peptide-MHC-I multimers labeled with DNA barcodes. Nat. Biotechnol. 34, 1037–1045 (2016).

    CAS  PubMed  Google Scholar 

  7. 7.

    Hadrup, S. R. et al. Parallel detection of antigen-specific T-cell responses by multidimensional encoding of MHC multimers. Nat. Methods 6, 520–526 (2009).

    CAS  PubMed  Google Scholar 

  8. 8.

    Newell, E. W., Klein, L. O., Yu, W. & Davis, M. M. Simultaneous detection of many T-cell specificities using combinatorial tetramer staining. Nat. Methods 6, 497–499 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Newell, E. W. et al. Combinatorial tetramer staining and mass cytometry analysis facilitate T-cell epitope mapping and characterization. Nat. Biotechnol. 31, 623–629 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Zhang, S.-Q. et al. High-throughput determination of the antigen specificities of T cell receptors in single cells. Nat. Biotechnol. 36, 1156–1159 (2018).

    CAS  Google Scholar 

  11. 11.

    Simoni, Y. et al. Bystander CD8+ T cells are abundant and phenotypically distinct in human tumour infiltrates. Nature 557, 575–579 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Mishto, M. & Liepe, J. Post-translational peptide splicing and T cell responses. Trends Immunol. 38, 904–915 (2017).

    CAS  PubMed  Google Scholar 

  13. 13.

    Laumont, C. M. et al. Noncoding regions are the main source of targetable tumor-specific antigens. Sci. Transl Med. 10, eaau5516 (2018).

    CAS  PubMed  Google Scholar 

  14. 14.

    Kahles, A. et al. Comprehensive analysis of alternative splicing across tumors from 8,705 patients. Cancer Cell 34, 211–224.e6 (2018).

    CAS  PubMed  Google Scholar 

  15. 15.

    Engelhard, V. H., Altrich-Vanlith, M., Ostankovitch, M. & Zarling, A. L. Post-translational modifications of naturally processed MHC-binding epitopes. Curr. Opin Immunol. 18, 92–97 (2006).

    CAS  PubMed  Google Scholar 

  16. 16.

    Cobbold, M. et al. MHC class I-associated phosphopeptides are the targets of memory-like immunity in leukemia. Sci. Transl. Med 5, 203ra125 (2013).

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Bacher, P. et al. Human anti-fungal Th17 immunity and pathology rely on cross-reactivity against Candida albicans. Cell 176, 1340–1355.e15 (2019).

    CAS  PubMed  Google Scholar 

  18. 18.

    Bacher, P. et al. Regulatory T cell specificity directs tolerance versus allergy against aeroantigens in humans. Cell 167, 1067–1078.e16 (2016).

    CAS  PubMed  Google Scholar 

  19. 19.

    Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Huang, H. et al. Select sequencing of clonally expanded CD8+ T cells reveals limits to clonal expansion. Proc. Natl Acad. Sci. USA 116, 8995–9001 (2019).

    CAS  PubMed  Google Scholar 

  21. 21.

    de Sola Pool, I. & Kochen, M. Contacts and influence. Soc. Networks 1, 5–51 (1978).

    Google Scholar 

  22. 22.

    Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Pogorelyy, M. V. et al. Detecting T cell receptors involved in immune responses from single repertoire snapshots. PLoS Biol. 17, e3000314 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Zhang, H. et al. Investigation of antigen-specific T-cell receptor clusters in human cancers. Clin. Cancer Res. 26, 1359–1371 (2020).

    CAS  PubMed  Google Scholar 

  25. 25.

    Han, A., Glanville, J., Hansmann, L. & Davis, M. M. Linking T-cell receptor sequence to functional phenotype at the single-cell level. Nat. Biotechnol. 32, 684–692 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Rubelt, F. et al. Individual heritable differences result in unique cell lymphocyte receptor repertoires of naive and antigen-experienced cells. Nat. Commun. 7, 11112 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Bagaev, D. V. et al. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 48, D1057–D1062 (2020).

    PubMed  Google Scholar 

  28. 28.

    Butler, M. O. et al. A panel of human cell-based artificial APC enables the expansion of long-lived antigen-specific CD4+ T cells restricted by prevalent HLA-DR alleles. Int. Immunol. 22, 863–873 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Rosskopf, S. et al. Creation of an engineered APC system to explore and optimize the presentation of immunodominant peptides of major allergens. Sci. Rep. 6, 31580 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Thorstenson, Y. R. et al. Allelic resolution NGS HLA typing of Class I and Class II loci and haplotypes in Cape Town, South Africa. Hum. Immunol. 79, 839–847 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Fortune, S. M. et al. Mutually dependent secretion of proteins required for mycobacterial virulence. Proc. Natl Acad. Sci. USA 102, 10676–10681 (2005).

    CAS  PubMed  Google Scholar 

  32. 32.

    Garces, A. et al. EspA acts as a critical mediator of ESX1-dependent virulence in Mycobacterium tuberculosis by affecting bacterial cell wall integrity. PLoS Pathog. 6, e1000957 (2010).

    PubMed  PubMed Central  Google Scholar 

  33. 33.

    Karosiene, E. et al. NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ. Immunogenetics 65, 711–724 (2013).

    CAS  PubMed  Google Scholar 

  34. 34.

    Odermatt, N. T., Sala, C., Benjak, A. & Cole, S. T. Essential nucleoid associated protein mIHF (Rv1388) controls virulence and housekeeping genes in Mycobacterium tuberculosis. Sci. Rep. 8, 14214 (2018).

    PubMed  PubMed Central  Google Scholar 

  35. 35.

    Brennan, M. J. et al. The enigmatic PE/PPE multigene family of mycobacteria and tuberculosis vaccination. Infect. Immun. 85, e00969 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647 (2017).

    CAS  PubMed  Google Scholar 

  37. 37.

    Blum, J. S., Wearsch, P. A. & Cresswell, P. Pathways of antigen processing. Annu. Rev. Immunol. 31, 443–473 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Boucau, J. & Le Gall, S. Antigen processing and presentation in HIV infection. Mol. Immunol. 113, 67–74 (2019).

    CAS  PubMed  Google Scholar 

  39. 39.

    Song, I. et al. Broad TCR repertoire and diverse structural solutions for recognition of an immunodominant CD8+ T cell epitope. Nat. Struct. Mol. Biol. 24, 395–406 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Robins, H. S. et al. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood 114, 4099–4107 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Davis, M. M. & Boyd, S. D. Recent progress in the analysis of αβT cell and B cell receptor repertoires. Curr. Opin Immunol. 59, 109–114 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Mahomed, H. et al. Predictive factors for latent tuberculosis infection among adolescents in a high-burden area in South Africa. Int. J. Tuberc. Lung Dis.15, 331–336 (2011).

    CAS  PubMed  Google Scholar 

  43. 43.

    O’Neill, D. W. & Bhardwaj, N. Differentiation of peripheral blood monocytes into dendritic cells. Curr. Protoc. Immunol. 22, 24 (2005).

    Google Scholar 

Download references


We would like to thank the Stanford Human Immune Monitoring Center for their high-throughput sequencing support for this project, M. Mindrinos and co-workers at Sirona Genomics for the HLA typing, S. Xue (Department of Immunology, University College London) for providing the Jurkat 76 T-cell line, J. Li for providing HLA-typed PBMCs, L. Chen and S. Chiou for valuable discussions regarding GLIPH2 optimization, H. Mahomed, W. Hanekom and members of the Adolescent Cohort Study (ACS) group for enrolment and follow-up of the Mtb-infected adolescents, R. DiFazio for help making the schematic overview and Y. Chien for constructive criticism of the manuscript, and J. Pavlovitch-Bedzyk for proofreading. This work was supported by the Bill and Melinda Gates Foundation (grant OPP1113682) and the Howard Hughes Medical Institute.

Author information




H.H., C.W. and M.M.D. conceptualized the study. H.H. performed the experiments with assistance from F.R. C.W. authored the codebase, upgraded the algorithm and performed its benchmark. T.J.S. provided PBMCs from Mtb-infected adolescents. F.R. provided bulk sequencing and bulk TCR analysis. H.H., C.W. and F.R. performed the analysis. H.H. and M.M.D. wrote the manuscript with input from all authors. M.M.D. supervised the study.

Corresponding author

Correspondence to Mark M. Davis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information

Supplementary Figures 1–7

Reporting Summary

Supplementary Table 1

Mtb-specific TCR sequences and summary

Supplementary Table 2

TCR sequences from VDJdb

Supplementary Table 3

TCR specificity groups from GLIPH2 analysis

Supplementary Table 4

Gene list of the whole Mtb ORF clone set

Supplementary Code

Two compiled standalone versions of GLIPH2

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, H., Wang, C., Rubelt, F. et al. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat Biotechnol 38, 1194–1202 (2020).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing