Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Systematic identification of minor histocompatibility antigens predicts outcomes of allogeneic hematopoietic cell transplantation

Abstract

T cell alloreactivity against minor histocompatibility antigens (mHAgs)—polymorphic peptides resulting from donor–recipient (D–R) disparity at sites of genetic polymorphisms—is at the core of the therapeutic effect of allogeneic hematopoietic cell transplantation (allo-HCT). Despite the crucial role of mHAgs in graft-versus-leukemia (GvL) and graft-versus-host disease (GvHD) reactions, it remains challenging to consistently link patient-specific mHAg repertoires to clinical outcomes. Here we devise an analytic framework to systematically identify mHAgs, including their detection on HLA class I ligandomes and functional verification of their immunogenicity. The method relies on the integration of polymorphism detection by whole-exome sequencing of germline DNA from D–R pairs with organ-specific transcriptional- and proteome-level expression. Application of this pipeline to 220 HLA-matched allo-HCT D–R pairs demonstrated that total and organ-specific mHAg load could independently predict the occurrence of acute GvHD and chronic pulmonary GvHD, respectively, and defined promising GvL targets, confirmed in a validation cohort of 58 D–R pairs, for the prevention or treatment of post-transplant disease recurrence.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Building a pipeline for systematic mHAg discovery.
Fig. 2: Antigenicity and immunogenicity of predicted mHAgs.
Fig. 3: mHAgs shape GvHD outcomes.
Fig. 4: mHAgs in acute liver GvHD.
Fig. 5: Predicted GvL mHAgs as targets for leukemia immunotherapy.

Similar content being viewed by others

Data availability

WES and RNA-seq data from the training HCT, DFCI-MRD and HP-MRD cohorts is available through the dbGaP portal (accession phs003394.v1.p1). The original mass spectra, PSMs and protein sequence databases used for searches have been deposited in the public proteomics repository MassIVE (https://massive.ucsd.edu) and are accessible at ftp://massive.ucsd.edu/v08/MSV000095025/. Original mass spectrometry data for the previously published B721.221 monoallelic immunopeptidomes are accessible at ftp://massive.ucsd.edu/MSV000080527. The GI GvHD scRNA-seq dataset is available from the corresponding author upon reasonable request. All external datasets used in this study (identified by reference number) are summarized in Supplementary Fig. 18, and their accession numbers are as follows: GSE164403 (ref. 29); GSE124395 (ref. 30); GSE115469 (ref. 31); EGAS00001002649(ref. 32); GSE123904 (ref. 33); GSE116222 (ref. 34); GSE125970 (ref. 35); GSE164241 (ref. 36); GSE164403 (ref. 37); GSE116256 (ref. 39); dbGAP study ID 30641 (ref. 40), accession ID phs001657.v1.p1; E-MTAB-8581 (ref. 62) accessed online through https://developmentcellatlas.ncl.ac.uk; www.proteinatlas.org/about/download (ref. 43); GSE109093 (ref. 44); GSE113046 (ref. 45). GTEx data were accessed from https://gtexportal.org/home/, IEDB from https://www.iedb.org/database_export_v3.php and 1000 Genomes project from https://www.internationalgenome.org/data. Additional databases/datasets used include Interferome (www.interferome.org), MutSigDB (https://www.gsea-msigdb.org/gsea/msigdb/). Source data are provided with this paper.

Code availability

The source code and documentation for the mHAg pipeline are available under https://github.com/nidhih2/mhags (https://doi.org/10.5281/zenodo.11658572 (ref. 97) for autosomal mHAg prediction and https://doi.org/10.5281/zenodo.11658599 (ref.98) for Y mHAg prediction).

References

  1. Copelan, E. A. Hematopoietic stem-cell transplantation. N. Engl. J. Med. 354, 1813–1826 (2006).

    Article  CAS  PubMed  Google Scholar 

  2. Griffioen, M., van Bergen, C. A. & Falkenburg, J. H. Autosomal minor histocompatibility antigens: how genetic variants create diversity in immune targets. Front. Immunol. 7, 100 (2016).

    Article  PubMed  Google Scholar 

  3. Mutis, T., Xagara, A. & Spaapen, R. M. The connection between minor h antigens and neoantigens and the missing link in their prediction. Front. Immunol. 11, 1162 (2020).

    Article  CAS  PubMed  Google Scholar 

  4. Zeiser, R. & Blazar, B. R. Acute graft-versus-host disease—biologic process, prevention, and therapy. N. Engl. J. Med. 377, 2167–2179 (2017).

    Article  CAS  PubMed  Google Scholar 

  5. Zeiser, R. & Blazar, B. R. Pathophysiology of chronic graft-versus-host disease and therapeutic targets. N. Engl. J. Med. 377, 2565–2579 (2017).

    Article  CAS  PubMed  Google Scholar 

  6. Aljurf, M. et al. Worldwide network for blood & marrow transplantation (WBMT) special article, challenges facing emerging alternate donor registries. Bone Marrow Transplant. 54, 1179–1188 (2019).

    Article  PubMed  Google Scholar 

  7. Cieri, N., Maurer, K. & Wu, C. J. 60 years young: the evolving role of allogeneic hematopoietic stem cell transplantation in cancer immunotherapy. Cancer Res. 81, 4373–4384 (2021).

    Article  CAS  PubMed  Google Scholar 

  8. Bolon, Y., Atshan, R., Allbee-Johnson, M., Estrada-Merly, N. & Lee, S. Current use and outcome of hematopoietic stem cell transplantation: CIBMTR summary slides. CIBMTR https://cibmtr.org/CIBMTR/Resources/Summary-Slides-Reports (2022).

  9. Spellman, S. R. Hematology 2022—what is complete HLA match in 2022? Hematology Am. Soc. Hematol. Educ. Program 2022, 83–89 (2022).

    Article  PubMed  Google Scholar 

  10. Goulmy, E., Gratama, J. W., Blokland, E., Zwaan, F. E. & van Rood, J. J. A minor transplantation antigen detected by MHC-restricted cytotoxic T lymphocytes during graft-versus-host disease. Nature 302, 159–161 (1983).

    Article  CAS  PubMed  Google Scholar 

  11. Wang, W. et al. Human H–Y: a male-specific histocompatibility antigen derived from the SMCY protein. Science 269, 1588–1590 (1995).

    Article  CAS  PubMed  Google Scholar 

  12. Den Haan, J. M. et al. Identification of a graft versus host disease-associated human minor histocompatibility antigen. Science 268, 1476–1480 (1995).

    Article  Google Scholar 

  13. Goulmy, E., Termijtelen, A., Bradley, B. A. & van Rood, J. J. Y-antigen killing by T cells of women is restricted by HLA. Nature 266, 544–545 (1977).

    Article  CAS  PubMed  Google Scholar 

  14. Goulmy, E. et al. Mismatches of minor histocompatibility antigens between HLA-identical donors and recipients and the development of graft-versus-host disease after bone marrow transplantation. N. Engl. J. Med. 334, 281–285 (1996).

    Article  CAS  PubMed  Google Scholar 

  15. Spierings, E. et al. Multicenter analyses demonstrate significant clinical effects of minor histocompatibility antigens on GvHD and GvL after HLA-matched related and unrelated hematopoietic stem cell transplantation. Biol. Blood Marrow Transplant. 19, 1244–1253 (2013).

    Article  CAS  PubMed  Google Scholar 

  16. Grumet, F. C. et al. CD31 mismatching affects marrow transplantation outcome. Biol. Blood Marrow Transplant. 7, 503–512 (2001).

    Article  CAS  PubMed  Google Scholar 

  17. McCarroll, S. A. et al. Donor–recipient mismatch for common gene deletion polymorphisms in graft-versus-host disease. Nat. Genet. 41, 1341–1344 (2009).

    Article  CAS  PubMed  Google Scholar 

  18. Spellman, S. et al. Effects of mismatching for minor histocompatibility antigens on clinical outcomes in HLA-matched, unrelated hematopoietic stem cell transplants. Biol. Blood Marrow Transplant. 15, 856–863 (2009).

    Article  CAS  PubMed  Google Scholar 

  19. Kogler, G. et al. Recipient cytokine genotypes for TNF-α and IL-10 and the minor histocompatibility antigens HY and CD31 codon 125 are not associated with occurrence or severity of acute GVHD in unrelated cord blood transplantation: a retrospective analysis. Transplantation 74, 1167–1175 (2002).

    Article  PubMed  Google Scholar 

  20. Martin, P. J. et al. A model of minor histocompatibility antigens in allogeneic hematopoietic cell transplantation. Front. Immunol. 12, 782152 (2021).

    Article  CAS  PubMed  Google Scholar 

  21. Story, C. M. et al. Genetics of HLA peptide presentation and impact on outcomes in HLA-matched allogeneic hematopoietic cell transplantation. Transplant. Cell Ther. 27, 591–599 (2021).

    Article  CAS  PubMed  Google Scholar 

  22. Warren, E. H. et al. Effect of MHC and non-MHC donor/recipient genetic disparity on the outcome of allogeneic HCT. Blood 120, 2796–2806 (2012).

    Article  CAS  PubMed  Google Scholar 

  23. Bykova, N. A., Malko, D. B. & Efimov, G. A. In silico analysis of the minor histocompatibility antigen landscape based on the 1000 Genomes project. Front. Immunol. 9, 1819 (2018).

    Article  PubMed  Google Scholar 

  24. Jadi, O. et al. Associations of minor histocompatibility antigens with outcomes following allogeneic hematopoietic cell transplantation. Am. J. Hematol. 98, 940–950 (2023).

    Article  CAS  PubMed  Google Scholar 

  25. Lang, F., Schrors, B., Lower, M., Tureci, O. & Sahin, U. Identification of neoantigens for individualized therapeutic cancer vaccines. Nat. Rev. Drug Discov. 21, 261–282 (2022).

    Article  CAS  PubMed  Google Scholar 

  26. Fotakis, G., Trajanoski, Z. & Rieder, D. Computational cancer neoantigen prediction: current status and recent advances. Immunooncol. Technol. 12, 100052 (2021).

    Article  CAS  PubMed  Google Scholar 

  27. Peters, B., Nielsen, M. & Sette, A. T cell epitope predictions. Annu. Rev. Immunol. 38, 123–145 (2020).

    Article  CAS  PubMed  Google Scholar 

  28. Sarkizova, S. et al. A large peptidome dataset improves HLA class I epitope prediction across most of the human population. Nat. Biotechnol. 38, 199–209 (2020).

    Article  CAS  PubMed  Google Scholar 

  29. Reynolds, G. et al. Developmental cell programs are co-opted in inflammatory skin disease. Science 371, eaba6500 (2021).

    Article  CAS  PubMed  Google Scholar 

  30. Aizarani, N. et al. A human liver cell atlas reveals heterogeneity and epithelial progenitors. Nature 572, 199–204 (2019).

    Article  CAS  PubMed  Google Scholar 

  31. MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).

    Article  PubMed  Google Scholar 

  32. Vieira Braga, F. A. et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat. Med. 25, 1153–1163 (2019).

    Article  CAS  PubMed  Google Scholar 

  33. Laughney, A. M. et al. Regenerative lineages and immune-mediated pruning in lung cancer metastasis. Nat. Med. 26, 259–269 (2020).

    Article  CAS  PubMed  Google Scholar 

  34. Parikh, K. et al. Colonic epithelial cell diversity in health and inflammatory bowel disease. Nature 567, 49–55 (2019).

    Article  CAS  PubMed  Google Scholar 

  35. Wang, Y. et al. Single-cell transcriptome analysis reveals differential nutrient absorption functions in human intestine. J. Exp. Med. 217, e20191130 (2020).

    Article  PubMed  Google Scholar 

  36. Williams, D. W. et al. Human oral mucosa cell atlas reveals a stromal-neutrophil axis regulating tissue immunity. Cell 184, 4090–4104 (2021).

    Article  CAS  PubMed  Google Scholar 

  37. Bannier-Hélaouët, M. et al. Exploring the human lacrimal gland using organoids and single-cell sequencing. Cell Stem Cell 28, 1221–1232 (2021).

    Article  PubMed  Google Scholar 

  38. Kanate, A. S. et al. Indications for hematopoietic cell transplantation and immune effector cell therapy: guidelines from the American Society for Transplantation and Cellular Therapy. Biol. Blood Marrow Transplant. 26, 1247–1256 (2020).

    Article  CAS  PubMed  Google Scholar 

  39. Van Galen, P. et al. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell 176, 1265–1281 (2019).

    Article  PubMed  Google Scholar 

  40. Tyner, J. W. et al. Functional genomic landscape of acute myeloid leukaemia. Nature 562, 526–531 (2018).

    Article  CAS  PubMed  Google Scholar 

  41. Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

    Article  CAS  Google Scholar 

  42. Jiang, L. et al. A quantitative proteome map of the human body. Cell 183, 269–283 (2020).

    Article  CAS  PubMed  Google Scholar 

  43. Uhlen, M. et al. A genome-wide transcriptomic analysis of protein-coding genes in human blood cells. Science 366, eaax9198 (2019).

    Article  CAS  PubMed  Google Scholar 

  44. Cesana, M. et al. A CLK3-HMGA2 alternative splicing axis impacts human hematopoietic stem cell molecular identity throughout development. Cell Stem Cell 22, 575–588 (2018).

    Article  CAS  PubMed  Google Scholar 

  45. Drissen, R., Thongjuea, S., Theilgaard-Monch, K. & Nerlov, C. Identification of two distinct pathways of human myelopoiesis. Sci. Immunol. 4, eaau7148 (2019).

    Article  CAS  PubMed  Google Scholar 

  46. Kim, H. T. et al. Donor and recipient sex in allogeneic stem cell transplantation: what really matters. Haematologica 101, 1260–1266 (2016).

    Article  PubMed  Google Scholar 

  47. Ofran, Y. et al. Diverse patterns of T-cell response against multiple newly identified human Y chromosome-encoded minor histocompatibility epitopes. Clin. Cancer Res. 16, 1642–1651 (2010).

    Article  CAS  PubMed  Google Scholar 

  48. Miklos, D. B. et al. Antibody response to DBY minor histocompatibility antigen is induced after allogeneic stem cell transplantation and in healthy female donors. Blood 103, 353–359 (2004).

    Article  CAS  PubMed  Google Scholar 

  49. Feng, X., Hui, K. M., Younes, H. M. & Brickner, A. G. Targeting minor histocompatibility antigens in graft versus tumor or graft versus leukemia responses. Trends Immunol. 29, 624–632 (2008).

    Article  CAS  PubMed  Google Scholar 

  50. Bachireddy, P. et al. Mapping the evolution of T cell states during response and resistance to adoptive cellular therapy. Cell Rep. 37, 109992 (2021).

    Article  CAS  PubMed  Google Scholar 

  51. Bachireddy, P. et al. Distinct evolutionary paths in chronic lymphocytic leukemia during resistance to the graft-versus-leukemia effect. Sci. Transl. Med. 12, eabb7661 (2020).

    Article  CAS  PubMed  Google Scholar 

  52. Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  PubMed  Google Scholar 

  53. Torikai, H. et al. A novel HLA-A*3303-restricted minor histocompatibility antigen encoded by an unconventional open reading frame of human TMSB4Y gene. J. Immunol. 173, 7046–7054 (2004).

    Article  CAS  PubMed  Google Scholar 

  54. Ouspenskaia, T. et al. Unannotated proteins expand the MHC-I-restricted immunopeptidome in cancer. Nat. Biotechnol. 40, 209–217 (2022).

    Article  CAS  PubMed  Google Scholar 

  55. Andreatta, M. et al. MS-Rescue: a computational pipeline to increase the quality and yield of immunopeptidomics experiments. Proteomics 19, e1800357 (2019).

    Article  PubMed  Google Scholar 

  56. Lee, P. C. et al. Reversal of viral and epigenetic HLA class I repression in Merkel cell carcinoma. J. Clin. Invest. 132, e151666 (2022).

    Article  CAS  PubMed  Google Scholar 

  57. Oliveira, G. et al. Phenotype, specificity and avidity of antitumour CD8+ T cells in melanoma. Nature 596, 119–125 (2021).

    Article  CAS  PubMed  Google Scholar 

  58. Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2019).

    Article  CAS  PubMed  Google Scholar 

  59. Chowell, D. et al. TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes. Proc. Natl Acad. Sci. USA 112, E1754–E1762 (2015).

    Article  CAS  PubMed  Google Scholar 

  60. Schaefer, M. R. et al. A novel trafficking signal within the HLA-C cytoplasmic tail allows regulated expression upon differentiation of macrophages. J. Immunol. 180, 7804–7817 (2008).

    Article  CAS  PubMed  Google Scholar 

  61. Gabrielsen, I. S. M. et al. Transcriptomes of antigen presenting cells in human thymus. PLoS ONE 14, e0218858 (2019).

    Article  CAS  PubMed  Google Scholar 

  62. Park, J. E. et al. A cell atlas of human thymic development defines T cell repertoire formation. Science 367, eaay3224 (2020).

    Article  CAS  PubMed  Google Scholar 

  63. Holtan, S. G. et al. Composite end point of graft-versus-host disease-free, relapse-free survival after allogeneic hematopoietic cell transplantation. Blood 125, 1333–1338 (2015).

    Article  CAS  PubMed  Google Scholar 

  64. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  65. Lin, M. J. et al. Cancer vaccines: the next immunotherapy frontier. Nat. Cancer 3, 911–926 (2022).

    Article  CAS  PubMed  Google Scholar 

  66. Rojas, L. A. et al. Personalized RNA neoantigen vaccines stimulate T cells in pancreatic cancer. Nature 618, 144–150 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Lansford, J. L. et al. Computational modeling and confirmation of leukemia-associated minor histocompatibility antigens. Blood Adv. 2, 2052–2062 (2018).

    Article  CAS  PubMed  Google Scholar 

  68. Olsen, K. S. et al. Shared graft-versus-leukemia minor histocompatibility antigens in DISCOVeRY-BMT. Blood Adv. 7, 1635–1649 (2023).

    Article  CAS  PubMed  Google Scholar 

  69. Parkhurst, M. R. et al. Unique neoantigens arise from somatic mutations in patients with gastrointestinal cancers. Cancer Discov. 9, 1022–1035 (2019).

    Article  CAS  PubMed  Google Scholar 

  70. Wolff, D. et al. National Institutes of Health Consensus Development Project on criteria for clinical trials in chronic graft-versus-host disease: IV. The 2020 highly morbid forms report. Transplant. Cell Ther. 27, 817–835 (2021).

    Article  PubMed  Google Scholar 

  71. Lybaert, L. et al. Neoantigen-directed therapeutics in the clinic: where are we? Trends Cancer 9, 503–519 (2023).

    Article  CAS  PubMed  Google Scholar 

  72. Bacigalupo, A. & Jones, R. PTCy: the ‘new’ standard for GVHD prophylaxis. Blood Rev. 62, 101096 (2023).

    Article  PubMed  Google Scholar 

  73. Murata, M., Warren, E. H. & Riddell, S. R. A human minor histocompatibility antigen resulting from differential expression due to a gene deletion. J. Exp. Med. 197, 1279–1289 (2003).

    Article  CAS  PubMed  Google Scholar 

  74. Broen, K. et al. A polymorphism in the splice donor site of ZNF419 results in the novel renal cell carcinoma-associated minor histocompatibility antigen ZAPHIR. PLoS ONE 6, e21699 (2011).

    Article  CAS  PubMed  Google Scholar 

  75. Griffioen, M. et al. Identification of 4 novel HLA-B*40:01 restricted minor histocompatibility antigens and their potential as targets for graft-versus-leukemia reactivity. Haematologica 97, 1196–1204 (2012).

    Article  CAS  PubMed  Google Scholar 

  76. Spierings, E. et al. Identification of HLA class II-restricted H–Y-specific T-helper epitope evoking CD4+ T-helper cells in H–Y-mismatched transplantation. Lancet 362, 610–615 (2003).

    Article  CAS  PubMed  Google Scholar 

  77. Coghill, J. M. et al. Effector CD4+ T cells, the cytokines they generate, and GVHD: something old and something new. Blood 117, 3268–3276 (2011).

    Article  CAS  PubMed  Google Scholar 

  78. Jones, S. C., Murphy, G. F., Friedman, T. M. & Korngold, R. Importance of minor histocompatibility antigen expression by nonhematopoietic tissues in a CD4+ T cell-mediated graft-versus-host disease model. J. Clin. Invest. 112, 1880–1886 (2003).

    Article  CAS  PubMed  Google Scholar 

  79. Chaves, F. A., Lee, A. H., Nayak, J. L., Richards, K. A. & Sant, A. J. The utility and limitations of current web-available algorithms to predict peptides recognized by CD4 T cells in response to pathogen infection. J. Immunol. 188, 4235–4248 (2012).

    Article  CAS  PubMed  Google Scholar 

  80. Dohner, H. et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 129, 424–447 (2017).

    Article  PubMed  Google Scholar 

  81. Greenberg, P. L. et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood 120, 2454–2465 (2012).

    Article  CAS  PubMed  Google Scholar 

  82. Przepiorka, D. et al. 1994 consensus conference on acute GVHD grading. Bone Marrow Transplant. 15, 825–828 (1995).

    CAS  PubMed  Google Scholar 

  83. Glucksberg, H. et al. Clinical manifestations of graft-versus-host disease in human recipients of marrow from HLA-matched sibling donors. Transplantation 18, 295–304 (1974).

    Article  CAS  PubMed  Google Scholar 

  84. Pavletic, S. Z. et al. NCI first international workshop on the biology, prevention, and treatment of relapse after allogeneic hematopoietic stem cell transplantation: report from the committee on the epidemiology and natural history of relapse following allogeneic cell transplantation. Biol. Blood Marrow Transplant. 16, 871–890 (2010).

    Article  PubMed  Google Scholar 

  85. Parry, E. M. et al. Evolutionary history of transformation from chronic lymphocytic leukemia to Richter syndrome. Nat. Med. 29, 158–169 (2023).

    Article  CAS  PubMed  Google Scholar 

  86. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).

    Article  CAS  PubMed  Google Scholar 

  88. Kim, H. et al. Development of a validated interferon score using NanoString technology. J. Interferon Cytokine Res. 38, 171–185 (2018).

    Article  CAS  PubMed  Google Scholar 

  89. Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).

    Article  CAS  PubMed  Google Scholar 

  91. Quentmeier, H. et al. The LL-100 panel: 100 cell lines for blood cancer studies. Sci Rep. 9, 8218 (2019).

    Article  PubMed  Google Scholar 

  92. Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).

    Article  CAS  PubMed  Google Scholar 

  93. Klaeger, S. et al. Optimized liquid and gas phase fractionation increases HLA-peptidome coverage for primary cell and tissue samples. Mol. Cell. Proteomics 20, 100133 (2021).

    Article  CAS  PubMed  Google Scholar 

  94. Cui, K. H., Warnes, G. M., Jeffrey, R. & Matthews, C. D. Sex determination of preimplantation embryos by human testis-determining-gene amplification. Lancet 343, 79–82 (1994).

    Article  CAS  PubMed  Google Scholar 

  95. Bui, H. H. et al. Predicting population coverage of T-cell epitope-based diagnostics and vaccines. BMC Bioinformatics 7, 153 (2006).

    Article  PubMed  Google Scholar 

  96. Samarajiwa, S. A., Forster, S., Auchettl, K. & Hertzog, P. J. INTERFEROME: the database of interferon regulated genes. Nucleic Acids Res. 37, D852–D857 (2009).

    Article  CAS  PubMed  Google Scholar 

  97. Hookeri, N. nidhih2/mhags: v1.0.0 (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.11658572 (2024).

  98. Hookeri, N. nidhih2/mhags-fm: v1.0.0 (v1.0.0). Zenodo https://doi.org/10.5281/zenodo.11658599 (2024).

Download references

Acknowledgements

We are grateful for expert assistance from S. Pollock, H. Lyon and F. Dao for their help in sample collection and management at the Broad Institute; K. Rizza and the OTTR team (DFCI Department of Cellular Therapy) for assistance with clinical databases; A. Gusev for fruitful discussion on correlative analysis design; D. Hearsey and the members of the DFCI Ted and Eileen Pasquarello Tissue Bank in Hematologic Malignancies for provision of samples; the patients who generously consented for the research use of these samples and all members of the Wu Laboratory for productive discussions. This research was supported by grants from the National Institutes of Health (NIH/NCI-P01 CA229092 and NIH/NHLBI P01 HL158505 to C.J.W. and NIH R01 HL157174 to D.B.K.) and from the Leukemia & Lymphoma Society (SCOR-22937-22 to C.J.W. and R.J.S.). Statistical analysis was supported by the DF/HCC Cancer Center Support Grant 5P30 CA006516. Mass spectrometry-based immunopeptidomics data acquisition and analysis was supported in part by NIH P01CA206978 (to S.A.C.), NCI Clinical Proteomic Tumor Analysis Consortium program U24CA270823 and U01CA271402 (to S.A.C.), as well as a grant from the Dr. Miriam and Sheldon G. Adelson Medical Research Foundation (to S.A.C.). N.C. was supported by the 2020 American Association for Cancer Research-Incyte Immuno-oncology Research Fellowship (20-40-46-CIER) and the Helen Gurley Brown Foundation. H.J. was supported by the NCI CaNCURE (grant 5R25CA174650). L.P. is a scholar of the American Society of Hematology (ASH), is a participant in the BIH Charité Digital Clinician Scientist Program funded by the DFG, the Charité—Universitätsmedizin Berlin and the Berlin Institute of Health at Charité (BIH) and is supported by the Max-Eder program of the German Cancer Aid (Deutsche Krebshifle), by the Else Kröner-Fresenius-Stiftung (2023_EKEA.102) and the DKMS John Hansen Research Grant. D.A.B. acknowledges support from the Department of Defense Early Career Investigator grant (KCRP AKCI-ECI and W81XWH-20-1-0882), the Louis Goodman and Alfred Gilman Yale Scholar Fund and the Yale Cancer Center (supported by NIH/NCI research grant P30CA016359). G.O. was supported by the Claudia Adams Barr Program for Innovative Cancer Research and by DF/HCC Kidney Cancer SPORE P50 CA101942. S.L. is supported by the NCI Research Specialist Award (R50CA251956). L.S.K. is supported by the NIH under grants NIH/NIAID U19 Al1051731, NIH/NHLBI R01 HL095791, NIH/NHLBI P01 HL158504, NIH/NHLBI P01 HL158505 and NIH/NIAID U19 AI174967. Visual elements in Fig. 1 were created with BioRender.com.

Author information

Authors and Affiliations

Authors

Contributions

N.C. and C.J.W. conceived the project and directed the overall study. N.C. designed and performed the experimental and data analysis together with H.J., K.P. and L.P. K.S. developed the computational pipeline under the supervision of N.C., C.J.W., C.S. and G.G. N.H. analyzed public single-cell datasets, docked the pipeline on Terra and applied it to the DFCI-MRD patient cohort. L.S.K. provided the allo-HCT GI single-cell libraries. Y.S. analyzed the GvHD single-cell dataset. R.K.-R. and Y.S. ran the pipeline on the HP-MRD cohort. S.L. and K.J.L. assisted with NGS preparation and analysis. N.C. curated the clinical annotation of the patient cohort with the help of K.A.K., H.T.K. and V.T.H J.S. and W.J.L. provided the DFCI-MRD cohort DNA samples. P.D.-F., V.G.-G.S. and C.M.-C. provided the samples and clinical annotation for the HP-MRD cohort. L.L. performed the 1000 Genomes simulation under the guidance of N.C., C.S. and G.G. J.K. and N.C. designed and performed the statistical analyses under the supervision of D.N. C.F. and S.S. curated AML cell line genomic and transcriptomic analysis. G.M.H., S.K., J.A., S.S., G.O., D.A.B., D.B.K., K.R.C. and S.A.C. generated and analyzed mass spectrometry results. G.O., R.J.S., J.R. and V.T.H. contributed to data discussion and interpretation. N.C. and C.J.W. wrote the manuscript. All authors discussed the results and read and approved the manuscript.

Corresponding author

Correspondence to Catherine J. Wu.

Ethics declarations

Competing interests

C.J.W. holds equity in BioNTech and receives research support from Pharmacyclics. D.B.K is a scientific advisor for Immunitrack and Breakbio and owns equity in Affimed N.V., Agenus, Armata Pharmaceuticals, Breakbio, BioMarin Pharmaceutical, Celldex Therapeutics, Editas Medicine, Gilead Sciences, Immunitybio, IMV, Lexicon Pharmaceuticals and Neoleukin Therapeutics. BeiGene supported unrelated SARS-COV-2 research at Translational Immunogenomics Lab. R.J.S. consults or is on the advisory board of Kiadis, Juno Therapeutics, Gilead, Jasper, Jazz Pharmaceuticals, Precision Biosciences, Rheo Therapeutics, Takeda and NMDP—Be the Match. J.R. receives research funding from Kite/Gilead, Novartis and Oncternal and consults or is on advisory boards for Clade Therapeutics, Garuda Therapeutics, LifeVault Bio, Smart Immune and TriArm Bio. V.T.H. receives funding from Jazz Pharmaceuticals and consults or is on advisory boards for Jazz Pharmaceuticals, Janssen, Alexion Pharmaceuticals and Omeros. W.J.L. consults or is on the advisory board of CareDx, One Lambda and Thermo Fisher Scientific and receives royalty payments from Thermo Fisher Scientific. K.J.L. holds equity in Standard BioTools and is on the scientific advisory board for MBQ Pharma. S.A.C. is a member of the scientific advisory boards of PTM BioLabs, Kymera, Seer and PrognomIQ and holds equity in the latter three. D.A.B. reports honoraria from LM Education/Exchange Services; advisory board fees from Exelixis and AVEO; personal fees from Schlesinger Associates, Cancer Expert Now, Adnovate Strategies, MDedge, CancerNetwork, Catenion, OncLive, Cello Health BioConsulting, PWW Consulting, Haymarket Medical Network, Aptitude Health, ASCO Post/Harborside, Targeted Oncology, AbbVie, DLA Piper and Elephas; equity in CurIOS Therapeutics, Elephas and Fortress Biotech (subsidiary); research support from Exelixis (US) and AstraZeneca (UK), outside of the submitted work. G.O. is a consultant for Bicycle Therapeutics. L.S.K. is on the scientific advisory board for Mammoth Biosciences and HiFiBio; received research funding from Magenta Therapeutics, Tessera Therapeutics, Novartis, EMD Serono, Gilead Pharmaceuticals and Regeneron Pharmaceuticals; consulting fees from Vertex; grants/personal fees from Bristol Myers Squibb and royalties/partial funding for the current study from Bristol Myers Squibb. L.S.K.’s conflict of interest with Bristol Myers Squibb is managed under an agreement with Harvard Medical School. D.N. holds equity in Madrigal Pharmaceutics. G.G. receives research funds from Pharmacyclics, Ultima Genomics and IBM. G.G. receives research funds from Pharmacyclics, Bayer, Genentech, Ultima Genomics and IBM; is an inventor of patent applications related to MSMuTect, MSMutSig, MSIDetect, POLYSOLVER, SignatureAnalyzer-GPU and MinimuMM-seq; is a founder and consultant and holds privately held equity in Scorpion Therapeutics and is a founder and holds privately held equity in PreDICTA Biosciences. The other authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks Marcel van den Brink and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Pipeline details.

a, Detailed workflow for the prediction of autosomal (left) and Y-encoded (right) mHAgs. b, Pipeline outputs for the training AML cohort composed of 11 D–R pairs (see Supplementary Table 3). Shown are the median number (and interquartile range) of hits for each step of the pipeline for matched-related donor (MRD, blue; n = 2) and unrelated donor (URD, orange; n = 9) transplants. The number of discordant variants between donors and recipients was, as expected, higher in URD than in MRD transplants. Pie charts (on the right)—distribution of the types of discordant variants in MRD (top) and URD (bottom) D–R pairs, that is, SNPs, single-nucleotide polymorphisms; DEL, deletions; INS, insertions. GvHD, graft-versus-host disease; GvL, graft-versus-leukemia.

Extended Data Fig. 2 Single-cell data analysis to define the GvHD filter gene set.

a, Summary of the single-cell datasets used, related to the following organs target of GvHD: oral mucosa, lacrimal gland (eye), skin, liver, colon (GI) and lung. For each organ analyzed, the number of datasets and their accession numbers are shown, together with the total number of cells after standard single-cell data QC analysis, as well as after removal of immune cells. The ID of the second lung dataset has been abbreviated for ease of visualization, but the full identifier is reported in c legend. b, UMAP plots showing clustering of the resident cell types for each organ. Note that for ‘eye’, the dataset includes both primary cells and organoids (derived from ductal cells), which were both analyzed. c, Violin plots of lineage-defining markers for each cell type across the different organ datasets. d, UMAP plots showing the relative contribution of individual datasets, for those organs with 2 available. Vasc., vascular; Lymph., lymphatic; KC, keratinocyte; LSEC, liver sinusoidal endothelial cells; CT, crypt top.

Extended Data Fig. 3 Threshold definition for single-cell-based expression atlas.

a, To minimize the drop-out effect common in single-cell RNA-seq data, gene expression was analyzed in a pseudo-bulk fashion for each cluster. To define the threshold for positive expression having the best signal-to-noise ratio, the expression levels of lineage-specific markers such as MLANA (melanin, expressed only in melanocytes), SFTB (surfactant B, expressed only in the lung) and ALB (albumin, expressed only in the liver) were analyzed, using as control a pan-expressed gene, B2M. The dot plot shows the expression levels for all single-cell clusters (with the tiles next to their names indicating the organ of origin: green for ‘liver’, light blue for ‘lung’, yellow for ‘skin’, orange for ‘GI’, red for ‘oral mucosa’ and navy blue for ‘eye’). b, Comparison of the expression profile of fibroblasts from 2 independent single-cell datasets (oral mucosa and skin). Results of the linear regression analysis (R squared and p value) are reported and show a substantial transcriptional identity between fibroblasts across different anatomical sites and datasets. c, Comparison of the expression profile of fibroblasts derived from single-cell sequencing data versus bulk sequencing available through the GTEx repository. Fibroblasts were chosen as they were the only purified cell type available in both GTEx and single-cell datasets. Results of the linear regression analysis are reported, showing a significant transcriptional similarity across single-cell versus bulk RNA sequencing. Vasc., vascular; Lymph., lymphatic; KC, keratinocyte; LSEC, liver sinusoidal endothelial cells; CT, crypt top; CPM, counts per million; TPM, transcript per million.

Source data

Extended Data Fig. 4 GI single-cell RNA-sequencing of allo-HCT patients.

a, Schema depicting the patients included in the allo-HCT dataset and the analytic pipeline for the single-cell RNA-sequencing analysis. Briefly, single-cell RNA-sequencing data wre generated from the biopsies of 3 allo-HCT patients undergoing diagnostic colonoscopy for suspected GI GvHD at a median time from transplant of 90 days (range: 22–103). Upon standard processing and QC, viable cells were clustered and manually annotated (see Supplementary Fig. 2). Immune cell clusters were excluded, and remaining resident non-immune cells were merged and harmonized with the healthy subject GI dataset used for the generation of the GvHD filter. CPM, counts per million. b, UMAP showing cluster annotations from the merged Seurat object containing both allo-HCT and healthy subject-derived GI cells (top) and violin plots depicting the lineage-defining markers used for cluster annotation (bottom). c, UMAP depicting the clusters colored based on the dataset of origin. d, Venn diagram showing the number of genes that were present in the allo-HCT GI dataset vs. the GI healthy subject dataset. e, Venn diagram showing the overlap of the genes expressed in the allo-HCT GI dataset vs. the overall GvHD filter. f, Enrichment analysis of interferon-related signatures (ref. 88; MSigDB IFNa and IFNg89) in the allo-HCT vs. GI healthy subject datasets (p < 0.0001, 2-tailed Mann–Whitney test).

Source data

Extended Data Fig. 5 GvL filter details.

a, Schematic depicting the generation of the 2 components of the GvL filter, that is, AML and Hematopoietic filters. For the ‘AML filter’, a single-cell-based classifier39 was applied to bulk RNA-seq data from the Beat AML cohort40, to fully capture the AML transcriptional heterogeneity. For the ‘heme filter’, bulk RNA-seq from 18 purified mature hemopoietic cell types43 as well as from hematopoietic stem and progenitor cells (HSPCs)44,45 were the starting source. From the list of expressed genes (TPM > 2), all those with expression in adult non-hematopoietic tissues per the GTEx RNA and protein repositories were excluded to define a set of 650 genes with preferential expression in AML and/or hematopoietic cells. b, Gender-specificity of the GvL filters: the GTEx filtering step was performed in a gender-specific fashion, as genes expressed in the male reproductive organs are not filtered out if the patient is female, and vice versa genes expressed in the female reproductive organs are maintained if the patient is male. c, Histograms depicting the chromosomal location of the 650 genes comprising the ‘AML’ and ‘heme’ filters; below each bar, relative chromosome size is depicted. For the X chromosome, only the pseudo-autosomal regions have been included in the analysis. d, Subcellular localization of the genes in the ‘AML’ and ‘heme’ filters. e, Biological functions of the genes included in the filters. Biological functions (from GO and superpaths) have been manually clustered in macro-groups as specified in Supplementary Table 2.

Extended Data Fig. 6 Y-encoded mHAg filter.

a, Schematic depicting the structure of the Y chromosome with a special focus on the genes in the male-specific region (MSY). Heatmap showing the expression pattern of the genes in the MSY across different healthy adult tissues: only the first 9 genes (RPS4Y1, DDX3Y, KDM5D, EIF1AY, ZFY, USP9Y, TMSB4Y, UTY and NLGN4Y) have evidence of expression (≥1 TPM) in ≥1 adult tissue site of GvHD. PAR, pseudo-autosomal region. b, Stacked histograms showing the number of predicted Y epitopes across individual HLA-A, HLA-B and HLA-C alleles and divided based on the MSY gene of origin. c, Bubble plot showing the median number of predicted epitopes for each MSY gene grouped based on the HLA peptide-binding motif from ref.28.

Extended Data Fig. 7 Antigenicity and immunogenicity of Y mHAgs.

a, Correlation matrix showing the peptide-binding motifs of the HLA-A, HLA-B and HLA-C alleles from ref. 28; lateral panels display the individual HLA alleles belonging to each peptide-binding motif, whose corresponding monoallelic B721.221 immunopeptidomes have been analyzed in Fig. 2b. b, Hydrophobicity scores of the 410 Y mHAg peptides tested for immunogenicity, grouped by individual HLA restrictions: HLA-A0201 had the highest number of predicted binders with a score >0 (boxplots show min to max and median values; Kruskal–Wallis test with Dunn’s multiple comparisons test). c, Hydrophobicity scores of the 410 Y mHAgs grouped by HLA groups. Whiskers indicate min and max values, with all individual values shown (Kruskall–Wallis test with Dunn’s multiple comparisons test). d, Hydrophobicity scores of the predicted binders for each HLA allele, grouped based on the experimental evidence of T cell immunogenicity (per Fig. 2f): only for HLA-A0101 and HLA-C0501 was hydrophobicity significantly associated with immunogenicity (assessed with 2-tailed unpaired t test). Whiskers indicate min and max values, with all individual values shown. e, UMAP showing cluster annotations of single-cell thymic epithelial cells (TECs) from ref.62 (left), with feature plots of cluster-defining markers (middle); normalized expression of HLA-A, HLA-B and HLA-C genes per cluster (right).

Source data

Extended Data Fig. 8 Tracking of Y mHAg-specific T cells ex vivo.

a, Flow cytometry plots showing the percentage of circulating CD8+ T cells specific for the indicated Y mHAgs at the listed time points, including donor before allo-HCT in a patient transplanted from his HLA-identical sister and experiencing severe chronic GvHD. An irrelevant epitope from the EBV EBNA3A protein (HLA-B0702-restricted) was used as control, as both patient and donor were EBV seropositive. b, Timeline depicting the patient clinical course, highlighting the onset and course of the severe chronic GvHD, involving primarily skin and liver as shown by the liver function tests (ALT in red and total bilirubin in green). Triangles, peripheral blood samples used for Y mHAg-specific T cell tracking; diamonds, EBV reactivation. MMF, mycophenolate mofetil; tx: transplant. c, Quantification of ZFY-C0501-specific T cells in the leukapheresis (LK) products of additional 7 female donors to male patients (F to M), compared with T cells stained with a control C0501-dextramer. Boxplots show min to max and median values; significance was assessed with 2-tailed Wilcoxon paired t test.

Source data

Extended Data Fig. 9 Autosomal mHAgs and GvHD.

a, Normal distribution of the autosomal mHAg load in the DFCI-MRD cohort. b, Cumulative incidence of NIH moderate/severe chronic GvHD stratifying patients based on the overall autosomal mHAg load below (orange) or above (yellow) the median: no differences in 5-year cumulative incidences are observed: CIs are 42% (95% confidence interval: 33–52%) and 39% (95% confidence interval: 30–49%) for < median and > median, respectively, 2-sided p = 0.8 (Gray’s test). c, Distribution of patients experiencing grade II–IV skin (left) and GI (right) acute GvHD across deciles of skin and GI mHAgs, respectively. d, Distribution of patients experiencing NIH moderate/severe organ-specific chronic GvHD across deciles of mHAgs expressed in the indicated GvHD target organs: from left to right—skin, GI, liver, eye and oral. e, Heatmap depicting the co-occurrence of the 7 SNPs associated with liver acute GvHD in: from left to right, patients with liver acute GvHD, patients experiencing acute GvHD without liver involvement and patients with chronic liver GvHD. f, Number of co-occurring driver liver mHAgs in the 3 patient groups outlined in e and defined with the same color code. Boxplots show min to max and median values (Kruskall–Wallis test with Dunn’s multiple comparison test). g, Promoter analysis of the genes harboring the SNPs associated with liver acute GvHD: 4 of 7 genes have interferon-responsive elements in their promoter region. Transcription factor binding site locations within 1500 base pairs (bp) upstream of the transcription start site (position 0) and the 5′ UTR are indicated.

Source data

Extended Data Fig. 10 Population coverage simulating a T cell-based immunotherapy approach targeting GRFS mHAgs.

a, Heatmap showing the donor–recipient pairs (DRPs) that are informative for the pool of 54 GRFS epitopes indicated in the columns. DRPs (rows) are grouped by population of origin of the simulated recipient, as shown in the inner right bar. The outer right bar shows the number of predicted epitopes per DRP. Gray histograms on the bottom indicate the number of informative DRPs for each epitope. b, Population coverage analysis for: from top to bottom, overall simulation cohort, EUR, EAS, SAS, AFR and AMR. The histogram bars denote the percentage of DRPs that are informative for the indicated number of epitope hits, while the open circles indicate the cumulative percentage of population coverage for each number of epitope hits. The percentage indicated in the top-right corner of each graph shows the % of population for which ≥1 GRFS epitope could be potentially targeted. The red line denotes the 90% threshold of population coverage, which is considered optimal.

Supplementary information

Supplementary Information

Supplementary Figs. 1–18.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–7.

Supplementary Data

Code for correlative outcome analyses.

Source data

Source Data Fig. 2

Statistical source data for Fig. 2i.

Source Data Fig. 3

Statistical source data for Fig. 3.

Source Data Fig. 4

Statistical source data for Fig. 4c,g.

Source Data Fig. 5

Statistical source data for Fig. 5.

Source Data Extended Data Fig. 3

Statistical source data for Fig. 3b,c.

Source Data Extended Data Fig. 4

Statistical source data for Fig. 4f.

Source Data Extended Data Fig. 7

Statistical source data for Fig. 7b–d.

Source Data Extended Data Fig. 8

Statistical source data for Fig. 8c.

Source Data Extended Data Fig. 9

Statistical source data for Fig. 9f.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cieri, N., Hookeri, N., Stromhaug, K. et al. Systematic identification of minor histocompatibility antigens predicts outcomes of allogeneic hematopoietic cell transplantation. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-024-02348-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41587-024-02348-3

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research