Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Mapping the dynamic genetic regulatory architecture of HLA genes at single-cell resolution

Abstract

The human leukocyte antigen (HLA) locus plays a critical role in complex traits spanning autoimmune and infectious diseases, transplantation and cancer. While coding variation in HLA genes has been extensively documented, regulatory genetic variation modulating HLA expression levels has not been comprehensively investigated. Here we mapped expression quantitative trait loci (eQTLs) for classical HLA genes across 1,073 individuals and 1,131,414 single cells from three tissues. To mitigate technical confounding, we developed scHLApers, a pipeline to accurately quantify single-cell HLA expression using personalized reference genomes. We identified cell-type-specific cis-eQTLs for every classical HLA gene. Modeling eQTLs at single-cell resolution revealed that many eQTL effects are dynamic across cell states even within a cell type. HLA-DQ genes exhibit particularly cell-state-dependent effects within myeloid, B and T cells. For example, a T cell HLA-DQA1 eQTL (rs3104371) is strongest in cytotoxic cells. Dynamic HLA regulation may underlie important interindividual variability in immune responses.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of study and scHLApers pipeline.
Fig. 2: Quantifying single-cell HLA expression using scHLApers.
Fig. 3: eQTLs for classical HLA genes from pseudobulk analysis.
Fig. 4: Integrating single cells into a unified cell state embedding across datasets.
Fig. 5: Identifying dynamic eQTLs by modeling single cells.
Fig. 6: Dynamic HLA-DQ eQTLs in myeloid and B cells.

Similar content being viewed by others

Data availability

The GRCh38 reference genome (primary assembly) and Gencode v38 annotation file can be downloaded at https://www.gencodegenes.org/human/release_38.html. For the synovium dataset, the single-cell expression data are available on Synapse at https://doi.org/10.7303/syn52297840. Genotype data are available on the Arthritis and Autoimmune and Related Diseases Knowledge Portal (ARK Portal, https://arkportal.synapse.org/Explore/Datasets/DetailsPage?id=syn52297840). For intestine, the raw scRNA-seq data (bam files) was obtained from the Broad Data Use Oversight System (DUOS) (dataset name: Ulcerative_Colitis_in_Colon_Regev_Xavier); the genotype data are available on dbGaP (phs001642). For PBMC-cultured, the raw scRNA-seq data (FASTQ files) was obtained from GEO (PRJNA682434), and the imputed low-pass WGS data is publicly available at SRA (PRJNA736483) and Zenodo (https://doi.org/10.5281/zenodo.4273999). For PBMC-blood (OneK1K cohort), both the raw scRNA-seq data (bam files) and genotyping data are publicly available on GEO (GSE196830). The reprocessed versions of all scRNA-seq count matrices from this study after realignment with scHLApers are publicly available on Figshare (https://doi.org/10.6084/m9.figshare.24311335).

Code availability

Code and tutorials to run the scHLApers pipeline (v1.0) are available on GitHub (https://github.com/immunogenomics/scHLApers) and Zenodo (https://doi.org/10.5281/zenodo.10003910). Scripts for reproducing analyses in the manuscript are also available on GitHub (https://github.com/immunogenomics/hla2023) and Zenodo (https://doi.org/10.5281/zenodo.10003911).

References

  1. Lenz, T. L., Spirin, V., Jordan, D. M. & Sunyaev, S. R. Excess of deleterious mutations around HLA genes reveals evolutionary cost of balancing selection. Mol. Biol. Evol. 33, 2555–2564 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Dendrou, C. A., Petersen, J., Rossjohn, J. & Fugger, L. HLA variation and disease. Nat. Rev. Immunol. 18, 325–339 (2018).

    Article  CAS  PubMed  Google Scholar 

  3. Matzaraki, V., Kumar, V., Wijmenga, C. & Zhernakova, A. The MHC locus and genetic susceptibility to autoimmune and infectious diseases. Genome Biol. 18, 76 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Trowsdale, J. & Knight, J. C. Major histocompatibility complex genomics and human disease. Annu. Rev. Genomics Hum. Genet. 14, 301–323 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Okada, Y. et al. Fine mapping major histocompatibility complex associations in psoriasis and its clinical subtypes. Am. J. Hum. Genet. 95, 162–172 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Raychaudhuri, S. et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat. Genet. 44, 291–296 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Hollenbach, J. A. & Oksenberg, J. R. The immunogenetics of multiple sclerosis: a comprehensive review. J. Autoimmun. 64, 13–25 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Vader, W. et al. The HLA-DQ2 gene dose effect in celiac disease is directly related to the magnitude and breadth of gluten-specific T cell responses. Proc. Natl Acad. Sci. USA 100, 12390–12395 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47, 898–905 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Ishigaki, K. et al. HLA autoimmune risk alleles restrict the hypervariable region of T cell receptors. Nat. Genet. 54, 393–402 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Sharon, E. et al. Genetic variation in MHC proteins is associated with T cell receptor expression biases. Nat. Genet. 48, 995–1002 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Broughton, S. E. et al. Biased T cell receptor usage directed against human leukocyte antigen DQ8-restricted gliadin peptides is associated with celiac disease. Immunity 37, 611–621 (2012).

    Article  CAS  PubMed  Google Scholar 

  13. Apps, R. et al. Influence of HLA-C expression level on HIV control. Science 340, 87–91 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Cavalli, G. et al. MHC class II super-enhancer increases surface expression of HLA-DR and HLA-DQ and affects cytokine production in autoimmune vitiligo. Proc. Natl Acad. Sci. USA 113, 1363–1368 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Raj, P. et al. Regulatory polymorphisms modulate the expression of HLA class II molecules and promote autoimmunity. eLife 5, e12089 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  16. D’Antonio, M. et al. Systematic genetic analysis of the MHC region reveals mechanistic underpinnings of HLA type associations with disease. eLife 8, e48476 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Aguiar, V. R. C., César, J., Delaneau, O., Dermitzakis, E. T. & Meyer, D. Expression estimation and eQTL mapping for HLA genes with a personalized pipeline. PLoS Genet. 15, e1008091 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gutierrez-Arcelus, M. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat. Genet. 52, 247–253 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Cuomo, A. S. E. et al. CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol. Syst. Biol. 18, e10663 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Schmiedel, B. J. et al. Single-cell eQTL analysis of activated T cell subsets reveals activation and cell type–dependent effects of disease-risk variants. Sci. Immunol. 7, eabm2508 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Meyer, D., Aguiar, V. R. C., Bitarello, B. D., Brandt, D. Y. C. & Nunes, K. A genomic perspective on HLA evolution. Immunogenetics 70, 5–27 (2018).

    Article  CAS  PubMed  Google Scholar 

  23. Brandt, D. Y. C. et al. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 Genomes Project Phase I data. G3 5, 931–941 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Sakaue, S. et al. A statistical genetics guide to identifying HLA alleles driving complex disease. Nat. Protoc. 18, 2625–2641 (2023).

    Article  CAS  PubMed  Google Scholar 

  25. Aguiar, V. R. C., Masotti, C., Camargo, A. A. & Meyer, D. HLApers: HLA typing and quantification of expression with personalized index. Methods Mol. Biol. 2120, 101–112 (2020).

    Article  CAS  PubMed  Google Scholar 

  26. Bettens, F. et al. Regulation of HLA class I expression by non-coding gene variations. PLoS Genet. 18, e1010212 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Robinson, J. et al. IPD-IMGT/HLA database. Nucleic Acids Res. 48, D948–D955 (2020).

    CAS  PubMed  Google Scholar 

  28. Kaminow, B., Yunusov, D. & Dobin, A. STARsolo: accurate, fast and versatile mapping/quantification of single-cell and single-nucleus RNA-seq data. Preprint at bioRxiv https://doi.org/10.1101/2021.05.05.442755 (2021).

  29. Zhang, F. et al. Deconstruction of rheumatoid arthritis synovium defines inflammatory subtypes. Nature https://doi.org/10.1038/s41586-023-06708-y (2023).

  30. Smillie, C. S. et al. Intra- and inter-cellular rewiring of the human colon during ulcerative colitis. Cell 178, 714–730.e22 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Randolph, H. E. et al. Genetic ancestry effects on the response to viral infection are pervasive but cell type specific. Science 374, 1127–1133 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).

  33. Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS ONE 8, e64683 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Dunlap, G. et al. Clonal associations of lymphocyte subsets and functional states revealed by single cell antigen receptor profiling of T and B cells in rheumatoid arthritis synovium. Preprint at bioRxiv https://doi.org/10.1101/2023.03.18.533282 (2023).

  36. Wang, Z. et al. Clonally diverse CD38+HLA-DR+CD8+ T cells persist during fatal H7N9 disease. Nat. Commun. 9, 824 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Tippalagama, R. et al. HLA-DR marks recently divided antigen-specific effector CD4 T cells in active tuberculosis patients. J. Immunol. 207, 523–533 (2021).

    Article  CAS  PubMed  Google Scholar 

  38. Soskic, B. et al. Immune disease risk variants regulate gene expression dynamics during CD4+ T cell activation. Nat. Genet. 54, 817–826 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Holling, T. M., Schooten, E. & van Den Elsen, P. J. Function and regulation of MHC class II molecules in T-lymphocytes: of mice and men. Hum. Immunol. 65, 282–290 (2004).

    Article  CAS  PubMed  Google Scholar 

  40. LaSalle, J. M., Tolentino, P. J., Freeman, G. J., Nadler, L. M. & Hafler, D. A. Early signaling defects in human T cells anergized by T cell presentation of autoantigen. J. Exp. Med. 176, 177–186 (1992).

    Article  CAS  PubMed  Google Scholar 

  41. Lanzavecchia, A., Roosnek, E., Gregory, T., Berman, P. & Abrignani, S. T cells can present antigens such as HIV gp120 targeted to their own surface molecules. Nature 334, 530–532 (1988).

    Article  CAS  PubMed  Google Scholar 

  42. Hagopian, W. et al. Co-occurrence of type 1 diabetes and celiac disease autoimmunity. Pediatrics 140, e20171305 (2017).

    Article  PubMed  Google Scholar 

  43. Yamamoto, F. et al. Capturing differential allele-level expression and genotypes of all classical HLA loci and haplotypes by a new capture RNA-seq method. Front. Immunol. 11, 941 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Kaur, G. et al. Structural and regulatory diversity shape HLA-C protein expression levels. Nat. Commun. 8, 15924 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Kulkarni, S. et al. Genetic interplay between HLA-C and MIR148A in HIV control and Crohn disease. Proc. Natl Acad. Sci. USA 110, 20705–20710 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Chandran, V. et al. Killer-cell immunoglobulin-like receptor gene polymorphisms and susceptibility to psoriatic arthritis. Rheumatology 53, 233–239 (2014).

    Article  CAS  PubMed  Google Scholar 

  47. Ota, M. et al. Dynamic landscape of immune cell-specific gene regulation in immune-mediated diseases. Cell 184, 3006–3021.e17 (2021).

    Article  CAS  PubMed  Google Scholar 

  48. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kang, J. B. et al. Efficient and precise single-cell reference atlas mapping with Symphony. Nat. Commun. 12, 5890 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Wilkinson, S. T. et al. Partial plasma cell differentiation as a mechanism of lost major histocompatibility complex class II expression in diffuse large B-cell lymphoma. Blood 119, 1459–1467 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Yoon, H. S. et al. ZBTB32 is an early repressor of the CIITA and MHC class II gene expression during B cell differentiation to plasma cells. J. Immunol. 189, 2393–2403 (2012).

    Article  CAS  PubMed  Google Scholar 

  52. Kumasaka, N. et al. Mapping interindividual dynamics of innate immune response at single-cell resolution. Nat. Genet. 55, 1066–1075 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kang, J. B., Raveane, A., Nathan, A., Soranzo, N. & Raychaudhuri, S. Methods and insights from single-cell expression quantitative trait loci. Annu. Rev. Genomics Hum. Genet. 24, 277–303 (2023).

    Article  CAS  PubMed  Google Scholar 

  54. Yao, C. et al. Sex- and age-interacting eQTLs in human complex diseases. Hum. Mol. Genet. 23, 1947–1956 (2014).

    Article  CAS  PubMed  Google Scholar 

  55. Davenport, E. E. et al. Discovering in vivo cytokine-eQTL interactions from a lupus clinical trial. Genome Biol. 19, 168 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

    Article  CAS  PubMed  Google Scholar 

  57. Calzetti, F. et al. Human dendritic cell subset 4 (DC4) correlates to a subset of CD14dim/−CD16++ monocytes. J. Allergy Clin. Immunol. 141, 2276–2279.e3 (2018).

    Article  PubMed  Google Scholar 

  58. Janeway, C. A., Travers, P., Walport, M. & Shlomchik, M. J. Immunobiology (CRC Press, 2001).

  59. Kambayashi, T. & Laufer, T. M. Atypical MHC class II-expressing antigen-presenting cells: can anything replace a dendritic cell? Nat. Rev. Immunol. 14, 719–730 (2014).

    Article  CAS  PubMed  Google Scholar 

  60. Prugnolle, F. et al. Pathogen-driven selection and worldwide HLA class I diversity. Curr. Biol. 15, 1022–1027 (2005).

    Article  CAS  PubMed  Google Scholar 

  61. Yeung, H.-Y. & Dendrou, C. A. Pregnancy immunogenetics and genomics: implications for pregnancy-related complications and autoimmune disease. Annu. Rev. Genomics Hum. Genet. 20, 73–97 (2019).

    Article  CAS  PubMed  Google Scholar 

  62. Barreiro, L. B. & Quintana-Murci, L. From evolutionary genetics to human immunology: how selection shapes host defence genes. Nat. Rev. Genet. 11, 17–30 (2010).

    Article  CAS  PubMed  Google Scholar 

  63. Petersdorf, E. W. et al. HLA-C expression levels define permissible mismatches in hematopoietic cell transplantation. Blood 124, 3996–4003 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Chowell, D. et al. Patient HLA class I genotype influences cancer response to checkpoint blockade immunotherapy. Science 359, 582–587 (2018).

    Article  CAS  PubMed  Google Scholar 

  65. Naranbhai, V. et al. HLA-A*03 and response to immune checkpoint blockade in cancer: an epidemiological biomarker study. Lancet Oncol. 23, 172–184 (2022).

    Article  CAS  PubMed  Google Scholar 

  66. Matern, B. M. et al. Long-read nanopore sequencing validated for human leukocyte antigen class I typing in routine diagnostics. J. Mol. Diagn. 22, 912–919 (2020).

    Article  CAS  PubMed  Google Scholar 

  67. Liu, C. et al. High-resolution HLA typing by long reads from the R10.3 Oxford nanopore flow cells. Hum. Immunol. 82, 288–295 (2021).

    Article  CAS  PubMed  Google Scholar 

  68. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Wallace, C. A more accurate method for colocalisation analysis allowing for multiple causal variants. PLoS Genet. 17, e1009440 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9, e52155 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  71. scHLApers. GitHub. https://github.com/immunogenomics/scHLApers (2023).

  72. IMGTHLA. GitHub. https://github.com/ANHIG/IMGTHLA (2023).

  73. hlaseqlib. GitHub. https://github.com/genevol-usp/hlaseqlib (2022).

  74. tutorial_HLAQCImputation.ipynb. GitHub. https://github.com/immunogenomics/HLA_analyses_tutorial/blob/main/tutorial_HLAQCImputation.ipynb (2023).

  75. SNP2HLA.py. GitHub. https://github.com/immunogenomics/HLA_analyses_tutorial/blob/main/scripts/SNP2HLA.py (2023).

  76. Chain file for hg19 to hg38 liftover. UCSC. http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz (2013).

  77. Darby, C. A., Stubbington, M. J. T., Marks, P. J., Martínez Barrio, Á. & Fiddes, I. T. scHLAcount: allele-specific HLA expression from single-cell gene expression data. Bioinformatics 36, 3905–3906 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Azimuth. HuBMAP Consortium. https://app.azimuth.hubmapconsortium.org/app/human-pbmc (2020).

  79. Cuomo, A. S. E. et al. Optimizing expression quantitative trait locus mapping workflows for single-cell studies. Genome Biol. 22, 188 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Nakagawa, S., Johnson, P. C. D. & Schielzeth, H. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. J. R. Soc. Interface 14, 20170213 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank A. Dobin, H. Randolph, H. Lau, C. Stevens and members of the Raychaudhuri Lab, in particular A. Gupta and Y. Baglaenko, for their helpful input and discussions. This work was funded by the National Institutes of Health grants T32GM007753 and T32GM144273 (J.B.K., L.R. and K.A.L.), F30AI172238 (J.B.K.), T32HG002295 (A.Z.S. and L.R.), T32AR007530 (A.N.), F30AI157385 (L.R.), R01AR063759 (S.R.), U01HG012009 (S.R.) and UC2AR081023 (S.R.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This project also received funding from the MGH Center for the Study of Inflammatory Bowel Disease grant DK-43351 (R.J.X.), a fellowship from the Fok Foundation (J.E.P.), the Arthritis National Research Foundation (M.G.-A.), Gilead Sciences Research Scholar grant (M.G.-A.), Lupus Research Alliance (M.G.-A.) and a Kennedy Trust KTRR Senior Research Fellowship (KENN202109) (Y.L).

Author information

Authors and Affiliations

Authors

Consortia

Contributions

J.B.K. and S.R. conceived the study. J.B.K., A.Z.S. and Y.L. developed the scHLApers pipeline. J.B.K., S.S. and S.G. performed HLA imputation and eQTL analysis. J.B.K. performed analysis and integration of the single-cell data. L.R. post-processed the PBMC-blood dataset. A.N., V.R.C.A., C.V., K.A.L. and M.G.-A. helped interpret data and analyses. F.Z., A.H.J., S.Y., J.A.-H., H.K., A.N.A., K.J., K.D., AMP RA/SLE, M.J.D., R.J.X., L.T.D., J.H.A., J.E.P., D.A.R. and M.B.B. generated and helped interpret data resources. S.R. supervised the project. J.B.K. and S.R. composed the initial manuscript draft. All authors provided critical intellectual feedback and participated in interpreting the data and revising the manuscript.

Corresponding author

Correspondence to Soumya Raychaudhuri.

Ethics declarations

Competing interests

J.B.K. is a consultant to Aditum Bio. R.J.X. is co-founder of Jnana Therapeutics and Celsius Therapeutics, scientific advisory board member at Nestlé, and board director at MoonLake Immunotherapeutics; these organizations had no roles in this study. M.B.B. is a consultant to GSK, 4FO Ventures, Third Rock Ventures and consultant and founder of Mestag Therapeutics. S.R. is a scientific advisor to Pfizer, Janssen and Sonoma Biotherapeutics, a founder of Mestag Therapeutics, and a consultant for AbbVie, Sanofi, Biogen and Nimbus Therapeutics. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Patrick Gaffney and Keishi Fujio for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Correcting HLA expression estimation bias with scHLApers.

a, Schematic showing how high HLA gene polymorphism leads to bias in read alignment to a single reference genome. Consider two hypothetical individuals who are either homozygous for HLA-DRB1 allele X (orange) or allele Y (blue), where the reference allele is X. Reads from X will align perfectly to the reference, leading to accurate HLA-DRB1 quantification. However, for Y, reads will fail to align to the reference due to discordant sequence content, leading to unmapped reads and underestimation of expression. b, Percentage change in expression (total UMIs for HLA gene per individual, y-axis) across cohorts (synovium, n = 69 individuals; intestine, n = 22; PBMC-cultured, n = 73; PBMC-blood, n = 909). c, Percentage change in estimated expression (total UMIs for HLA gene per individual, y-axis) in synovium (n = 69) as a function of the mean (between the individual’s two alleles) Levenshtein distance relative to the GRCh38 reference allele at the 3’ end of each gene (x-axis). For b and c, dashed horizontal red lines denote no change. Fitted linear regression line (blue) shown with 95% confidence region. d, Heatmap showing the alignment of reads to each gene in scHLApers (rows) versus where the same read aligned (‘came from’) in the standard pipeline (columns) for synovium (top) and PBMC-cultured (bottom). Columns include HLA genes, other regions in the extended MHC, or unmapped reads. Rows sum to 100%, and a darker color indicates that more of the reads aligning to a given gene in scHLApers came from the corresponding location in the standard pipeline. e, Phylogenetic tree derived from a multiple sequence alignment of HLA-C allelic genomic sequences. The reference allele is C*07:02. Yellow box shows alleles similar to the reference (‘reference-like’). Boxplot on right shows the change in HLA-B estimated UMI counts summed across cells from each sample (y-axis) compared to the genotype for HLA-C in terms of dosage of ‘reference-like’ alleles (x-axis), across n = 1,073 individuals from all cohorts. For b and e, boxplot center line represents median, lower/upper box limits represent 25/75% quantiles, whiskers extend to box limit ±1.5 × IQR, and outlying points are plotted individually.

Extended Data Fig. 2 Concordance of eQTLs with bulk RNA-seq, differential allelic expression, and read alignment visualization.

a, Concordance between the effect sizes of lead HLA eQTLs identified in the multi-cohort pseudobulk model for B cells (this study, y-axis) and the same variant’s effect in LCLs identified through bulk RNA-seq eQTL analysis (Aguiar et al., x-axis). Because not all lead variants in this study were directly comparable due to different sets of tested variants, we tested the concordance of the most significant variant present in both datasets (triangles indicate that the exact lead variant in this study was also tested in Aguiar et al., whereas circles indicate ‘substitute’ lead variants was used for comparison). b, HLA-B expression in myeloid cells (top, n = 861 individuals) and HLA-C expression in B cells (bottom, n = 909), showing mean log(CP10k + 1)-normalized expression (y-axis) across cells for each individual in PBMC-blood by allele (x-axis). Each individual’s expression value is plotted once if they are homozygous (red) and twice if heterozygous (tan) for each allele (imputed dosage is rounded to the nearest integer). The black diamonds show the mean value for each allele (used to order the x-axis). c, Integrative Genomics Viewer (IGV) screenshots showing read alignments for alleles HLA-B*15:01 and HLA-C*07:01, associated with lower expression of the respective genes, for a representative individual in synovium.

Extended Data Fig. 3 Personalization improves eQTL effect size estimates.

a, Comparison of eQTL effect size estimates calculated using expression quantified by scHLApers (x-axis) vs. standard pipeline (y-axis). Each dot represents one of 12,045 MHC-wide genetic variants tested using the pseudobulk eQTL model per cell type (color). Pearson correlation is labeled for each gene. b, Example of eQTL effect correction through the use of corrected expression estimates, shown for HLA-DRB1 in B cells. eQTL effect sizes (y-axis) estimated for MHC variants along Chr. 6 (x-axis), shown for standard pipeline (top), scHLApers pipeline (middle), and the magnitude of difference between the betas from the two pipelines (bottom). The variant with the largest correction in estimated eQTL effect (HLA-DRB1*07:01) is labeled in orange, and the lead variant in the scHLApers pipeline (rs9271117) is labeled in blue. c, Boxplots visualizing the eQTL effects across individuals for HLA-DRB1*07:01 (left) and rs9271117 (right) using HLA-DRB1 expression estimates from the standard (top) vs. scHLApers (bottom) pipelines. Increased dosage of the ALT allele (x-axis) vs. HLA-DRB1 expression in B cells (y-axis: units are residual of inverse normal transformed mean log(CP10k + 1)-normalized expression across cells after regressing out covariates), across n = 1,069 individuals total (synovium, n = 65; intestine, n = 22; PBMC-cultured, n = 73; PBMC-blood, n = 909), plotted by dataset (color). For HLA-DRB1*07:01, ‘A’ denotes absence of the allele, and ‘T’ denotes presence (rather than REF/ALT nucleotides). Nominal Wald P-values are derived from linear regression (two-sided test).

Extended Data Fig. 4 Testing single-cell NBME model for concordance with pseudobulk and for calibration for genotype-cell-state interactions.

a-e, The models in a-c test genotype main effects, whereas d and e test genotype-cell-state interaction. a,b, Concordance of genotype main effect estimates (a) and significance of genotype main effect (b) between the NBME model (y-axis) and the pseudobulk model for the PBMC-blood dataset (x-axis) across all cell types and classical HLA genes. c, Power of the NBME single-cell eQTL model to detect regulatory effects across allele frequencies. The proportion of simulations where the null hypothesis was appropriately rejected at α = 5 × 10−8 (y-axis) in the presence of a simulated eQTL effect across 1000 simulations. Simulations were run across a range of eQTL allele frequencies (x-axis) and effect sizes (colors) using the PBMC-blood myeloid data and HLA-DQA1 expression. d,e, We permuted cell state (10 hPCs as a block) for 1,000 tests and obtained interaction P-values from a one-sided likelihood ratio test (LRT) comparing to the null model without G×hPC interaction terms. Q-Q plots showing statistical calibration (compared to uniform P-values) for PME model (d) versus NBME model (e) when testing for cell state interactions for representative class I (HLA-A) and class II (HLA-DPA1) genes in myeloid cells in PBMC-blood. The red line is the identity line. The histograms below show distributions of LRT P-values for HLA-DPA1.

Supplementary information

Supplementary Information

Supplementary notes and Figs. 1–15.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–15.

Supplementary Data 1

Multi-cohort pseudobulk eQTL full results for myeloid, B and T cells. Results from testing each of 12,050 for association with classical HLA gene expression in each cell type (total 8 genes × 3 cell types × 12,050 variants = 289,200 tests) in the multi-cohort pseudobulk linear model. Columns list the variants in multi-cohort analysis, cell type, gene, effect size of variant on covariate-corrected standardized gene expression (β), standard error of β estimate, nominal Wald P value from linear regression (two-sided test), and REF and ALT alleles. For metadata about each tested variant, see Supplementary Table 7.

Supplementary Data 2

Multi-cohort pseudobulk conditional analysis results. Results from conditional analysis identifying eQTLs, conditioning on the lead variant(s) from previous round(s). Columns list the variant, cell type, gene, round of conditional analysis (conditional_iter, ranging from 1 to 4 for primary to quaternary effects), effect size of eQTL (β), standard error of β estimate, and nominal Wald P-value from linear regression (two-sided test). Includes only variants with nominal P < 0.05 to reduce file size. For metadata about each tested variant, see Supplementary Table 7.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, J.B., Shen, A.Z., Gurajala, S. et al. Mapping the dynamic genetic regulatory architecture of HLA genes at single-cell resolution. Nat Genet 55, 2255–2268 (2023). https://doi.org/10.1038/s41588-023-01586-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01586-6

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing