Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response

An Author Correction to this article was published on 02 November 2021

This article has been updated

Abstract

Fine-mapping to plausible causal variation may be more effective in multi-ancestry cohorts, particularly in the MHC, which has population-specific structure. To enable such studies, we constructed a large (n = 21,546) HLA reference panel spanning five global populations based on whole-genome sequences. Despite population-specific long-range haplotypes, we demonstrated accurate imputation at G-group resolution (94.2%, 93.7%, 97.8% and 93.7% in admixed African (AA), East Asian (EAS), European (EUR) and Latino (LAT) populations). Applying HLA imputation to genome-wide association study data for HIV-1 viral load in three populations (EUR, AA and LAT), we obviated effects of previously reported associations from population-specific HIV studies and discovered a novel association at position 156 in HLA-B. We pinpointed the MHC association to three amino acid positions (97, 67 and 156) marking three consecutive pockets (C, B and D) within the HLA-B peptide-binding groove, explaining 12.9% of trait variance.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: A schematic showing the overall study design.
Fig. 2: The multi-ancestry HLA reference panel shows improvement in allele diversity and imputation accuracy.
Fig. 3: Stepwise conditional analysis of the allele and amino acid positions of classical HLA genes to HIV-1 viral load.
Fig. 4: Location and effect of three independently associated amino acid positions in HLA-B.
Fig. 5: Pairwise LD and haplotype structure for eight classical HLA genes in five population groups.

Similar content being viewed by others

Data availability

All data for generating the figures presented in the manuscript are available at https://github.com/immunogenomics/HLA-TAPAS.

Code availability

HLA-TAPAS, https://github.com/immunogenomics/HLA-TAPAS; GATK version 3.6, https://software.broadinstitute.org/gatk/download/archive; HLA*PRG, https://github.com/AlexanderDilthey/MHC-PRG; HLA*LA, https://github.com/DiltheyLab/HLA-PRG-LA; PLINK version 1.90, https://www.cog-genomics.org/plink2; Beagle version 4.1, https://faculty.washington.edu/browning/beagle/b4_1.html; Hapl-o-Mat version 1.1, https://github.com/DKMS/Hapl-o-Mat/; BIGDAWG version 2.3.6, https://cran.r-project.org/web/packages/BIGDAWG/index.html.

Change history

References

  1. International HIV Controllers Study et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science 330, 1551–1557 (2010).

  2. Raychaudhuri, S. et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat. Genet. 44, 291–296 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Evans, D. M. et al. Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nat. Genet. 43, 761–767 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Snyder, A. et al. Genetic basis for clinical response to CTLA-4 blockade in melanoma. N. Engl. J. Med. 371, 2189–2199 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

  6. Horton, R. et al. Gene map of the extended human MHC. Nat. Rev. Genet. 5, 889–899 (2004).

    Article  CAS  PubMed  Google Scholar 

  7. Gourraud, P.-A. et al. HLA diversity in the 1000 Genomes dataset. PLoS ONE 9, e97282 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Robinson, J. et al. IPD-IMGT/HLA Database. Nucleic Acids Res. 48, D948–D955 (2020).

    CAS  PubMed  Google Scholar 

  9. Gonzalez-Galarza, F. F. et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data and new query tools. Nucleic Acids Res. 48, D783–D788 (2020).

    CAS  PubMed  Google Scholar 

  10. Dilthey, A. T., Moutsianas, L., Leslie, S. & McVean, G. HLA*IMP—an integrated framework for imputing classical HLA alleles from SNP genotypes. Bioinformatics 27, 968–972 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS ONE 8, e64683 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Zheng, X. et al. HIBAG—HLA genotype imputation with attribute bagging. Pharmacogenomics J. 14, 192–200 (2014).

    Article  CAS  PubMed  Google Scholar 

  13. Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47, 898–905 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. McLaren, P. J. et al. Polymorphisms of large effect explain the majority of the host genetic contribution to variation of HIV-1 virus load. Proc. Natl Acad. Sci. USA 112, 14658–14663 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Tian, C. et al. Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections. Nat. Commun. 8, 599 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Onengut-Gumuscu, S. et al. Type 1 diabetes risk in African-ancestry participants and utility of an ancestry-specific genetic risk score. Diabetes Care 42, 406–415 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. HIV/AIDS (WHO, 2021); https://www.who.int/news-room/fact-sheets/detail/hiv-aids

  18. McLaren, P. J. et al. Fine-mapping classical HLA variation associated with durable host control of HIV-1 infection in African Americans. Hum. Mol. Genet. 21, 4334–4347 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Okada, Y. et al. Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. Nat. Commun. 9, 1631 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Mitt, M. et al. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. Eur. J. Hum. Genet. 25, 869–876 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Hirata, J. et al. Genetic and phenotypic landscape of the major histocompatibilty complex region in the Japanese population. Nat. Genet. 51, 470–480 (2019).

    Article  CAS  PubMed  Google Scholar 

  23. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  24. Nelis, M. et al. Genetic structure of Europeans: a view from the north-east. PLoS ONE 4, e5472 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Dilthey, A., Cox, C., Iqbal, Z., Nelson, M. R. & McVean, G. Improved genome inference in the MHC using a population reference graph. Nat. Genet. 47, 682–688 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Dilthey, A. T. et al. High-accuracy HLA type inference from whole-genome sequencing data using population reference graphs. PLoS Comput. Biol. 12, e1005151 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Dilthey, A. T. et al. HLA*LA-HLA typing from linearly projected graph alignments. Bioinformatics 35, 4394–4396 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Mellors, J. W. et al. Quantitation of HIV-1 RNA in plasma predicts outcome after seroconversion. Ann. Intern. Med. 122, 573–579 (1995).

    Article  CAS  PubMed  Google Scholar 

  29. Bartha, I. et al. Estimating the respective contributions of human and viral genetic variation to HIV control. PLoS Comput. Biol. 13, e1005339 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Blanco-Gelaz, M. A. et al. The amino acid at position 97 is involved in folding and surface expression of HLA-B27. Int. Immunol. 18, 211–220 (2006).

    Article  CAS  PubMed  Google Scholar 

  31. Stewart-Jones, G. B. E. et al. Structures of three HIV-1 HLA-B*5703-peptide complexes and identification of related HLAs potentially associated with long-term nonprogression. J. Immunol. 175, 2459–2468 (2005).

    Article  CAS  PubMed  Google Scholar 

  32. Archbold, J. K. et al. Natural micropolymorphism in human leukocyte antigens provides a basis for genetic control of antigen recognition. J. Exp. Med. 206, 209–219 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Kløverpris, H. N. et al. HIV control through a single nucleotide on the HLA-B locus. J. Virol. 86, 11493–11500 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Gaiha, G. D. et al. Structural topology defines protective CD8+ T cell epitopes in the HIV proteome. Science 364, 480–484 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Browning, B. L. & Browning, S. R. A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88, 173–182 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Hill, A. V. et al. Common West African HLA antigens are associated with protection from severe malaria. Nature 352, 595–600 (1991).

    Article  CAS  PubMed  Google Scholar 

  37. Sanchez-Mazas, A. et al. The HLA-B landscape of Africa: signatures of pathogen-driven selection and molecular identification of candidate alleles to malaria protection. Mol. Ecol. 26, 6238–6252 (2017).

    Article  CAS  PubMed  Google Scholar 

  38. Maiers, M., Gragert, L. & Klitz, W. High-resolution HLA alleles and haplotypes in the United States population. Hum. Immunol. 68, 779–788 (2007).

    Article  CAS  PubMed  Google Scholar 

  39. Chen, J. J. et al. Hardy–Weinberg testing for HLA class II (DRB1, DQA1, DQB1, AND DPB1) loci in 26 human ethnic groups. Tissue Antigens 54, 533–542 (1999).

    Article  CAS  PubMed  Google Scholar 

  40. Tshabalala, M. et al. Human leukocyte antigen-A, B, C, DRB1, and DQB1 allele and haplotype frequencies in a subset of 237 donors in the South African Bone Marrow Registry. J. Immunol. Res. 2018, 2031571 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Hagenlocher, Y. et al. 6-Locus HLA allele and haplotype frequencies in a population of 1075 Russians from Karelia. Hum. Immunol. 80, 95–96 (2019).

    Article  CAS  PubMed  Google Scholar 

  42. Nothnagel, M., Fürst, R. & Rohde, K. Entropy as a measure for linkage disequilibrium over multilocus haplotype blocks. Hum. Hered. 54, 186–198 (2002).

    Article  CAS  PubMed  Google Scholar 

  43. Okada, Y. et al. Construction of a population-specific HLA imputation reference panel and its application to Graves’ disease risk in Japanese. Nat. Genet. 47, 798–802 (2015).

    Article  CAS  PubMed  Google Scholar 

  44. Okada, Y. eLD: entropy-based linkage disequilibrium index between multiallelic sites. Hum. Genome Var. 5, 29 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Chikata, T. et al. Host-specific adaptation of HIV-1 subtype B in the Japanese population. J. Virol. 88, 4764–4775 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Nomura, E. et al. Mapping of a disease susceptibility locus in chromosome 6p in Japanese patients with ulcerative colitis. Genes Immun. 5, 477–483 (2004).

    Article  CAS  PubMed  Google Scholar 

  47. Price, P. et al. The genetic basis for the association of the 8.1 ancestral haplotype (A1, B8, DR3) with multiple immunopathological diseases. Immunol. Rev. 167, 257–274 (1999).

    Article  CAS  PubMed  Google Scholar 

  48. Horton, R. et al. Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project. Immunogenetics 60, 1–18 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Graham, R. R. et al. Visualizing human leukocyte antigen class II risk haplotypes in human systemic lupus erythematosus. Am. J. Hum. Genet. 71, 543–553 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Miller, F. W. et al. Genome-wide association study identifies HLA 8.1 ancestral haplotype alleles as major genetic risk factors for myositis phenotypes. Genes Immun. 16, 470–480 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Haapasalo, K. et al. The psoriasis risk allele HLA-C*06:02 shows evidence of association with chronic or recurrent Streptococcal tonsillitis. Infect. Immun. 86, e00304–e00318 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Salter-Townshend, M. & Myers, S. Fine-scale inference of ancestry segments without prior knowledge of admixing groups. Genetics 212, 869–889 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Zhou, Q., Zhao, L. & Guan, Y. Strong selection at MHC in Mexicans since admixture. PLoS Genet. 12, e1005847 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Meyer, D., C Aguiar, V. R., Bitarello, B. D., C Brandt, D. Y. & Nunes, K. A genomic perspective on HLA evolution. Immunogenetics 70, 5–27 (2018).

    Article  CAS  PubMed  Google Scholar 

  55. Norris, E. T. et al. Admixture-enabled selection for rapid adaptive evolution in the Americas. Genome Biol. 21, 29 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Guan, Y. Detecting structure of haplotypes and local ancestry. Genetics 196, 625–642 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Degenhardt, F. et al. Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles. Hum. Mol. Genet. 28, 2078–2092 (2019).

    Article  CAS  PubMed  Google Scholar 

  59. Ambardar, S. & Gowda, M. High-resolution full-length HLA typing method using third generation (Pac-Bio SMRT) sequencing technology. Methods Mol. Biol. 1802, 135–153 (2018).

    Article  CAS  PubMed  Google Scholar 

  60. Macdonald, W. A. et al. A naturally selected dimorphism within the HLA-B44 supertype alters class I structure, peptide repertoire, and T cell recognition. J. Exp. Med. 198, 679–691 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Kloverpris, H. N. et al. HLA-B*57 micropolymorphism shapes HLA allele-specific epitope immunogenicity, selection pressure, and HIV immune control. J. Virol. 86, 919–929 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Carrington, M. & Walker, B. D. Immunogenetics of spontaneous control of HIV. Annu. Rev. Med. 63, 131–145 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Torkamani, A. & Topol, E. Polygenic risk scores expand to obesity. Cell 177, 518–520 (2019).

    Article  CAS  PubMed  Google Scholar 

  66. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).

    PubMed  Google Scholar 

  68. Julg, B. et al. Possession of HLA class II DRB1*1303 associates with reduced viral loads in chronic HIV-1 clade C and B infection. J. Infect. Dis. 203, 803–809 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Schäfer, C., Schmidt, A. H. & Sauter, J. Hapl-o-Mat: open-source software for HLA haplotype frequency estimation from ambiguous and heterogeneous data. BMC Bioinformatics 18, 284 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  70. Pappas, D. J., Marin, W., Hollenbach, J. A. & Mack, S. J. Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): an integrated case–control analysis pipeline. Hum. Immunol. 77, 283–287 (2016).

    Article  CAS  PubMed  Google Scholar 

  71. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Price, A. L. et al. Long-range LD can confound genome scans in admixed populations. Am. J. Hum. Genet. 83, 132–135 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Pasaniuc, B. et al. Analysis of Latino populations from GALA and MEC studies reveals genomic loci with biased local ancestry estimation. Bioinformatics 29, 1407–1415 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. McLaren, P. J. et al. Association study of common genetic variants and HIV-1 acquisition in 6,300 infected cases and 7,200 controls. PLoS Pathog. 9, e1003515 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Okada, Y. et al. Contribution of a non-classical HLA gene, HLA-DOA, to the risk of rheumatoid arthritis. Am. J. Hum. Genet. 99, 366–374 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Lenz, T. L. et al. Widespread non-additive and interaction effects within HLA loci modulate the risk of autoimmune diseases. Nat. Genet. 47, 1085–1090 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The study was supported by the National Institutes of Health (NIH) TB Research Unit Network, Grant U19 AI111224-01. We thank C. Willer, B. Vanderwerff and B. Klunder from the University of Michigan for help facilitating getting the constructed reference panel on the Michigan Imputation Server. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute (NHLBI); the National Institutes of Health; or the US Department of Health and Human Services. The GaP Registry at The Feinstein Institute for Medical Research provided fresh, de-identified human plasma; blood was collected from control participants under an institutional review board–approved protocol (IRB No. 09-081) and processed to isolate plasma. The GaP is a sub-protocol of the Tissue Donation Program at Northwell Health and a national resource for genotype–phenotype studies (https://www.feinsteininstitute.org/robert-s-boas-center-for-genomics-and-human-genetics/gap-registry/). For some HIV cohort participants, DNA and data collection was supported by NIH/NIAID AIDS Clinical Trial Group (ACTG) grants UM1 AI068634, UM1 AI068636 and UM1 AI106701, and ACTG clinical research site grants A1069412, A1069423, A1069424, A1069503, AI025859, AI025868, AI027658, AI027661, AI027666, AI027675, AI032782, AI034853, AI038858, AI045008, AI046370, AI046376, AI050409, AI050410, AI050410, AI058740, AI060354, AI068636, AI069412, AI069415, AI069418, AI069419, AI069423, AI069424, AI069428, AI069432, AI069432, AI069434, AI069439, AI069447, AI069450, AI069452, AI069465, AI069467, AI069470, AI069471, AI069472, AI069474, AI069477, AI069481, AI069484, AI069494, AI069495, AI069496, AI069501, AI069501, AI069502, AI069503, AI069511, AI069513, AI069532, AI069534, AI069556, AI072626, AI073961, RR000046, RR000425, RR023561, RR024156, RR024160, RR024996, RR025008, RR025747, RR025777, RR025780, TR000004, TR000058, TR000124, TR000170, TR000439, TR000445, TR000457, TR001079, TR001082, TR001111 and TR024160. Molecular data for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the NHLBI. See the TOPMed omics support table (Supplementary Table 23) for study-specific omics support information. Core support including centralized genomic read mapping and genotype calling, along with variant quality metrics and filtering, was provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support including phenotype harmonization, data management, sample-identity QC and general program coordination was provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The COPDGene project was supported by Award Number U01 HL089897 and Award Number U01 HL089856 from the NHLBI. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NHLBI or the National Institutes of Health. The COPDGene project is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprised of AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, Novartis, Pfizer, Siemens and Sunovion. A full listing of COPDGene investigators can be found at http://www.copdgene.org/directory. The JHS is supported and conducted in collaboration with Jackson State University (HHSN268201800013I), Tougaloo College (HHSN268201800014I), Mississippi State Department of Health (HHSN268201800015I) and University of Mississippi Medical Center (HHSN268201800010I, HHSN268201800011I and HHSN268201800012I) contracts from the NHLBI and the National Institute on Minority Health and Health Disparities. We also thank the staff and participants of the JHS. MESA and the MESA SHARe project are conducted and supported by the NHLBI in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079 and UL1-TR-001420. MESA Family is conducted and supported by the NHLBI in collaboration with MESA investigators. Support is provided by grants and contracts R01HL071051, R01HL071205, R01HL071250, R01HL071251, R01HL071258 and R01HL071259, and by the National Center for Research Resources, grant UL1RR033176. The provision of genotyping data was supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR001881, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. This project has been funded in whole or in part with federal funds from the Frederick National Laboratory for Cancer Research, under Contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US Government. This research was supported in part by the Intramural Research Program of the NIH, Frederick National Laboratory, Center for Cancer Research. The Diabetes Heart Study was supported by R01 HL92301, R01 HL67348, R01 NS058700, R01 AR48797, R01 DK071891, R01 AG058921, the General Clinical Research Center of the Wake Forest University School of Medicine (M01 RR07122, F32 HL085989), the American Diabetes Association and a pilot grant from the Claude Pepper Older Americans Independence Center of Wake Forest University Health Sciences (P60 AG10484). A. Metspalu is supported by Gentransmed grant 2014-2020.4.01.15-0012. D.W.H. is supported by NIH grants AI110527, AI077505, TR000445, AI069439 and AI110527. J.T.E. and P.E.S. were supported by NIH/NIAMS R01 AR042742, R01 AR050511 and R01 AR063611. Y.O. was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI (19H01021, 20K21834), AMED (JP20km0405211, JP20ek0109413, JP20ek0410075, JP20gm4010006 and JP20km0405217), Takeda Science Foundation, JST Moonshot R&D (JPMJMS2021, JPMJMS2024) and the Bioinformatics Initiative of Osaka University Graduate School of Medicine, Osaka University.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

Y. Luo and S. Raychaudhuri conceived, designed and performed analyses, wrote the manuscript and supervised the research. M. Kanai implemented the omnibus test for the HIV-1 fine-mapping study. Y. Luo, W.C., M. Kanai, P.E.S., J.T.E. and B. Han contributed to the development of the HLA-TAPAS pipeline. X. Li performed the selection analysis. S. Sakaue performed imputation comparison between Beagle v.4 and Minimac4. L. Forer, S. Schoenherr, C. Fuchsberger and A.V.S. hosted the HLA imputation server. J.T.E., M.G.-A. and P.K.G. helped with the GaP data acquisition. K. Yamamoto, K.O., D.W.H., X.G., N.D.P., Y.-D.I.C., J.I.R., K.D.T., S.S.R., A.C., J.G.W., S. Kathiresan, M.H.C., A. Metspalu, T.E. and Y.O. contributed to the WGS data acquisition. J. Fellay, M. Carrington and P.J.M. contributed to the HIV-1 data acquisition. All authors contributed to the writing of the manuscript.

Corresponding authors

Correspondence to Yang Luo or Soumya Raychaudhuri.

Ethics declarations

Competing interests

M.H.C. has received consulting or speaking fees from Illumina and AstraZeneca, and grant support from GSK and Bayer. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Pierre-Antoine Gourraud and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 HLA nomenclature.

Description of a classical HLA allele using current standard nomenclature. The first field corresponds to the serological antigen. The second field distinguishes HLA alleles that differ by one or more missense variants. The third field distinguishes HLA alleles that differ by one or more synonymous variants. The G-group distinguishes HLA alleles that differ by one or more synonymous variants within the exons that encode the peptide binding groove regions (exon 2 and 3 for HLA class I genes and exon 2 for HLA class II genes).

Extended Data Fig. 2 Correlation between imputed and typed dosage (dosage r2) of classical HLA alleles in 1,067 Admixed African HIV-1 individuals.

The x-axis shows the minor allele frequency observed in the SBT dataset. Blue points show G-group HLA alleles. Red points show one-field HLA alleles.

Extended Data Fig. 3 Association tests within the MHC to HIV-1 viral load.

The x-axis shows the genomic positions of chromosome 6 (build 37), and the y-axis is the -log10 (P-value) obtained from two-sided regression analyses for SNPs (gray), classical HLA alleles (blue) and amino acids (red). The dashed black line indicates the genome-wide significance threshold (P = 5 × 10−8). For biallelic markers, results were calculated by a linear regression model including sex, cohort-specific principal components and ancestry indicator as covariates (circle). Association at amino acid positions with more than two residues was calculated using a multi-degree-of-freedom omnibus test (one-sided F-test) including the same covariates (diamond). The top associated amino acid, classical HLA allele and SNPs are annotated in the figure. a, Of all variants tested, the top hit maps to amino acid position 97 in HLA-B. b, Subsequent conditional analysis controlling for all residues at position 97 in HLA-B revealed an independent association at position 67 in HLA-B. c, Results conditioned on position 97 and 67 in HLA-B showed a third signal at position 156 in HLA-B. d, Results conditioned on position 97, 67 and 156 in HLA-B showed position 77 in HLA-A has the strongest association signal outside HLA-B among all amino acid positions. e, Results conditioned on all amino acid positions in HLA-B. Notably, amino acid positions were more significant than any single SNP or classical HLA allele in each conditional analysis for the three amino acid positions in HLA-B.

Extended Data Fig. 4 Effect on set point viral load of individual residues at position 97 in HLA-B.

Mean set point viral load (spVL, RNA copies per milliliter) and its standard error of all six residues at position 97 in HLA-B in three populations independently. Data are presented as mean values ± standard errors. Residues are ranked from the most protective to the riskiest in the overall population. There are 3,901 AA, 7,455 EUR, and 677 LAT independent individuals included in the analysis.

Extended Data Fig. 5 Global diversity of the MHC region.

Principal component analysis of the pairwise IBD distance between 21,546 individuals using MHC region markers. The first two principal components show separation of continental groups.

Extended Data Fig. 6 Diversity of eight classical HLA genes in the constructed multi-ancestry MHC reference panel.

Each gene is stratified by six populations (AA, Admixed African; EAS, East Asian; EUR, European; LAT, Latino; SAS, South Asian). The top two most common alleles within each classical gene of each population are plotted across all panels. Alleles that have frequencies greater than 1% are also labelled in the bar plots. a, Class I genes. b, Class II genes.

Extended Data Fig. 7 Allele diversity of eight classical HLA genes in global populations.

For each gene, the top five most frequent alleles across all populations are shown (light blue, most frequent; dark blue, second frequent; light green, third frequent; dark green, fourth frequent; red, fifth frequent; gray, all other alleles).

Extended Data Fig. 8 Pairwise normalized entropy (ε) among all population groups.

The normalized entropy (ε) measures the difference of the haplotype frequency distribution for linkage disequilibrium and linkage equilibrium, and takes values between 0 (no LD) to 1 (perfect LD).

Extended Data Fig. 9 Deviation from average genome-wide ancestry in Admixed African and Latino populations.

a,b, The x-axis is the genomic position of chromosome 6. The y-axis shows the local African ancestry deviation measure inferred at a given position for Admixed Africans (a) and Latinos (b). The MHC region (chr6:28Mb-34Mb) is highlighted in red shading. Local ancestries were estimated using RFMix (red) and ELAI (blue). The ancestry deviation measure is the difference between African ancestry at a given genomic position with respect to the genome-wide average estimated by ADMIXTURE with K = 3, normalized by the standard deviation of the ancestry estimate. The dashed line indicates the genome-wide significance threshold at ±4.42 standard deviation of the ancestry estimate deviated from the genome-wide average.

Extended Data Fig. 10 Conditional analysis of other previously reported independently associated amino acid positions.

a,b, Manhattan plots of amino acid positions in the six classical HLA genes. Each point shows a single amino acid position and its omnibus P-value after controlling for independent positions that are associated with spVL in this study (position 97, 67 and 156 in HLA-B) (a) and independent positions that are only reported in previous studies14,18 and not in the presented work (position 45, 63 and 116 in HLA-B and position 77, 95 in HLA-A) (b). Independently associated amino acid positions that are only reported in the European population14 are shown in blue. Independently associated amino acid positions that are only reported in the African American population18 are shown in purple. Independently associated amino acid positions identified in this study are shown in red.

Supplementary information

Supplementary Information

Supplementary Figs. 1–18, Tables 1–23 and Note.

Reporting Summary

Supplementary Data 1

Summary of inferred HLA alleles at G-group resolution. For each allele observed in the reference panel, we listed its overall frequency (Freq); P value (Pval) for difference in frequencies across populations (a two-sided chi-square test with four degrees of freedom); frequency within each continental population (European, EUR; AA, admixed African; LAT, Latino; SAS, South Asian; EAS, East Asian) and its accuracy in two validation cohorts (JPN (n = 288 independent individuals) and 1KG (n = 955 independent individuals)).

Supplementary Data 2

Imputation dosage r2 of each allele. Each row shows each imputed classical HLA allele and its imputation accuracy (measure in dosage r2) in three cohorts with validation data (G1K (n = 955), GaP Registry (GAP, n = 75) and HIV (n = 1,067)). We also report the allelic frequency using the gold standard and the inferred alleles.

Supplementary Data 3

Association study within the MHC to HIV-1 viral load. For each row, we list the results of association testing for each of the binary markers that we imputed across the extended MHC region. For each marker, its unique identifier (ID), genome base pair position in Grch37 (BP), reference allele (REF), alternative allele (ALT), minor allele frequency (MAF), effect size (BETA) and standard error (SE) in each conditional analysis are listed.

Supplementary Data 4

HLA haplotype frequency based on eight classical HLA alleles inferred at G-group resolution. All haplotypes with count > 1 in each and overall population are listed. AA represents individuals of admixed African ancestry; EAS represents individuals of East Asian ancestry; EUR represents individuals of European ancestry; LAT represents individuals of Native American ancestry; SAS represents individuals of South Asian ancestry.

Peer Review Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, Y., Kanai, M., Choi, W. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat Genet 53, 1504–1516 (2021). https://doi.org/10.1038/s41588-021-00935-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00935-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing