A recent genetic association study1 identified a gene cluster on chromosome 3 as a risk locus for respiratory failure after infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A separate study (COVID-19 Host Genetics Initiative)2 comprising 3,199 hospitalized patients with coronavirus disease 2019 (COVID-19) and control individuals showed that this cluster is the major genetic risk factor for severe symptoms after SARS-CoV-2 infection and hospitalization. Here we show that the risk is conferred by a genomic segment of around 50 kilobases in size that is inherited from Neanderthals and is carried by around 50% of people in south Asia and around 16% of people in Europe.
The summary statistics of the genome-wide association study that support the finding of this study are available from the COVID-19 Host Genetics Initiative (round 3, ANA_B2_V2: hospitalized patients with COVID-19 compared with population controls; https://www.covid19hg.org/). The genomes used are available from the 1000 Genomes Project (phase 3 release, https://www.internationalgenome.org/) and the Max Planck Institute for Evolutionary Anthropology (Chagyrskaya, Altai and Vindija 33.19, http://cdna.eva.mpg.de/neandertal/). The ancestral alleles are available at Ensembl (release 100, https://www.ensembl.org/). Map data are from OpenStreetMap and available from https://www.openstreetmap.org.
Ellinghaus, D. et al. Genomewide association study of severe COVID-19 with respiratory failure. N. Engl. J. Med. https://doi.org/10.1056/NEJMoa2020283 (2020).
COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).
WHO. Coronavirus disease (COVID-19) Weekly Epidemiological Update and Weekly Operational Update: Weekly Epidemiological Update 14 September 2020 https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports (2020).
Vetter, P. et al. Clinical features of COVID-19. Br. Med. J. 369, m1470 (2020).
Zhou, F. et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 395, 1054–1062 (2020).
Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).
Sankararaman, S., Patterson, N., Li, H., Pääbo, S. & Reich, D. The date of interbreeding between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012).
Prüfer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358, 655–658 (2017).
Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).
Mafessoni, F. et al. A high-coverage Neandertal genome from Chagyrskaya Cave. Proc. Natl Acad. Sci. USA 117, 15132–15136 (2020).
Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).
Langergraber, K. E. et al. Generation times in wild chimpanzees and gorillas suggest earlier divergence times in great ape and human evolution. Proc. Natl Acad. Sci. USA 109, 15716–15721 (2012).
Kong, A. et al. A high-resolution recombination map of the human genome. Nat. Genet. 31, 241–247 (2002).
Huerta-Sánchez, E. et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature 512, 194–197 (2014).
Sankararaman, S. et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature 507, 354–357 (2014).
Vernot, B. & Akey, J. M. Resurrecting surviving Neandertal lineages from modern human genomes. Science 343, 1017–1021 (2014).
Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 352, 235–239 (2016).
Steinrücken, M., Spence, J. P., Kamm, J. A., Wieczorek, E. & Song, Y. S. Model-based detection and analysis of introgressed Neanderthal ancestry in modern humans. Mol. Ecol. 27, 3873–3888 (2018).
Gittelman, R. M. et al. Archaic hominin admixture facilitated adaptation to out-of-Africa environments. Curr. Biol. 26, 3375–3382 (2016).
Chen, L., Wolf, A. B., Fu, W., Li, L. & Akey, J. M. Identifying and interpreting apparent Neanderthal ancestry in African individuals. Cell 180, 677–687 (2020).
Skov, L. et al. The nature of Neanderthal introgression revealed by 27,566 Icelandic genomes. Nature 582, 78–83 (2020).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
OpenStreetMap. Planet OSM. https://planet.osm.org/ (2017).
Public Health England. COVID-19: Review of Disparities in Risks and Outcomes. https://www.gov.uk/government/publications/covid-19-review-of-disparities-in-risks-and-outcomes (2020).
Browning, S. R., Browning, B. L., Zhou, Y., Tucci, S. & Akey, J. M. Analysis of human sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61 (2018).
Dannemann, M., Andrés, A. M. & Kelso, J. Introgression of Neandertal- and Denisovan-like haplotypes contributes to adaptive variation in human Toll-like receptors. Am. J. Hum. Genet. 98, 22–33 (2016).
Zeberg, H., Kelso, J. & Pääbo, S. The Neandertal progesterone receptor. Mol. Biol. Evol. 37, 2655–2660 (2020).
Zeberg, H. et al. A Neanderthal sodium channel increases pain sensitivity in present-day humans. Curr. Biol. 30, 3465–3469 (2020).
Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555–3557 (2015).
Li, H. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27, 718–719 (2011).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Hasegawa, M., Kishino, H. & Yano, T. Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174 (1985).
Yates, A. D. et al. Ensembl 2020. Nucleic Acids Res. 48, D682–D688 (2020).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
We thank the COVID-19 Host Genetics Initiative for making the data from the genome-wide association study available, and the Max Planck Society and the NOMIS Foundation for funding.
The authors declare no competing interests.
Peer review information Nature thanks Tobias Lenz, Yang Luo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Odds ratios for hospitalization due to COVID-19 for cohorts contributing to the meta-analysis (round 3) of the COVID-19 Host Genetics Initiative (rs35044562).
The odds ratio and the P value for the summary effect are odds ratio = 1.60 (95% confidence interval, 1.42–1.79) and P = 3.1 × 10−15 (two-sided z-test, n = 3,199 patients with COVID-19 and 897,488 controls over 8 independent studies). Data are the odds ratios and 95% confidence intervals. HOST(age), UK Biobank European (EUR), GENCOVID, deCODE and BelCovid use European population controls. BRACOVID, Genes & Health and FinnGen use American, south Asian and Finnish population controls, respectively.
Heat map of linkage disequilibrium between genetic variants where one allele is shared with three Neanderthal genomes and missing in 108 Yoruba individuals. The black box highlights a haplotype of 333.8 kb between rs17763537 and rs13068572 (chromosome 3: 45,843,315–46,177,096). Red, r2 correlation; blue, D′ correlation.
Extended Data Fig. 3 Linkage disequilibrium between index variant rs11385942 and the index variant of the COVID-19 Host Genetics Initiative (rs35044562).
Shades of red indicate the extent of linkage disequilibrium (r2) in the populations included in the 1000 Genomes Project. Populations labelled ‘n/a’ are monomorphic for the protective allele of rs35044562. The previously described index variant (rs11385942)1 does not have any genetic variants in linkage disequilibrium (r2 > 0.8) in populations from Africa. Map source data from OpenStreetMap23.
Extended Data Fig. 4 Phylogeny of haplotypes in individuals included in the 1000 Genomes Project and Neanderthals covering the genomic region of the core risk haplotype.
The shaded area highlights a monophyletic group that contains all present-day haplotypes carrying the risk allele at rs35044562 and the haplotypes of the three high-coverage Neanderthals. Arabic numbers show bootstrap support (100 replicates). The tree is rooted with the inferred ancestral human sequence. Scale bar, number of substitutions per nucleotide position.
Extended Data Fig. 5 Frequency differences between south and east Asia for haplotypes introgressed from Neanderthals.
The dashed line indicates the frequency difference for the Neanderthal haplotype that confers risk of severe COVID-19.
About this article
Cite this article
Zeberg, H., Pääbo, S. The major genetic risk factor for severe COVID-19 is inherited from Neanderthals. Nature (2020). https://doi.org/10.1038/s41586-020-2818-3
Infection and Drug Resistance (2020)
Human genetic factors associated with susceptibility to SARS-CoV-2 infection and COVID-19 disease severity
Human Genomics (2020)