The genetic makeup of an individual contributes to susceptibility and response to viral infection. While environmental, clinical and social factors play a role in exposure to SARS-CoV-2 and COVID-19 disease severity1,2, host genetics may also be important. Identifying host-specific genetic factors may reveal biological mechanisms of therapeutic relevance and clarify causal relationships of modifiable environmental risk factors for SARS-CoV-2 infection and outcomes. We formed a global network of researchers to investigate the role of human genetics in SARS-CoV-2 infection and COVID-19 severity. We describe the results of three genome-wide association meta-analyses comprised of up to 49,562 COVID-19 patients from 46 studies across 19 countries. We reported 13 genome-wide significant loci that are associated with SARS-CoV-2 infection or severe manifestations of COVID-19. Several of these loci correspond to previously documented associations to lung or autoimmune and inflammatory diseases3–7. They also represent potentially actionable mechanisms in response to infection. Mendelian Randomization analyses support a causal role for smoking and body mass index for severe COVID-19 although not for type II diabetes. The identification of novel host genetic factors associated with COVID-19, with unprecedented speed, was made possible by the community of human genetic researchers coming together to prioritize sharing of data, results, resources and analytical frameworks. This working model of international collaboration underscores what is possible for future genetic discoveries in emerging pandemics, or indeed for any complex human disease.
This Supplementary Information file contains the following sections: New and replicated loci from COVID-19 HGI meta-analyses; Additional independent susceptibility signals at the 3p21.31 locus ; Sensitivity analysis for use of population controls; Sensitivity analysis for overlapping samples between cohorts in Mendelian randomization analyses; Supplementary discussion on study limitations; Supplementary References; and titles and summaries for Supplementary Tables 1-13 (see Excel file for Supplementary Tables).
Quantile-quantile plots for GWAS from all individual studies that contributed data. QQ-plots showing the expected -log10(P-values) on the x-axis and the observed unadjusted P-values values from two-tailed inverse variance weighted meta-analysis on the y-axis (red line showing no deviation from the expected) for each study contributing data to the analyses. Sample size of cases and controls is listed for each study in the plot title, as well as the median lambda value.
LozusZoom plots to visualise the meta-analysis results at the loci passing genome-wide significance. For each genome-wide significant locus in three meta-analyses: meta-analysis of critical illness, hospitalization, and reported infection, we showed 1) a manhattan plot of each locus where a color represents a weighted-average r2 value (see Methods) to a lead variant (unadjusted P-values from the two-tailed inverse variance weighted meta-analysis); 2) r2 values to a lead variant across gnomAD v2 populations, i.e., African/African-American (AFR), Latino/Admixed American (AMR), Ashkenazi Jewish (ASJ), East Asian (EAS), Estonian (EST), Finnish (FIN), Non-Finish Europeans (NFE), North-Western Europeans (NWE), and Southern Europeans (SEU); 3) genes at a locus; and 4) genes prioritized by each gene prioritization metric where a size of circles represents a rank in each metric. Note that the COVID-19 lead variants were chosen across all the meta-analyses (Supplementary Table 2; see Methods) and were not necessarily a variant with the most significant P-value from each inverse variance weighted meta-analysis.
Scatter and funnel plots for each for exposure - COVID-19 outcome pair. Scatter plots show the exposure variant effect size against the COVID-19 outcome variant effect size and corresponding standard errors. Funnel plots show the Mendelian randomization (MR) causal estimates for each variant against their precision, with asymmetry in the plot indicating potential violations of the assumptions of MR. Regression lines show the corresponding causal estimates fixed effect inverse-weighted (IVW, red-solid line) meta-analysis; MR-Egger regression (blue-dashed); Weighted median estimator (WME, green-dashed); weighted mode based estimator (WMBE, purple-dashed); and Mendelian Randomization Pleiotropy RESidual Sum and Outlier corrected (MR-PRESSO, orange-dashed). Variants highlighted in red were flagged as outliers by MR-PRESSO.
This file contains Supplementary Tables 1-13; see main Supplementary Information PDF for table titles and summaries.
This file contains the full authorship for the Covid-19 Host Genetics Initiative.
About this article
Cite this article
COVID-19 Host Genetics Initiative. Mapping the human genetic architecture of COVID-19. Nature (2021). https://doi.org/10.1038/s41586-021-03767-x