Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants

Aragam, Krishna G.; Jiang, Tao; Goel, Anuj; Kanoni, Stavroula; Wolford, Brooke N.; Atri, Deepak S.; Weeks, Elle M.; Wang, Minxian; Hindy, George; Zhou, Wei; Grace, Christopher; Roselli, Carolina; Marston, Nicholas A.; Kamanu, Frederick K.; Surakka, Ida; Venegas, Loreto Muñoz; Sherliker, Paul; Koyama, Satoshi; Ishigaki, Kazuyoshi; Åsvold, Bjørn O.; Brown, Michael R.; Brumpton, Ben; de Vries, Paul S.; Giannakopoulou, Olga; Giardoglou, Panagiota; Gudbjartsson, Daniel F.; Güldener, Ulrich; Haider, Syed M. Ijlal; Helgadottir, Anna; Ibrahim, Maysson; Kastrati, Adnan; Kessler, Thorsten; Kyriakou, Theodosios; Konopka, Tomasz; Li, Ling; Ma, Lijiang; Meitinger, Thomas; Mucha, Sören; Munz, Matthias; Murgia, Federico; Nielsen, Jonas B.; Nöthen, Markus M.; Pang, Shichao; Reinberger, Tobias; Schnitzler, Gavin; Smedley, Damian; Thorleifsson, Gudmar; von Scheidt, Moritz; Ulirsch, Jacob C.; Arnar, David O.; Burtt, Noël P.; Costanzo, Maria C.; Flannick, Jason; Ito, Kaoru; Jang, Dong-Keun; Kamatani, Yoichiro; Khera, Amit V.; Komuro, Issei; Kullo, Iftikhar J.; Lotta, Luca A.; Nelson, Christopher P.; Roberts, Robert; Thorgeirsson, Gudmundur; Thorsteinsdottir, Unnur; Webb, Thomas R.; Baras, Aris; Björkegren, Johan L. M.; Boerwinkle, Eric; Dedoussis, George; Holm, Hilma; Hveem, Kristian; Melander, Olle; Morrison, Alanna C.; Orho-Melander, Marju; Rallidis, Loukianos S.; Ruusalepp, Arno; Sabatine, Marc S.; Stefansson, Kari; Zalloua, Pierre; Ellinor, Patrick T.; Farrall, Martin; Danesh, John; Ruff, Christian T.; Finucane, Hilary K.; Hopewell, Jemma C.; Clarke, Robert; Gupta, Rajat M.; Erdmann, Jeanette; Samani, Nilesh J.; Schunkert, Heribert; Watkins, Hugh; Willer, Cristen J.; Deloukas, Panos; Kathiresan, Sekar; Butterworth, Adam S.

doi:10.1038/s41588-022-01233-6

Download PDF

Article
Open access
Published: 06 December 2022

Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants

Nature Genetics volume 54, pages 1803–1815 (2022)Cite this article

38k Accesses
171 Citations
166 Altmetric
Metrics details

Subjects

Abstract

The discovery of genetic loci associated with complex diseases has outpaced the elucidation of mechanisms of disease pathogenesis. Here we conducted a genome-wide association study (GWAS) for coronary artery disease (CAD) comprising 181,522 cases among 1,165,690 participants of predominantly European ancestry. We detected 241 associations, including 30 new loci. Cross-ancestry meta-analysis with a Japanese GWAS yielded 38 additional new loci. We prioritized likely causal variants using functionally informed fine-mapping, yielding 42 associations with less than five variants in the 95% credible set. Similarity-based clustering suggested roles for early developmental processes, cell cycle signaling and vascular cell migration and proliferation in the pathogenesis of CAD. We prioritized 220 candidate causal genes, combining eight complementary approaches, including 123 supported by three or more approaches. Using CRISPR–Cas9, we experimentally validated the effect of an enhancer in MYO9B, which appears to mediate CAD risk by regulating vascular cell motility. Our analysis identifies and systematically characterizes >250 risk loci for CAD to inform experimental interrogation of putative causal mechanisms for CAD.

Large-scale genome-wide association study of coronary artery disease in genetically diverse populations

Article 01 August 2022

Mapping gene and gene pathways associated with coronary artery disease: a CARDIoGRAM exome and multi-ancestry UK biobank analysis

Article Open access 12 August 2021

Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure

Article Open access 09 January 2020

Main

Coronary artery disease (CAD) remains the leading global cause of mortality, reflecting both risk behaviors and genetic susceptibility¹. Genetic association studies have identified >200 susceptibility loci for CAD. Consistent with other complex diseases, genetic analyses have identified the polygenic architecture of CAD, enabled insights into disease etiology and facilitated the development of new tools for risk prediction^{2,3,4,5,6,7,8,9,10}. However, with rapid increase in the availability of genetic data linked to health outcomes, the identification of disease-associated loci has outpaced their functional characterization.

Several in silico tools have emerged to elucidate the mechanisms connecting genomic regions to disease risk^11,12. Nonetheless, it remains challenging to identify causal genes as these tools frequently lack consensus¹³. Recent analyses have suggested the value of integrating ‘locus-based’ approaches with more global (similarity-based) assessments of shared pathways and functions to enhance the prediction of causal genes^13,14,15. The use of orthogonal and disease-specific resources to aid variant and gene classifications may expedite the transition from gene maps to disease mechanisms.

To extend these approaches to CAD, we analyzed imputed data from nine studies not previously included in genome-wide association study (GWAS) meta-analyses (86,847 cases and 417,789 controls) and combined results with data from UK Biobank, the CARDIoGRAMplusC4D Consortium and Biobank Japan, achieving a total sample of 210,842 CAD cases among 1,378,170 participants^2,3,7,10,16. Our objectives were to (1) discover new associations with CAD; (2) determine the impact of expanded genetic discovery for identifying biologically relevant loci and improving risk prediction; (3) implement a systematic, integrative approach to prioritize likely causal variants, genes and biological pathways, thereby providing a catalog of testable hypotheses for experimental follow-up and (4) experimentally validate a new locus as proof of principle for our prioritization framework.

Results

Discovery of known and new CAD loci

Participants were largely (>95%) of European ancestry and 46% were female (Supplementary Table 1). In total, 20,073,070 variants were included in the discovery meta-analysis (Online Methods). We replicated 150 (69.4%) of 216 previously reported CAD loci at conventional genome-wide significance (P ≤ 5.0 × 10⁻⁸) and 38 (17.6%) at nominal significance (P ≤ 1.0 × 10⁻⁵; Supplementary Table 2). Approximate conditional analysis using Genome-wide Complex Trait Analysis (GCTA) identified 241 conditionally independent associations exceeding genome-wide significance at 198 loci (Supplementary Table 3, Extended Data Fig. 1 and Supplementary Data 1). In total, 54 sentinel variants were new, including 30 outside genomic regions previously reported for CAD (Table 1).

Table 1 New loci for CAD from primary meta-analysis

Full size table

As in previous CAD GWAS⁹, we found genetic correlations with several CAD risk factors and other cardiovascular diseases (Supplementary Table 4). To identify potential etiological mechanisms for specific loci, we conducted a phenome-wide association scan (PheWAS) in UK Biobank (Supplementary Table 5). In total, 128 (53%) of the CAD-associated variants had directionally consistent associations with conventional CAD risk factors, such as blood lipids, blood pressure, hyperglycemia or adiposity.

Several new associations (Table 1) were near genes that have not been robustly implicated in CAD via genetic association studies but have strong biological plausibility, including rs6883598 near FBN2, encoding fibrillin-2, which mediates the early stages of elastic fiber assembly and is associated with aortic aneurysms and Beals Syndrome, a Marfan-like disorder^17,18,19 and rs1892971 near MMP13, which encodes matrix metalloproteinase (MMP)-13, an interstitial collagenase that influences the structural integrity of atherosclerotic plaques through regulation and organization of intraplaque collagen^20,21. While the sentinel variant near FBN2 was associated with blood pressure in the PheWAS, the lead variant near MMP13 was not associated with conventional CAD risk factors, suggesting it is likely to act through alternative pathways.

Allelic architecture

Of the 54 new associations, 46 sentinel variants were common (minor allele frequency (MAF) > 0.05) with relatively weak effects on CAD (odds ratio (OR) per CAD risk allele: 1.03–1.07; Fig. 1). The others were low frequency (MAF = 0.009–0.036) of which, four had comparatively strong effects (OR = 1.30–1.44) and four had more modest effects (OR = 1.10–1.14; Extended Data Fig. 2). We then conducted gene-based tests of missense and predicted loss-of-function variants in UK Biobank (n = 33,941 CAD cases, 438,394 controls; Supplementary Table 6) and found a strong signal for PCSK9. We did not find evidence for further association with a burden of low-frequency or rare variants (Extended Data Fig. 3 and Supplementary Table 7).

**Fig. 1: Common variant association signals for CAD.**

Differential effects by sex

To identify associations that differ by sex, we conducted sex-stratified GWAS in a subset of studies comprising 77,080 CAD cases (Supplementary Table 8). We found ten associations that reached genome-wide significance (P ≤ 5.0 × 10⁻⁸) and had evidence (P ≤ 0.01) for between-sex heterogeneity (Supplementary Table 9). Lead variant rs7696877 was the only signal with a stronger effect in females (per-allele OR = 0.94) than in males (per-allele OR = 0.98, heterogeneity P = 0.007).

Subthreshold associations

At a significance level (P < 2.52 × 10⁻⁵) approximating a 1% false discovery rate (FDR), we identified a further 656 conditionally independent associations with CAD (Supplementary Table 10). Most (486, 74.1%) were common variants, but almost all had modest effects (per-allele OR < 1.07). Several associations had strong biological priors, including rs41279633 (P = 1.24 × 10⁻⁶) in NPC1L1, encoding Niemann-Pick C1-like 1, an important mediator of intestinal cholesterol absorption and the target of ezetimibe, a cholesterol-lowering drug. Other examples included PNPLA3 (rs738408; P = 1.04 × 10⁻⁵), the strongest locus for nonalcoholic fatty liver disease²², and TCF7L2 (rs7903146; P = 6.39 × 10⁻⁸), the strongest locus for type 2 diabetes²³. The percent of heritability for CAD (on the liability scale) explained by the 241 conditionally independent associations reaching genome-wide significance was 15.5%, increasing to 36.1% for the 897 associations with P < 2.52 × 10⁻⁵.

Polygenic score associations with incident and recurrent CAD

We evaluated 362 polygenic risk scores (PRS) using combinations of derivation methods (Pruning and Thresholding²⁴ or LDpred algorithm²⁵) and summary statistics (from the current meta-analysis or an earlier 1000 genomes-imputed GWAS involving around 60,000 CAD cases⁷). We selected the optimal PRS for each combination of the derivation method and GWAS summary statistics based on prediction of incident CAD in a training dataset from the Malmö Diet and Cancer study (MDC; n = 22,872; n_{incident_cases} = 3,307; Supplementary Table 11). The two top-performing scores were those derived with LDpred and comprised 2,324,653 variants (2022 PRS) and 1,532,758 variants (2015 PRS; Supplementary Tables 12–15). In bootstrapping analyses, the 2022 PRS outperformed the 2015 PRS (age- and sex-adjusted mean hazard ratio (HR) per 1 s.d. higher PRS = 1.56 versus 1.49; P = 3.2 × 10⁻³¹; age- and sex-adjusted mean area under the receiver operator characteristic curve (AUC) = 0.742 versus 0.736; P = 6.5 × 10⁻¹⁶; Supplementary Table 16).

We validated both scores in a held-out subset of the MDC (n = 5,685; n_{incident_cases} = 815; Supplementary Table 11). The 2022 PRS was more strongly associated with incident CAD (HR = 1.61; 95% CI = 1.50–1.72) than the 2015 PRS (HR = 1.49; 95% CI = 1.39–1.59), providing improved stratification of participants at higher and lower risk for incident CAD (Fig. 2a). After adjustment for established risk factors (Online Methods), the 2022 PRS remained strongly associated with incident events (HR = 1.54; 95% CI = 1.42–1.66). The 2022 PRS yielded a 5.7-fold higher risk of CAD between the top and bottom deciles of the PRS, compared to a 3.8-fold higher risk with the 2015 PRS.

**Fig. 2: Polygenic prediction of incident and recurrent CAD.**

We then evaluated prediction of recurrent coronary events in the placebo arm of the Further Cardiovascular Outcomes Research with PCSK9 Inhibition in Subjects with Elevated Risk (FOURIER; n = 7,135; n_{incident_cases} = 673) clinical trial, a cohort of patients with established atherosclerotic cardiovascular disease²⁶. The 2022 PRS demonstrated better recurrent event prediction (HR = 1.20; 95% CI = 1.11–1.29) than the 2015 PRS (HR = 1.13; 95% CI = 1.04–1.22) and enhanced stratification of participants at higher and lower risk (Fig. 2b). The 2022 PRS yielded a 1.7-fold higher risk of recurrent coronary events between the top and bottom deciles of the PRS versus a 1.4-fold higher risk with the 2015 PRS.

Cross-ancestry comparison and meta-analysis

We used a large CAD GWAS from Biobank Japan to evaluate the genome-wide significant associations in East Asian ancestry participants³. Effect estimates for the 199 sentinel variants in both datasets were strongly positively correlated (r = 0.59) between the predominantly European ancestry meta-analysis and the Biobank Japan GWAS (Extended Data Fig. 4a), as were the effect allele frequencies (r = 0.76; Extended Data Fig. 4b). To assess the potential for enhanced, cross-ancestry discovery, we meta-analyzed the Biobank Japan summary statistics with the current analysis, yielding 38 additional new loci at genome-wide significance (Table 2, Fig. 1, and Supplementary Table 17). The sentinel variants were common (MAF > 5%) with weak effects (per-allele ORs: 1.026–1.059; Fig. 1), with the exception of rs75655731 near LINC005999, which was low-frequency (MAF = 1.4%) with a stronger effect (per-allele OR = 1.090); 36 of these associations were included in the 1% FDR set, including the aforementioned associations at TCF7L2 and PNPLA3.

Table 2 New loci for CAD from meta-analysis with Biobank Japan

Full size table

Prioritizing causal variants, genes and biological pathways

Using several independent approaches, we prioritized causal variants, effector genes, relevant tissues and intermediate causal pathways for all 279 significant associations. The presence of a protein-altering (that is, missense or predicted loss of function) variant has been shown to be a strong, causal gene predictor, particularly if the variant is uncommon¹⁴. At 52 associations, the sentinel variant, or a strong proxy (r² ≥ 0.8), was a protein-altering variant (Supplementary Table 18). These included well-known low-frequency missense variants in PCSK9 (p.R46L) and ANGPTL4 (p.E40K)¹⁶. Nineteen of the 52 missense variants were new, including a missense variant (rs129415; p.G398R) in SCUBE1 that is strongly correlated with the CAD sentinel variant (r² = 0.99). SCUBE1 encodes signal peptide-CUB-EGF domain-containing protein 1, a glycoprotein secreted by activated platelets that protect against thrombosis in mice when inhibited²⁷.

Functionally informed fine-mapping

Incorporating functional annotations into fine-mapping approaches has been shown to improve identification of causal variants^28,29,30. Using ChromHMM-derived chromatin states from the NIH Roadmap Epigenomics Consortium to functionally annotate the genome, we found more than twofold enrichment for these states in the ten CAD-relevant cell/tissue types we tested, consistent with previous findings (Supplementary Table 19)⁷. Of 235 distance-based regions containing genome-wide significant associations, we found 127 (54.0%) with significant enrichment (Supplementary Table 20). The majority (78; 61.4%) of distance-based regions were relatively tissue specific, showing enrichment in less than three tissues, but eight regions showed widespread enrichment in seven or more tissues (Fig. 3a). Adipose (n = 33), liver (n = 26) and aorta (n = 21) showed the greatest enrichment for the most regions (Supplementary Table 20).

We applied a functionally informed fine-mapping method (functional genome-wide association analysis (FGWAS))²⁹, which uses chromatin state enrichment information to reweight GWAS summary statistics and compute variant-specific posterior probabilities of association (PPA). Among the 127 enriched regions, we identified 42 that contained less than five variants in the 95% credible set (Fig. 3b and Supplementary Table 21), while 53 regions contained a variant with PPA ≥ 0.5 (Fig. 3c and Supplementary Table 22) showing that the combination of functional annotation and high statistical power can pinpoint likely causal variants. Indeed, 14 regions were fine-mapped to a single variant, including missense variants in PCSK9, ANGPTL4 and APOE, plus other well-studied noncoding variants, such as rs9349379 (PHACTR1/EDN1)³¹ and rs2107595 (HDAC9/TWIST1)³².

At 12 loci, fine-mapping prioritized (PPA ≥ 0.5) variants that were not the sentinel. For example, at the low-density lipoprotein (LDL) cholesterol and adiposity-associated MAFB locus³³, the sentinel variant was rs2207132 (Supplementary Table 3 and Extended Data Fig. 5a). However, a strongly correlated variant (rs1883711; r² = 0.92) lies in a region annotated as a likely enhancer in liver and adipose tissue, the two enriched tissues at this locus (Extended Data Fig. 5b). Therefore, rs1883711 was upweighted by FGWAS (PPA = 0.77) over rs2207132 (PPA = 0.13). We queried CAD-associated variants for cis-expression quantitative trait loci (cis-eQTLs) in CAD-relevant tissues from the Stockholm-Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) and Genotype-Tissue Expression (GTEx) studies (Online Methods)^34,35. The eQTL for MAFB observed in liver samples from CAD patients in STARNET suggests that the CAD association is mediated by changes in MAFB expression (encoding MAF bZIP transcription factor B; Supplementary Table 22). MafB expression in macrophages is upregulated by oxidized LDL stimulation³⁶, while MafB deficiency in mice appears to increase atherosclerosis by inhibiting foam cell apoptosis³⁷.

Polygenic prioritization of candidate causal genes

Combining locus- and similarity-based approaches has been shown to enhance the prioritization of causal genes^14,38. However, established similarity-based methods have not leveraged the full polygenic signal to inform gene prioritization. We therefore incorporated a new similarity-based method for gene prioritization, the Polygenic Priority Score (PoPS), which uses the full genome-wide association data¹⁵. We applied PoPS to summary-level data from the GWAS meta-analysis. Initial 57,543 features—including gene expression, protein–protein interaction networks, and biological pathways—were considered, of which 19,091 features (33.2%) passed a marginal feature selection step and were input into the final PoPS model (Online Methods and Supplementary Table 23). We computed a PoPS score for all protein-coding genes within 500 kb of all 279 genome-wide associations and prioritized the gene with the highest PoPS score in each locus, resulting in 235 prioritized genes. PoPS prioritized many well-established genes implicated in CAD pathogenesis, including LDLR, APOB, PCSK9, SORT1, NOS3, VEGFA and IL6R (Supplementary Tables 24 and 25).

Next, we identified features from the PoPS model which were most informative in prioritizing CAD-relevant genes. Hierarchical clustering yielded 2,852 clusters, which we ranked by relative contribution to the PoPS scores of prioritized genes (Fig. 4a). The highest-ranking cluster contained features indicating homeostatic regulation of blood lipids (Supplementary Table 26). Other top clusters were related to vascular cell function, migration and proliferation; the structure and function of the extracellular matrix and metabolic pathways including those in adipose tissue controlling thermoregulation, all well-established mechanisms in CAD pathogenesis^39,40,41. Additional high-ranking clusters highlighted early developmental processes and cell cycle signaling pathways as less recognized, but important, mediators of CAD risk.

**Fig. 4: PoPS informs the identification of causal genes for CAD.**

We then examined a locus where the PoPS method facilitated the prioritization of a putative causal gene. Lead variant rs1807214 lies in an intergenic region of chromosome 15 at which no causal gene has been established^7,8. Data from GTEx and STARNET identified cis-eQTLs for ABHD2, MFGE8 and HAPLN3 (Supplementary Tables 27 and 28). Prior locus-based algorithms have prioritized the nearest gene, ABHD2, located 65 kb downstream of the sentinel variant^5,38. However, PoPS prioritized MFGE8, located 108 kb upstream of the sentinel (Fig. 4b). MFGE8 encodes lactadherin, an integrin-binding glycoprotein implicated in vascular smooth muscle cell (VSMC) proliferation and invasion, and the secretion of proinflammatory molecules^42,43. In vitro deletion of this intergenic region by CRISPR–Cas9 increases MFGE8 expression—with no change to ABHD2 expression—and MFGE8 knock-down reduces coronary artery (CA)-VSMC and monocyte (THP-1) proliferation, lending functional support to MFGE8 as a likely causal mediator of the CAD association in this region⁴⁴.

Systematic prioritization of putative causal genes

We developed and applied a consensus-based prioritization framework involving eight similarity-based or locus-based predictors to systematically prioritize likely causal genes for all 279 genome-wide associations (Online Methods and Fig. 5a). Most likely causal genes were selected based on the highest (unweighted) number of the eight predictors. To test this framework, we generated an a priori set of 30 ‘positive control’ genes with well-established causal roles in CAD and assessed the accuracy of each predictor (Supplementary Table 29). Twenty-eight of the 30 positive control genes were correctly prioritized as the most likely causal gene based on the highest number of concordant predictors with a median of four concordant predictors per gene (Supplementary Table 30). All predictors demonstrated high accuracy, including nearest gene (90%), PoPS (90%), eQTL (85%) and mouse knock-outs (100%; Supplementary Table 30).

**Fig. 5: Integrating eight gene prioritization predictors to identify most likely causal genes.**

We were able to prioritize a likely causal gene at 239 (85.7%) of the genome-wide associations based on having two or more concordant predictors, resulting in the prioritization of 220 genes (Supplementary Table 31). We considered 123 of these genes strongly prioritized (three or more concordant predictors; Fig. 5b and Supplementary Fig. 1). For 21 genes, the prioritized gene was not the nearest gene to the sentinel variant, including APOC3, PLTP and LOX. Agreement (the proportion of times that a predictor prioritized the same gene as the most likely causal gene) was high across predictors, including nearest gene (84%), PoPS (83%) and eQTLs (86%; Fig. 5a). Concordance (the proportion of times a pair of predictors both provided evidence for the consensus-based causal gene) was more variable (Extended Data Fig. 6); nearest gene and the presence of a protein-altering variant were typically concordant (71%), whereas monogenic genes and eQTLs were much less concordant (35%).

Candidate loci with converging lines of evidence

Several newly identified CAD risk loci had strong variant- and gene-level evidence supporting their candidacy for functional interrogation. For example, we identified a CAD-associated region that was most strongly enriched in the aorta (Supplementary Table 3), with an intronic variant (rs4074793) in ITGA1 having a PPA of 0.95 (Extended Data Fig. 7a,b). Lead variant rs4074793 lies in a region annotated as a likely enhancer in several tissues and is the lead variant for a strong cis-eQTL for ITGA1 in liver among CAD patients from STARNET (P = 1.8 × 10⁻⁷³; Extended Data Fig. 7c). This eQTL was also seen in aorta, subcutaneous fat and mammary artery (Extended Data Fig. 7d). No other gene expression signals were seen at this locus, while PoPS also strongly prioritized ITGA1 as the likely causal gene (Supplementary Table 31). ITGA1 encodes integrin subunit alpha-1, a widely expressed protein that forms a heterodimer with integrin beta-1 and acts as a cell surface receptor for extracellular matrix components, such as collagens and laminins. The CAD risk allele (rs4074793-G), or strong proxies, were associated with elevated liver enzymes⁴⁵, C-reactive protein and LDL cholesterol⁴⁶, highlighting the influence of altered ITGA1 expression in the liver on lipid pathways as a likely causal pathway to CAD.

We also identified a new association with CAD at a gene-dense region enriched for epigenetic annotations in adipose, liver, monocytes and skeletal muscle myoblasts (Fig. 6a and Supplementary Table 20). FGWAS prioritized rs7246865 as the putative causal variant (PPA = 0.71). Among 30 genes within 500 kb of rs7246865, PoPS prioritized MYO9B (Supplementary Table 24), which encodes unconventional myosin-IXb, a myosin protein with Rho-GTPase signaling activity involved in cell migration⁴⁷. Evidence for the involvement of MYO9B was also provided by a cis-eQTL in tibial artery in GTEx (P = 5.3 × 10⁻⁸), with the CAD risk allele exhibiting lower MYO9B expression (Supplementary Table 27).

Experimental interrogation of a new CAD locus

We proceeded to investigate the functional significance of the MYO9B locus with respect to CAD risk. This genomic region is contained within a vascular tissue enhancer, as identified by a strong H3K27ac ChIP-seq signal in coronary artery, aorta and tibial artery (Fig. 6b). Using ATAC-seq of primary vascular cells, we identified open chromatin at rs7246865 in the following three cell types of relevance to CAD: immortalized human aortic endothelial cells (ECs), CA-VSMCs and monocytes (Fig. 6b).

We used CRISPR–Cas9 to delete the enhancer sequence in these cell types (Fig. 6b), achieving 53–72% effective deletion of a 131-bp segment within the enhancer (Fig. 6c). We measured the transcriptional effect of enhancer deletion on all genes expressed in these cell types within a 250-kb window surrounding rs7246865. The enhancer deletion resulted in reduced MYO9B and HAUS8 expression in ECs (Fig. 6d) and reduced MYO9B expression in CA-VSMCs (Fig. 6e), compatible with vascular GTEx eQTLs. There was no change in the expression of any other genes in the region in either cell type or of any gene in monocytes.

Finally, we sought to understand whether the enhancer is associated with a cellular phenotype of relevance to CAD. Given the cytoskeletal functions of MYO9B and HAUS8 in other cell types^47,48, we assessed the effects of these genes in a monolayer wound-healing assay, a composite of cell migration and proliferation⁴⁹. We observed that ECs with the enhancer deletion exhibited impaired wound healing, as did ECs with knock-outs of either MYO9B or HAUS8, suggesting that the regulatory effect of the enhancer contributes to CAD risk through impaired wound healing in ECs (Fig. 6f). We did not observe any effect on migration with deletion of the noncoding enhancer or MYO9B in CA-VSMCs.

Discussion

In a discovery analysis involving >200,000 cases of CAD and >1 million controls, we identified 279 genome-wide significant associations, including 82 reported here for the first time. We objectively prioritized likely causal variants and effector genes across all associations using functionally informed fine-mapping, a recently developed genome-wide gene prioritization method (PoPS), and systematic integration of locus-based and similarity-based predictors, with several tailored specifically to cardiovascular disease. Finally, informed by our prioritization framework, we experimentally interrogated a new CAD signal to establish a putative, mechanistic link between this genomic region and risk of CAD.

The large sample size enabled detection of more than 80 new genetic associations with CAD, predominantly common weak-effect variants. Our findings suggest that future, larger GWAS—at least those in European ancestry populations—are unlikely to discover many more large-effect common variants (that is, those with ORs greater than 1.05) associated with CAD. In fact, additional associations contributing to the long polygenic tail of CAD risk are likely to arise from the ~650 predominantly weak-effect signals among associations that reached the 1% FDR threshold, which in aggregate explained ~36% of the heritability of CAD. Notably, we identified 38 new loci when we incorporated recently published GWAS results based on only 29,000 CAD cases from Biobank Japan, demonstrating that future multi-ancestry analyses should enhance the yield of genetic discovery for CAD.

Consistent with previous studies, we demonstrated that a genome-wide PRS derived from this GWAS strongly predicts both incident and recurrent CAD^50,51,52,53. Notably, our new PRS demonstrated improved ability to discern those at higher and lower risk of CAD as compared to a widely used PRS derived from an earlier GWAS of ~61,000 CAD cases⁵². While the new PRS provides an improved tool for genetic risk prediction of CAD in the setting of primary and secondary prevention, our findings suggest that further increases in European-ancestry GWAS sample size may only modestly improve the predictive ability of the CAD PRS. More substantive improvements in polygenic risk prediction may arise from methodological developments, such as approaches that model interactions between variants or incorporate functional information^54,55. Moreover, further investigations are required to understand the extent to which genetic discovery analyses that include more non-European ancestry participants will improve the portability of PRS across ancestries, and whether this will result in improved prediction across all ancestry groups⁵⁶.

The weak effects of most CAD-associated variants do not preclude their contribution to important etiological insights with therapeutic implications, as the effects of pharmacologically perturbing identified targets are typically much stronger than those of naturally occurring genetic variants that are common in the population. For example, we uncovered common variant associations of weak effect at HMGCR and NPC1L1, which encode the targets of HMG-CoA reductase inhibitors (statins) and ezetimibe, respectively, two of the most effective and commonly prescribed medications for the prevention and management of CAD through lowering blood lipid levels. However, the translation of statistical associations into actionable biology and potential therapeutic targets requires elucidation of causal genes and mechanisms, which has lagged behind the rapid growth in genetic association discoveries.

Here we implemented strategies to enhance the identification of putative causal variants, genes and biological pathways. By incorporating epigenomic enrichment in disease-relevant tissues—a previously shown approach to improve fine-mapping over broader, disease-agnostic approaches²⁹—we prioritized likely causal variants that were not always those with the strongest statistical associations. Using a recently developed similarity-based tool (PoPS) that exploits the full genome-wide data to identify disease-enriched features, we prioritized >200 likely causal genes. Support for the validity of the genes prioritized by PoPS comes from the high ranking of features of known relevance to atherosclerosis (for example, lipid metabolism, extracellular matrix processes) from more than 50,000 tested features; the correct assignment of the most likely causal gene at several well-established lipid and nonlipid CAD loci; selection of the likely-correct causal gene over several other candidates in a region, including those in closer proximity to the sentinel (for example, MFGE8); and corroborating evidence at many loci from orthogonal gene prioritization methods, such as eQTLs in disease-relevant tissues.

As support from multiple, orthogonal lines of evidence increases the likelihood of prioritizing the correct causal gene, we propose an integrative, consensus-based prioritization framework that incorporates eight complementary predictors. By applying this framework to all 279 genome-wide associations, we systematically enhance the level of evidence around both known and new risk loci for CAD to arrive at 123 genes strongly prioritized on the basis of having three or more concordant predictors. Although distance from the sentinel variant has been shown to be a reasonable predictor of causal genes across many phenotypes^14,38, our integrative approach prioritized a gene that was not the nearest gene for 15% of associations. Also, at several newly identified associations, such as those nearest ITGA1 and MYO9B, we provide complementary lines of in silico evidence to nominate potential causal variants, genes and mechanistic pathways. Finally, we leveraged genome-editing and cell-based assays to interrogate the new association signal at chromosome 19, validating the involvement of MYO9B, but also implicating another putative causal gene, HAUS8. Importantly, these experimental findings substantiate our in silico prioritization of a region with apparent regulatory influence, and our similarity-based prioritization of cell migration pathways, as both MYO9B and HAUS8 may exert their influence on CAD risk through the control of vascular cell cytoskeleton. Furthermore, the findings raise the possibility that two genes at a locus may regulate a common, cellular pathway in coordinated fashion, such as seen for COL4A1 and COL4A2 at a well-established CAD risk locus⁵⁷. While experimental evidence is ultimately required to confirm causal mechanisms at all unresolved CAD risk loci, we provide a prioritization framework yielding evidence-based candidates that may be amenable to analogous functional follow-up.

Methods

Genetic discovery meta-analysis

Details of the ten de novo studies, including the source of participants, case and control definitions, basic participant characteristics, and ethics approval, are provided in Supplementary Note, Supplementary Table 1 and Extended Data Fig. 1. Study-specific sample and variant filters were applied before additive logistic (or logistic mixed) models were run, with CAD status as the outcome and adjusting for study-specific covariates, including those accounting for potential ancestry effects.

We performed an inverse-variance weighted meta-analysis on the betas and standard errors using METAL⁵⁸, combining the results from the ten de novo studies with previously published summary statistics. Variant-specific sample sizes were maximized by using a combination of summary statistics from prior CAD meta-analyses of the CARDIoGRAMplusC4D consortium, and additional variant filtering was performed, as detailed in Supplementary Note^2,7,10,16. The final dataset included 20,073,070 variants.

Joint association analysis

We performed joint association analysis using GCTA software⁵⁹. This approach fits an approximate multiple regression model using summary-level meta-analysis statistics and LD corrections estimated from a reference panel (here the UKBB sample using European ancestry participants only). We adopted a chromosome-wide stepwise selection procedure to select variants and estimate their joint effects at (i) a genome-wide significance level (P_Joint ≤ 5.0 × 10⁻⁸) in the meta-analyzed variants that reached genome-wide significance (n = 18,348) and (ii) an FDR 1% P value cut-off (P_Joint ≤ 2.52 × 10⁻⁵) in the 1% FDR variant list (n = 47,622). We identified 241 independent variants at the genome-wide significance threshold and 897 independent variants within the 1% FDR list.

Identifying previously reported regions and associations

To identify regions of the genome previously reported as having associations with CAD, we first collapsed variants reaching genome-wide significance by clumping variants within 500 kb of each other into a single locus. We compared these regions with all variants previously found to be associated with CAD at a genome-wide level of significance (P ≤ 5.0 × 10⁻⁸) from previous large-scale genetic association studies of CAD. Regions were annotated as ‘known’ if they included a previously reported CAD-associated variant. To assess which of our associations were previously reported or new, we examined the pairwise correlation between each of our 279 genome-wide significant sentinel variants and any nearby previously reported variants, defining ‘new’ as having r² < 0.2 in UK Biobank European ancestry participants.

Genetic correlation analysis

Genetic correlation between CAD and conventional risk factors (total cholesterol, LDL cholesterol, HDL cholesterol, triglycerides, body mass index, systolic blood pressure and diastolic blood pressure) and cardiometabolic diseases (type 2 diabetes, ischemic stroke and heart failure) was assessed using LD Score Regression (LDSC)⁶⁰. We used the 1000 Genomes European ancestry LD file comprising ~1.2 million variants available at https://alkesgroup.broadinstitute.org/LDSCORE/.

PheWAS in UK Biobank

To understand the spectrum of phenotypic consequences of our 279 independent associations with CAD, we conducted a PheWAS in the UK Biobank (see Supplementary Note for complete analysis details). Briefly, we tested for associations with 53 cardiovascular and noncardiovascular diseases and 32 continuous traits, as listed in Supplementary Tables 32 and 33. A genetic variant was considered to be associated with a ‘conventional CAD risk factor’ if the CAD risk-increasing allele exhibited a directionally consistent/positive association with blood lipids (total cholesterol, LDL cholesterol, triglycerides or a diagnosis of hypercholesterolemia); blood pressure (systolic blood pressure, diastolic blood pressure or a diagnosis of hypertension); hyperglycemia (serum glucose, hemoglobin A1c or a diagnosis of type 2 diabetes) or adiposity (body mass index).

Rare variant analyses

Variant annotation was performed using Variant Effect Predictor (VEP) v96.0 with LOFTEE plugin on version three imputed data and variants with an information score ≥0.8 (refs. ^61,62). Various gene-based groupings were tested (Supplementary Table 6) and allele frequencies from the entire UK Biobank cohort were used for groupings. Variants (n = 64,102) were considered to be in a gene if they fell within the gene coordinates as defined by GENCODE v19. Gene-based association tests were performed in SAIGE-GENE v0.35.8.5 using a white British subset of UK Biobank (28,683 CAD cases and 367,783 controls)⁶³. Software defaults were used except in step 0 the number of markers for sparse matrix was 2000, and in step 1, the tolerance for preconditioned conjugate gradient to converge was 0.01 and variance ratios were estimated across MAC categories. Two variants were required in each gene for testing. Covariates in the model included the genotyping array, the first five principal components calculated in the white British subset of samples, birth year, and sex. Burden, SKAT, and SKAT-O tests were performed for each gene. As no strong signals were observed except for the PCSK9 gene, we did not extend our rare variant testing to other studies.

Sex-specific analysis

We performed a sex-stratified GWAS analysis in UK Biobank following the same phenotype definition and sample exclusions with the main analysis. We used the SAIGE software and adjusted our single-variant association analysis for the first five genetic principal components and the genotyping array, separately for men and women⁶⁴. Based on promising initial results in UK Biobank, we collated sex-stratified GWAS summary statistics, as available, from other participating studies (Supplementary Table 6). Additional details of sex-specific analyses are provided in Supplementary Note.

FDR estimation

The FDR following the meta-analysis was assessed using the ‘q value’ R package. We generated q values for all 20.1 million variants. The P value cut-off for a q value of 1% was 2.52 × 10⁻⁵ and there were 47,622 variants reaching that threshold. Joint conditional analysis was performed using GCTA (as described earlier) to identify approximately independent association signals.

Estimation of heritability explained

Heritability calculations were based on a multifactorial liability-threshold model, implemented in the INDI-V calculator (http://cnsgenomics.com/shiny/INDI-V/), under the assumption of a baseline population risk (K) of 0.0719 and a twin heritability (H_L²) of 0.4 (refs. ^65,66). Single-variant regression estimates from the meta-analysis summary statistics were used to estimate heritability for the sentinel variants at the 241 conditionally independent genome-wide significant associations and the 897 conditionally independent associations reaching the 1% FDR threshold in the primary meta-analysis. To account for correlation between variants, multiple regression estimates from the GCTA joint association analysis were also used to estimate heritability for both sets of variants.

Cross-ancestry comparison

For cross-ancestry comparison, we used summary statistics from a recent GWAS of 29,319 CAD cases and 183,134 controls from Biobank Japan³. In total, 199 of the 241 sentinel variants from our primary meta-analysis were also found in the Biobank Japan study; after aligning effect alleles, we compared the beta estimates and minor allele frequencies using Pearson’s correlation coefficient. To investigate the effect of outliers on the between-ancestry correlation of beta estimates, we re-estimated the correlation coefficient after excluding three strong outliers (at ATXN2, FER and SLC22A1). We then performed an inverse-variance weighted meta-analysis on the beta estimates and standard errors, incorporating summary results from Biobank Japan and those from all other studies in our primary meta-analysis. After cross-ancestry meta-analysis, we again dropped variants that were only present in one study or had fewer than 30,000 cases in total from all contributing studies, leaving 23,333,163 variants after filtering. We then collapsed variants reaching genome-wide significance (P ≤ 5.0 × 10⁻⁸) by clumping variants within 500 kb into a single locus, resulting in 38 additional loci that did not contain a previously reported CAD variant.

Derivation and training of PRSs

PRS were derived using the pruning and thresholding method or the LDpred computational algorithm (LDpred v.1.0), with 503 European ancestry individuals derived from the 1000 Genomes Project study serving as the linkage disequilibrium reference panel⁶⁷. To evaluate the added utility of our GWAS for the prognostication of CAD risk, we compared two sets of scores using effect estimates from either the current meta-analysis or from our previous 1000 Genomes-imputed GWAS of CAD involving ~60,000 cases⁷. For each derivation method and summary statistic, we constructed a range of scores of varying sizes drawing from common genetic variants that overlapped between the current meta-analysis, the earlier 1000 Genomes-imputed CAD GWAS and our training/validation datasets from the MDC Study⁶⁸. Additional details on PRS derivation and training are contained in Supplementary Note.

Incident event prediction analyses

Cox proportional hazard models were used to assess the time-to-event relationship between each PRS and incident CAD events in the MDC study (see Supplementary Note for study details). Baseline models were adjusted for age and sex only, and then subsequently, for established risk factors for CAD (total cholesterol, HDL cholesterol, systolic blood pressure, body mass index, type 2 diabetes, current smoking status and family history of CAD). Harrell C-statistics were estimated using Cox proportional hazard analysis over a 21-year follow-up period to assess the discrimination of the PRS.

Recurrent event prediction analyses

The two optimal PRS (2022 PRS and 2015 PRS) were calculated in participants of the genetic substudy of the FOURIER trial (see Supplementary Note for trial details) using the genotype dosage for each allele, multiplied by its weight and then summed across all variants. Patients received a raw score standardized per 1 s.d. (continuous), as well as a percentile score relative to the total cohort. All scoring was performed using PLINK v2.0 (www.cog-genomics.org/plink/2.0/)⁶⁹. Model goodness-of-fit was evaluated using the concordance statistic and Akaike’s Information Criterion. R version 3.6.1 was used for statistical analyses.

The clinical outcome of interest was recurrent major coronary events, defined as myocardial infarction, coronary revascularization or death from CAD (n_{incident_cases} = 673). Participants in the genetic cohort were followed for a median of 2.3 years. All endpoints were formally adjudicated by a blinded clinical events committee during the trial. A Cox model was used to determine the HR per 1 s.d. higher level of the PRS and for the extreme deciles compared to the middle 80%. Analyses were adjusted for age, sex and ancestry (using principal components 1–5).

Identifying protein-altering variants

To identify protein-altering variants among our genome-wide significant associations, we took the 279 sentinel variants and their LD proxies at r² ≥ 0.8 as estimated in the European ancestry subset of UK Biobank and annotated them using the Ensembl VEP⁶². We selected for each sentinel variant any proxies identified as having a ‘high’ (that is, stop-gain and stop-loss, frameshift indel, donor and acceptor splice-site and initiator codon variants) or ‘moderate’ (that is, missense, in-frame indel, splice region) consequence and recorded the gene that the variant disrupts.

Functional GWAS analysis

To fine-map loci and identify credible functional variants, we applied FGWAS software²⁹. The software integrates GWAS summary statistics with epigenetic data and we used the ChromHMM-derived states from the NIH Roadmap Epigenomics Consortium on a selection of ten CAD-relevant cell/tissue types (adipose nuclei, aorta and human skeletal muscle myoblasts (HSMM), liver, human umbilical vein endothelial cells (HUVEC), kidney, adrenal gland, pancreatic islets, primary monocytes and T-cells from peripheral blood)^70,71. To maximize our search space to find functional elements, we prepared a custom state by merging likely functional ChromHMM states (enhancers, transcription start sites, repressed polycomb, transcription at 5′ and 3′ of gene) for each genomic position. We reweighted the GWAS by running a null model and then a model containing the custom annotation for each of the ten tissues. Regions of the genome that showed strong enrichment (>3 s.d. increment in Bayes factor (BF)) and had a genome-wide significant CAD-associated variant (P < 5.0 × 10⁻⁸) were selected. For each region, we identified the tissue that showed maximum increment in BF and then constructed a 95% credible functional set of variants based on the ranked PPA for each variant within a region.

eQTL analysis in CAD-relevant tissues

To examine whether the CAD associations were driven by changes in gene expression in CAD-relevant tissues and cell types, we interrogated cis-eQTLs from CAD-relevant tissues in the STARNET eQTL study and the GTEx study^34,35. Analysis-specific details are provided in the Supplementary Note.

Polygenic prioritization of candidate causal genes

We implemented PoPS, a similarity-based gene prioritization method designed to leverage the full genome-wide signal to nominate causal genes independent of methods utilizing GWAS data proximal to the gene¹⁵. Broadly, PoPS leverages polygenic enrichments of gene features including cell-type-specific gene expression, curated biological pathways and protein–protein interaction networks (Supplementary Table 23) to train a linear model to compute a PoPS for each gene (see Supplementary Note for further details).

Variants responsible for cardiovascular-relevant monogenic disorders

To identify genes harboring pathogenic variants responsible for cardiovascular-relevant monogenic disorders, we searched the NCBI’s ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/) on 26 June 2020. Variants were pruned to those within ±500 kb of our CAD sentinel variants; categorized as ‘pathogenic’ or ‘likely pathogenic’; with a listed phenotype; and with either (i) details of the evidence for pathogenicity, (ii) expert review of the gene or (iii) a gene that appears in practice guidelines. We then filtered variants that were annotated with a manually curated set of cardiovascular-relevant phenotype terms, including those related to CAD, CAD risk factors (lipids, metabolism, blood pressure, obesity and platelets), bleeding disorders and relevant cardiac, vasculature or neurological abnormalities (Supplementary Table 34). Where a variant was annotated with multiple genes, both genes were considered as potentially pathogenic.

Phenotyping knock-out mice

Human gene symbols were mapped to gene identifiers (HGNC) and mouse ortholog genes were obtained using Ensembl (www.ensembl.org). Phenotype data for single-gene knock-out models were obtained from the International Mouse Phenotyping Consortium, data release 10.1 (www.mousephenotype.org), and from the Mouse Genome Informatics database, data from July 2019 (www.informatics.jax.org). For each mouse model, reported phenotypes were grouped using the mammalian phenotype ontology hierarchy into broad categories relevant to CAD as follows: cardiovascular physiology (MP:0001544), cardiovascular morphology (MP:0002127), growth and body weight (MP:0001259), lipid homeostasis (MP:0002118), cholesterol homeostasis (MP:0005278) and lung morphology (MP:0001175). This resulted in mapping from genes to phenotypes in animals (Supplementary Table 35).

Rare variant associations, MR and drug evidence

To inform prioritization of causal genes within 1-Mb regions around our genome-wide associations, we reviewed the literature for three sources of evidence as follows: (1) rare coding variants previously associated with CAD, either individually or in aggregate gene-based tests, through whole-exome sequencing (WES) or exome array studies; (2) Mendelian randomization (MR) studies of gene expression, protein levels or proximal phenotypes that implicate specific genes as causal effector genes for CAD and (3) drugs proven to be effective for cardiovascular-relevant indications and that target specific proteins encoded by genes.

Systematic integration of gene prioritization evidence

To systematically prioritize likely causal genes for all 279 genome-wide associations, we integrated the following eight of the aforementioned similarity-based or locus-based predictors of causal genes: (1) the top two prioritized genes from PoPS; (2) genes with eQTLs in CAD-relevant tissues from STARNET or GTEx; (3) genes containing protein-altering variants that are in strong LD (r² ≥ 0.8) with the CAD sentinel variant; (4) genes harboring variants responsible for monogenic disorders of cardiovascular relevance according to ClinVar; (5) genes containing rare coding variants that have been associated with CAD risk in previous WES or array-based studies; (6) genes encoding proteins of causal relevance to CAD per MR studies or that are targets for established cardiovascular drugs; (7) genes that display cardiovascular-relevant phenotypes in knock-out mice from the International Mouse Phenotyping Consortium or Mouse Genome Informatics database; and (8) the nearest gene to the CAD sentinel variant (Fig. 5a). We prioritized the most likely ‘causal gene’ for each association using a consensus-based approach, selecting the gene with the highest, unweighted sum of evidence across all eight predictors.

We tested our approach by evaluating whether 30 (positive control) genes with established relevance to CAD were prioritized as the most likely causal genes within their respective genomic regions. Positive control genes were selected by a literature search that sought evidence from engineered mouse models of reduced gene expression (‘knock-out’ or ‘knock-down’ models), MR studies or successful drug targets. In addition, we defined two measures to summarize the relative contributions of individual predictors and pairs of predictors to the consensus-based approach. Specifically, we defined ‘agreement’ as the proportion of times that an individual predictor prioritized the same gene that was nominated as the most likely causal gene by the consensus-based framework. ‘Concordance’ was defined as the proportion of times a pair of predictors both converged on the gene that was nominated as the most likely causal gene by the consensus of the eight predictors.

CRISPR–Cas9 genome editing in vascular cells

Human coronary artery VSMCs (Lonza CC-2583; culture media CC-31182) were used at passage five or earlier. Endothelial cell experiments were conducted with immortalized human aortic endothelial cells (ATCC CRL-4052; culture media Lifeline Cell Technology LL-0003). Monocyte experiments were conducted with THP-1 monocyte cells (ATCC TIB-202; culture media RPMI ATCC 30-2001, 10% FBS Sigma 12306C-500ML). Genome editing was performed as previously described (Supplementary Note)⁷².

Gene expression by qPCR

For assessment of gene expression, mRNA was extracted (Qiagen RNAeasy kit; Qiagen, 74106) and DNase I digestion was performed (DNAse I, Thermo Fisher 18068015) before cDNA synthesis (Applied Biosystems, 43-688-14) and qPCR (Applied Biosystems, 4444965). Gene expression was assessed by quantitative PCR with Taqman probes (Invitrogen) for genes of interest (MYO9B: Hs00994622_m1; HAUS8: Hs00928622_m1; OCEL1: Hs00928613_m1; USE1: Hs00218426_m1; NR2F6: Hs00172870_m1; GAPDH: Hs03929097_g1). Data are shown relative to expression of GAPDH. Statistical analyses were conducted with unpaired two-way Student’s t test.

Noncoding enhancer characterization

Assay for transposase-accessible chromatin using sequencing (ATAC-seq) data for THP-1 monocytes and CA-VSMCs was previously available. We performed ATAC-seq in human immortalized aortic endothelial cells as previously described⁷³. H3K27ac CHIP-seq data were publicly available via ENCODE (coronary artery, ENCFF970RKM; aorta, ENCFF118EKX; tibial artery, ENCFF972ZHA).

Wound-healing assay

Wound-healing assays were performed as previously described (Platypus Technologies, CMAUFL4)⁴⁹. After genome editing, 15,000 cells per well were plated with well inserts in place in culture media. Inserts were then removed the day after plating. Prior to complete wound healing (48–72 h), cells were stained with Calcein AM dye (Invitrogen, C3099) and wound healing was quantified with a fluorescence plate reader (excitation 488 nm/emission 522 nm). Statistical analyses were conducted with one-way ANOVA between groups. Where specific software tools are not named, we used Stata or R for analyses.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Summary statistics are available upon publication through the CARDIoGRAMplusC4D website (http://www.cardiogramplusc4d.org/) and the NHGRI-EBI GWAS catalog (https://www.ebi.ac.uk/gwas/, accession codes: GCST90132314 (https://www.ebi.ac.uk/gwas/studies/GCST90132314) and GCST90132315 (https://www.ebi.ac.uk/gwas/studies/GCST90132315)). Interactive searchable Manhattan plots and a locus-specific epigenome annotation browser for functionally enriched loci are available at https://procardis.shinyapps.io/cadgen/. An interactive searchable browser detailing the locus-specific evidence prioritizing causal variants, genes and pathways is available at the Common Metabolic Diseases Knowledge Portal (https://hugeamp.org/method.html?trait=cad&dataset=cardiogram).

Other datasets used in this study include the NCBI’s ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/) on 26 June 2020, a 1000 Genomes European ancestry LD file comprising ~1.2 million variants (https://alkesgroup.broadinstitute.org/LDSCORE/), the GTEx Consortium v7 data release (https://www.gtexportal.org/home/datasets), the Ensembl database (www.ensembl.org), the International Mouse Phenotyping Consortium, data release 10.1 (www.mousephenotype.org) and the Mouse Genome Informatics database, data from www.informatics.jax.org on July 2019.

Code availability

Custom code for preparing the study-specific GWAS summary statistics files for meta-analysis can be found at https://github.com/cambridge-ceu/cardiogramplusC4D_GWAS. Custom code for PRS analysis using a modified version of Ldpred 1.0 can be found at https://github.com/wavefancy/LDpredChrByChr.

References

GBD 2019 Diseases and Injuries Collaborators. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 396, 1204–1222 (2020).
Article Google Scholar
Howson, J. M. et al. Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms. Nat. Genet. 49, 1113–1119 (2017).
Article CAS Google Scholar
Ishigaki, K. et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 52, 669–679 (2020).
Article CAS Google Scholar
Klarin, D. et al. Genetic analysis in UK Biobank links insulin resistance and transendothelial migration pathways to coronary artery disease. Nat. Genet. 49, 1392–1397 (2017).
Article CAS Google Scholar
Koyama, S. et al. Population-specific and trans-ancestry genome-wide analyses identify distinct and shared genetic risk loci for coronary artery disease. Nat. Genet. 52, 1169–1177 (2020).
Article CAS Google Scholar
Nelson, C. P. et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat. Genet. 49, 1385–1391 (2017).
Article CAS Google Scholar
Nikpay, M. et al. A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Article CAS Google Scholar
van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433–443 (2018).
Article Google Scholar
Verweij, N. et al. Identification of 15 novel risk loci for coronary artery disease and genetic risk of recurrent events, atrial fibrillation and heart failure. Sci. Rep. 7, 2761 (2017).
Article Google Scholar
Webb, T. R. et al. Systematic evaluation of pleiotropy identifies 6 further loci associated with coronary artery disease. J. Am. Coll. Card. 69, 823–836 (2017).
Article CAS Google Scholar
de Leeuw, C. A. et al. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Article Google Scholar
Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).
Article CAS Google Scholar
Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).
Article Google Scholar
Stacey, D. et al. ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. Nucleic Acids Res. 47, e3 (2019).
Article CAS Google Scholar
Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Preprint at medRxiv https://doi.org/10.1101/2020.09.08.20190561 (2020).
Myocardial Infarction Genetics and CARDIoGRAM Exome Consortia Investigators. Coding variation in ANGPTL4, LPL, and SVEP1 and the risk of coronary disease. N. Engl. J. Med. 374, 1134–1144 (2016).
Article Google Scholar
Zhang, H., Hu, W. & Ramirez, F. Developmental expression of fibrillin genes suggests heterogeneity of extracellular microfibrils. J. Cell Biol. 129, 1165–1176 (1995).
Article CAS Google Scholar
Putnam, E. A. et al. Fibrillin-2 (FBN2) mutations result in the Marfan-like disorder, congenital contractural arachnodactyly. Nat. Genet. 11, 456–458 (1995).
Article CAS Google Scholar
Takeda, N. et al. Congenital contractural arachnodactyly complicated with aortic dilatation and dissection: case report and review of literature. Am. J. Med. Genet. A 167A, 2382–2387 (2015).
Article Google Scholar
Deguchi, J. O. et al. Matrix metalloproteinase-13/collagenase-3 deletion promotes collagen accumulation and organization in mouse atherosclerotic plaques. Circulation 112, 2708–2715 (2005).
Article CAS Google Scholar
Quillard, T. et al. Selective inhibition of matrix metalloproteinase-13 increases collagen content of established mouse atherosclerosis. Arterioscler. Thromb. Vasc. Biol. 31, 2464–2472 (2011).
Article CAS Google Scholar
Romeo, S. et al. Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 40, 1461–1465 (2008).
Article CAS Google Scholar
Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat. Genet. 38, 320–323 (2006).
Article CAS Google Scholar
Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Article CAS Google Scholar
Vilhjálmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
Article Google Scholar
Sabatine, M. S. et al. Rationale and design of the further cardiovascular outcomes research with PCSK9 inhibition in subjects with elevated risk trial. Am. Heart J. 173, 94–101 (2016).
Article CAS Google Scholar
Wu, M.-Y. et al. Inhibition of the plasma SCUBE1, a novel platelet adhesive protein, protects mice against thrombosis. Arterioscler. Thromb. Vasc. Biol. 34, 1390–1398 (2014).
Article CAS Google Scholar
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
Article Google Scholar
Pickrell, J. K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).
Article CAS Google Scholar
Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
Article CAS Google Scholar
Gupta, R. M. et al. A genetic variant associated with five vascular diseases is a distal regulator of endothelin-1 gene expression. Cell 170, 522–533 (2017).
Article CAS Google Scholar
Prestel, M. et al. The atherosclerosis risk variant rs2107595 mediates allele-specific transcriptional regulation of HDAC9 via E2F3 and Rb1. Stroke 50, 2651–2660 (2019).
Article CAS Google Scholar
Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).
Article CAS Google Scholar
Franzén, O. et al. Cardiometabolic risk loci share downstream cis-and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).
Article Google Scholar
GTEx Consortium. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Article Google Scholar
Alasoo, K. et al. Genetic effects on promoter usage are highly context-specific and contribute to complex traits. Elife 8, e41673 (2019).
Article Google Scholar
Hamada, M. et al. MafB promotes atherosclerosis by inhibiting foam-cell apoptosis. Nat. Commun. 5, 3147 (2014).
Article Google Scholar
Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021).
Article CAS Google Scholar
Wong, D., Turner, A. W. & Miller, C. L. Genetic insights into smooth muscle cell contributions to coronary artery disease. Arterioscler. Thromb. Vasc. Biol. 39, 1006–1017 (2019).
Article CAS Google Scholar
Bennett, M. R., Sinha, S. & Owens, G. K. Vascular smooth muscle cells in atherosclerosis. Circ. Res. 118, 692–702 (2016).
Article CAS Google Scholar
Fantuzzi, G. & Mazzone, T. Adipose tissue and atherosclerosis: exploring the connection. Arterioscler. Thromb. Vasc. Biol. 27, 996–1003 (2007).
Article CAS Google Scholar
Chiang, H.-Y., Chu, P.-H. & Lee, T.-H. MFG-E8 mediates arterial aging by promoting the proinflammatory phenotype of vascular smooth muscle cells. J. Biomed. Sci. 26, 61 (2019).
Article Google Scholar
Wang, M., Wang, H. H. & Lakatta, E. G. Milk fat globule epidermal growth factor VIII signaling in arterial wall remodeling. Curr. Vasc. Pharm. 11, 768–776 (2013).
Article CAS Google Scholar
Soubeyrand, S. et al. Regulation of MFGE8 by the intergenic coronary artery disease locus on 15q26.1. Atherosclerosis 284, 11–17 (2019).
Article CAS Google Scholar
Chambers, J. C. et al. Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat. Genet. 43, 1131–1138 (2011).
Article CAS Google Scholar
Klarin, D. et al. Genetics of blood lipids among ~300,000 multi-ethnic participants of the Million Veteran Program. Nat. Genet. 50, 1514–1523 (2018).
Article CAS Google Scholar
Hanley, P. J. et al. Motorized RhoGAP myosin IXb (Myo9b) controls cell shape and motility. Proc. Natl Acad. Sci. USA 107, 12145–12150 (2010).
Article CAS Google Scholar
Lu, Y. et al. Genome-wide identification of genes essential for podocyte cytoskeletons based on single-cell RNA sequencing. Kidney Int. 92, 1119–1129 (2017).
Article CAS Google Scholar
Gough, W. et al. A quantitative, facile, and high-throughput image-based cell migration method is a robust alternative to the scratch assay. J. Biomol. Screen. 16, 155–163 (2011).
Article Google Scholar
Damask, A. et al. Patients with high genome-wide polygenic risk scores for coronary artery disease may receive greater clinical benefit from alirocumab treatment in the ODYSSEY OUTCOMES trial. Circulation 141, 624–636 (2020).
Article Google Scholar
Hindy, G. et al. Genome-wide polygenic score, clinical risk factors, and long-term trajectories of coronary artery disease. Arterioscler. Thromb. Vasc. Biol. 40, 2738–2746 (2020).
Article CAS Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Article CAS Google Scholar
Marston, N. A. et al. Predicting benefit from evolocumab therapy in patients with atherosclerotic disease using a genetic risk score: results from the FOURIER trial. Circulation 141, 616–623 (2020).
Article Google Scholar
Amariuta, T. et al. Improving the trans-ancestry portability of polygenic risk scores by prioritizing variants in predicted cell-type-specific regulatory elements. Nat. Genet. 52, 1346–1354 (2020).
Article CAS Google Scholar
Xu, Y. et al. Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease. Cell Genom. 2, 100086 (2022).
Article CAS Google Scholar
Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).
Article CAS Google Scholar
Burbelo, P. D., Martin, G. R. & Yamada, Y. Alpha 1(IV) and alpha 2(IV) collagen genes are regulated by a bidirectional promoter and a shared enhancer. Proc. Natl Acad. Sci. USA 85, 9679–9682 (1988).
Article CAS Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Article CAS Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).
Article CAS Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article CAS Google Scholar
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Article Google Scholar
Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).
Article CAS Google Scholar
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
Article CAS Google Scholar
Muñoz, M. et al. Evaluating the contribution of genetics and familial shared environment to common disease using the UK Biobank. Nat. Genet. 48, 980–983 (2016).
Article Google Scholar
Witte, J. S., Visscher, P. M. & Wray, N. R. The contribution of genetic variants to disease depends on the ruler. Nat. Rev. Genet. 15, 765–776 (2014).
Article CAS Google Scholar
1000 Genomes Project Consortiumet al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article Google Scholar
Berglund, G. et al. The Malmo Diet and Cancer study. Design and feasibility. J. Intern. Med. 233, 45–51 (1993).
Article CAS Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Article Google Scholar
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Article CAS Google Scholar
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Article CAS Google Scholar
Atri, D. S. et al. CRISPR-Cas9 genome editing of primary human vascular cells in vitro. Curr. Protoc. 1, e291 (2021).
Article CAS Google Scholar
Buenrostro, J. D. et al. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).
Article Google Scholar

Download references

Acknowledgements

T. Kessler is supported by the Corona-Foundation (Junior Research Group Translational Cardiovascular Genomics) and the German Research Foundation (DFG) as part of the Sonderforschungsbereich SFB 1123 (B02). T.J. was supported by a Medical Research Council DTP studentship (MR/S502443/1). J.D. is a British Heart Foundation Professor, European Research Council Senior Investigator, and National Institute for Health and Care Research (NIHR) Senior Investigator. J.C.H. acknowledges personal funding from the British Heart Foundation (FS/14/55/30806) and is a member of the Oxford BHF Centre of Research Excellence (RE/13/1/30181). R.C. has received funding from the British Heart Foundation and British Heart Foundation Centre of Research Excellence. O.G. has received funding from the British Heart Foundation (BHF) (FS/14/66/3129). P.S.d.V. was supported by American Heart Association grant number 18CDA34110116 and National Heart, Lung, and Blood Institute grant R01HL146860. The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung and Blood Institute, National Institutes of Health, Department of Health and Human Services (contract HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I and HHSN268201700005I), R01HL087641, R01HL059367 and R01HL086694; National Human Genome Research Institute contract U01HG004402; and National Institutes of Health contract HHSN268200625226C. We thank the staff and participants of the ARIC study for their important contributions. Infrastructure was partly supported by grant UL1RR025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research. The Trøndelag Health Study (The HUNT Study) is a collaboration between HUNT Research Centre (Faculty of Medicine and Health Sciences, NTNU, Norwegian University of Science and Technology), Trøndelag County Council, Central Norway Regional Health Authority and the Norwegian Institute of Public Health. The K.G. Jebsen Center for Genetic Epidemiology is financed by Stiftelsen Kristian Gerhard Jebsen; Faculty of Medicine and Health Sciences, NTNU, Norwegian University of Science and Technology; and Central Norway Regional Health Authority. Whole genome sequencing for the HUNT study was funded by HL109946. The GerMIFs gratefully acknowledge the support of the Bavarian State Ministry of Health and Care, furthermore founded this work within its framework of DigiMed Bayern (grant DMB-1805-0001), the German Federal Ministry of Education and Research (BMBF) within the framework of ERA-NET on Cardiovascular Disease (Druggable-MI-genes, 01KL1802), within the scheme of target validation (BlockCAD, 16GW0198K), within the framework of the e:Med research and funding concept (AbCD-Net, 01ZX1706C), the British Heart Foundation (BHF)/German Centre of Cardiovascular Research (DZHK)-collaboration (VIAgenomics) and the German Research Foundation (DFG) as part of the Sonderforschungsbereich SFB 1123 (B02), the Sonderforschungsbereich SFB TRR 267 (B05), and EXC2167 (PMI). This work was supported by the British Heart Foundation (BHF) under grant RG/14/5/30893 (P.D.) and forms part of the research themes contributing to the translational research portfolios of the Barts Biomedical Research Centre funded by the UK National Institute for Health Research (NIHR). I.S. is supported by a Precision Health Scholars Award from the University of Michigan Medical School. This work was supported by the European Commission (HEALTH-F2–2013-601456) and the TriPartite Immunometabolism Consortium (TrIC)-NovoNordisk Foundation (NNF15CC0018486), VIAgenomics (SP/19/2/344612), the British Heart Foundation, a Wellcome Trust core award (203141/Z/16/Z to M.F. and H.W.) and the NIHR Oxford Biomedical Research Centre. M.F. and H.W. are members of the Oxford BHF Centre of Research Excellence (RE/13/1/30181). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. C.P.N. and T.R.W. received funding from the British Heart Foundation (SP/16/4/32697). C.J.W. is funded by NIH grant R35-HL135824. B.N.W. is supported by the National Science Foundation Graduate Research Program (DGE, 1256260). This research was supported by BHF (SP/13/2/30111) and conducted using the UK Biobank Resource (application 9922). O.M. was funded by the Swedish Heart and Lung Foundation, the Swedish Research Council, the European Research Council ERC-AdG-2019-885003 and Lund University Infrastructure grant ‘Malmö population-based cohorts’ (STYR 2019/2046). T.R.W. is funded by the British Heart Foundation. I.K., S. Koyama, and K. Ito are funded by the Japan Agency for Medical Research and Development, AMED, under grants JP16ek0109070h0003, JP18kk0205008h0003, JP18kk0205001s0703, JP20km0405209 and JP20ek0109487. The Biobank Japan is supported by AMED under grant JP20km0605001. J.L.M.B. acknowledges research support from NIH R01HL125863, American Heart Association (A14SFRN20840000), the Swedish Research Council (2018-02529) and Heart Lung Foundation (20170265) and the Foundation Leducq (PlaqueOmics: New Roles of Smooth Muscle and Other Matrix Producing Cells in Atherosclerotic Plaque Stability and Rupture, 18CVD02. A.V.K. has been funded by grant 1K08HG010155 from the National Human Genome Research Institute. K.G.A. has received support from the American Heart Association Institute for Precision Cardiovascular Medicine (17IFUNP3384001), a KL2/Catalyst Medical Research Investigator Training (CMeRIT) award from the Harvard Catalyst (KL2 TR002542) and the NIH (1K08HL153937). A.S.B. has been supported by funding from the National Health and Medical Research Council (NHMRC) of Australia (APP2002375). D.S.A. has received support from a training grant from the NIH (T32HL007604). N.P.B., M.C.C., J.F. and D.-K.J. have been funded by the National Institute of Diabetes and Digestive and Kidney Diseases (2UM1DK105554). EPIC-CVD was funded by the European Research Council (268834) and the European Commission Framework Programme 7 (HEALTH-F2-2012-279233). The coordinating center was supported by core funding from the UK Medical Research Council (G0800270; MR/L003120/1), British Heart Foundation (SP/09/002, RG/13/13/30194, RG/18/13/33946) and NIHR Cambridge Biomedical Research Centre (BRC-1215-20014). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. This work was supported by Health Data Research UK, which is funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and Wellcome. Support for title page creation and format was provided by AuthorArranger, a tool developed at the National Cancer Institute.

Author information

These authors contributed equally: Krishna G. Aragam, Tao Jiang, Anuj Goel, Stavroula Kanoni, Brooke N. Wolford, Deepak S. Atri.
These authors jointly supervised this work: Rajat M. Gupta, Jeanette Erdmann, Nilesh J. Samani, Heribert Schunkert, Hugh Watkins, Cristen J. Willer, Panos Deloukas, Sekar Kathiresan, Adam S. Butterworth.
A list of members and their affiliations appears in the Supplementary Information.

Authors and Affiliations

Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
Krishna G. Aragam & Patrick T. Ellinor
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Krishna G. Aragam & Amit V. Khera
Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Krishna G. Aragam, Deepak S. Atri, Minxian Wang, Carolina Roselli, Gavin Schnitzler, Amit V. Khera, Patrick T. Ellinor & Rajat M. Gupta
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Krishna G. Aragam, Elle M. Weeks, Minxian Wang, Wei Zhou, Jacob C. Ulirsch, Noël P. Burtt, Maria C. Costanzo, Dong-Keun Jang, Amit V. Khera, Hilary K. Finucane & Rajat M. Gupta
BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Tao Jiang, John Danesh, John Danesh & Adam S. Butterworth
Radcliffe Department of Medicine, Division of Cardiovascular Medicine, University of Oxford, Oxford, UK
Anuj Goel, Christopher Grace, Martin Farrall & Hugh Watkins
Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
Anuj Goel, Christopher Grace, Theodosios Kyriakou, Martin Farrall & Hugh Watkins
William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
Stavroula Kanoni, Olga Giannakopoulou, Tomasz Konopka, Damian Smedley & Panos Deloukas
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
Brooke N. Wolford, Wei Zhou & Cristen J. Willer
Divisions of Cardiovascular Medicine and Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Deepak S. Atri & Rajat M. Gupta
Department of Population Medicine, Qatar University College of Medicine, Doha, Qatar
George Hindy
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
Wei Zhou, Jacob C. Ulirsch & Hilary K. Finucane
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Wei Zhou & Hilary K. Finucane
TIMI Study Group, Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Nicholas A. Marston, Frederick K. Kamanu, Marc S. Sabatine & Christian T. Ruff
Department of Internal Medicine, Division of Cardiology, University of Michigan, Ann Arbor, MI, USA
Ida Surakka, Jonas B. Nielsen & Cristen J. Willer
Institute for Cardiogenetics, University of Lübeck, Lübeck, Germany
Loreto Muñoz Venegas, Syed M. Ijlal Haider, Sören Mucha, Matthias Munz, Tobias Reinberger & Jeanette Erdmann
German Research Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Lübeck/Kiel, Lübeck, Germany
Loreto Muñoz Venegas, Syed M. Ijlal Haider, Sören Mucha, Matthias Munz, Tobias Reinberger & Jeanette Erdmann
Medical Research Council Population Health Research Unit, CTSU—Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK
Paul Sherliker
Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Tsurumi-ku, Yokohama, Japan
Satoshi Koyama & Kaoru Ito
Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Tsurumi-ku, Yokohama, Japan
Kazuyoshi Ishigaki
Department of Public Health and Nursing, K.G. Jebsen Center for Genetic Epidemiology, Norwegian University of Science and Technology, NTNU, Trondheim, Norway
Bjørn O. Åsvold, Ben Brumpton, Jonas B. Nielsen & Kristian Hveem
HUNT Research Centre, Norwegian University of Science and Technology, Levanger, Norway
Bjørn O. Åsvold, Ben Brumpton & Kristian Hveem
Department of Endocrinology, Clinic of Medicine, St. Olavs Hospital, Trondheim, Norway
Bjørn O. Åsvold
Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
Michael R. Brown, Paul S. de Vries, Eric Boerwinkle, Alanna C. Morrison & Paul S. de Vries
Department of Nutrition–Dietetics, School of Health Science and Education, Harokopio University, Athens, Greece
Panagiota Giardoglou & George Dedoussis
deCODE Genetics/Amgen, Inc., Reykjavik, Iceland
Daniel F. Gudbjartsson, Anna Helgadottir, Gudmar Thorleifsson, David O. Arnar, Gudmundur Thorgeirsson, Unnur Thorsteinsdottir, Hilma Holm & Kari Stefansson
School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
Daniel F. Gudbjartsson
German Heart Centre Munich, Department of Cardiology, Technical University of Munich, Munich, Germany
Ulrich Güldener, Adnan Kastrati, Thorsten Kessler, Ling Li, Shichao Pang, Moritz von Scheidt, Heribert Schunkert & Moritz von Scheidt
CTSU—Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK
Maysson Ibrahim, Federico Murgia, Jemma C. Hopewell & Robert Clarke
German Research Center for Cardiovascular Research (DZHK e.V.), Partner Site Munich Heart Alliance, Munich, Germany
Adnan Kastrati, Thorsten Kessler, Thomas Meitinger, Moritz von Scheidt, Heribert Schunkert & Moritz von Scheidt
Department of Genetics and Genomic Science, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Lijiang Ma
The Zena and Michael A. Wiener Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Lijiang Ma
Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Thomas Meitinger
Klinikum Rechts der Isar, Institute of Human Genetics, Technical University of Munich, Munich, Germany
Thomas Meitinger
School of Medicine and University Hospital Bonn, Institute of Human Genetics, University of Bonn, Bonn, Germany
Markus M. Nöthen
Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
Jacob C. Ulirsch
Harvard T.H.Chan School of Public Health, Boston, MA, USA
Pierre Zalloua
Department of Cardiac Surgery, Tartu University Hospital and Institute of Clinical Medicine, Tartu University, Tartu, Estonia
Arno Ruusalepp
Department of Genetics and Genomic Sciences, Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Johan L. M. Björkegren
Integrated Cardio Metabolic Centre, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
Johan L. M. Björkegren
Department of Clinical Sciences in Malmö, Lund University, Malmö, Sweden
Olle Melander & Marju Orho-Melander
College of Medicine and Health Sciences, Khalifa University, Abu Dhabi, UAE
Pierre Zalloua
Department of Cardiovascular Medicine, The University of Tokyo, Tokyo, Japan
Issei Komuro
Faculty of Medicine, University of Iceland, Reykjavik, Iceland
David O. Arnar, Gudmundur Thorgeirsson, Unnur Thorsteinsdottir, Kari Stefansson & Nilesh J. Samani
Department of Internal Medicine, Division of Cardiology, Landspitali—National University Hospital of Iceland, Hringbraut, Reykjavik, Iceland
David O. Arnar & Gudmundur Thorgeirsson
Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, USA
Jason Flannick
Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
Yoichiro Kamatani
Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA
Iftikhar J. Kullo
Regeneron Genetics Center, Regeneron Pharmaceuticals, Tarrytown, NY, USA
Luca A. Lotta & Aris Baras
Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, Glenfield Hospital, Leicester, UK
Christopher P. Nelson & Thomas R. Webb
Cardiovascular Genomics and Genetics, University of Arizona College of Medicin, Phoenix, AZ, USA
Robert Roberts
Clinical Gene Networks AB, Stockholm, Sweden
Johan L. M. Björkegren
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
Eric Boerwinkle
Second Department of Cardiology, Medical School, National and Kapodistrian University of Athens, University General Hospital Attikon, Athens, Greece
Loukianos S. Rallidis
National Institute for Health and Care Research Cambridge Biomedical Research Centre, Cambridge University Hospitals, Cambridge, UK
John Danesh, John Danesh & Adam S. Butterworth
The National Institute for Health and Care Research Blood and Transplant Unit (NIHR BTRU) in Donor Health and Genomics, University of Cambridge, Cambridge, UK
John Danesh, John Danesh & Adam S. Butterworth
Human Genetics, Wellcome Sanger Institute, Saffron Walden, UK
John Danesh
Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
John Danesh, John Danesh & Adam S. Butterworth
British Heart Foundation Centre of Research Excellence, Division of Cardiovascular Medicine, Addenbrooke’s Hospital, Cambridge, UK
John Danesh, John Danesh & Adam S. Butterworth
Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
Cristen J. Willer
Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
Panos Deloukas
Verve Therapeutics, Cambridge, MA, USA
Sekar Kathiresan

Authors

Krishna G. Aragam
View author publications
You can also search for this author in PubMed Google Scholar
Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Anuj Goel
View author publications
You can also search for this author in PubMed Google Scholar
Stavroula Kanoni
View author publications
You can also search for this author in PubMed Google Scholar
Brooke N. Wolford
View author publications
You can also search for this author in PubMed Google Scholar
Deepak S. Atri
View author publications
You can also search for this author in PubMed Google Scholar
Elle M. Weeks
View author publications
You can also search for this author in PubMed Google Scholar
Minxian Wang
View author publications
You can also search for this author in PubMed Google Scholar
George Hindy
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Grace
View author publications
You can also search for this author in PubMed Google Scholar
Carolina Roselli
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas A. Marston
View author publications
You can also search for this author in PubMed Google Scholar
Frederick K. Kamanu
View author publications
You can also search for this author in PubMed Google Scholar
Ida Surakka
View author publications
You can also search for this author in PubMed Google Scholar
Loreto Muñoz Venegas
View author publications
You can also search for this author in PubMed Google Scholar
Paul Sherliker
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Koyama
View author publications
You can also search for this author in PubMed Google Scholar
Kazuyoshi Ishigaki
View author publications
You can also search for this author in PubMed Google Scholar
Bjørn O. Åsvold
View author publications
You can also search for this author in PubMed Google Scholar
Michael R. Brown
View author publications
You can also search for this author in PubMed Google Scholar
Ben Brumpton
View author publications
You can also search for this author in PubMed Google Scholar
Paul S. de Vries
View author publications
You can also search for this author in PubMed Google Scholar
Olga Giannakopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Panagiota Giardoglou
View author publications
You can also search for this author in PubMed Google Scholar
Daniel F. Gudbjartsson
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Güldener
View author publications
You can also search for this author in PubMed Google Scholar
Syed M. Ijlal Haider
View author publications
You can also search for this author in PubMed Google Scholar
Anna Helgadottir
View author publications
You can also search for this author in PubMed Google Scholar
Maysson Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Adnan Kastrati
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten Kessler
View author publications
You can also search for this author in PubMed Google Scholar
Theodosios Kyriakou
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Konopka
View author publications
You can also search for this author in PubMed Google Scholar
Ling Li
View author publications
You can also search for this author in PubMed Google Scholar
Lijiang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Meitinger
View author publications
You can also search for this author in PubMed Google Scholar
Sören Mucha
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Munz
View author publications
You can also search for this author in PubMed Google Scholar
Federico Murgia
View author publications
You can also search for this author in PubMed Google Scholar
Jonas B. Nielsen
View author publications
You can also search for this author in PubMed Google Scholar
Markus M. Nöthen
View author publications
You can also search for this author in PubMed Google Scholar
Shichao Pang
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Reinberger
View author publications
You can also search for this author in PubMed Google Scholar
Gavin Schnitzler
View author publications
You can also search for this author in PubMed Google Scholar
Damian Smedley
View author publications
You can also search for this author in PubMed Google Scholar
Gudmar Thorleifsson
View author publications
You can also search for this author in PubMed Google Scholar
Moritz von Scheidt
View author publications
You can also search for this author in PubMed Google Scholar
Jacob C. Ulirsch
View author publications
You can also search for this author in PubMed Google Scholar
David O. Arnar
View author publications
You can also search for this author in PubMed Google Scholar
Noël P. Burtt
View author publications
You can also search for this author in PubMed Google Scholar
Maria C. Costanzo
View author publications
You can also search for this author in PubMed Google Scholar
Jason Flannick
View author publications
You can also search for this author in PubMed Google Scholar
Kaoru Ito
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Keun Jang
View author publications
You can also search for this author in PubMed Google Scholar
Yoichiro Kamatani
View author publications
You can also search for this author in PubMed Google Scholar
Amit V. Khera
View author publications
You can also search for this author in PubMed Google Scholar
Issei Komuro
View author publications
You can also search for this author in PubMed Google Scholar
Iftikhar J. Kullo
View author publications
You can also search for this author in PubMed Google Scholar
Luca A. Lotta
View author publications
You can also search for this author in PubMed Google Scholar
Christopher P. Nelson
View author publications
You can also search for this author in PubMed Google Scholar
Robert Roberts
View author publications
You can also search for this author in PubMed Google Scholar
Gudmundur Thorgeirsson
View author publications
You can also search for this author in PubMed Google Scholar
Unnur Thorsteinsdottir
View author publications
You can also search for this author in PubMed Google Scholar
Thomas R. Webb
View author publications
You can also search for this author in PubMed Google Scholar
Aris Baras
View author publications
You can also search for this author in PubMed Google Scholar
Johan L. M. Björkegren
View author publications
You can also search for this author in PubMed Google Scholar
Eric Boerwinkle
View author publications
You can also search for this author in PubMed Google Scholar
George Dedoussis
View author publications
You can also search for this author in PubMed Google Scholar
Hilma Holm
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Hveem
View author publications
You can also search for this author in PubMed Google Scholar
Olle Melander
View author publications
You can also search for this author in PubMed Google Scholar
Alanna C. Morrison
View author publications
You can also search for this author in PubMed Google Scholar
Marju Orho-Melander
View author publications
You can also search for this author in PubMed Google Scholar
Loukianos S. Rallidis
View author publications
You can also search for this author in PubMed Google Scholar
Arno Ruusalepp
View author publications
You can also search for this author in PubMed Google Scholar
Marc S. Sabatine
View author publications
You can also search for this author in PubMed Google Scholar
Kari Stefansson
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Zalloua
View author publications
You can also search for this author in PubMed Google Scholar
Patrick T. Ellinor
View author publications
You can also search for this author in PubMed Google Scholar
Martin Farrall
View author publications
You can also search for this author in PubMed Google Scholar
John Danesh
View author publications
You can also search for this author in PubMed Google Scholar
Christian T. Ruff
View author publications
You can also search for this author in PubMed Google Scholar
Hilary K. Finucane
View author publications
You can also search for this author in PubMed Google Scholar
Jemma C. Hopewell
View author publications
You can also search for this author in PubMed Google Scholar
Robert Clarke
View author publications
You can also search for this author in PubMed Google Scholar
Rajat M. Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Jeanette Erdmann
View author publications
You can also search for this author in PubMed Google Scholar
Nilesh J. Samani
View author publications
You can also search for this author in PubMed Google Scholar
Heribert Schunkert
View author publications
You can also search for this author in PubMed Google Scholar
Hugh Watkins
View author publications
You can also search for this author in PubMed Google Scholar
Cristen J. Willer
View author publications
You can also search for this author in PubMed Google Scholar
Panos Deloukas
View author publications
You can also search for this author in PubMed Google Scholar
Sekar Kathiresan
View author publications
You can also search for this author in PubMed Google Scholar
Adam S. Butterworth
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Biobank Japan

Satoshi Koyama
, Kazuyoshi Ishigaki
, Issei Komuro
, Yoichiro Kamatani
& Kaoru Ito

EPIC-CVD

Adam S. Butterworth
, John Danesh
& Olle Melander

The CARDIoGRAMplusC4D Consortium

Krishna G. Aragam
, Tao Jiang
, Anuj Goel
, Stavroula Kanoni
, Brooke N. Wolford
, Deepak S. Atri
, Elle M. Weeks
, Minxian Wang
, George Hindy
, Wei Zhou
, Christopher Grace
, Carolina Roselli
, Nicholas A. Marston
, Frederick K. Kamanu
, Ida Surakka
, Loreto Muñoz Venegas
, Paul Sherliker
, Satoshi Koyama
, Kazuyoshi Ishigaki
, Bjørn O. Åsvold
, Michael R. Brown
, Ben Brumpton
, Paul S. de Vries
, Olga Giannakopoulou
, Panagiota Giardoglou
, Daniel F. Gudbjartsson
, Ulrich Güldener
, Syed M. Ijlal Haider
, Anna Helgadottir
, Maysson Ibrahim
, Adnan Kastrati
, Thorsten Kessler
, Theodosios Kyriakou
, Tomasz Konopka
, Ling Li
, Lijiang Ma
, Thomas Meitinger
, Sören Mucha
, Matthias Munz
, Federico Murgia
, Jonas B. Nielsen
, Markus M. Nöthen
, Shichao Pang
, Tobias Reinberger
, Gavin Schnitzler
, Damian Smedley
, Gudmar Thorleifsson
, Moritz von Scheidt
, Jacob C. Ulirsch
, David O. Arnar
, Noël P. Burtt
, Maria C. Costanzo
, Jason Flannick
, Kaoru Ito
, Dong-Keun Jang
, Yoichiro Kamatani
, Amit V. Khera
, Issei Komuro
, Iftikhar J. Kullo
, Luca A. Lotta
, Christopher P. Nelson
, Robert Roberts
, Gudmundur Thorgeirsson
, Unnur Thorsteinsdottir
, Thomas R. Webb
, Aris Baras
, Johan L. M. Björkegren
, Eric Boerwinkle
, George Dedoussis
, Hilma Holm
, Kristian Hveem
, Olle Melander
, Alanna C. Morrison
, Marju Orho-Melander
, Loukianos S. Rallidis
, Arno Ruusalepp
, Marc S. Sabatine
, Kari Stefansson
, Pierre Zalloua
, Patrick T. Ellinor
, Martin Farrall
, John Danesh
, Christian T. Ruff
, Hilary K. Finucane
, Jemma C. Hopewell
, Robert Clarke
, Rajat M. Gupta
, Jeanette Erdmann
, Nilesh J. Samani
, Heribert Schunkert
, Hugh Watkins
, Cristen J. Willer
, Panos Deloukas
, Sekar Kathiresan
& Adam S. Butterworth

Contributions

GWAS was conducted by K.G.A., T.J., B.N.W., W.Z., C.R., I.S., L.M.V., B.O.Å., D.O.A., A.B., J.D., G.D., P.D., P.T.E., J.E., O.G., P.G., D.F.G., U.G., S.M.I.H., A.H., G.H., H.H., K.H., A.V.K., I.J.K., S. Kathiresan, T. Kessler, T. Kyriakou, A.K., L.L., N.A.M., T.M., S.M., M.M., C.P.N., J.B.N., M.M.N., S.P., L.S.R., T.R., C.T.R., M.S.S., H.S., K.S., G. Thorgeirsson, G. Thorleifsson, U.T., M.v.S., C.J.W and T.R.W. Discovery meta-analysis of data was conducted by K.G.A., T.J. and A.S.B. Conditional analysis, FDR analysis, and heritability estimation were performed by S. Kanoni. Rare variant analysis was performed by B.N.W., W.Z., I.S. and C.J.W. Sex analysis was performed by S. Kanoni, P.D., K.G.A., B.O.Å., E.B., M.R.B., B.B., A.S.B., R.C., J.D., P.S.d.V., J.E., M.F., A.G., C.G., U.G., S.M.I.H., G.H., J.C.U., K.H., M.I., T.J., A.V.K., S. Kathiresan, T. Kessler, T. Kyriakou, A.K., L.L., T.M., A.C.M., S.M., L.M.V., M.M., F.M., J.B.N., M.M.N., S.P., T.R., H.S., I.S., M.v.S., H.W., C.J.W., B.N.W., P.Z. and W.Z. PheWAS was conducted by K.G.A. Cross-ancestry analysis was performed by K.G.A., T.J., K. Ishigaki, K. Ito, Y.K., I.K., S. Koyama and A.S.B. Association of polygenic risk scores was calculated by K.G.A., M.W., G.H., C.R., N.A.M., F.K.K., L.A.L., A.B., O.M., M.S.S., M.O.-M., A.V.K., M.S.S., P.T.E., C.T.R. and S. Kathiresan. Functionally informed fine-mapping was performed by A.G., C.G., M.F., J.C.U., R.C. and H.W. PoPS: E.M.W., H.K.F., K.G.A. and S. Kathiresan. Mouse knock-outs: D.S., T. Konopka and P.D. eQTL analyses were performed by A.S.B., T.J., H.S., L.M., J.L.M.B., A.R. and P.D. Causal gene prioritization was performed by K.G.A., A.S.B., P.S., R.C., A.G., C.G., M.F., J.C.U. and H.W. Experimental work was performed by D.S.A., G.S. and R.M.G. Data visualization was performed by C.G., A.G., N.P.B., M.C.C., J.F., D.-K.J., A.S.B. and K.G.A. CARDIoGRAMplusC4D Executive Committee: J.E., N.J.S., H.S., H.W., P.D., R.R., M.F., S. Kathiresan and J.D. Conceptualization, initiation and oversight came from A.S.B., S. Kathiresan, J.D., C.R., N.J.S., H.S., J.E., H.W., P.D. and C.J.W. K.G.A., T.J., A.G., S. Kanoni, B.N.W., J.D., C.T.R., H.K.F., J.C.H., R.C., J.E., N.J.S., H.S., H.W., C.J.W., P.D., S. Kathiresan and A.S.B drafted and edited the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Krishna G. Aragam or Adam S. Butterworth.

Ethics declarations

Competing interests

All deCODE affiliated authors are employees of deCODE/Amgen. The TIMI Study Group has received institutional research grant support through Brigham and Women’s from Abbott, Amgen, Aralez, AstraZeneca, Bayer HealthCare Pharmaceuticals, BRAHMS, Daiichi-Sankyo, Eisai, GlaxoSmithKline, Intarcia, Janssen, MedImmune, Merck, Novartis, Pfizer, Poxel, Quark Pharmaceuticals, Roche, Takeda, The Medicines Company, and Zora Biosciences. R.C., J.C.H., M.I. and F.M. work at the Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, which receives research grants from industry that are governed by the University of Oxford contracts that protect its independence and has a staff policy of not taking personal payments from industry; further details can be found at https://www.ndph.ox.ac.uk/files/about/ndph-independence-of-research-policy-jun-20.pdf. A.S.B. reports grants outside of this work from AstraZeneca, Bayer, Biogen, BioMarin, Bioverativ, Merck, Novartis and Sanofi. A.B. and L.A.L. are employees of Regeneron Pharmaceuticals and the spouse of C.J.W. works at Regeneron Pharmaceuticals. J.L.M.B. and A.R. are members of the board of directors, founders and shareholders of Clinical Gene Networks AB that has an invested interest in STARNET. J.D. serves on scientific advisory boards for AstraZeneca, Novartis, and UK Biobank and has received multiple grants from academic, charitable and industry sources outside of the submitted work. J.C.U. has received compensation for consulting from Goldfinch Bio and is an employee of Patch Biosciences. O.G. became a full-time employee of UCB while this manuscript was being drafted. The other authors declare no conflicts of interest.

Peer review

Peer review information

Nature Genetics thanks Xueling Sim and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Study design.

Flowchart depicting contributing studies and analysis strategy.

Extended Data Fig. 2 Genetic architecture of 897 association signals for CAD.

Minor allele frequency versus per-allele odds ratio for CAD for all sentinel variants reaching genome-wide significance or the 1% FDR threshold in our study. Colored circles indicate genome-wide significant associations (P < 5.0 × 10⁻⁸) with sentinel variants that are not correlated (r² < 0.2) with a previously reported variant (red), genome-wide significant sentinel variants correlated with a previously reported variant (blue), and associations reaching the 1% FDR threshold (P < 2.52 × 10⁻⁵) in our meta-analysis (gray). Two-sided P values are from Z-scores from fixed-effect inverse-variance weighted meta-analyses.

Extended Data Fig. 3 Gene-based association testing of rare variants in UK Biobank.

QQ-plot of aggregate variant association tests from 15,923 genes versus CAD in UK Biobank. Results presented here are for the SKAT-O test using the Mask 1 (‘lenient’) filter, which includes variants with minor allele frequency < 5% that are annotated as missense, frameshift, stop gain, stop loss or splice site. Results for all genes, tests and filters are in Supplementary Table 7. Details of masks and test are in Supplementary Table 6. The red dashed line indicates the Bonferroni threshold accounting for the number of genes tested. The gray dashed line indicates the null hypothesis (that is observed = expected under the null). The blue shaded area indicates the 95% confidence interval around the null.

Extended Data Fig. 4 Cross-ancestry comparison.

a, Comparison of allele frequencies between the meta-analysis and Biobank Japan. Black dots denote the allele frequencies for 199 sentinel variants reaching genome-wide significance in the (predominantly European ancestry) meta-analysis (y-axis) that were also present in the publicly available summary statistics from Biobank Japan (x-axis). Variants were aligned according to the effect allele in Supplementary Table 3. The Pearson correlation coefficient was 0.76. b, Comparison of beta estimates between the meta-analysis and Biobank Japan. Black dots denote the beta estimates for the CAD associations for 199 sentinel variants reaching genome-wide significance in the (predominantly European ancestry) meta-analysis (y-axis) that were also present in the publicly available summary statistics from Biobank Japan (x-axis). Variants were aligned according to the effect allele in Supplementary Table 3. Horizontal and vertical lines represent 95% confidence intervals. The Pearson correlation coefficient was 0.59, which increased to 0.85 when three outlying variants marked in red (at ATXN2, FER and SLC22A1) were excluded.

Extended Data Fig. 5 Epigenetically-informed fine-mapping of the MAFB locus.

a, Regional association plot from the CAD meta-analysis for the MAFB region. Colored dots represent the position (x-axis) in GRCh37 coordinates and –log₁₀(meta-analysis P value) (y-axis) of each variant in the region. Dots are shaded to represent the r² with the lead CAD variant (rs2207132), estimated using a random sample of 5,000 European ancestry participants from the UK Biobank. Recombination peaks are plotted in blue based on estimates of recombination from 1000 Genomes European-ancestry individuals. b, Tissue-specific imputed chromHMM states at the three credible set variants in the MAFB region. The top track shows the position on chromosome 20 (GRCh37) in the MAFB region. The second track shows as orange vertical bars the posterior probability (y-axis) for each variant in the window from the FGWAS fine-mapping, identifying rs1883711 (PPA = 0.77) as the most likely causal variant. The third track indicates as a black box the position of the imputed chromHMM state in each of the ten CAD-relevant tissues based on epigenomic data from the NIH Roadmap Epigenomics Consortium project. The yellow vertical line indicates the position of the most likely causal variant (rs1883711) with respect to the chromHMM states. rs1883711 lies in an enhancer region for liver (the most strongly enriched tissue for this region) and adipose, the two functionally enriched tissues in the region. The other two variants in the 95% credible set (rs2207132 and rs117113213) do not lie in regions annotated as chromHMM states. HSMM, human skeletal muscle myoblasts; HUVEC, human umbilical vein endothelial cells; PPA, posterior probability of being the causal variant.

Extended Data Fig. 6 Pairwise concordance of eight gene-prioritization predictors to identify most likely causal genes.

White squares lying on the diagonal contain the number of genes for which that predictor provided evidence (denominator) and the number of times for which that predictor prioritized the most likely causal gene at the locus (numerator). For example, eQTL data provided evidence for 105 causal genes, of which 90 (86%) were also the most likely causal gene at the locus. Blue squares below the diagonal show the concordance between pairs of predictors and contain the number of genes for which both predictors provided evidence (denominator) and the number of times for which the prioritized causal gene was the same (numerator). For example, the nearest gene and the presence of a protein-altering variant in high LD (r² > 0.8) with the CAD sentinel both provided evidence for a causal gene at 48 loci, of which they were concordant (that is prioritized the same causal gene) at 34 (71%). Darker blue squares show higher levels of concordance. Orange squares above the diagonal show the discordance between pairs of predictors and contain the number of genes for which both predictors provided evidence (denominator) and the number of times for which the prioritized causal gene was the different (numerator). For example, the nearest gene and the presence of a protein-altering variant in high LD (r² > 0.8) with the CAD sentinel both provided evidence for a causal gene at 48 loci, of which they were discordant (that is prioritized a different causal gene) at 13 (27%). Darker orange squares show higher levels of discordance. See Fig. 5a for descriptions of the eight predictors used to prioritize causal genes.

Extended Data Fig. 7 Prioritizing the likely causal variant, gene and pathway at the ITGA1 locus.

a, Regional association plot from the primary CAD meta-analysis for the ITGA1 region. Colored dots represent the position (x-axis) in GRCh37 coordinates and –log₁₀(meta-analysis P value) (y-axis) of each variant in the region. Dots are shaded to represent the r² with the lead CAD variant (rs4074793), estimated using a random sample of 5,000 European-ancestry participants from UK Biobank. Recombination peaks are plotted in blue based on estimates of recombination from 1000 Genomes European-ancestry individuals. b, Tissue-specific imputed chromHMM states at the two credible set variants in the ITGA1 region. The top track shows the position on chromosome 5 (GRCh37) with respect to the ITGA1 gene. The second track shows as a vertical orange line the posterior probability (y-axis) for each variant in the region from the FGWAS fine-mapping, identifying rs4074793 (PPA = 0.95) as the likely causal variant. The third track indicates as a black box the position of an enhancer state in each of the ten CAD-relevant tissues, using custom imputed chromHMM states based on epigenomic data from the NIH Roadmap Epigenomics Consortium project. The yellow vertical line indicates the position of the likely causal variant (rs4074793) with respect to the chromHMM states. rs4074793 is annotated to a chromHMM state for all five tissues that show enrichment in the region. HSMM, human skeletal muscle cells; HUVEC, human umbilical vein endothelial cells; PPA, posterior probability of being the causal variant. c, Effect of rs4074973 on ITGA1 expression in liver in the STARNET study. The plot shows the position (x-axis) in GRCh37 coordinates and –log₁₀(P value) (y-axis) of each variant in the region. The likely causal CAD variant rs4074973 is circled in black. Only variants with P < 0.01 are displayed. d, Associations of rs4074973 with ITGA1 expression and phenotypes from a phenome-wide association study. The per-allele association of rs40747973-G (the CAD risk allele) measured in s.d. units is plotted for each phenotype. The box indicates the point estimate and the horizontal bars represent the 95% confidence intervals. The top panel shows the association estimates for ITGA1 expression from the STARNET study. The bottom panel shows associations from UK Biobank (liver enzymes and inflammatory markers) and the literature (lipids⁴⁶). ALP, alkaline phosphatase; ALT, alanine aminotransferase; CRP, C-reactive protein; GGT, gamma glutamyltransferase; LDL-c, low-density lipoprotein cholesterol; Tchol, total cholesterol.

Supplementary information

Supplementary Information

Supplementary Fig. 1 and Supplementary Note.

Reporting Summary

Supplementary Data 1

Regional association plots for the 241 genome-wide significant signals from the primary CAD GWAS meta-analysis.

Supplementary Tables

Supplementary Tables 1–35.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aragam, K.G., Jiang, T., Goel, A. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat Genet 54, 1803–1815 (2022). https://doi.org/10.1038/s41588-022-01233-6

Download citation

Received: 26 April 2021
Accepted: 17 October 2022
Published: 06 December 2022
Issue Date: December 2022
DOI: https://doi.org/10.1038/s41588-022-01233-6

This article is cited by

Novel genetic markers for chronic kidney disease in a geographically isolated population of Indigenous Australians: Individual and multiple phenotype genome-wide association study
- Vignesh Arunachalam
- Rodney Lea
- Shivashankar H. Nagaraj
Genome Medicine (2024)
Joint and interactive associations of body mass index and genetic factors with cardiovascular disease: a prospective study in UK Biobank
- Ruyu Huang
- Xinxin Kong
- Jianling Bai
BMC Public Health (2024)
Polygenic risk score portability for common diseases across genetically diverse populations
- Sonia Moreno-Grau
- Manvi Vernekar
- Carlos D. Bustamante
Human Genomics (2024)
CRISPR–Cas9 applications in T cells and adoptive T cell therapies
- Xiaoying Chen
- Shuhan Zhong
- Xuepei Zhang
Cellular & Molecular Biology Letters (2024)
Mendelian randomization studies on coronary artery disease: a systematic review and meta-analysis
- Sarah Silva
- Segun Fatumo
- Dorothea Nitsch
Systematic Reviews (2024)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Discovery of known and new CAD loci

Allelic architecture

Differential effects by sex

Subthreshold associations

Polygenic score associations with incident and recurrent CAD

Cross-ancestry comparison and meta-analysis

Prioritizing causal variants, genes and biological pathways

Functionally informed fine-mapping

Polygenic prioritization of candidate causal genes

Systematic prioritization of putative causal genes

Candidate loci with converging lines of evidence

Experimental interrogation of a new CAD locus

Discussion

Methods

Genetic discovery meta-analysis

Joint association analysis

Identifying previously reported regions and associations

Genetic correlation analysis

PheWAS in UK Biobank

Rare variant analyses

Sex-specific analysis

FDR estimation

Estimation of heritability explained

Cross-ancestry comparison

Derivation and training of PRSs

Incident event prediction analyses

Recurrent event prediction analyses

Identifying protein-altering variants

Functional GWAS analysis

eQTL analysis in CAD-relevant tissues

Polygenic prioritization of candidate causal genes

Variants responsible for cardiovascular-relevant monogenic disorders

Phenotyping knock-out mice

Rare variant associations, MR and drug evidence

Systematic integration of gene prioritization evidence

CRISPR–Cas9 genome editing in vascular cells

Gene expression by qPCR

Noncoding enhancer characterization

Wound-healing assay

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

Biobank Japan

EPIC-CVD

The CARDIoGRAMplusC4D Consortium

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links