Lack of genetic support for shared aetiology of Coronary Artery Disease and Late-onset Alzheimer’s disease

Epidemiological studies suggest a positive association between coronary artery disease (CAD) and late-onset Alzheimer’s disease (LOAD). This large-scale genetic study brings together ‘big data’ resources to examine the causal impact of genetic determinants of CAD on risk of LOAD. A two-sample Mendelian randomization approach was adopted to estimate the causal effect of CAD on risk of LOAD using summary data from 60,801 CAD cases from CARDIoGRAMplusC4D and 17,008 LOAD cases from the IGAP Consortium. Additional analyses assessed the independent relevance of genetic associations at the APOE locus for both CAD and LOAD. Higher genetically determined risk of CAD was associated with a slightly higher risk of LOAD (Odds Ratio (OR) per log-odds unit of CAD [95% CI]: 1.07 [1.01–1.15]; p = 0.027). However, after exclusion of the APOE locus, the estimate of the causal effect of CAD for LOAD was attenuated and no longer significant (OR 0.94 [0.88–1.01]; p = 0.072). This Mendelian randomization study indicates that the APOE locus is the chief determinant of shared genetic architecture between CAD and LOAD, and suggests a lack of causal relevance of CAD for risk of LOAD after exclusion of APOE.

Several studies have also reported that clinically manifest cardiovascular disease was associated with a higher risk of LOAD [16][17][18] . However, since LOAD has a very long latency period, such studies have been constrained by confounding, reverse causality bias and diagnostic misclassification. Mendelian randomization (MR) studies afford the potential to elucidate the causal relevance of lifelong differences in exposures with disease outcomes that are independent of confounding and reverse causation 19,20 . Moreover, MR studies have been successful in enhancing our understanding of the causal risk factors for cardiovascular diseases 21 .
The aim of the present study was to examine the causal relevance of CAD on risk of LOAD, and explore their shared genetic architecture. The objectives of this study were: (i) to assess the impact of genetic determinants of CAD on risk of LOAD; and (ii) to assess the independent relevance of genetic associations at the APOE locus for both CAD and LOAD.

Methods
Participating cohorts. The CARDIoGRAMplusC4D 5 summary statistics were derived from a 1000 Genomes-based meta-analysis of 48 studies, involving 60 801 CAD cases and 123 504 controls. Diagnosis of CAD included evidence of myocardial infarction, chronic stable angina with a revascularisation procedure or a coronary stenosis >50% and CAD cases had a mean age of approximately 60 years 5 . The International Genomics of Alzheimer's Project (IGAP) 11 summary statistics were derived from a 1000 Genomes-based meta-analysis of LOAD cases among European individuals, involving 17 008 LOAD cases and 37 154 controls. Diagnosis of LOAD followed assessment by a neurologist and LOAD cases had a mean age of approximately 74 years 11 .
This study presents a new analysis of anonymised summary data from previously published meta-analyses in which each of the individual studies had ethics approval by the relevant institutions where participants (or, for those with substantial cognitive impairment, from an appropriate proxy), provided written informed consent 5,11 .

Selection of variants.
Among the 57 genome-wide significant variants identified in the CARDIoGRAMplusC4D meta-analysis, 52 genetic variants were selected for the present study after exclusions 5 . Variants with recessive associations (n = 2) and for which a suitable proxy was also unavailable (r 2 ≥ 0.80) in IGAP (n = 3), were excluded from the analysis 11 (eTable 1). Summary estimates (per effect allele) were extracted from the CARDIoGRAMplusC4D 5 meta-analysis for CAD and from the IGAP 11 meta-analysis for LOAD (eTables 2 and 3).
A sensitivity analysis including 190 genetic variants of the 214 (including 9 with a minor allele frequency <0.05) with a 5% False Discovery Rate (FDR) identified in the CARDIoGRAMplusC4D meta-analysis was performed 22 . A total of 24 variants were excluded due to the absence of a suitable proxy (i.e. no variant with r 2 ≥ 0.80).
Mendelian randomization analysis. The effect of CAD (risk phenotype) on LOAD (outcome phenotype) was analysed by looking at the impact of each genetic marker's effect size for CAD on its effect size for LOAD. Effects of individual variants are given per copy of the effect allele unless otherwise stated. This was assessed by calculating the ratio of LOAD effect size/CAD effect size for each of the 52 variants, and combining them using a fixed effect meta-analysis model to estimate the causal effect 23 . The Cochran Q statistic was used to assess heterogeneity in risk estimates between the variants in the fixed effect meta-analysis. An online database (PhenoScanner) 24 was used to identify multiple phenotypes associated with individual genetic variants to investigate potential pleiotropy. Sensitivity analyses using Egger regression MR 25 were performed, a method that allows for invalid instrumental variables due to pleiotropy, using the MR-BASE R package 26 .
Cross-trait LD score regression. The genetic correlation of effect statistics for CAD and for LOAD was estimated by cross-trait LD score regression 27 using a total of 5,403,795 variants that were studied in both the CARDIoGRAMplusC4D and IGAP meta-analyses. This method estimates the genetic correlation between the two traits using GWAS summary statistics and is unbiased by any overlap of participants in both study populations 27 . Scan for shared genetic determinants at the APOE locus. Shared genetic determinants for LOAD and CAD at the APOE locus were investigated using a recently developed approach known as gwas-pw 28 . This method uses a statistical model to estimate the posterior probability that a genomic region adheres to four separate models. Models 1 and 2 are used to test whether a locus contains a genetic variant for only one of the two phenotypes. Models 3 and 4 are used to test whether a genetic variant exists for both phenotypes within the locus. More specifically, Model 3 assesses whether the same genetic variant influences both phenotypes, whilst Model 4 assesses whether the two phenotypes are influenced by separate genetic variants within a locus.

Results
Mendelian randomization analyses. Figure 1 shows the associations of the 52 variants that were used to estimate the causal effect. For each genome-wide significant variant, the odds ratio (OR) and 95% confidence intervals (CI) of the summary statistics in the CARDIoGRAMplusC4D and IGAP meta-analyses are presented. The CAD odds ratios are presented in descending order of strength of their association with CAD and indicate that the APOE rs4420638 genetic variant was the sole variant significantly associated with LOAD. Table 1 shows the causal effect estimates on LOAD after combining information across all 52 CAD variants. The results indicate a nominally significant causal association, consistent with a higher risk of CAD being associated with a 7% higher risk of LOAD (OR 1.07 for LOAD per log odds unit of CAD [1.01-1.15]; p = 0.027). However, there was significant heterogeneity between the causal effects for each of the variants included in the analysis (p < 2.2 × 10 −308 ). After removal of the single outlying variant, rs4420638 at the APOE locus, there was no remaining heterogeneity (p = 0.351). Furthermore, after removal of the APOE variant, there was no longer any significant causal association of CAD with LOAD (OR 0.94 for LOAD per log odds unit of CAD [0.88-1.01]; p = 0.072). The causal estimate (based on the 52 CAD variants) from the Egger regression approach was not significant (p = 0.846), suggesting that the APOE variant may not be a valid instrumental variable due to pleiotropy at the APOE locus. Similar results were observed using data for the 190 CAD variants that were significantly associated with CAD at the 5% FDR threshold. ε2 allele (rs7412) and the APOE ε4 allele (rs429358) were also included in the figure. Table 2 shows the linkage disequilibrium (LD) structure (with D′ and r 2 ) between the variants within the APOE region.

Shared impact of APOE locus.
The gwas-pw method 28 was used to detect evidence of shared genetic determinants within the APOE locus (chr19:44,744,147-46,101,600) for CAD and LOAD. There was strong evidence for genetic variants influencing both phenotypes at the locus. Furthermore Model 4 (Posterior Probability: 0.90), which specifies separate genetic variants within the APOE locus influencing CAD and LOAD, had a higher posterior probability than Model 3 (Posterior Probability: 0.10) which specified a shared genetic variant in the APOE locus.

Discussion
Disparate genetic architecture of CAD and LOAD. The present study investigated a shared genetic architecture between CAD and LOAD using large-scale GWAS meta-analyses for both diseases. The initial Mendelian randomization analysis suggested that a higher risk of CAD (per log odds unit) was also associated with a 7% higher risk of LOAD. However, there was significant heterogeneity between the causal effects of individual variants. This heterogeneity was entirely explained by a single variant (rs4420638) at the APOE locus. When the APOE variant was removed from the analysis, the causal effect on LOAD was completely attenuated and no longer significant. Thus, overall, genetic determinants associated with a higher risk of CAD were not significantly associated with LOAD after excluding variants at the APOE locus. In addition, the LD score regression analysis identified little or no genetic correlation between CAD and LOAD. APOE locus. The APOE locus was investigated in greater detail since the rs4420638 variant was strongly associated with both CAD (p = 7.07 × 10 −11 ) and LOAD (p = 1.67 × 10 −396 ). The gwas-pw analysis 28 , suggested that the traits were influenced by separate genetic variants within the APOE locus. This indicates that the influence of the APOE locus on both LOAD and CAD may be mediated through different mechanisms.
In the case of the APOE variants (rs7412, rs429358): association with LDL cholesterol, a stronger signal was detected in the APOE-ε2 variant (rs7412, p = 5.54 × 10 −30 ) 29 than in the APOE-ε4 variant (rs429358, p = 4.21 × 10 −10 ) 29 , suggesting that the APOE-ε2 variant is strongly associated with LDL cholesterol pathways. In the case of amyloid beta load, the APOE-ε4 variant has been strongly associated with this phenotype (rs429358; OR not available, p = 5.45 × 10 −14 ) 31 , suggesting that the APOE-ε4 effect may be mediated by amyloid beta pathways. However the APOE-ε2 variant (rs7412) was not present in the analysis of amyloid beta.
In  11 suggesting that LOAD may be primarily associated with the APOE-ε4 variant: The APOE-ε4 (rs429358) variant is also the peak signal for cortical amyloid beta load 31 suggesting that variants in APOE for LOAD may be primarily associated with amyloid beta pathways. In contrast, CAD had a similar strength of association with both the APOE-ε4 variant (rs429358, OR [95% CI]: 1.10 [1.06-1.13]; p = 2.17 × 10 −9 ; EA = C) and APOE-ε2 variant (rs7412, OR [95% CI]: 1.15 [1.10-1.20]; p = 8.17 × 10 −11 ; EA = C) 5 . The CAD peak variant (rs4420638) was in complete LD (by measures of D′) with the APOE-ε2 variant (rs7412) suggesting that variants in APOE for CAD may be primarily associated with LDL cholesterol pathways. Future analyses of individual participant data could permit exploration of the ε2/ε3/ε4 APOE haplotypes to further elucidate these relationships.  Exploration of potentially pleiotropic effects in each of the 52 variants for CAD (eTable 7) suggest that LDL cholesterol is unlikely to be involved in any shared biological pathways between LOAD and CAD, other than via the effects of APOE. Among the 10 CAD loci that were associated with significant differences in LDL-cholesterol concentrations, only one of these (APOE) was also associated with LOAD. Another study reported null associations between genetically-predicted body mass index with LOAD 33 . Likewise, variants encoding Type-2 diabetes were also unrelated to LOAD 34 .

Other MR analyses. Recently
The METASTROKE consortium 35 examined the genetic association between ischemic stroke (IS) and LOAD using GREML 36  Limitations of the study. The present analysis was performed on 52 variants selected from the CARDIoGRAMplusC4D meta-analysis, at which a genome-wide significant signal (p ≤ 5 × 10 −8 ) had been identified 5,[38][39][40][41] . The results of this analysis were not materially altered when including 190 variants based on an FDR 5% threshold. Furthermore, results from LD score regression, examining information across the genome, also provide further support.
Results of an post-hoc power analysis are presented in eTable 12, which suggest that the analyses with genome wide significant and FDR 5% instrumental variables had ~80% power to identify ~10% effect on LOAD. Thus, later studies are unlikely to discover a material shared genetic architecture between CAD and LOAD.
The available evidence on the APOE locus indicated separate mechanisms by which this locus acts upon LOAD and CAD, which raises questions about the assumptions for MR analysis involving this locus. Further fine mapping studies of the APOE locus are needed to assess the genetic associations at these highly correlated variants. These studies could include populations with greater haplotype diversity to resolve tightly linked genetic signals that appear intractably interwoven in Europeans. However, we encountered difficulties with exploratory fine mapping studies of the APOE locus for LOAD, due to collinearity induced by the strong linkage disequilibrium between the variants in this locus.
The present study also had several limitations. Firstly, the study maybe constrained by selection bias due to differences in age. The average age of onset of CAD was approximately 60 years, while the average age of onset of LOAD cases was 74 years. Individuals predisposed to developing both LOAD and CAD may not have survived to old age, which may have underestimated any association with LOAD. Moreover, the diagnosis of probable LOAD excludes a prior history of cerebrovascular disease, so studies of LOAD may have reduced risk of overlap between CAD and LOAD. However, neuroimaging studies show that pathological changes in the brain precede the development of mild cognitive impairment and LOAD by 1-2 decades 7 .
Selection bias may also have influenced summary statistics between CAD and LOAD due to overlapping participants for each disease in some studies. However, the number of overlapping cases in AGES, Rotterdam and Framingham Heart studies is very limited (up to 1000 LOAD cases in IGAP could potentially overlap with CAD cases included in the CARDIoGRAMplusC4D study).
Another possible limitation is that since the CARDIoGRAMplusC4D data included some non-European samples, these could increase heterogeneity between the samples. However, the CARDIoGRAMplusC4D study reported that no heterogeneity between studies was observed at any of the genome wide significant variants apart from the 9p21 locus 5 .
The present report demonstrated a different genetic architecture of CAD and LOAD. While further studies are required to further elucidate links between cardiovascular disease risk factors and LOAD, additional MR studies of CAD are unlikely to be informative about the causes of LOAD.

Conclusions
Analyses were performed to investigate whether CAD and LOAD have a shared genetic architecture and whether CAD is a causal risk factor for LOAD, given the findings of observational studies. However, the present study demonstrated that although genetic predisposition to CAD was significantly associated with LOAD, this association was entirely mediated through the APOE locus. After exclusion of the APOE locus, CAD variants were no longer significantly associated with LOAD. Additional fine mapping studies are needed to dissect the independent relevance of APOE for both CAD and LOAD.