Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure

Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies.

Reviewer #1: Remarks to the Author: I am still concerned that the data sources likely carry a strong selection bias for specific disease entities. For example, cohorts that were exclusively selected from cardiac catheterization laboratories will have an overrepresentation of CAD cases (LURIC). Population based samples will have a stronger representation of patients with arterial hypertension etc.. This limitation should be further explored by stratification and discussed more thoroughly. Moreover, the authors should provide more details on individual data of the participants rather than describing crudely their sources in the appendix.
I am also doubtful that such diverse samples are suitable for MR analyses and thus support the far reaching conclusions in this respect. For example, if both cases with HF and controls without HF have had a myocardial infarction (EPHESUS; SOLID) then the association study will fail to detect CAD risk alleles. If most cases with HF and controls without HF have diabetes (Go-Darts) then the association study will fail to detect diabetes risk alleles. Accordingly, any sort of adjustment or MR study will underestimate the effects of respective risk alleles! A major aim of genetic association studies in complex disorders is to learn more about the molecular aetiology. This reviewer still suggests in addition to a strategy of pooling all individuals irrespectively of the disease causing condition to study individuals in whom HF is secondary to a well-defined condition, e.g. dilated cardiomyopathy or myocardial infarction, separately. Table 1 should mention for each of the loci the established genomewide significant associations with conditions that predispose the heart failure (CAD, obesity/BMI, aFib, hypertension). This increases the information content and puts the data into prospective. Table 1, it is misleading to call the 9p21 locus by a gene that has been shown to be not involved in the aetiology. One way of dealing with the locus could be to name it 9p21/CDKN2B.
The lambda for the meta-analysis was now reported to be 1.127. This is fairly high and it might be good to have a genetic epidemiologist comment on this issue.

Reviewers' comments:
Reviewer #1 (Remarks to the Author): I am still concerned that the data sources likely carry a strong selection bias for specific disease entities. For example, cohorts that were exclusively selected from cardiac catheterization laboratories will have an overrepresentation of CAD cases (LURIC). Population based samples will have a stronger representation of patients with arterial hypertension etc.. This limitation should be further explored by stratification and discussed more thoroughly. Moreover, the authors should provide more details on individual data of the participants rather than describing crudely their sources in the appendix.
We agree with the reviewer that non-population cohorts differ with respect to upstream or co-morbid disease phenotypes. Detailed information on the demographic and clinical characteristics of participating studies are given in Supplementary Table 16. Of the 26 included studies, 9 were from non-population samples, accounting for ~18% of the total case population, of which 5 were performed in a uniform risk population (CAD -EPHESUS, SOLID; suspected CAD -LURIC; diabetes -GoDARTS; elevated cardiovascular risk -PROSPER). To highlighted this, we have added the following text to the results section (Main text 309-311): The study sample comprised both population cohorts (17 studies,38,780 HF cases,893,657 controls) and case-control samples (9 studies, 8,529 cases, 36,357 controls); (see Supplementary Note for a detailed description of the included studies).
We provide empirical evidence to show that inclusion of such studies does not materially influence the estimates derived from Mendelian randomization (please see response to following reviewer question).
We acknowledge the importance of stratified analysis for analysis of this complex phenotype and this will form the basis for our next collaborative meta-analysis. In this study we will perform large-scale stratified analysis using harmonised covariates for stratification. We highlight this future work in the discussion (Main text 509-513).
I am also doubtful that such diverse samples are suitable for MR analyses and thus support the far reaching conclusions in this respect. For example, if both cases with HF and controls without HF have had a myocardial infarction (EPHESUS; SOLID) then the association study will fail to detect CAD risk alleles. If most cases with HF and controls without HF have diabetes (Go-Darts) then the association study will fail to detect diabetes risk alleles. Accordingly, any sort of adjustment or MR study will underestimate the effects of respective risk alleles! We agree that the effects of upstream risk factor-associated alleles on heart failure will be underestimated in HF case-control studies performed within populations recruited with the corresponding risk background. Similarly, MR analysis would underestimate the overall effects of the given risk factor on HF.
To estimate the possible effects from the inclusion of these studies in the meta-analysis (~18% of cases, ~4% controls), we have undertaken a sensitivity analysis by including only population samples in the meta-analysis (17 studies, 38,780 HF cases, 893,657 controls). We found that the effect estimates for both the HF-associated risk factor loci and two-sample Mendelian randomisation analysis were consistent with the results from the full sample. The exclusion of case-control studies performed in patients with or at risk of CAD did not reduce the effect estimates of the MR analysis for CAD or for those HF risk loci with established CAD associations.
We have added the following text to summarise the results of these analyses (Main text lines 463-467): We then performed a sensitivity analysis to explore potential bias arising from the inclusion of case-control samples by repeating the Mendelian randomisation analysis using heart failure GWAS estimates generated from population cohort studies only. The results of this analysis were consistent with those generated from the overall sample (data not shown).
A major aim of genetic association studies in complex disorders is to learn more about the molecular aetiology. This reviewer still suggests in addition to a strategy of pooling all individuals irrespectively of the disease causing condition to study individuals in whom HF is secondary to a well-defined condition, e.g. dilated cardiomyopathy or myocardial infarction, separately.
We agree that looking at both pooled and stratified samples is important as reflected in previous responses and in the main text discussion. These analyses are the focus of our next large-scale collaborative effort. For the purposes of this study, we designed and validated interoperable clinical classifiers against adjudicated cases populations to harmonise phenotypes across studies and to achieve sufficient statistical power. Table 1 should mention for each of the loci the established genomewide significant associations with conditions that predispose the heart failure (CAD, obesity/BMI, aFib, hypertension). This increases the information content and puts the data into prospective.
We provide the association of sentinel variants or proxies with heart failure related traits (including those suggested) in Supplementary Table 4. To make the table clearer we have filtered the results to present only those associations that reach genome-wide significance (P < 5x10 -8 ). We are happy to add this information to Table 1 however as an alternative, given limited space, we have coded this information into Figure 3 (see below). Table 1, it is misleading to call the 9p21 locus by a gene that has been shown to be not involved in the aetiology. One way of dealing with the locus could be to name it 9p21/CDKN2B.
We thank the reviewer for their suggestion and have now amended the manuscript to refer to this locus as 9p21/CDKN2B.
The lambda for the meta-analysis was now reported to be 1.127. This is fairly high and it might be good to have a genetic epidemiologist comment on this issue.