In recent years, multiple loci dispersed on the genome have been shown to be associated with coronary artery disease (CAD). We investigated whether these common genetic variants also hold value for CAD prediction in a large cohort of patients with familial hypercholesterolemia (FH). We genotyped a total of 41 single-nucleotide polymorphisms (SNPs) in 1701 FH patients, of whom 482 patients (28.3%) had at least one coronary event during an average follow up of 66 years. The association of each SNP with event-free survival time was calculated with a Cox proportional hazard model. In the cardiovascular disease risk factor adjusted analysis, the most significant SNP was rs1122608:G>T in the SMARCA4 gene near the LDL-receptor (LDLR) gene, with a hazard ratio for CAD risk of 0.74 (95% CI 0.49–0.99; P-value 0.021). However, none of the SNPs reached the Bonferroni threshold. Of all the known CAD loci analyzed, the SMARCA4 locus near the LDLR had the strongest negative association with CAD in this high-risk FH cohort. The effect is contrary to what was expected. None of the other loci showed association with CAD.
Familial hypercholesterolemia (FH) is an autosomal dominant disorder characterized by increased low-density lipoprotein cholesterol (LDL-C) levels and preponderance to coronary artery disease (CAD). The diagnosis is based on stringent clinical criteria or on the identification of mutations in the LDL-receptor (LDLR), apolipoprotein B or proprotein convertase subtilisin/kexin type 9 (PCSK9) gene. The frequency of heterozygosity is at least 1/500 in most European countries.1 By virtue of the elevated LDL-C levels, FH results in lipid accumulation in the arterial wall and as a consequence accelerated atherosclerosis.2, 3 If left untreated, 50% of male and 30% of female heterozygous FH patients will develop CAD before 60 years of age.4 The age of onset and severity of CAD varies considerably between FH patients, even among individuals who share an identical gene defect.5
Previously, we performed a retrospective multicentre cohort study of 2400 FH patients, of whom 782 patients (32.6%) had at least one cardiovascular event during an average follow up of 66 years.6 In this cohort, we demonstrated that LDL-C levels are more important than the LDLR mutation type in determining the age of onset of CAD.7 We also showed that classical risk factors including male gender, smoking, hypertension, type 2 diabetes, low levels of high-density lipoprotein cholesterol (HDL-C) and elevated lipoprotein(a) levels were independent risk factors for the development of CAD. However, these factors combined explained only 18.7% of the variation in CVD occurrence.8 Thus, a considerable part of the variability in CVD occurrence remains to be disentangled and common genetic variation might provide one of the explanations.
Recent large-scale genome-wide association (GWA) studies have revealed common genetic variations at 45 loci that moderately affect (hazard ratios (HR) varying between 1.1 and 2.0) the incidence of CAD in the general Caucasian population.9, 10, 11, 12
We set out to address whether common variations within the 45 previously identified loci by recent GWA studies are modifiers of CAD risk in a high-risk population of heterozygous FH cases.
Materials and methods
Written informed consent was obtained from all patients. The Medical Ethics Review Board of each participating hospital approved the protocol, which complies with the Declaration of Helsinki.
Study design and study population
The Genetic Identification of Risk Factors in Familial Hypercholesterolemia (GIRaFH) is a retrospective multicenter cohort study. The study design and study population have been described elsewhere.6 Briefly, DNA samples from patient who, based on clinically oriented algorithms are anticipated to suffer from FH are being sent in to the central core molecular diagnostic laboratory by physicians working at one of the nationwide lipid clinics. LDLR gene variation was genotyped according to previously published methods.13 DNA of a total of 9300 hypercholesterolemic patients was stored in the DNA database at the time of initiation of the cohort. Only those cases from larger lipid clinics were selected (9188) for further analysis, as smaller clinics normally only send DNA samples of the rare, usually very serious FH cases. Of this set, 4000 cases were randomly selected. After review of medical records, a group of 2400 patients fulfilled the FH diagnostic criteria based on internationally established criteria and were included in the study.14 Phenotypic, CVD event and cause of death data were acquired from the medical charts. None of the study population received primary prevention in the form of beta-blockers or aspirin. CAD was defined as angina pectoris (AP), acute coronary syndrome (ACS), percutaneous coronary interventions (PCI) or coronary artery bypass grafting (CABG).
On the basis of the latest published meta-analysis, a total of 45 single-nucleotide polymorphisms (SNPs) associated with CAD were identified.12 We did not include SNPs which were only associated with an intermediate trait such as lipid levels, type 2 diabetes or hypertension.
If these SNPs were not directly genotyped, imputed data, using MACH and the HapMap phase 2 data sets (build 36 release 22),15, 16, 17 were used. Finally if this failed proxies were looked up within a window of 500 kb (R2≥0.8).15, 16, 17 See Figure 1.
Genotyping and imputation
For 1701 of the 2400 GIRaFH cases DNA was available for additional genotyping. Genotyping was performed using the 50 K gene-centric Human CVD BeadChip18 and genotypes were called using the BeadStudio software (Illumina, San Diego, CA, USA) and subjected to quality control filters at the sample and the SNP level. After genotyping, PLINK v1.07 (http://pngu.mgh.harvard.edu/purcell/plink/) was used to test the SNPs for population substructure which could introduce false-positive associations. This was done by means of multidimensional scaling implementation.19 In addition, the SNPs were subjected to additional quality control filters based on sample size and minor allele frequencies (MAF). Samples with a call rate of <95% were excluded from further analysis. Genetic markers with a MAF <1% were excluded from further analysis. An identity-by-state analysis was performed to ensure that only Caucasian individuals were included in the final association analyses.
With an effective sample size of 1701 cases and 483 events, the GIRaFH sample has 80% power to identify statistically significant associations for SNPs conferring a relative risk >2.2 and MAF >0.10. Differences between subgroups were tested with χ2 statistics or an independent sample t-test where appropriate. Triglycerides had a skewed distribution and therefore statistical analyses were performed on log-transformed data. The association of each SNP with event-free survival time was calculated with a Cox proportional hazard model in the R package ProbABEL20 The event-free survival time was defined as time from birth to date of CAD event, or when no event had occurred as time from birth to date of inclusion in the study. An additive genetic model was applied in the Cox model and classical cardiovascular risks were used as covariates.21 We corrected for factors that had previously been shown to be associated with CAD risk in this population: age, gender, smoking, type 2 diabetes, hypertension and body mass index (BMI). Analyses were performed for the loci previously reported to be associated with CAD. Of the 45 reported SNPs, data were available of 41 SNPs after imputation. Significance was defined as a P-value <0.05 divided by the number of SNPs tested, yielding a significance level of 1.22 × 10−3 (41 SNPs based on the literature). All analyses were also performed separately for males and females. Statistical analyses were performed using SPSS software (version 17; Chicago, IL, USA).
All described variants will be submitted to the following public variant database Leiden Open Variation Database (LOVD)3 (http://databases.lovd.nl/shared/genes/).
Genotyping and imputation
Of the 1701 DNA samples, 7 individuals did not cluster appropriately in the IBS, reflecting non-Caucasoid origin, and were consecutively excluded, leaving a total of 1694 DNA samples for analysis. A total of 38 978 SNPs met our quality control steps. No subjects were excluded because of low call rate. The genomic inflation factor was close to 1 (λ=1.07), indicating that the influence of population substructure and genotyping errors was negligible. Using HapMap, we were able to impute up to 2.5 million SNPs for all individuals. Out of the 45 SNPs, 9 were directly genotyped and 30 were imputed. A proxy of 2 of the remaining 6 SNPs (rs12205331:C>T rs12197124:C>T (R2=1.0) and rs9369640:C>Ars7751826:C>T (R2=1.0) could be found in LD with the lead SNP. So a total of 9 SNPs was genotyped directly, 30 SNPs were imputed, for two SNPs, proxy SNPs were found and 4 SNPs could not be found within the imputed data, because they were not available in the reference panel or they were poorly imputed. We only analyzed the 41 SNPs that were available after imputation.
Demographic data of the 1694 study cases are listed in Table 1. The average age at inclusion was higher in the CAD group. The mean age of onset of CAD was 49.1 (standard deviation; SD 10.7) years and the mean event-free survival in individuals without CAD was 47.3 (SD 12.6) years. During follow-up, 28% of our cohort developed CAD. Cardiovascular risk factors were significantly more prevalent in CAD cases than in controls. Treatment-naive LDL-C levels at the time of inclusion in the cohort did not differ between patients with and without CAD. All first visits to the lipid clinic took place between March 1969 and November 2002.
SNPs and risk of CAD
None of the 41 SNPs reached a significant P-value after the Bonferroni correction (P<1.11 × 10−3 ) (see Table 2). The best performing SNP in the cardiovascular risk factors adjusted analysis was rs1122608:G>T, in the SMARCA4 gene near the LDLR gene, with HR 0.74 (CI 0.49−0.99) and P-value 0.021 (Figure 2). No differences were observed in gender specific analysis (data not shown).
We tested the hypothesis that common genetic variants that were previously shown in GWA studies to be associated with CAD risk in the general population, might affect the risk of CAD in a high-risk cohort of FH patients. As previously reported, established risk factors do associate with the risk of CAD in FH patients.8, 22 However, none of the tested CAD-associated SNPs significantly modified the risk of CAD in our FH cohort in analyses unadjusted or adjusted for established cardiovascular risk factors. The lowest observed P-value of association was for a SNP in the SMARCA44 gene, near the LDLR gene (in adjusted analysis; P=0.021 ); however, it showed a paradoxically protective effect.
FH patients are known to be at high CAD risk. Other patient cohorts with high risk are those with established cardiovascular disease and those with type 2 diabetes. Of these three patient categories, the effect of 9p21 variants on survival had been tested only in those with established CAD. In a prospective observational study including 846 Caucasoid cases who underwent CABG, the 9p21 SNP rs10116277:G>T was independently associated with all-cause mortality during 5 years follow-up after surgery.23 Homozygotes for the minor allele of this SNP had an increased risk of all-cause mortality (HR 1.7; CI 1.1−2.7). The SNP even remained associated with outcome after adjustment for the Euroscore, a score commonly used to predict CVD outcome after CABG. In contrast, in a larger cohort of patients with established CAD (>8000 patients), a haplotype block with eight of the strongest 9p21 SNPs was associated with better prognosis in whites but not in blacks or Hispanics.24 Moreover, in line with our findings, the HRs for prognosis among the risk alleles were in opposite direction compared with the published HRs for CAD/MI risk in the Caucasoid population for both the two most widely reported 9p21 SNPs; rs2383207:A>G and rs10757278:A>G, G alleles were 0.75 (0.60−0.93) and P=0.0083 and 0.81 (0.66−1.0) P=0.05. However, a less commonly reported linkage disequilibrium consisting of six 9p21 SNPs was associated with worse prognosis.24 Compared with these two studies in high-risk populations, our study adds a considerable number of events in a high-risk populations; a total of 482 CAD events occurred during follow-up in our cohort, whereas the studies by Muehlschlegel et al23 and Gong et al24 reported analyses on 38 and 134 CVD events, respectively. In summary, the only two other studies that addressed the effect of 9p21 SNPs showed conflicting results in high-risk populations.
Previous work conducted by our group to determine the genetic modifiers of CVD risk among FH patients showed that the G20210A polymorphism in the protrombin gene was strongly associated with significantly increased CVD risk.14 However, in that publication, the threshold to reach statistical significance was rather lenient. In this paper, a P-value <0.001 was considered statistically significant; however, applying the Bonferroni correction, which is common practice nowadays would suggest a P-value <0.00076. None of the SNPs reached the a priori determined value for statistical significance. Because the reported protrombin variant is not considered to be a CAD risk SNP its effect on survival was not calculated in our current analysis. In current analysis, only 1701 samples of the original 1940 were available. A selection bias has not taken place. This is merely a reflection of the usage of the DNA. Baseline characteristic are similar in both studies.
Our study did not address the underlying explanation for the rather counterintuitive result. The top SNP in our analysis, rs1122608:G>T, had a protective effect, contrary to the latest papers. However, smaller reports have reported a protective effect of this variant for CAD and PAD.25, 26 Martinelli et al25 suggest that the effect they observed is due to the lipid effects of the variants.25, 26
The effect of common CAD-associated genetic variants on general population samples has also yielded conflicting results. In a study among ∼3000 cases and 3000 controls, a gene score comprising nine SNPs was associated with CAD risk.27 People in the top quintile for this gene score had a twofold increased risk of MI compared with those in the bottom quintile corrected for age, sex and ancestry. In line, SNP-based risk score designed in a Finnish cohort of 30 725 participants free of CVD was associated with the risk of a first CAD event, with a relative risk of 1.7 between the highest and lowest quintiles of the score adjusted for traditional risk factors; sex, LDL-C, HDL-C, current smoking, BMI, systolic and diastolic blood pressure, blood pressure treatment an prevalent type 2 diabetes.28 In contrast, Paynter et al29 did not observe an independent association between genetic risk factors and CAD risk in a cohort of 19 313 initially healthy women during 12.3 years follow-up. A risk score based on 12 SNPs was clearly associated with CAD risk after adjustment for age. However, after additional adjustment for other traditional risk factors this association disappeared.
We have previously shown that the type of LDLR mutations underlying FH, variation in LDL-C levels, and established classical risk factors explain 21.3% of the variation in CAD risk. Our current analysis suggests that environmental and/or unknown genetic factors may have a role. As there was no standardized information available on lifestyle factors such as dietary habits and physical activity, the effect of environmental factors and their potential interaction with genetic variants could not be studied, but it is unlikely that this could explain significant proportions of the remaining 80% of CAD risk prediction. At maximum, the common genetic variants we tested explained only 10% of the heritability of the trait, if we consider the heritability estimates of 40% for CAD to be correct.10 Much of the ‘missing heritability’ is expected to be explained by common variants not yet identified, rare variants, structural variants and copy number variations.
Several aspects of the design of our study need to be considered. The major strength of this study is the unparalleled cohort size, the detailed information on CAD events during follow-up and the high CAD event rate. However, our study is hampered by several limitations. First, power calculations suggest that our study was at the limit of the power needed to detect statistically significant associations with CAD, as the studied SNPs were previously shown to have a moderate-effect (RR 1.1−1.7). In addition, the MAF of some of the SNPs were lower then used in the power calculation. Second, the majority of the CAD SNPs analyzed were not directly genotyped but imputed using HapMap and we cannot rule out misclassifications. However, Southam et al17 have shown that imputation of common variants is generally very accurate. Finally, the patients who were included in this study were referred to a Lipid Clinic. In theory, patients with the most detrimental genetic profiles might have died before referral. Therefore, the effect of genetic variants associated with a more severe CAD phenotype or early death could have been underestimated or missed.
In this high-risk cohort of patients with FH, common SNPs shown to be associated with CAD risk in the general population could not be associated with the disease.
This work was supported by Bloodomics consortium, European Union 6th Framework Programme (LSHM-CT-2004-503485), Ipse Movet, the Lifetime Achievement Award of the Dutch Heart Foundation (2010T082) to JJPK and grants from the British Heart Foundation and the National Institute for Health Research, England to WHO. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.