We aimed to replicate reported associations of 10 SNPs at eight distinct loci with overall ischemic stroke (IS) and its subtypes in an independent cohort of Dutch IS patients. We included 1,375 IS patients enrolled in a prospective multicenter hospital-based cohort in the Netherlands, and 1,533 population-level controls of Dutch descent. We tested these SNPs for association with overall IS and its subtypes (large artery atherosclerosis, small vessel disease and cardioembolic stroke (CE), as classified by TOAST) using an additive multivariable logistic regression model, adjusting for age and sex. We obtained odds ratios (OR) with 95% confidence intervals (95% CI) for the risk allele of each SNP analyzed and exact p-values by permutation. We confirmed the association at 4q25 (PITX2) (OR 1.43; 95% CI, 1.13–1.81, p = 0.029) and 16q22 (ZFHX3) (OR 1.62; 95% CI, 1.26–2.07, p = 0.001) as risk loci for CE. Locus 16q22 was also associated with overall IS (OR 1.24; 95% CI, 1.08–1.42, p = 0.016). Other loci previously associated with IS and/or its subtypes were not confirmed. In conclusion, we validated two loci (4q25, 16q22) associated with CE. In addition, our study may suggest that the association of locus 16q22 may not be limited to CE, but also includes overall IS.
A substantial proportion of the etiology of acute ischemic stroke (IS) is thought to be attributable to (common) genetic variation1. Genome-wide association studies (GWAS) have estimated that the proportion of phenotypic variance of IS explained by common variants ranges between 16 and 40%, depending on subtype1. Thus far, GWAS have identified a small number of single nucleotide polymorphisms (SNPs) associated with overall IS or its subtypes large artery atherosclerosis (LAA), small vessel disease (SVD) and cardioembolism (CE)2,3,4,5,6,7,8,9,10,11. These loci have been suggested to be mostly or entirely subtype specific. Discovery of common variants influencing stroke is hindered by many challenges, including but not limited to the heterogeneity of the phenotype, high lifetime risk of stroke, late disease onset, and limited statistical power in studies performed to date. Thus, replication of presumed risk loci in independent cohorts is emphatically recommended before initiating fine-mapping efforts in search of causal variants and functional studies to discern the functional consequences of these variants12. We aimed to replicate the associations of eight loci with IS and/or its subtypes as previously reported in an independent set of patients with IS drawn from a Dutch cohort.
We included 1,375 patients with IS of Dutch descent who were enrolled in the Dutch Parelsnoer initiative (PSI) Cerebrovascular Disease13. This study represents an ongoing collaboration of eight university medical centers in the Netherlands in which clinical data, imaging and biomaterials of patients with stroke are prospectively and uniformly collected13. The present study includes patients with IS enrolled between September 2009 and November 2014. IS was defined as focal neurologic deficits of sudden onset originating from the brain and persisting for more than 24 hours, in the absence of hemorrhage as confirmed by imaging. We further classified IS subtypes according to the Trial of Org 10172 in Acute Stroke Treatment (TOAST), LAA, SVD, CE, and stroke of other and of undetermined cause14. We used 1,533 population-level controls of Dutch descent15. Information on ancestry in patients and controls was obtained by self-report. The Medical Ethics Committee of the University Medical Center Utrecht approved the study and all patients provided written informed consent. The research described was conducted in accordance with relevant guidelines and regulations.
DNA of the cases and controls was extracted from peripheral blood. We genotyped 10 SNPs of eight loci (1p13.2 (TSPAN2), 4q25 (PITX2), 6p21.1 (SUPT3H/CDC5L), 7p21.1 (HDAC9), 9p21.3 (CDKN2BAS1), 9q34 (ABO), 12q24 (ALDH2) and 16q22 (ZFHX3)) using KASP assays (LGC Genomics, Hoddesdon, UK).
We removed individuals with >25% missing genotypes (29 individuals; 7 cases and 22 controls). We tested each SNP for deviation from Hardy-Weinberg equilibrium (p < 0.001) and calculated minor allele frequencies for each SNP in cases and controls. As the design of the study prevented us from performing principal component analyses to test for ancestral homogeneity, we compared risk allele frequencies with those from the Genome of the Netherlands (GoNL) Project16. GoNL comprises a comprehensive characterization of genetic variation of 769 individuals of Dutch ancestry as assessed by whole-genome sequencing16. Frequencies were calculated in the unrelated set of individuals in GoNL (N = 498). Next, we tested these SNPs for association with IS and its subtypes using an additive logistic regression model, which includes 0, 1 or 2 copies of the risk alleles, and adjusted for age and sex. We report odds ratios (OR) with 95% confidence intervals (95% CI) for each risk allele as established in previous studies2,4. To assess the validity of also including samples with missing genotypes, we performed a sensitivity analysis excluding each individual with at least one missing genotype. Accompanying exact probability-values for the observed associations were obtained by performing 10,000 permutations. Analyses were performed in Plink version 1.9b3.38.
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
After quality control, our data set consisted of 1,368 IS patients (803 (58.7%) men, median age 67.3 years (interquartile range (IQR): 56.5–77.2)) and 1,511 controls of Dutch descent (926 (61.3%) men, median age 64.4 years (IQR: 58.0–70.3)). Baseline characteristics of the patients with IS are presented in Supplementary Table S1. All genotyped SNPs had call rates >98% and were in Hardy-Weinberg equilibrium. Risk allele frequency of all variants showed high concordance with those reported in the Genome of the Netherlands Project (Table 1)16.
At the 4q25 (PITX2) locus, previously identified by GWAS, we confirmed the association at rs2634074 with CE stroke (OR = 1.43 for the T allele; 95% CI, 1.13–1.81, p = 0.029), but not at rs2200733 (OR = 1.30 for the T allele; 95% CI, 0.96–1.76; p = 0.60) (Table 1). We also replicated 16q22 (ZFHX3) as a risk locus for CE (OR = 1.62 for the T allele; 95% CI, 1.26–2.07, p = 0.0014) and found this locus significantly associated with overall IS (OR = 1.28 for the T allele; 95% CI, 1.12–1.47, p = 0.002) (Table 1). This association remained significant in a sensitivity analysis excluding cases with cardioembolic stroke (OR = 1.18; 95% CI, 1.03–1.36, p = 0.02), and when we only included patients with LAA stroke or SVD (OR = 1.22; 95% CI, 1.03–1.45, p = 0.02).
We could not replicate the previously-established associations at the 1p13.2 (TSPAN2), 6p21.1 (SUPT3H/CDC5L), 7p21.1 (HDAC9), 9p21.3 (CDKN2B-AS1), 9q34 (ABO) and 12q24 (ALDH2) loci with overall IS or its subtypes. However, all of the effect directions were consistent with the observed directions of the initial association reports except for the C allele of rs556621 at locus 6p21.1 (SUPT3H/CDC5L) (Table 1).
Results did essentially not differ when individuals with missing genotypes were excluded (Supplementary Table S2).
In a well-defined cohort of patients with IS, we confirmed the 4q25 (PITX2) and 16q22 (ZFHX3) loci to be significantly associated with the IS subtype CE. We also found locus 16q22 to be significantly associated with overall IS. We were not able to replicate the previously-established associations at the 1p13.2 (TSPAN2), 6p21.1 (SUPT3H/CDC5L), 7p21.1 (HDAC9), 9p21.3 (CDKN2B-AS1), 9q34 (ABO) and 12q24 (ALDH2) with overall IS or its subtypes although, barring locus 6p21.1 (SUPT3H/CDC5L), the effect direction of their associations were consistent with expectation2,3,4,5,6,7,8,9,10,11.
Previous studies have consistently demonstrated the association between variants at 4q25 (PITX2) and 16q22 (ZFHX3) with atrial fibrillation both in patients with and without IS3,10, and additionally with cardioembolic stroke2,3,4. In the 4q25 locus, we only replicated rs2634074, but not rs2200733, despite moderate linkage disequilibrium (r2 = 0.51) and a comparable effect size as established previously2,3,4,9. This finding is likely explained by the difference in the power to detect a statistically-significant signal at each SNP (95% and 56%, respectively), a difference that results from their ~10% frequency difference. After the initial report of the association10, other studies found locus 16q22 to be specific for CE2,4, whereas our findings point to a possible association with both CE and overall IS. The association with overall IS remained significant after excluding cases with cardioembolic stroke, possibly suggesting a partially shared genetic architecture across different stroke subtypes17.
Variants near PITX2 that encode for a transcription factor have convincingly been implicated in sinoatrial node development and regulation of cardiac action potentials18. Little is known about the role of ZFHX3 in ischemic stroke. Besides atrial fibrillation, this gene has also been implicated in the regulation of myogenic and neuronal differentiation, and as a tumor suppressor gene in multiple cancers. Additionally, sequence variants in the locus have been linked to Kawasaki disease10. These lines of evidence may suggest that the role of this locus in IS might not be restricted to those of cardiac origin, and therefore may explain its potential association with overall IS.
While recent publications have suggested that implicated variants are likely subtype specific2,4, it is noteworthy that some genetic overlap between diagnostic IS subtypes has also been reported17. Given the repeated discovery of this locus in cardioembolic stroke in large-scale GWAS, it is likely that the observed association of locus 16q22 with overall IS in this study is driven by a subset of patients with another IS subtype that may also have as yet undiscovered atrial fibrillation or cardioembolic stroke. In addition, significant associations of genetic risk scores for atrial fibrillation with overall IS were recently found to be almost entirely explained by an association with cardioembolic stroke19.
Several factors may have prevented us from being able to replicate all associations investigated here. First, we had limited power to discover (nominal) associations in a relatively limited cohort size; our power was particularly limited for lower-frequency variants or variants with modest effect, a characteristic true of the vast majority of loci discovered through genome-wide association studies (stroke loci included). Thus, it is entirely possible that non-replicated loci are truly associated with overall IS and its subtypes and would replicate in larger sample collections. Second, failure to replicate may also be due to phenotypic heterogeneity; subtyping approaches vary across studies, and subtyping is imperfect, as many samples are categorized as ‘undetermined,’ thus allowing for potentially incorrectly subtyped cases (and consequently, reducing power). However, to decrease diagnostic uncertainty we excluded patients with transient ischemic attacks. Despite these limitations, most variants showed comparable effect sizes in the same direction as reported previously.
In conclusion, we validated two loci (4q25, 16q22) associated with IS caused by CE. In addition, our study may suggest that locus 16q22 may also be associated with overall IS or another subtype for which the current study may lack power to demonstrate a significant association. Future studies should search for the causal variants underlying these loci by fine-mapping and further discerning which genes within the loci may have functional consequence for disease.
Bevan, S. et al. Genetic heritability of ischemic stroke and the contribution of previously reported candidate gene and genomewide associations. Stroke 43, 3161–3167 (2012).
NINDS Stroke Genetics Network (SiGN), I. S. G. C. (ISGC). Loci associated with ischaemic stroke and its subtypes (SiGN): a genome-wide association study. Lancet. Neurol. 15, 4–7 (2015).
Gretarsdottir, S. et al. Risk variants for atrial fibrillation on chromosome 4q25 associate with ischemic stroke. Ann. Neurol. 64, 402–409 (2008).
Traylor, M. et al. Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE Collaboration): A meta-analysis of genome-wide association studies. Lancet Neurol. 11, 951–962 (2012).
Holliday, E. G. et al. Common variants at 6p21.1 are associated with large artery atherosclerotic stroke. Nat. Genet. 44, 1147–51 (2012).
Malik, R. et al. Low-frequency and common genetic variation in ischemic stroke: The METASTROKE collaboration. Neurology, doi:10.1212/WNL.0000000000002528 1–10 (2016).
Gschwendtner, A. et al. Sequence variants on chromosome 9p21.3 confer risk for atherosclerotic stroke. Ann. Neurol. 65, 531–539 (2009).
Williams, F. M. K. et al. Ischemic stroke is associated with the ABO locus: the EuroCLOT study. Ann. Neurol. 73, 16–31 (2013).
Kilarski, L. L. et al. Meta-analysis in more than 17,900 cases of ischemic stroke reveals a novel association at 12q24.12. Neurology 83, 678–85 (2014).
Gudbjartsson, D. F. et al. A sequence variant in ZFHX3 on 16q22 associates with atrial fibrillation and ischemic stroke. Nat. Genet. 41, 876–8 (2009).
Bellenguez, C. et al. Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke. Nat. Genet. 44, 328–33 (2012).
Manolio, T. A. Bringing genome-wide association findings into clinical use. Nat. Rev. Genet. 14, 549–58 (2013).
Nederkoorn, P. J. et al. The Dutch String-of-Pearls Stroke Study: Protocol of a large prospective multicenter genetic cohort study. Int. J. Stroke 10, 120–122 (2015).
Adams, H. et al. (TOAST) Classification of Subtype of Acute Ischemic Stroke. Stroke 23, 35–41 (1993).
Huisman, M. H. B. et al. Family history of neurodegenerative and vascular diseases in ALS: A population-based study. Neurology 77, 1363–1369 (2011).
Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–25 (2014).
Holliday, E. G. et al. Genetic overlap between diagnostic subtypes of ischemic stroke. Stroke 46, 615–619 (2015).
Kirchhof, P. et al. PITX2c is expressed in the adult left atrium, and reducing Pitx2c expression promotes atrial fibrillation inducibility and complex changes in gene expression. Circ. Cardiovasc. Genet. 4, 123–133 (2011).
Lubitz, S. A. et al. Atrial fibrillation genetic risk and ischemic stroke mechanisms. Stroke., doi:10.1161/STROKEAHA.116.016198 (2017).
This study was funded by Parelsnoer Institute (PSI), which was co-financed by the Dutch Government, the Dutch Federation of University Medical Centers (UMC) and the eight UMCs (from 2007–2011). The continuation of the PSI is financed by the UMCs. Y.M. Ruigrok was supported by a clinical fellowship grant of the Netherlands Organization for Scientific Research (NWO) (project no. 40-00703-98-13533). C.J.M. Klijn is supported by a clinical established investigator grant from the Dutch Heart Foundation (2012T077), and an ASPASIA grant from The Netherlands Organisation for Health Research and Development, ZonMw (grant number 015008048).
The authors declare that they have no competing interests.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
About this article
Vavilov Journal of Genetics and Breeding (2018)
Dutch Parelsnoer Institute-Cerebrovascular Accident (CVA) Study: A Large Multicenter Clinical Biobank with Standardized Collection and Storage
Open Journal of Bioresources (2018)