Genome-wide association studies (GWAS) have not consistently detected replicable genetic risk factors for ischemic stroke, potentially due to etiological heterogeneity of this trait. We performed GWAS of ischemic stroke and a major ischemic stroke subtype (large artery atherosclerosis, LAA) using 1,162 ischemic stroke cases (including 421 LAA cases) and 1,244 population controls from Australia. Evidence for a genetic influence on ischemic stroke risk was detected, but this influence was higher and more significant for the LAA subtype. We identified a new LAA susceptibility locus on chromosome 6p21.1 (rs556621: odds ratio (OR) = 1.62, P = 3.9 × 10−8) and replicated this association in 1,715 LAA cases and 52,695 population controls from 10 independent population cohorts (meta-analysis replication OR = 1.15, P = 3.9 × 10−4; discovery and replication combined OR = 1.21, P = 4.7 × 10−8). This study identifies a genetic risk locus for LAA and shows how analyzing etiological subtypes may better identify genetic risk alleles for ischemic stroke.
Stroke affects approximately 15 million persons worldwide each year1 and is a leading cause of death and adult acquired disability2,3. The vast majority of strokes are ischemic, involving cerebral artery blockage by atherosclerotic plaque or embolus. Although clinical risk factors for ischemic stroke are well established4, the genetic risk alleles are incompletely identified. Genetic influences on stroke risk are supported, however, by higher concordance among monozygotic than dizygotic twins5, increased risk among family members of affected individuals6 and high heritability of intermediate predictors, including carotid intima-media thickness (IMT: h2 ≈ 30–60%)7,8 and white matter lesions (h2 ≈ 50–70%)9,10.
With the exception of the 4q25 locus associated with atrial fibrillation and ischemic stroke11,12, the 9p21 region associated with coronary artery disease and ischemic stroke13,14 and a recently described 7p21.1 association with LAA15, GWAS for ischemic stroke have identified few convincingly associated variants. Inability to replicate many reported associations may be attributable to phenotypic heterogeneity, a challenge that could be partly addressed by more complete subtyping of ischemic stroke etiology. At least three major ischemic stroke etiological types are commonly distinguished: (i) large artery atherosclerosis (LAA); (ii) cardioembolism (CE); and (iii) small vessel occlusion (SVO)16. Genetic heterogeneity may contribute to this phenotypic diversity; a recent, well-powered GWAS of ischemic stroke detected heterogeneity of risk locus effects across stroke subtypes15, and family studies have also identified differences in subtype heritability, owing perhaps to variable roles of heritable intermediate phenotypes, such as hypertension and large vessel atherosclerosis17. The greatest familial risk has been associated with LAA, for which family history confers significant risk, even beyond the seventh decade of life6.
We conducted a GWAS of ischemic stroke in an Australian sample of European ancestry involving 1,230 cases and 1,280 population controls. The causal subtype of ischemic stroke was classified using TOAST criteria16. Demographic and clinical characteristics of the Australian Stroke Genetics Collaborative (ASGC) data set are summarized in Supplementary Table 1.
After quality control filtering of genotype data, data on 551, 514 SNPs from 1,162 ischemic stroke cases and 1,244 controls were used for genotype imputation and genetic analysis. Before performing genome-wide association analyses, we assessed the genetic contribution to ischemic stroke and the LAA, CE and SVO subtypes using a recent method18 that estimates the proportion of phenotypic variance (Vg/Vp) attributable to variation in genotyped SNPs, where Vg is the component of phenotypic variance attributable to variation in genotyped SNPs and Vp is the total observed phenotypic variance. For ischemic stroke, the estimated genetic load was substantial (Vg/Vp IS = 0.39), with SNPs explaining a significant proportion of phenotypic variation (P = 4.5 × 10−4). For cases classified in the LAA subtype, we observed a higher, more significant estimate of genetic load (Vg/Vp LAA = 0.66; P = 5.6 × 10−5), consistent with previous reports of high familial risk for LAA6. Evidence for genetic contribution was less significant for the CE and SVO subtypes (Vg/Vp CE = 0.6, P = 0.0026 and Vg/Vp SVO = 0.1, P = 0.33, respectively; Table 1).
We performed two primary GWAS in the Australian discovery sample, comparing (i) all ischemic stroke cases (n = 1,162) and (ii) LAA cases (n = 421) with population controls (n = 1,244). GWAS of the CE and SVO subtypes, which both had fewer cases and a less significant Vg/Vp estimate, were performed as supplementary analyses (Supplementary Figs. 1 and 2 and Supplementary Tables 2 and 3). Genotypic effects were estimated using logistic regression models (1-degree-of-freedom additive trend tests) adjusted for age and sex. Results were compared with a prespecified significance threshold of 5 × 10−8, corresponding to Bonferroni adjustment for 1 × 106 independent tests. Quantile-quantile plots indicated excellent quality of the GWAS data and an absence of systematic bias caused by population substructure or other artifacts (Supplementary Fig. 3).
Analyses of ischemic stroke detected the strongest signals at several SNPs within the SLC5A4 gene on chromosome 22q12.3 (Fig. 1, Supplementary Fig. 4 and Supplementary Table 4). Peak association was detected at rs5998322 (Ptrend = 3.91 × 10−7; OR = 1.97, 95% confidence interval (CI) = 1.51–2.57) within exon 11. A strong signal was also detected 4 Mb downstream of this peak at a number of SNPs located within and upstream of the APOL2 gene (peak association at rs4479522: Ptrend = 3.23 × 10−6; OR = 1.34, 95% CI = 1.18–1.51). Analysis of rs5998322 adjusted for allele dosage at rs4479522 produced similar results to the unadjusted analysis (Ptrend = 4.47 × 10−7), suggesting independence of the two associated loci at 22q.
The GWAS of LAA detected two associated SNPs on chromosome 6p21.1 exceeding the prespecified threshold for genome-wide significance (α = 5 × 10−8; Figs. 1 and 2). These variants, rs556621 (Ptrend = 3.92 × 10−8; OR (A allele) = 1.62, 95% CI = 1.36–1.93) and rs556512 (Ptrend = 4.25 × 10−8; OR (A allele) = 1.62, 95% CI = 1.36–1.93) were in perfect linkage disequilibrium (LD) in HapMap Phase 2 Utah residents of Northern and Western European ancestry (CEU) data (r2 = 1, D′ = 1; Supplementary Table 5), with a minor (A) allele population frequency of 0.33. The rs556621 SNP was directly genotyped in our sample, whereas rs556512 was imputed with excellent reliability (imputation r2 = 0.99). Very similar effect sizes for rs556621 were estimated in logistic models further adjusted for the first ten ancestry principal components and several correlated clinical risk factors (Supplementary Table 6), indicating a lack of confounding by population substructure or clinically related heritable traits. Consistent but attenuated association of the 6p21.1 variants was observed for the broad ischemic stroke phenotype, with peak association also detected at rs556621 (P = 5.6 × 10−5; OR (A allele) = 1.29, 95% CI = 1.14–1.47) (Table 2). Supplementary analyses of CE and SVO subtypes revealed no association with rs556621 (P = 0.73 and 0.39, respectively; data not shown). In addition to the 6p21.1 locus, the LAA GWAS also detected clusters of suggestively associated SNPs (P < 1 × 10−5) at 14q32.33 and the second 22q12.3 locus detected in the GWAS of ischemic stroke (Supplementary Fig. 5 and Supplementary Table 7).
In a subsequent LAA GWAS adjusted for rs556621 genotype, no SNP showed evidence of strong independent association with LAA (peak P = 5.6 × 10−6 for rs11625862 at 14q32.33). Haplotype association tests across the 6p21.1 region also did not detect multi-marker haplotypes that were more strongly associated with LAA than the two index SNPs (data not shown).
The addition of rs556621 genotypes to a risk prediction model containing various clinical traits associated with LAA occurrence produced a small but significant increase in the area under the receiver operator characteristic (ROC) curve (ΔAUC = 0.01; P = 1.2 × 10−5; Supplementary Table 8), although this ΔAUC estimate may be inflated by estimation in the discovery cohort. To further assess the internal validity of the association at rs556621, the sample was randomly partitioned into training and test groups containing two-thirds and one-third of the LAA cases and controls, respectively. Association with LAA was evaluated in the training set, with genotyped SNPs reaching P < 1 × 10−4 (n = 44) then assessed in the test set (the remaining third of the sample). The index SNP at 6p21.1 (rs556621) reached P = 5.69 × 10−5 in the training set and was the only SNP associated with LAA in the independent test set after permutation-based adjustment for the testing of 44 non-independent SNPs (familywise adjusted P = 6.74 × 10−3; Supplementary Table 9).
External validity of the observed association of rs556621 with LAA risk was assessed in a replication study involving 10 independent population cohorts contributing 1,715 LAA cases (1,323 European and 392 US) and 52,695 controls (39,509 European and 13,186 US) of confirmed European ancestry. Details of the individual cohorts are provided in Supplementary Table 10 and the Supplementary Note. Association analyses for the index SNP at 6p21.1 (rs556621) were performed separately within each of the ten cohorts, with the results combined using fixed-effects, inverse variance–weighted meta-analysis. Because association evidence was assessed for a single SNP in the independent replication study, no multiple-testing adjustment was indicated, and the result was compared with a prespecified significance threshold of 0.05.
The replication study confirmed association of rs556621 with LAA (Ptrend = 3.9 × 10−4; OR (A allele) = 1.15, 95% CI = 1.06–1.24), with no evidence of between-study heterogeneity (P = 0.50, I2 = 0.0%) (Fig. 3, Table 2 and Supplementary Table 11). The estimated population-attributable risk for rs556621 in the replication study was ∼5%. When the discovery and replication cohorts were combined, meta-analyses yielded Ptrend = 4.7 × 10−8 for the association (OR = 1.21, 95% CI = 1.13–1.30). However, the heterogeneity statistic for the combined analysis was moderately significant (P = 0.02, I2 = 43.4%), indicating some inflation of the effect size in the discovery cohort (winner's curse). For this reason, the estimated effect in the independent replication study is likely a better estimate of the true population effect. Meta-analyses of rs556621 for overall ischemic stroke in the replication study showed no evidence for association, despite a greater than fivefold increase in case numbers (9,552 cases and 52,695 controls; Ptrend = 0.29; OR (A allele) = 1.02, 95% CI = 0.98–1.06; Supplementary Fig. 6). These results support the existence of a common 6p21.1 risk variant of modest but genuine effect specific to the LAA stroke subtype. Neither this SNP nor SNPs in high LD with rs556621 have previously been reported to be associated with coronary heart disease risk.
The 6p21.1 SNPs are located in an intergenic region of moderate LD (Supplementary Fig. 7), ∼200 kb upstream of the SUPT3H gene (forward strand) and ∼180 kb upstream of CDC5L (reverse strand). rs556621 and rs556512 both lie within a small length of genomic sequence that contains BCL3 and PBX3 transcription factor–binding motifs and enriched for enhancer- and/or promoter-associated marks of histone protein modification. The associated SNPs or other correlated variants may thus function in regulating gene expression via altered responsiveness of key transcription factor–binding sites19. A number of predicted microRNAs (miRNAs) also lie in the vicinity of rs556621 (Supplementary Table 12), suggesting that variants in LD with rs556621 could also potentially regulate gene expression through alteration of regulatory miRNA sequences. Queries of four public expression quantitative trait locus (eQTL) databases did not identify rs556621 or proxy SNPs in high LD with rs556621 as cis eQTLs in the assayed tissue or cell types. Future targeted investigations in atherosclerotic neurovascular tissue may help to elucidate the mechanisms by which the associated SNPs influence LAA risk.
Suggestive association with both ischemic stroke and LAA was also detected for variants in a chromosome 22q12.3 region containing the APOL1-APOL4 gene cluster. These primate-specific genes are implicated in lipid metabolism and vascular biology20,21, where their expression is strongly induced by proinflammatory cytokines22,23,24. APOL2, APOL3 and APOL4 are thought to encode intracellular proteins; APOL2, across which association evidence was strongest, is almost exclusively expressed in the brain, with reduced expression in the heart23.
This is one of the first reported GWAS for large artery atherosclerosis, a major subtype of ischemic stroke. We report the identification of variants at 6p21.1 that associate with LAA risk in individuals of European ancestry. We also report a locus within the APOL1-APOL4 gene cluster that is suggestively associated with both LAA and broad ischemic stroke. The potential pathological function of these variants and their contributions to stroke risk in non-European populations remain to be determined.
MACH, http://www.sph.umich.edu/csg/yli/mach/index.html; Haploview, http://www.broadinstitute.org/scientific-community/science/programs/medical-and-population-genetics/haploview/haploview; UNPHASED, http://unphased.sourceforge.net/; LocusZoom, http://csg.sph.umich.edu/locuszoom/; METAL, http://www.sph.umich.edu/csg/abecasis/Metal/; SNAP, http://www.broadinstitute.org/mpg/snap/; SCAN–SNP and CNV Annotation Database, http://scan.bsd.uchicago.edu/newinterface/about.html; NCBI GTEx (Genotype-Tissue Expression) eQTL Browser, http://www.ncbi.nlm.nih.gov/gtex/GTEX2/gtex.cgi; Pritchard laboratory UChicago eQTL browser, http://eqtl.uchicago.edu/cgi-bin/gbrowse/eqtl/; mRNA by SNP Browser, http://www.sph.umich.edu/csg/liang/asthma/.
Study participants: the ASGC discovery sample.
ASGC stroke cases comprised stroke patients of European ancestry who were admitted to four clinical centers across Australia (The Neurosciences Department at Gosford Hospital, Gosford; the Neurology Department at John Hunter Hospital, Newcastle; The Queen Elizabeth Hospital, Adelaide; and the Royal Perth Hospital, Perth) between 2003 and 2008. Stroke was defined by World Health Organization criteria as a sudden focal neurological deficit of vascular origin, lasting more than 24 h and confirmed by imaging, such as computerized tomography (CT) and/or magnetic resonance imaging (MRI) brain scan. Other investigative tests such as electrocardiogram, carotid doppler and trans-esophageal echocardiogram were conducted to define ischemic stroke mechanism as clinically appropriate. Cases were excluded from participation if they were aged <18 years, were diagnosed with hemorrhagic stroke or had transient ischemic attack rather than ischemic stroke or if they were unable to undergo baseline brain imaging. On the basis of these criteria, a total of 1,230 ischemic stroke cases were included in the current study. Ischemic stroke subtypes were assigned using TOAST criteria on the basis of clinical, imaging and risk factor data16.
ASGC controls were participants in the Hunter Community Study (HCS), a population-based cohort of individuals aged 55–85 years, predominantly of European ancestry and residing in the Hunter Region in New South Wales, Australia. Detailed recruitment methods for the HCS have been previously described25. Briefly, participants were randomly selected from the New South Wales State electoral roll and were contacted by mail between 2004 and 2007. Consenting participants completed five detailed self-report questionnaires and attended the HCS data collection center, at which time a series of clinical measures were obtained. A total of 1,280 HCS participants were genotyped for the current study.
All study participants gave informed consent for participation in genetic studies. Approval for the individual studies was obtained from the relevant institutional ethics committees.
Study participants: replication cohorts.
Replication data were contributed by a total of 11 cohorts involved in the Metastroke and International Stroke Genetics Consortia (ISGC): the Atherosclerosis Risk in Communities Study (ARIC), the Bio-Repository of DNA in Stroke (BRAINS), deCODE Genetics, the Baltimore Genetics of Early Onset Stroke (GEOS) Study, the Heart and Vascular Health (HVH) Study, the Ischemic Stroke Genetics Study/Siblings With Ischemic Stroke Study (ISGS/SWISS), The Massachusetts General Hospital Genes Affecting Stroke Risk and Outcome Study (MGH-GASROS), the Milano stroke genetics study, the Rotterdam Study, the Wellcome Trust Case Control Consortium 2–Munich (WTCCC2-Munich) and the Wellcome Trust Case Control Consortium 2–UK (WTCCC2-UK). All replication cohorts defined ischemic stroke and the LAA, CE and SVO subtypes using clinical criteria consistent with those used for the ASGC discovery sample. Summary demographic data and clinical phenotyping details for these individual cohorts are provided in Supplementary Table 2 and the Supplementary Note.
Genome-wide genotyping and quality control: ASGC discovery sample.
ASGC cases and controls were genotyped using the Illumina HumanHap610-Quad array. Quality control excluded SNPs with genotype call rate of <0.95, deviation from Hardy-Weinberg equilibrium (P < 1 × 10−6) or minor allele frequency of <0.01. At the sample level, quality control excluded individuals with (i) genotype call rate of <95% (n = 4); (ii) genome-wide heterozygosity of <23.3% or >27.2% (n = 9); (iii) inadequate clinical data or inconsistent clinical and genotypic gender (n = 45); and (iv) an inferred first- or second-degree relative in the sample identified on the basis of pairwise allele sharing estimates (estimated genome proportion shared identical by descent (IBD); > 0.1875; n = 37). After these exclusions, Eigenstrat principal-components analysis (PCA) was performed, incorporating genotype data from Phase 3 HapMap populations (CEU, Han Chinese in Beijing, China (CHB), Japanese in Tokyo, Japan (JPT), Toscani in Italia (TSI) and Yoruba from Ibadan, Nigeria (YRI)). In eigenvector plots, the majority of ASGC samples clustered closely with European (CEU and TSI) reference populations. Eighteen samples (16 cases and 2 controls) showed prominent evidence of Asian ancestry and were removed. Principal-component and IBD analyses were performed using a pruned subset of quasi-independent SNPs (∼130,000 SNPs) to avoid confounding by LD. After quality control, 1,162 cases and 1,244 controls were available for association analyses at 551,514 SNPs.
Genotype imputation in the filtered sample was performed using MACH v1.0.16 on the basis of HapMap Phase 2 (release 24) phased haplotypes for samples of European ancestry (CEU). Subsequent quality control excluded imputed SNPs with minor allele frequency of <0.01 or ratio of observed dosage variance to expected binomial variance (r2) of <0.3.
Genotyping and quality control: replication cohorts.
Each replication cohort performed genome-wide genotyping, quality control and imputation as part of its own primary study. The particular arrays and quality control filters used by the individual cohorts are described in the Supplementary Note. Of the 11 cohorts, 6 directly genotyped rs556621, and 5 imputed allelic dosages for this SNP. To ensure the accuracy of results, imputed data were only included if the quality of imputation was high, defined as a ratio of observed to expected binomial dosage variance (r2) of >0.7. This resulted in the exclusion of one sample (HVH; r2 = 0.64). All other samples had r2 ≥0.95 for rs556621.
Estimating the proportion of phenotypic variation attributable to genotyped SNPs.
The proportion of case-control variation attributable to variation in genotyped SNPs was estimated in the discovery sample with GCTA software18,26, which uses genome-wide SNP data to estimate additive genetic relationships (correlations) between essentially unrelated individuals, using a linear mixed model (LMM) to estimate the contribution of genotyped SNPs (and causal variants in LD with genotyped SNPs) to observed variation in case-control status. Before analysis, additional quality control of genotype data was performed to reduce bias in variance estimates from the accrued effects of small genotyping errors27. We excluded SNPs with missingness of >0.1% or Hardy-Weinberg equilibrium P value of <1 × 10−4 and individuals with >0.1% missing genotype data or estimated relatedness of >0.05 (approximately closer than second cousins)27. After quality control, genotypes at 457,533 SNPs were available for estimating genetic effects for 1,079 ischemic stroke cases, 400 LAA cases, 288 SVO cases and 226 CE cases. Each case group was evaluated in a separate analysis using a common control sample of 1,172 individuals; all fitted LMMs were adjusted for age and sex. Heritability estimates shown in Table 1 relate to the observed (binary) risk scale and case-control proportions. We note that, although these estimates do not represent heritability in the conventional sense, the test statistics and their associated significance levels are invariant under adjustment for ascertainment bias or liability scale28.
Genome-wide association analyses in the Australian discovery cohort.
Genome-wide association analyses were performed using 1-degree-of-freedom trend tests, assuming an additive effect of allele dosage. Parameters were estimated using logistic regression models adjusted for age and sex. Analyses were not adjusted for principal components of population ancestry, as observed genomic inflation factors in unadjusted models (λ = 1.031, λ1,000 = 1.026 for ischemic stroke; λ = 1.007, λ1,000 = 1.011 for LAA) indicated an absence of bias due to population stratification. Meta-analysis genomic control inflation factors (λ) were calculated as previously described, as were standardized values for a sample of 1,000 cases and 1,000 controls (λ1,000)29. Secondary analyses of peak regions were adjusted for ancestry principal components and clinical traits, including hypertension, hypercholesterolemia, diabetes mellitus, atrial fibrillation, myocardial infarction and smoking status, to investigate potential confounders of the observed genetic associations. Association tests were performed using maximum-likelihood estimated dosages for imputed SNPs and observed integer dosages for genotyped SNPs. Logistic models were fitted using mach2dat software, which calculates significance levels for estimated parameters using a likelihood-ratio test30,31. The two secondary logistic analyses conditioned on rs4479522 and rs556621 genotypes were adjusted for age, sex and integer-valued dosage of the test allele at conditioned SNPs.
Pairwise LD between SNPs was assessed and visualized using Haploview software32 on the basis of European (CEU) HapMap Phase 2 data. Haplotype analyses of the 6p21.1 region used genotyped data and maximum-likelihood genotypes for SNPs imputed with high reliability (r2 > 0.7). Sliding window haplotypes incorporating from two to six adjacent SNPs were estimated and assessed for association with LAA case-control status using UNPHASED software33. Regional association plots were constructed using LocusZoom software34.
Meta-analysis of rs556621 in replication cohorts.
For rs556621, each replication sample performed logistic regression using a 1-degree-of-freedom trend test relating the presence of stroke (LAA or overall ischemic stroke) to allelic dosage, assuming an additive effect of the test allele. The test allele, estimated β coefficient, standard error and effective sample size were provided for the combined replication analysis. Fixed-effects, inverse variance–weighted meta-analyses of the ten replication cohorts providing high-quality data for rs556621 was performed using METAL software. Between-study heterogeneity was investigated using Cochran's Q statistic with its associated P value and the I2 metric, representing the percentage of between-study heterogeneity exceeding the value expected by chance. Population-attributable risk (PAR%) was estimated for rs556621 using the formula
where OR is the odds ratio estimated using independent replication data and p is the prevalence of the risk allele in controls35.
Predictive modeling using ROC curves.
Predictive models incorporating clinical and genetic risk factors were evaluated for their ability to discriminate between case and control participants by calculating the area under the receiver operator characteristic (ROC) curve (AUC). ROC curves show the relationship between sensitivity (true positive rate) and 1-specificity (false negative rate) for all possible cut-points of a diagnostic test. For specified covariates, the ROC curve was fitted and the AUC calculated using Stata software36, on the basis of parameter estimates from logistic regression models. Likelihood-ratio tests were used to assess the significance of changes in model fit.
For the lead SNP at 6p21.1 (rs556621), proxy SNPs with r2 of >0.8 were identified from HapMap CEU Phases 1 and 2 (release 22) and 3 data (release 2) using SNAP (v2.2). Four publicly available eQTL databases were searched to determine whether genotypes of the lead or proxy SNPs have been previously associated with gene expression in cis in a range of tissue and cell types. We defined potential cis eQTLs as candidate SNPs associated with expression of a gene transcript mapping to a genomic region within 1 Mb37 at a nominal significance level of 1 × 10−3. The databases searched were (i) SCAN–SNP and CNV Annotation Database; (ii) the NCBI GTEx (Genotype-Tissue Expression) eQTL Browser; (iii) the Pritchard laboratory UChicago eQTL browser; and (iv) mRNA by SNP Browser v1.0.1. The tissue and cell types assessed in these databases include liver, brain, lymphoblastoid cell lines (LCLs), monocytes, fibroblasts and T cells.
A complete list of funding acknowledgments is included in the Supplementary Note. We are grateful to the participants with ischemic stroke and also to their families for participating in this study. Australian population control data were derived from the Hunter Community Study. We also thank the University of Newcastle for funding and the men and women of the Hunter region who participated in this study. This research was funded by grants from the Australian National Health and Medical Research Council (NHMRC; project grant 569257), the Australian National Heart Foundation (NHF; project grant G 04S 1623), the University of Newcastle, the Gladys M Brawn Fellowship scheme and the Vincent Fairfax Family Foundation in Australia. E.G.H. is supported by the Australian NHMRC Fellowship scheme. J.G. is supported by a Practitioner Fellowship from the NHMRC and a Senior Clinical Research Fellowship from the Australian Office of Health and Medical Research. The principal funding for the Wellcome Trust Case Control Consortium 2 (WTCCC2) ischemic stroke study was provided by the Wellcome Trust, as part of the WTCCC2 project (085475/B/08/Z, 085475/Z/08/Z and WT084724MA). This work was also supported by the European Community's Sixth Framework Programme (LSHM-CT-2007-037273), the Wellcome Trust core award (090532/Z/09/Z) and AstraZeneca. M. Farrall is a member of the Oxford British Heart Foundation (BHF) Centre of Research Excellence. The Siblings with Ischemic Stroke Study (SWISS) and the Ischemic Stroke Genetics Study (ISGS) were funded by grants from the US National Institute of Neurological Disorders and Stroke. Additional funding was provided by the US National Institute of Neurological Disorders and Stroke (U01NS069208). The Rotterdam Study received principal funding for this report from the Netherlands Heart Foundation (grant 2009B102).
Supplementary Tables 1–12, Supplementary Figures 1–7 and Supplementary Note