Introduction

Stroke is one of the leading causes of mortality, morbidity and disability worldwide.1 Thanks to genome-wide association studies (GWAS), genetics of complex diseases has made dramatic steps forward in the last few years2 but very few single-nucleotide polymorphisms (SNPs) have been invariantly associated with ischemic stroke.3, 4, 5, 6, 7 Moreover, the relative risk conferred by individual genetic variants is usually low. Thus, in an attempt to estimate the aggregate effect of several gene variants, different genetic risk scores (GRS) have been constructed including a few studies on stroke with inconsistent results.8, 9, 10, 11, 12, 13, 14, 15 Interestingly, the selection of the SNPs to be included in GRS calculation have been based on different criteria: SNPs detected by GWAS either associated or not associated with known cardiovascular risk factors, as well as SNPs in candidate genes or pathways.8, 9, 10, 11, 12, 13, 14, 15

High blood pressure (BP) is the major risk factors for stroke and among the most important ones for other cardiovascular events.16,17 Despite BP and hypertension being heritable traits,18 the search for genetic variants associated with these traits has been more challenging compared with other cardiovascular risk factors. Only in recent years, GWAS have discovered several genetic variants which associate with BP-related traits.19, 20, 21 Indeed a GRS based on 29 SNPs has been associated with the prevalence of hypertension and the incidence of coronary events and stroke in a meta-analsyis,21 and more recently, also with the incidence of hypertension.22 The aim of the present study was to test the association of a GRS, consisting in the weighted allele sum of 29 SNPs previously associated with high BP, with ischemic stroke in a case–control collaboration consisting of three Swedish case–control studies, including more than 6000 individuals. Moreover, the possible improvement in the prediction of stroke over some ‘traditional risk factors’, such as hypertension, diabetes and smoking habit, was assessed using indices of calibration, discrimination and reclassification applicable to a case–control design.

Materials and methods

All study participants or, when applicable, their next-of-kin provided informed consent. The procedures were in accordance with the institutional guidelines. The Ethics Committee of the Medical Faculty of Lund University and the University of Gothenburg approved the study.

Subjects

Lund stroke register (LSR)

LSR is an ongoing prospective study, recruiting consecutive patients with first-ever stroke from 2001 and later from the primary uptake area of Skåne University Hospital, Lund. The patients were 18 years or older at stroke onset and all patients diagnosed as having ischemic stroke were examined with neuroimaging or autopsy of the brain. The large majority of these patients are treated at the stroke unit of Skåne University Hospital, Lund.23 Control subjects were individuals without stroke, randomly selected from the same geographical uptake area and age and gender matched to patients included during the first year (2001–2002) of the LSR project. All included subjects (or if they were unable to answer, their next-of-kin) provided informed consent to participate. In the current study, patients with ischemic stroke were included if they also had provided a blood sample for DNA analysis.

Malmö diet and cancer (MDC)

Eight hundred and seventy-three ischemic stroke cases and 867 age- and sex-matched controls were selected from subjects recruited and followed over time in the MDC, a longitudinal study, ongoing in the urban-zone of Malmö, Sweden. Between 1991 and 1996, women aged 45–73 years and men aged 46–73 years, with residency in Malmö (~250 000 habitants), were invited by mail and by newspaper advertisement to participate in the MDC.24 In all, 28 449 participated out of an eligible population of 74 000. The participants were asked to complete a self-administered questionnaire at home, which included items on lifestyle factors, medication, previous and current diseases.

Sahlgrenska Academy Study on Ischemic Stroke (SAHLSIS)

The SAHLSIS is a case–control study on ischemic stroke before 70 years of age, the design of which has been described in detail before.25 Briefly, white patients who presented with first-ever or recurrent acute ischemic stroke before reaching the age of 70 years (n=844) were consecutively recruited between 1998 and 2008 at four stroke units in Western Sweden. White community-control subjects (n=668) from the same geographic area as the cases were randomly selected to match cases for age and sex.

Stroke definition

In all three samples, ischemic stroke was defined as rapidly developed clinical signs of local or global loss of cerebral function that lasted for 24 h or led to death within 24 h following the World Health Organization’s definition. By definition, patients with transient ischemic attacks are excluded. CT, MRI or autopsy verified the infarction or excluded hemorrhage and nonvascular disease.23,25,26

Data about etiologic subtypes of ischemic stroke according to the modified ‘Trial of Org 10172 in Acute Stroke Treatment’ (TOAST) criteria were available only for the LSR and SAHLSIS samples.27

Hypertension definition in different studies

In the three samples, hypertension was defined as being on antihypertensive treatment or having a systolic BP/diastolic BP ≥160/90 mm Hg.

Diabetes mellitus definition in different studies

In the LSR study, diabetes mellitus diagnosis was made based on fasting capillary or whole-blood glucose ≥6.1 mMol/l or plasma glucose ≥7.0 mMol/l at two time points usually during the hospitalization of the patient in the acute phase within 0–2 days interval; or non-fasting glucose above 11 mMol/l in addition with clinical symptoms; or previous treatment for diabetes mellitus (diet, oral medication or insulin). In the MDC study, diabetes mellitus was defined as use of antidiabetic medication or self-reported history of a physician's diagnosis of diabetes. In SAHLSIS, diabetes mellitus was defined by diet or pharmacological treatment, fasting plasma glucose ≥7.0 mmol/l, or fasting blood glucose ≥6.1 mmol/l.

Anamnestic data in different studies

In the LSR study, information about medical history was obtained from subjects (or when applicable from next-of-kin or previous medical records) at baseline investigation (for cases – in connection to stroke onset; for control subjects – in connection to clinical examination for inclusion in the LSR study). In the MDC and SAHLSIS studies, information about medical history and smoking habits was derived from a structured self-administered questionnaire. Smoking habits were coded as ‘current’ vs ‘never’ or ‘former’.

Genotype analysis

Information about the different SNPs included in the GRS is reported in the Online Supplementary Methods. The SNPs were genotyped using IPLEX on a MassARRAY platform (Sequenom, San Diego, CA, USA) and at KBioscience in the United Kingdom according to the manufacturer’s standard protocols. Nearly 25% of the samples were run in duplicate. All genotypes were called by two different investigators. We pre-specified a threshold call rate of 90% per individual SNP and a threshold of P<10−03 for excluding SNPs according to Hardy–Weinberg equilibrium calculation in controls. Genetic data have been deposited at the European Genome-phenome Archive (http://www.ebi.ac.uk/ega/), which is hosted by the EBI, under accession number EGAS00001000936.

Genetic risk score

To create the multivariable GRS for each study participant we used the weighted method (weighted GRS) according to the beta value attributed to the 29 tested SNPs in previous association studies,19, 20, 21, 22 assuming each SNP to be independently associated with risk28 according to an additive genetic model. For each SNP included in the GRS calculation, weightings of 0, 1 and 2 were attributed according to the number of risk alleles (defined as coded) present. Successively, the number of corresponding coded alleles (0, 1 or 2) was multiplied for the absolute value of the beta-coefficient for hypertension, as detected in previous studies,19, 20, 21 and then these products were summed up. Thus, in our approach, different SNPs contribute with different weights to the GRS value, as opposed to an alternative approach in which no weighting of effects is used, and each SNP allele counts equally in the score. Successively, the risk score was divided by the number of effectively genotyped SNPs to produce an average measure, which takes into account the amount of effectively genotyped SNPs, and the ratio was standardized. The GRS was modeled as a continuous variable (increase in one unit means an increase in one SD of the GRS) and as tertiles (see the Online Supplementary Methods for the specific threshold, which were used to divide the different tertiles of GRS and for more details about the equation used to calculate the GRS). See also Supplementary Figure S1 for the distribution of GRS in the population. Only subjects with at least 27 valid genotypes were included in the final analysis (see also Supplementary Table S3 for the description of included/excluded subjects). This threshold was arbitrary chosen for allowing that not too many participants were excluded and, at the same time, that not too much noise was introduced to the GRS because of the fact that it is not based on all the genotypes.

Statistical analysis

Throughout the manuscript, continuous variables are reported as the mean±SD. The χ2-test (Pearson) was used to compare group frequencies and to test for deviations from Hardy–Weinberg equilibrium. Logistic regression (stepwise methods: backward Wald) analysis was used in the multivariate models with ischemic stroke as the dependent variables and either age and sex (model A), or age, sex, diabetes mellitus and smoking habits (model B) or age, sex, diabetes mellitus, smoking habits and hypertension (model C) as independent variables. The analyses were performed using the GRS modeled as a continuous variable and as tertiles (see above for more details).

An unbiased estimate of the variance explained by the GRS was obtained by evaluating the increase in explained variance of the trait when adding the GRS to all the covariates included in model C, when tested in logistic regression (Nagelkerke r2). Model calibration was assessed with the Hosmer–Lemeshow goodness-of-fit test.29 All these analyses were performed using SPSS statistical software (version 20.0; SPSS Inc. Chicago, IL, USA).

We assessed the improvement in risk discrimination by comparing the area under the receiver operator characteristic (ROC) curves (AUC) in models with all the nongenetic covariates significantly associated with stoke and in the same model plus the GRS. ROC curves were developed using a probability-weighted Cox model. The category-less net reclassification improvement (NRI) index for case–control studies and the integrated discrimination improvement (IDI) index were estimated according to Pencina et al.30,31 using the Hmisc library by Frank E Harrell Jr. implemented in the R statistical software (version 2.15.2).32

All tests were two-sided and P-values <0.05 were considered statistically significant. Bonferroni adjustments were performed when appropriate.

Results

The clinical characteristics of the participants in the three case–control studies and in the combined sample are summarized in Table 1. Results concerning Hardy–Weinberg equilibrium and details about individual markers are presented in Supplementary Table S1.

Table 1 Baseline characteristics for the three case–control samples (subjects with at least 27 valid genotypes)

As expected, the GRS was significantly associated with hypertension, after adjustment for age and sex, in the combined sample (OR (95% CI): 1.175 (1.115–1.238) P=2.0E−09) and in the three samples separately (LSR: 1.194 (1.107–1.288) P=4.0E−06; MDC: 1.160 (1.050–1.282) P=0.004; SAHLSIS: 1.129 (1.005–1.268) P=0.041), whereas no association was evident between the GRS and either diabetes mellitus or smoking habits.

Association analysis between the GRS and ischemic stroke

Figure 1 shows the association between the GRS in tertiles and ischemic stroke in the combined sample and in the three studies separately. In the combined sample, using regression model A, adjusting for age and sex, the GRS was associated with ischemic stroke (Table 2). When diabetes mellitus, smoking habits and hypertension also were included in the model, the association was somewhat attenuated but remained significant in the combined sample (model B and C). A similar trend was evident in the MDC, SAHLSIS and LSR, although this association did not reach statistical significance in the latter sample. The OR for ischemic stroke was nearly 25% higher in individuals classified in the 3rd tertile according to the GRS compared with those classified in the 1st tertile (Table 2). However, the magnitude of this OR was lower than that of the traditional risk factors, including hypertension (Table 3). The part of variance explained by the logistic regression model with the addition of the GRS was 0.170 as compared with 0.167 when tested after full adjustment but without the GRS (Nagelkerke r2) and model calibration was good for both models (P=0.671 for the model without the GRS and 0.544 for the model with the GRS). Only subjects with at least 27 valid genotypes were included in the analyses. However, the inclusion of subjects with 26 or 25 valid SNPs did not alter the results substantially (data not shown).

Figure 1
figure 1

Association between the weighted GRS (in tertiles) and ischemic stroke in the combined sample (a) and in the three studies separately (b=LSR, c=MDC, d=SAHLSIS).

Table 2 Association of the GRS with ischemic stroke in MDC, LSR, SAHLSIS and in the combined sample
Table 3 Odds ratios (95% confidence intervals) as found in logistic regression (multivariate model C) for stroke incidence in the combined sample

In an exploratory association analysis of stroke subtypes, including only in the SAHLSIS and LSR samples, the stroke subtypes that were associated with the GRS in multivariate analysis were cryptogenic stroke and the combined group ‘other or undetermined causes’ (see Supplementary Table S5) in the combined sample and in SAHLSIS. In addition, small vessel disease was associated with the GRS in SAHLSIS.

The AUC using the model with the GRS was not significantly improved as compared with the model without the GRS (0.672±0.007 vs 0.669±0.007; P>0.05, see Figure 2). The category free NRI (>0), which is applicable to case–control studies, was statistically significant (0.0659±0.0265; P=0.013; see also Supplementary Table S4), but not when evaluated in cases (0.0328±0.0169) separately from controls (0.0331±0.0205; P=0.106 and ). The IDI was significant (0.001452±0.000498; P=0.003)

Figure 2
figure 2

Receiver operator characteristics (ROC) curves for stroke discrimination using nongenetic risk factors (age, sex, hypertension, diabetes, smoking) as compared with nongenetic risk factors plus the GRS.

Association analysis between individual SNPs and ischemic stroke

Results from the association analysis between individual SNPs and ischemic stroke, after full adjustment (model C), are presented in Supplementary Table S2. A few SNPs showed a marginally significant P-value either in the combined sample or in the different studies separately. However, their significance did not withstand adjustment for multiple testing using the Bonferroni correction.

Discussion

The main finding of the present study is that a GRS, previously associated with hypertension incidence and prevalence, is associated with ischemic stroke in a large collaborative study pooling together three Swedish case–control studies. This result suggests that life-long exposure to small genetically mediated differences in BP may be causally related to ischemic stroke. However, the effect size of the tested GRS is relatively low and the improvement in the AUC, NRI and IDI indexes is too modest to add clinically meaningful predictive value when it is added on top of known risk factors. Our results are in line with recent studies using similar GRS,21,33 which showed an overall modest improvement in relative risk estimates (either odds ratio or hazard ratio): in the International Consortium for Blood Pressure GWAS, the increase in the risk of stroke passing from the 1st to the 5th quintile of the GRS spanned between 1.23 and 1.34, and in a study involving several Finnish cohorts between 1.23 and 1.35.21,33

There are a number of possible reasons why the effect size of the GRS was low in our study and even not significant in all studies separately. First, none of the individual SNPs included in the GRS, when analyzed separately, had a nominal association with ischemic stroke and someone had a paradoxical inverse effect (that is the ‘risk allele’ confers protection in one or more subsamples of the present study). Second, it is possible that a different selection of beta-coefficients based on larger GWAS or on GWAS that analyze more appropriate population cohorts or even beta-coefficients, which take into account interaction between genes, could result in an improved GRS compared with the one that we used in this study. Furthermore, this is a GRS of hypertension and in the present samples the effect (OR) of hypertension on stroke is unexpectedly modest compared with other risk factors such as smoking and diabetes mellitus. Other reasons could refer to ischemic stroke per se. Ischemic stroke is a very heterogeneous disease with different etiological subtypes, whose causes, including genetic determinants, could be very different. In our exploratory analysis on stroke subtypes, the GRS showed association with cryptogenic stroke and with small vessels occlusion. However, the sample size for subtypes is small compared with genetic study standards, and thus further investigations of this GRS based on larger, well-characterized samples of patients with ischemic stroke are clearly warranted to see if there are subtype-specific associations. In addition, the heritability of ischemic stroke, as estimated in previous studies, is quite low making the discovery of SNPs constantly associated with it quite challenging.34, 35, 36, 37, 38 On the contrary, heritability of hypertension and BP-related traits seems higher18 and GWAS have reported different SNPs highly significantly associated with BP even if with quite low effect size in terms of mmHg.19, 20, 21,39 This evidence could support the hypothesis that the genetic component of stroke susceptibility that is mediated by hypertension is quite diluted. Another hypothesis is that rarer variants in genes implicated in Mendelian forms of stroke may have larger effects. It has already been shown that carriers of genetic variants in genes implicated in Mendelian forms of hyper- or hypotension, are associated with huge variation in BP40,41 as compared with common SNPs.21 In the same manner, rare mutations implicated in Mendelian forms of stroke, as well as maternally inherited mitrochondrial mutations, could contribute more than common SNPs to the occurrence of cerebrovascular events.42,43 Thus, we speculate that incorporating these rarer variants in GRS, as well as other SNPs directly related to stroke or stroke subtypes, could substantially augment the prediction of a genetic score.

Strengths of our study include the large sample size, with a number of stroke cases similar to that of studies on candidate genes in a recent meta-analysis,44 the genetic homogeneity of the populations, and the rigorous classification of stroke cases and controls. Even if longitudinal studies are better suited to assess predictivity with respect to case–control ones with a cross-sectional design, we tried to contrast this limitation by applying indices that have been adapted to this type of studies. Other major limitations of the GRS are that no interactions between single SNPs with other genetic variants and/or with other demographic or environmental factors were taken into account. The analysis of stroke subtypes is underpowered and was only possible to perform it on two samples. Furthermore, there is an overlap between nearly 300 stroke cases and controls in MDC (11% of the total sample) between the present study and the first study of the GRS,21 potentially overinflating the results. Finally, the effect of the GRS seems nonlinear, suggesting that the GRS we used, and especially the beta-coefficients derived from previous studies, could be non-optimal for a finer discrimination of stroke risk, at least in our samples.

Perspectives

In conclusion, we confirmed an independent association of a GRS with ischemic stroke in a case–control study. However, the low magnitude of the effect and the modest improvement in discrimination and reclassification indicate that the GRS, in its present form, is insufficient to add valuable information in the clinical setting. Larger genetic studies focusing on ischemic stroke subtypes could help clarify the pathophysiology of this worldwide burden disease. Further studies are needed to detect newer GRS, including well-validated SNPs along with rarer genetic variants, associated with stroke.