Introduction

Subarachnoid hemorrhages (SAH), more than 90% of which are due to the rupture of intracranial aneurysms (IAs), cause considerable mortality and morbidity (Inagawa 2001; Longstreth et al. 1993) and are responsible for 2% of all deaths in Japan annually (Yamada et al. 2003a). Despite advances in diagnostic and treatment techniques, the prognosis of SAH remains unimproved. Prophylactic treatment by surgical clipping or endovascular coiling of an aneurysm prior to its rupture can reduce mortality and morbidity (Henkes et al. 2004; International Study of Unruptured Intracranial Aneurysms Investigators 1998; Raaymakers et al. 1998). However, it is costly and time-consuming to detect unruptured aneurysms by performing periodic screening examinations by magnetic resonance angiography (MRA). In addition, the indication of prophylactic treatment is controversial due to the low risk of rupture of unruptured aneurysms (International Study of Unruptured Intracranial Aneurysms Investigators 1998). If we can predict susceptibility to IA formation or vulnerability to IA rupturing, we can follow a high-risk strategy instead of a population strategy.

Recent evidence suggests that genetic factors, as well as environmental risk factors including smoking and hypertension, are associated with SAH (Ronkainen et al. 1998; Schievink et al. 1995), indicating that the genotypes of certain susceptibility genes can predict IA formation or rupturing. Thus, the identification of susceptibility genes for IA is one way to realize effective screening and treatment processes. At the same time, the identification of these genes will help us to understand the pathogenesis of IA, and will enable us to develop medical treatments to prevent IA growth and rupture as a new therapeutic option.

To clarify the genetic background of IA, several linkage studies have been conducted to date, and a total of nine possible loci have been proposed: 1p34.3–p36.13 (Nahed et al. 2005; Ruigrok 2006), 5p15.2–14.3 (Verlaan et al. 2006), 5q22–q31 (Onda et al. 2001), 7q11.2–q22.1 (Farnham et al. 2004; Onda et al. 2001), 11q24–q25 (Ozturk et al. 2006), 14q23–q31 (Onda et al. 2001; Ozturk et al. 2006), 17cen (Yamada et al. 2003b), 19q12–13 (Mineharu et al. 2007; van der Voet et al. 2004; Yamada et al. 2003b), and Xp22 (Olson et al. 2002; Yamada et al. 2003b). Linkage to 2p13 (Roos et al. 2004) was shown in a large consanguineous family, but this finding has been retracted because newly diagnosed affected siblings did not show linkage to this locus (Ruigrok 2006). Among these nine linkage regions, the perlecan gene (HSPG2) at 1p36.1–36.4 (Ruigrok et al. 2006a), the versican gene (CSPG2) at 5q14.3 (Ruigrok et al. 2006b), elastin (ELN) and LIM domain kinase 1 (LIMK1) genes at 7q11 (Onda et al. 2001; Akagawa et al. 2006), collagen alpha 2(I) (COL1A2) at 7q22 (Yoneyama et al. 2004), tumor necrosis factor receptor, superfamily member 13B (TNFRSF13B) at 17cen (Inoue et al. 2006), and kallikrein at 19q13 (Weinsheimer et al. 2007) have been proposed as susceptibility genes for IA. However, there has been a failure to replicate the linkage to 7q11 and the association of ELN polymorphisms with IA in several studies (Berthelemy-Okazaki et al. 2005; Hofer et al. 2003; Krex et al. 2004; Mineharu et al. 2006; Yamada et al. 2003b), and linkage to 17cen also could not be replicated in a study of the same ethnic group (Krischek et al. 2006). Further studies are warranted to replicate the previously reported IA loci. Thus, the primary objective of this study was to replicate the previously suggested IA loci by association analyses, using high-density single nucleotide polymorphism (SNP) markers. We further aimed to narrow down the confidence interval of the loci confirmed in this study using linkage disequilibrium (LD) mapping to identify a novel susceptibility gene for IA.

Materials and methods

Study population

We recruited 29 cases and 35 controls as a discovery cohort from a local community consisting of two adjacent cities, Daisen and Yuri-Honjo, in the Akita Prefecture of northeast Japan. All subjects were confirmed to have lived in these communities for more than three generations. Only male subjects were enrolled for an accurate construction of haplotype on the X chromosome. We recruited 237 cases and 253 controls as a replication cohort from several collaborative hospitals from all over Japan. All of the case subjects were diagnosed by digital subtraction angiography (DSA) or confirmed to have IAs in operations throughout collaborating hospitals. We excluded case subjects who were affected with known heritable diseases, such as Ehlers–Danlos syndrome type IV, Marfan syndrome, neurofibromatosis type 1 or autosomal dominant polycystic kidney disease, or autoimmune diseases. Control subjects met the following criteria: (a) confirmation of the absence of IA by DSA, MRA or three-dimensional computerized tomography; (b) an age at diagnosis of ≥40 years old; (c) no medical history of any stroke including IA or SAH, and; (d) no family history of IA or SAH in first-degree relatives.

Individual and family history, ancestral information and lifestyle information were obtained by interviews. Past history and co-morbidity were also examined from the clinical charts of individual participants. Individuals with hypertension were defined as those who were diagnosed as having a systolic blood pressure of ≥140 mmHg or a diastolic blood pressure of ≥90 mmHg or those who were taking medicine after they had been found to have high blood pressure, which was confirmed in their medical records. Smokers were defined as current smokers. This study was approved by the Ethics Committee of Kyoto University Institutional Review Board and appropriate informed consent was obtained from all subjects.

Population stratification

Population-based association analysis is often confounded by population stratification, which causes unlinked markers to show association with a phenotype. Thus, we evaluated the degree of population stratification using Fst statistics, which was performed by a genomic control approach (Hao et al. 2004) using the STRUCTURE software (http://pritch.bsd.uchicago.edu/structure.html) (Pritchard and Rosenberg 1999). Fifty randomly selected, unlinked markers in the GeneChip 10 K mapping array, with a heterozygosity of ≥40%, were used as genomic controls, and Fst values were calculated using a mixture of 64 case and control subjects. An Fst = 1.0 suggests that the population consists of genetically distinct subgroups, and an Fst = 0 suggests the absence of genetically distinct subgroups, with all of the observed genomic variability occurring within populations. Fst values of up to 0.05 suggests negligible population stratification (Adeyemo et al. 2005).

Genotyping

Single nucleotide polymorphism (SNP) markers were set at seven previously reported IA loci with a maximum LOD score of ≥3.0 (1p34.3–p36.13, 5q22–q31, 7q11.2–q22.1, 11q24–q25, 14q23–q31, 17cen and 19q12–13) using GeneChip 10 K Human Mapping arrays (Affymetrix, Santa Clara, CA, USA). To avoid a false negative result, we incorporated a safety margin and defined the candidate interval of each linkage region reported by previous works as a region with a LOD score of >0. DNA samples from the discovery cohort, including two control samples from Affymetrix, were assayed according to the protocol. The procedure was similar to one described previously (Hu et al. 2005). The LD coverage of the 10 K mapping array has not yet been determined, but it is estimated to be <30% with reference to the information in the HapMap set under pairwise SNP LD coefficients r 2 > 0.80, because the coverages of the 100 and 500 K arrays are 31 and 66%, respectively (Hao et al. 2007). The average heterozygosity for markers in the 10 K array is 0.32 and the average minor allele frequency is 0.24 in Asian populations. Variants with a call rate of ≤90% were excluded from the analysis. Additional genotyping was carried out by the PCR-Invader assay using Taqman probes (Applied Biosystems TaqMan SNP Genotyping Assays; Foster City, CA, USA) or by the PCR-restriction fragment length polymorphisms (RFLP) method. Eight SNPs (rs413015, rs2273623, rs2224410, rs7152548, rs3742636, rs2025967, rs767757, rs2301113) around rs767603 at 14q23 were selected within the haplotype block of rs767603 estimated by GeneSpring GT2 software in the discovery cohort. These SNPs were commercially available as Taqman probes and their minor allele frequencies were ≥20%.

Statistical analysis

Association analysis was conducted using GeneSpring GT2 software (Agilent Technology, Palo Alto, CA, USA). Haplotype frequency was estimated based on the expectation-maximization algorithm, and the differences in both SNP frequency and haplotype frequency were compared by chi-square statistics. Odds ratios (ORs) and 95% confidence intervals (95%CI) with adjustment for smoking and hypertension were obtained by logistic regression models using SAS software (Version 8.2; SAS Institute Inc., Cary, NC, USA). Thesias software (http://genecanvas.ecgene.net/) was used in haplotype analysis to adjust for covariates including smoking and hypertension (Tregouet et al. 2004). We applied the Bonferroni correction to account for multiple testing, which was corrected for the number of SNP markers or haplotypes at each tested locus. Permutation tests were performed with 10,000 iterations using the SumStat program (http://linkage.rockefeller.edu/register/). Although we tested seven linkage loci, we did not correct for this multiple testing, because to do so would have been too conservative. Instead, we confirmed the results of our initial association analysis in an independent cohort.

Linkage disequilibrium analysis

The pairwise SNP LD coefficients D′ and r 2 were calculated using Haploview software (http://www.broad.harvard.edu/mpg/haploview/download.php). LD patterns were compared with those in the JPT (east Japan-based population in Tokyo) Hapmap data set.

Sequencing of candidate genes and genotyping of SNPs

Seven candidate genes, protein phosphatase, magnesium dependent, 1A, alpha isoform (PPM1A), sine oculis homeobox homolog 6 (SIX6), SIX1, SIX4, menage a trois 1 (MNAT1), protein kinase C, eta (PRKCH) and hypoxia-inducible factor 1, alpha subunit (HIF1A), were directly sequenced, including coding exons and intron–exon boundaries, in 16 randomly selected cases and 16 controls from the discovery cohort. SNPs identified by direct sequencing were genotyped in the replication cohort when a minor allele frequency was >0.05.

Results

The demographic characteristics of the study population are shown in Table 1. No significant difference was observed between case and control groups in terms of either sex or hypertension. Age and the proportion of smokers were higher among control subjects than among case subjects, increasing the likelihood of a genetic contribution and reducing the effect of environmental risk factors in case subjects. In the discovery cohort, most cases (93.1%) had a ruptured aneurysm, which would be incidental. According to STRUCTURE analysis, discovery samples were likely to consist of a genetically homogenous cohort without obvious population stratification, with an Fst of ≤0.003 (data not shown).

Table 1 Characteristics of the discovery and replication cohorts

The results of association analysis in the discovery cohort are shown in Table 2. The average call rate was 95.3%, and all SNPs followed Hardy–Weinberg equilibrium seperately among both case and control subjects. The average D′ of SNP markers in the tested loci was between 0.13 and 0.19 (the average D′ of all the markers in the 10 K mapping array was 0.16). The density of markers differed among loci, ranging from 2.1 markers/Mb (7q11) to 5.3 markers/Mb (11q24–q25). We tested 118 SNPs across a 29.3-Mb interval (4.0 markers/Mb) at chromosome 14q23–q31. Within the locus, rs767603 showed a significant association with IA (Table 2; = 0.00017, Bonferroni-corrected = 0.023), and a haplotype including this SNP (rs1254332, rs116853 and rs767603) also showed a significant association (Table 2; = 0.0018, Bonferroni-corrected = 0.048). The association of rs767603 with IA was confirmed in a permutation test (permutation = 0.018). No significant association was observed in the other tested loci. In the present study, we focused our analysis on the further refinement of the association signal at 14q23, because the marker in this locus (rs767603) showed a significant association with IA in both single SNP and haplotype analysis. The locus at chromosome 14q23 was identified by a study in a Japanese population (Onda et al. 2001), which also encouraged us to focus on this locus.

Table 2 Locus-wide association analysis in the previously reported linkage regionsa

To determine the extent of the association around rs767603 at 14q23, eight additional SNPs over a 1.6-Mb genomic region were genotyped in the discovery cohort. LD analysis revealed a 599-kb region of high linkage disequilibrium (D′ > 0.8), which contains rs767603 (Fig. 1). The patterns of LD and the range of the LD block (region with D′ of >0.8) were almost the same as those in the JPT Hapmap data set. Within the 599-kb LD block, the most significant association was observed in rs767603 (Fig. 1; = 0.00017). An adjacent polymorphism, rs2224410, showed the second-lowest p value (Fig. 1; = 0.0010). To confirm the association in the discovery cohort, these two SNPs were genotyped in an independent cohort consisting of 237 cases and 253 controls recruited from all over Japan. Both SNPs showed a significant allelic association with IA (Table 3; OR = 0.61, 95% CI]: 0.43–0.86, = 0.0046 for rs767603; OR = 0.62, 95% CI: 0.44–0.89, = 0.0082 for rs2224410). Haplotype analysis of the two SNPs also showed a significant association (Table 3; OR = 0.61, 95% CI: 0.43–0.87, = 0.0060). These associations remain practically the same even after adjusting for sex, hypertension and smoking habits (data not shown). When we divided case subjects in the replication cohort into two groups (subjects with ruptured aneurysms and subjects with unruptured aneurysms), significant allelic association was observed among subjects with ruptured aneurysms (= 0.0084 for rs767603 and p = 0.0070 for rs2224410) but a marginally significant association or an absence of association was observed among subjects with unruptured aneurysm (p = 0.087 for rs767603 and p = 0.205 for rs2224410).

Fig. 1
figure 1

Association analysis and LD analysis in the discovery cohort. Pairwise linkage disequilibrium (LD) among 16 SNPs at 14q23 in the discovery cohort. The negative natural logarithm of the significance of allele association to IA is given in the graph. Dark red-shaded squares indicate D′ values >0.80. The numbers in the squares show 100D′ and no number indicates D′= 1.0. The relative physical position of each SNP is given above the SNP ID. The names of the genes in Block 2 (599-kb LD block) and its flanking regions are given above the LD map

Table 3 Association analysis in the discovery and replication cohorts for the SNPs in 14q23

To find a functionally relevant polymorphism, we selected seven candidate genes around the 599-kb LD block; PPM1A, SIX6, SIX1, SIX4, MNAT1, PRKCH and HIF1A. These genes were directly sequenced in 16 randomly selected cases and 16 controls in the discovery cohort. Among the 17 SNPs identified by sequencing, ten had a minor allele frequency of >0.05; these were genotyped and compared in the replication cohort. Two SNPs in the SIX6 gene, rs3759688 within the promoter region and rs1956558 within the 5′UTR, which showed complete LD (D′ = 1, r 2 =  1) with each other, showed a significant association with IA in the replication cohort (OR = 0.57, 95%CI: 0.32–1.00, = 0.049, data not shown), although these were less significantly associated with IA than rs767603.

Discussion

We found consistent evidence of allelic and haplotype associations of rs767603 at chromosome 14q23–q31 with IA. Even when we analyzed individuals with ruptured and unruptured IAs separately, a similar association was observed in both groups. Although the allelic association in the unruptured IA group was marginally significant, this could be due to small sample size. LD analysis showed that rs767603 belongs to a 599-kb LD block harboring 13 genes, suggesting that a susceptibility gene might reside within this block. We have sequenced all coding regions of candidate genes within the 599-kb LD block and its flanking regions to find a relevant polymorphism for IA. However, no variants identified through the sequencing were more significantly associated with IA than rs767603.

Among seven susceptibility loci tested, we only found evidence of an association at the locus on 14q23. The absence of associations of the other tested loci could be partly due to the low density of markers among loci. Although D′ was similar among loci, marker densities ranged from 2.1 to 5.3 markers/Mb. In fact, the marker density of 14q23 was the highest among the IA loci reported for a Japanese population (chromosome 7q11, 14q23, 17cen and 19q13).

We failed to identify any variants likely to be involved in IA formation. It is possible that rs767603 itself may be involved in the etiology of IA by influencing the transcriptional level of its nearby gene, although this is unlikely because rs767603 lies as much as 14 kb upstream of the nearest gene (SIX1). Alternatively, it is more rational to assume that other variants in the promoter regions of our candidate genes or variants of other untested genes in LD with rs767603 may be causally associated with IA. Further work is needed to investigate such possibilities.

When we divided case subjects in the replication cohort into two groups, subjects with ruptured aneurysms and subjects with unruptured aneurysms, significant allelic association was only observed among subjects with ruptured aneurysms, but a barely marginally significant association was observed among subjects with unruptured aneurysm. This would be due to the small sample size for unruptured cases (153 ruptured cases and 84 unruptured cases). Because similar trends were observed in both groups, we treated them as one cohort.

The limitations of this study are the sparse marker density and the small size of the study population. The coverage of the 10 K array is less than 30%, and some markers were about 1 Mb away from each other. The low SNP density and low coverage might fail to detect potential associations. Thus, an association study with higher SNP density might reveal hidden associations. Given a risk allele frequency of 0.2, a prevalence of 0.01 and a relative risk of 1.5, 240 cases and 240 controls are needed for an alpha of ≤0.05 with 80% power according to the Genetic Power Calculator (http://pngu.mgh.harvard.edu/~purcell/gpc/), supposing that a marker allele is in complete LD with a disease allele. However, the relative risks of susceptibility genes for IA were less than 1.5 in most recent studies (Yoneyama et al. 2004; Akagawa et al. 2006; Inoue et al. 2006; Ruigrok et al. 2006a, 2006b; Weinsheimer et al. 2007), and a marker allele is unlikely to be in complete LD with a disease allele because of the low coverage of SNP markers in theGeneChip 10 K mapping array. Thus, the power of the present study may not be sufficient to detect a relevant polymorphism for IA formation.

In conclusion, our findings confirmed the association of a locus for IA on 14q23, although we failed to identify a susceptibility gene for IA. The association was, however, consistently validated in three independent studies including ours. Further studies with higher density markers and a larger population are warranted.