Fine Mapping of Intracranial Aneurysm Susceptibility Based on a Genome-wide Association Study

In addition to conventional genome-wide association studies (GWAS), a ne-mapping is increasingly used to identify the genetic function of variants associated with disease susceptibilities. Here, we used a ne-mapping approach to evaluate the casual variants based on a previous GWAS involving patients with intracranial aneurysm (IA). Fine-mapping analysis was conducted based on the chromosomal data provided by GWAS consisting 250 patients diagnosed with IA and 296 controls using posterior inclusion probability (PIP) and log10 transformed Bayes factor (log10BF). The narrow sense of heritability (h 2 ) explained by each casual variant was estimated. Subsequent gene expression and functional network analyses were used to calculate the transcripts per million (TPM) values. Twenty causal candidate single nucleotide polymorphisms (SNPs) surpassed a genome-wide signicance threshold for creditable evidence (log10BF > 6.1). Four SNPs including rs75822236 (R535H, GBA; log10BF = 15.06), rs112859779 (G141S, TCF24; log10BF = 12.12), rs79134766 (A208T, OLFML2A; log10BF = 14.92), and rs371331393 (Q1932X, ARHGAP32; log10BF = 20.88) showed a completed PIP value in each chromosomal region, suggesting a high probability of variant causality associated with IA. Expression in GBA was highly enriched in the whole blood (TPM = 33.13), while TCF24 were rarely expressed in all tissues and cells. No direct interaction was observed between the four casual genes; however, PSAP appeared to be particularly important via indirect correlation between other genes. Our results suggested that four mutations of GBA, TCF24, OLFML2A, and ARHGAP32 were linked to IA susceptibility and pathogenesis. Our approach may promise more informative mutations in the following GWAS. 6% 1 . The incidence of unruptured 100,000 2 . The asymptomatic UIA can rupture suddenly in subarachnoid hemorrhage 50% within one month after ictus 3, 4 . IA is a complex disease clinical genetic underlying its formation and growth 2 clinical for IA


Introduction
Intracranial aneurysm (IA) refers to an abnormal focal dilatation of a cerebral artery due to a weakening of the intima of a blood vessel wall. The prevalence of IA in the general adult population has been reported to be nearly 6% 1 . The incidence of unruptured intracranial aneurysms (UIAs) was 15.6 per 100,000 persons 2 . The asymptomatic UIA can rupture suddenly resulting in subarachnoid hemorrhage (SAH), which is associated with a higher mortality rate exceeding 50% within one month after ictus 3,4 . IA is a complex disease involving an interaction between clinical and genetic factors underlying its formation and growth 2 . Important clinical risk factors for IA include female gender, hypertension and smoking. The risk of rupture is increased when the aneurysm is located between arterial branches or in the vertebrobasilar region, in addition to larger size at diagnosis, and the presence of a bleb or daughter sac [5][6][7] . Genetic studies have been performed to identify genes associated with IA via linkage analysis and single nucleotide polymorphisms (SNPs) of known candidate genes, strongly correlated genes or genome-wide association studies (GWAS) for screening multiple candidate genes. In particular, GWAS revealed large-scale genetic associations, which were primarily correlated with traits and diseases. GWAS technically compares allele frequencies in SNPs between cases and controls. However, complex traits of IA are not entirely attributed to a single gene, but are caused by the in uence of multiple genes 8, 9 . Gene-gene and gene-environment interactions also affect the traits and diseases. Given the inherent features of the GWAS, genetic markers included in the same linkage disequilibrium exhibit similar correlation. Accordingly, even if a candidate gene is identi ed via GWAS, it could merely suggest a statistically signi cant difference rather than represent an etiological factor. Further, the precise location of the causative gene may differ in the same linkage disequilibrium (LD) block. Thus, it is important to reduce the errors via additional data processing to identify false-positive results obtained in GWAS.
Fine-mapping is one of the post-GWAS analyses used to narrow potential candidate variants directly affecting the trait 10 . This approach can be used to identify the regions associated with possible causal susceptibility based on the population of structures with LD 9 . It provides a complex correlation between the casual variants and the disease development using computational data without in vivo and in vitro molecular biology studies. 10,11 Sekar et al. 11 showed that structurally diverse alleles of the complement component 4 genes contribute to schizophrenia via excessive complement activity, resulting in reduced numbers of synapses. Consequently, it can be used to assess the functional role of the risk allele, which is a challenge to investigate based on molecular mechanisms, despite the strong genetic association. Fine-mapping of complex traits has been increasingly performed in many diseases, especially cancer and stroke, but has yet to be reported in IA. Here, for the rst time, we performed a ne-mapping analysis based on previous GWAS data sets to identify the causal candidate variants in an effort to identify the precise genetic variants associated with IA in a Korean adult cohort. We also performed a functional gene set enrichment analysis using the optimized candidate sets to analyze the biological relationship between candidate genes and IA.

GWAS-based summary statistics
The analysis was based on the summary statistics provided by the previous IA GWAS. In brief, the study included 250 adult patients with saccular aneurysm and 296 controls between March 2015 and December 2020 12,13 . The AxiomTH Asian Precision Medicine Research Array (APMRA) (Thermo Fisher Scienti c, MA, USA) were used for genotyping of the study subjects. High-quality plates were de ned by a plate pass rate higher than 95% for samples. The average call rate of passing samples was greater than 99%. A total of 512,575 SNPs passed the quality control including genotyping call rate of 95% or higher, minor allele frequency of at least 1%, and Hardy-Weinberg equilibrium P-value ≥ 1×10 − 612 . GWAS-based summary statistics included allele types, minor allele frequencies, effect sizes, and p-values. The Institutional Review Board and Ethics Committee approved all protocols of the study (No. 2016-3, 2019-06-006).

Statistical Analysis
We performed a ne-mapping study to identify the role of casual candidate combinations in the susceptibility to IA using FINEMAP v.1.3.1 14 . Odds ratios (ORs) of individual SNPs were converted using the natural log-transformed formula (lnOR). Causality of each SNP or con guration was assessed using effect sizes, posterior inclusion probabilities (PIPs), and narrow sense of heritability (h2), which was explained by casual SNPs. The log10-transformed Bayes factors (log10BF) for the individual casual SNPs and con gurations were estimated via FINEMAP analysis. A log10BF value greater than 6.1 suggested signi cant genome-wide evidence. The ne-mapping approach requires the estimates of SNP correlations, and therefore LD matrices between SNPs were generated by PLINK v1.9 (https://www.cog-genomics.org/plink/) 15 . All the ne-mapping tests were conducted with individual chromosomes (chr1-22) due to the LD-based mapping procedure. Manhattan and regional association plots of ne-mapping results were obtained using the package of "qqman" in R v3.6.1 (https://cran.r-project.org/web/packages/qqman) and LocusZoom v1.3 written in the modi ed Python and R scripts 16 . Regional annotations and functional impact of SNPs were described using the ANNOVAR program including the PolyPhen-2 (http://www.openbioinformatics.org/annovar/) 17 .

Gene expression and functional network analyses
The expressions of causal candidate genes was evaluated in human blood, brain-speci c tissues, or cells using the Genotype-Tissue Expression (GTEx) Portal (https://gtexportal.org/home/) 18 . Transcripts per million (TPM) values of a total of 56,200 genes were calculated in 13 brain tissues, 4 blood vessels (3 arterial tissues and 1 cell line of EBV-transformed lymphocytes), and a whole blood cell. Subsequent gene functional network analysis was conducted using the GeneMANIA program (https://genemania.org/) 19 .

Gene expression and functional network analyses
Gene expression and functional network analyses were performed using the four causal candidate genes ( Fig. 2 and Supplementary Table 2). GBA was broadly enriched in all tissues and cell lines (6 < TPM < 34). In particular, it was highly expressed in the whole blood (TPM = 33.13). Conversely, TCF24 was rarely expressed in all tissues and cells (TPM < 0.15). Expression in OLFML2A was moderate in all arteries (TPM = 2.37 to 9.54). ARHGAP32 was rarely enriched in EBV-transformed lymphocytes and whole blood (TPM = 0.0667 and 0.2274, respectively), while it was enriched in the brain and blood vessels (TPM, between 3 and 24). No direct interaction was observed between the four casual genes (Fig. 3). However, these genes constitute a hub network interacting with neighbor genes, especially PSAP, SCARB2, and ASAH1 (Fig. 3).

Discussion
Large-scale GWAS and meta-analyses have been widely used to identify and validate common or novel susceptible gene variants in various medical diseases over the past decade. However, given the overall genotype-phenotype analyses, disease-modifying functional mutations and direct biological relevance to disease have yet to be elucidated completely 20 . In addition, the heritability of a speci c trait cannot be fully explained by common SNPs of intronic or intergenic regions via GWASs that have been targeted to identify common variants in common complex diseases. Accordingly, even if a large number of susceptible loci were identi ed, a few cases showed their replication in an independent cohort. Thus, few disease-associated variants have been demonstrated in functional in vitro studies or used in treatment 21 . To overcome these limitations, an updated ne-mapping analysis was performed to identify the variant causality associated with human complex diseases and as a cost-effective genotyping strategy 9 . To date, many studies underscored the need for 'feature selection' to identify relevant "variables" using parametric or non-parametric models. However, feature selection is not a simple challenge and requires substantial genetic investigations. It is important to identify the causal or driver mutations linked to treatment of human complex diseases. The selection of genetic variants from GWAS is uncertain given the strongly correlated SNPs corresponding to a pairwise LD structure at the population level. A ne-mapping analysis facilitates the identi cation of creditable genetic variants to re ne the selection bias such as false-positive variants based on the initial GWAS and to improve the ndings of molecular functional studies 9 . Here, we performed a ne-mapping analysis based on the results of previous IA GWAS using the statistical method developed by Benner et al. 14 Our ndings may enable the identi cation of causal variants (true negative) and exclude potential false positives via statistically signi cant ne-mapping analysis of transformed GWAS results. Therefore, these analytical methods may enable the selection of functional candidate variants based on the molecular mechanisms associated with IA formation.
In this study, we found four causal genes that are potentially linked to IA such as GBA, TCF24, OLFML2A, and ARHGAP32. We speculated that these genetic variants may cause dysfunctional immune response and in ammation in DNA sequences damaged by amino acid substitution or gain-or loss-of-function mutations, which affects the IA formation. GBA located in the exonic regions of 1q22 (rs75822236) was signi cantly associated with IA 12 . More speci cally, GWAS revealed that the "T" allele of this variant increased the risk of IA. 12 In addition, a ne-mapping analysis also revealed a higher level of log10BF (15.06) and PIP (1.0), suggesting that this variant was a true positive for IA. The role of GBA was mainly investigated in Parkinson's disease (PD) or Gaucher disease (GD), which is a recessive lysosomal storage disorder, and barely investigated in IA. Mata et al. 22 reported that GBA mutations and E326K carrier were related to impaired working memory and executive function in patients with PD. In GD, null or severe homozygous mutations of GBA showed little or no human glucocerebrosidase activity 23 . These ndings suggested differences in phenotype due to the various GBA mutations. Kleinloog et al. 24 reported enrichment of the lysosomal pathway in ruptured IA compared with UIA based on RNA sequencing analysis of aneurysm wall. Although the lysosomal pathway does not re ect an acute reaction to IA rupture 24 , it is likely that it is induced by in ammation after bleeding.
OLFML2A and TCF24 showed a protective effect against IA formation with log10BF levels greater than 12 and completed PIPs. However, the relationship between these two genes and IA is still unclear, even though it has been implicated in cardiovascular diseases. Conversely, ARHGAP32 signi cantly increased the risk of IA with the highest log10BF (20.88) and completed PIP. ARHGAP32 refers to Rho GTPase-activating protein 32 and mediates N-methyl D-aspartate receptor signaling 12 . The role of ARHGAP32 has been mainly investigated in the regulation of blood pressure. Rho-speci c GTPase-activating protein GRAF3 was highly expressed in smooth muscle cells (SMCs) and regulated blood pressure control by inhibiting the contractility of RhoA-mediated SMC 25 . GRAF3-de cient mice also showed increased blood pressure in response to angiotensin II and endothelin 1 26 . In actual clinical practice, many patients manifest both IA and hypertension. Inci et al. 27 reported that the rate of pre-existing hypertension was 43.5% in patients with IA, which was higher than 24.4% in the normal population. Hypertension may contribute to degeneration of the internal elastic lamina, weakening of the vessel wall, and IA formation 27 . Nevertheless, it is unclear whether the role of ARHGAP32 in IA is mediated indirectly via chronic hypertension or directly via change in vascular tone.
Functional network analyses showed that PSAP was an important gene in the development of IA. The role of PSAP gene was rarely investigated in IA and was mainly studied in PD. Oji et al. 28 reported that two SNPs of rs4747203 and rs885828, the intronic regions of the PSAP saposin D domain were linked to PD. PSAP mutation can also result in dopaminergic neurodegeneration and motor decline in mice. Although we did not include patients with PD, a ne-mapping analysis revealed that PD-related genes such as GBA and PSAP may contribute to IA. Lysosomal dysfunction and the resulting lysosomal storage disorder can contribute causally to PD. Putative damaging variants in at least one gene associated with lysosomal storage disorder were observed in most PD patients 29 . However, lysosomal dysfunction can also be observed in the arterial wall. Lysosomal changes in the vascular SMCs were attributed to the accumulation of excessive substrate levels in the lysosomes of a primate model of atherosclerosis and hypertension 30 . The excessive sterol accumulation in lysosomes can disrupt the lysosomal function 31 . Therefore, in this case, it is possible that lysosomal dysfunction may directly affect IA formation or may contribute to IA via atherosclerosis. Hokari et al. 32 reported that atherosclerotic factors strongly increased the risk of middle cerebral artery aneurysm compared with paraclinoid aneurysm. After securing aneurysm, consistent statin therapy was signi cantly correlated with better prognosis. 33 Wu et al. 34 demonstrated that the autophagy-lysosomal pathway, which entails self-digestion of dysfunctional intracellular components by lysosomal enzymes, was an important pro-survival mechanism after SAH. However, the study investigated the role of the lysosome after SAH development, but not in IA development itself. Therefore, additional studies are needed to investigate the role of candidate variants in lysosomal dysfunction resulting in IA formation via abnormal ECM remodeling in response to hemodynamic stress.
A ne-mapping analysis may enable the detection of mutations involving the candidate genes and thereby contribute to treatment based on genome-based precision medicine. Identifying the causal variants beyond GWAS provides a crucial blueprint in predicting disease risk and preventing IA as well as metabolic diseases and stroke 35 . To the best of our knowledge, this was the rst ne-mapping analysis based on the results of a GWAS of IA. Nevertheless, the interpretation of the results may have some limitations. First, the sample size was relatively somehow small to discovery more functional mutations to achieve a su cient statistical power, even though four common mutations have been detected in our study. Second, variables regarding aneurysm location and size were not considered in the analysis. Microscopic analysis of the aneurysm during actual surgery often reveals atherosclerotic changes in the aneurysm wall, especially in elderly patients with larger aneurysm. Accordingly, GWAS and subsequent ne-mapping analysis should be performed together given the metabolic status of a large number of patients with IA.
In summary, ne-mapping analysis robustly identi ed four functional mutations of causal candidate genes, such as GBA, TCF24, OLFML2A, and ARHGAP32 that have associated with IA. Mutations in these genes show play roles in immune and in ammatory systems according to our literature review and functional annotations. Their mutations suggest a possible polygenic inheritance of IA formation. Finally, our study will provide more informative and replicable causal susceptibilities to IA including four mutations in the second stage GWAS.

Declarations
The data that support the ndings of this study was submitted as online supplemental material, and further detailed information is available upon request to the corresponding author. All genotype and phenotype resources are managed by "The First Korean Stroke Genetics Association Research" study constructed from the Sacred Heart Hospital Stroke Database.  A heatmap of multiple gene expression involving GBA, TCF24, OLFML2A, and ARHGAP32 in human cells and tissues including artery, brain, and whole blood is presented. Gene expression was estimated as transcripts per million (TPM). Genes and types of cells or tissues were ordered via agglomerative hierarchical clustering.

Figure 3
Susceptibility to intracranial aneurysm (Homo sapiens) based on multiple protein interactions between proteins coded by four causal hub genes including GBA, TCF24, OLFML2A, and ARHGAP32. The network included neighboring genes correlated with four hub genes. The width of individual lines indicates the intensity of the interaction between proteins. The colors in each line indicate multiple functions including physical interaction, co-expression, prediction, co-localization, genetic interaction, pathways, and shared protein domains.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. OnlineSupplementalData.docx