Genome-wide study of immune biomarkers in cerebrospinal fluid and serum from patients with bipolar disorder and controls

Bipolar disorder is a common, chronic psychiatric disorder. Despite high heritability, there is a paucity of identified genetic risk factors. Immune biomarkers are under more direct genetic influence than bipolar disorder. To explore the genetic associations with immune biomarker levels in cerebrospinal fluid (CSF) and blood serum which previously showed differences in bipolar disorder, we performed a study involving 291 individuals (184 bipolar disorder patients and 107 controls). The biomarkers assayed in both CSF and serum were: chitinase-3-like protein-1 (YKL-40), monocyte chemoattractant protein-1 (MCP-1), soluble cluster of differentiation (sCD14), tissue inhibitor of metalloproteinases-1 and 2 (TIMP-1 and TIMP-2). C-reactive protein (CRP) was only quantified in serum, and interleukin 8 (IL-8) measures were only available in CSF. Genome-wide association studies were conducted using PLINK for each of three genotyping waves and incorporated covariates for population substructure, age, sex, and body mass index (BMI). Results were combined by meta-analysis. Genome-wide significant associations were detected for all biomarkers except TIMP-1 and TIMP-2 in CSF. The strongest association in CSF was found for markers within the CNTNAP5 gene with YKL-40 (rs150248456, P = 2.84 × 10−10). The strongest association in serum was also for YKL-40 but localized to the FANCI gene (rs188263039, P = 5.80 × 10−26). This study revealed numerous biologically plausible genetic associations with immune biomarkers in CSF and blood serum. Importantly, the genetic variants regulating immune biomarker levels in CSF and blood serum differ. These results extend our knowledge of how biomarkers showing alterations in bipolar disorder are genetically regulated.

One line of research has explored immunoinflammatory processes in bipolar disorder and a number of studies have investigated immune markers in serum [12][13][14][15][16] . Serum immune markers are, however, not necessarily indicative of immune and inflammatory activity in the brain. This is because concentrations of cytokines and other proteins in serum or plasma come from production in peripheral tissues and thus do not reflect inflammatory processes in the brain [17][18][19] . Therefore, we previously investigated a set of immune biomarkers in cerebrospinal fluid (CSF) in bipolar disorder and healthy controls, including monocyte chemoattractant protein-1 (MCP-1), chitinase-3-like protein-1 (YKL-40), soluble cluster of differentiation 14 (sCD14), tissue inhibitor of metalloproteinases-1/2 (TIMP-1 and TIMP-2), and interleukin 8 (IL-8) 20,21 . Taken together, these results suggest that brain-specific immune mechanisms, beyond systemic inflammatory processes, are involved in the pathophysiology of bipolar disorder.
Immune biomarkers are products of gene expression and are as such measurable components in biological pathways between genotype and disease. Immune biomarkers are therefore potential endophenotypes for some psychiatric disorders by representing a measurable characteristic of the disorder more closely related to the genetic underpinnings than behavioral manifestations such as bipolar disorder 15,22,23 . Available evidence indicates a number of loci associated with immune biomarker levels among different populations 15,[24][25][26] , but few genome-wide association studies of multiple immune biomarkers have been published in bipolar disorder cohorts. Revealing genetic associations for CSF and blood serum biomarkers serving as indicators of biological processes implicated in bipolar disorder might yield new insights into the genetic underpinnings of bipolar disorder.
The aim of this study was to conduct a GWAS of a set of immune biomarkers in CSF and serum that previously have been found to differ between bipolar disorder patients and controls.

Subjects
The study population consisted of subjects with bipolar disorder as well as age-and sex-matched controls. Subjects were recruited from a long-term naturalistic study of bipolar disorder, the St. Göran Bipolar Project, at the bipolar outpatient unit at the Northern Stockholm psychiatric clinic in Stockholm, Sweden. The work-up and diagnostic procedures for patients and selection of controls have been described in detail previously 20,[27][28][29] . In brief, the key clinical assessment instrument was a Swedish version of the Affective Disorder Evaluation (ADE), which is a standardized interview protocol developed for the Systematic Treatment Enhancement Program of Bipolar Disorder (STEP-BD). The clinical diagnosis of bipolar disorder was made according to DSM-IV criteria as per the Structured Clinical Interview for DSM-IV. In addition, the Mini International Neuropsychiatric Interview (M.I.N.I.) was completed to screen for other psychiatric diagnoses. The ADE and M.I.N.I. interviews were conducted by board-certified psychiatrists, or residents in psychiatry. A best-estimate diagnostic decision was then made based on all information available by a consensus panel of experienced boardcertified psychiatrists specialized in bipolar disorder. To gather a representative cohort of bipolar disorder patients, the St. Göran study aimed to have as few exclusion criteria as possible and persons with somatic disorders including autoimmune disorder were not excluded. Patients were not remunerated for participation.
Population-based controls living in the same catchment areas were randomly selected by Statistics Sweden and contacted by mail. Details of the recruitment, and inclusion and exclusion criteria can be found elsewhere 29,30 . Briefly, eligible persons were scheduled for a personal examination and investigated to exclude mental illness by a psychiatrist using the Mini International Neuropsychiatric Interview (M.I.N.I.) and selected parts of the ADE. Exclusion criteria were as follows: any current psychiatric disorder including personality disorder, a family history of schizophrenia or bipolar disorder in firstdegree relatives, drug or alcohol abuse (based on DUDIT, AUDIT and serum levels of carbohydrate-deficient transferrin), and neurological conditions except mild migraines, as well as pregnancy, untreated endocrine disorders, dementia, and chronic systemic autoimmune disorders, except persons with controlled asthma and allergies. Control subjects were remunerated for their participation.
Only cases and controls with Scandinavian ancestry were included in this study. All participating subjects granted oral and written informed consent after complete description of the study. Ethical approvals for this study were granted by the Stockholm Regional Ethics Committee.
Sampling and biomarker analyses of CSF and blood CSF and blood were obtained when the participants were in a stable condition. Sampling occurred between 9 and 10 am after an overnight fast. For each participant, 12 mL of CSF was collected and gently inverted to avoid gradient effects. Serum was obtained from blood samples after coagulation and centrifugation. Both serum and CSF samples were stored at −80°C at Karolinska Institutet Biobank, Sweden. The assays and kits for each biomarker have been described in detail previously 20,21 . All biomarker concentrations were measured by experienced and board-certified laboratory technicians at the Clinical Neurochemistry Laboratory in Mölndal, Sweden who were blinded to the clinical information.

Genotyping, quality control, and imputation
Blood samples were transferred to the Karolinska Institutet Biobank for DNA extraction. An aliquot of each DNA sample was shipped to the Broad Institute (Boston, USA) for genotyping. Whole-genome genotyping was conducted at the Broad Institute using PsychChip (wave 1), Affymetrix 6.0 (wave 2), and Illumina OmniExpress (wave 3) chips. All controls were in wave 1 and wave 3, while the bipolar cases were exclusively in wave 2. Following quality control steps, datasets were then imputed with the full 1000 Genomes Project integrated variant set as reference 31 . Details of genotyping, quality control, and imputation procedures have been described previously 32 . The three waves shared more than 10 million SNPs after imputation.
A total of 183 bipolar cases and 107 controls from the St. Göran project had both biomarker information and genotype data available.

Statistical analyses
We used PLINK version 1.9, SPSS Statistics 23 and R version 3.3.3 with the qqman package for all statistical analyses. Group differences between bipolar cases and controls were tested using the Mann-Whitney U test for age, BMI, and all biomarker concentrations, while Fisher's exact test was used for sex. Interquartile range was calculated for continuous variables.
The CRP distributions deviated from normality, prohibiting use of linear regressions in GWAS analysis. We therefore conducted a log transformation to adapt it to a distribution compatible with linear regression.
The GWAS analysis was conducted using imputed SNP dosages and linear regression models in PLINK for each wave and incorporated the first four multidimensional scaling (MDS) components, age, sex, and BMI as covariates. Results were then combined by meta-analysis in PLINK using a random-effects model. Although imputed data were used, there were still SNPs that were not present in one or more waves. Waves included in the meta-analysis are noted in results. SNPs with minor allele frequency (<1%) and poor imputation (INFO < 0.6) were removed from meta-analysis results. A standard genome-wide significance threshold of p < 5 × 10 −8 was used 33 . Linkage disequilibrium based clumping was used for grouping correlated SNPs to define regions of association. Genes were also identified based on UCSC hg19 coordinates. GWAS summary statistics were also uploaded to FUMA to obtain results from MAGMA gene-set analysis and MAGMA tissue expression analysis (GTEx v6, 30 general tissue types) 34,35 .
SNPs with p-values < 5 × 10 −5 from either CSF or serum results were used for the difference test to compare the effect sizes between CSF and serum results. Effect sizes of these SNPs were standardized by dividing by the standard deviations of the biomarker levels. Pearson correlation coefficients were used to test the correlation of standardized effect sizes from CSF and serum results.

Code availability
The scripts used to generate these results can be provided upon request.

Demographics and clinical characteristics
We included 184 bipolar disorder patients (67 men and 117 women) and 107 controls (46 men and 61 women) for the serum measurements. A subset of this sample population-114 bipolar disorder patients (44 men and 70 women) and 83 controls (36 men and 47 women)-who consented to lumbar puncture comprised the study population for CSF measurements. A summary of sample sizes included in this study for each genotyping wave is described in Supplementary Table S1.
A description of demographic and clinical characteristics of the study population as well as comparison of differences between bipolar patients and controls can be found in Supplementary Table S2. Note that the comparisons of biomarker concentrations between cases and controls shown in Table S2 have been published previously 20,21 , but may differ somewhat as not all individuals had both biomarker information and genotype data available.

Genetic variants associated with immune biomarkers in CSF
A total of >5.6 million SNPs from 114 bipolar patients and 83 controls were included in our final meta-analysis for biomarkers in CSF. The quantile-quantile plots for each of the six biomarkers showed moderate deviations from the null distribution at low p-values indicating the presence of association signals, but no deviation at the higher p-values, which denotes well matched cases and controls, i.e., no inflation (Fig. S1). The genomic inflation factor, λ, ranged from 1.05-1.11.
The genome-wide association analysis results for the six biomarkers in CSF are shown in Fig. 1. The number of genome-wide significant regions (Index SNPs' P-values < 5 × 10 −8 ) for each GWAS meta-analysis on YKL-40, MCP-1, sCD14, and IL-8 in CSF was 4, 3, 2, and 6, respectively. No SNPs of genome-wide significance were associated with TIMP-1 and TIMP-2. Table 1 illustrates the genome-wide significant (GWS) regions for each biomarker in CSF. The top GWS SNPs associated with YKL-40 were located in the genes CNTNAP5, EYS, and FER1L3. For MCP-1, the SNP with the strongest association was located within ACAA2 gene (rs10438979, P A/G = 1.64 × 10 −9 ). The other two top GWS SNPs were located near or within LINCO1288, FAM129A, and EDEM3. Two GWS SNPs were related to sCD14 levels in CSF, which are located in the gene area of C8orf37-AS1 and TUT1.
The results from MAGMA tissue expression analysis for 30 general tissue types for biomarkers from CSF are shown in Fig. S2. The genetic variants associated with YKL-40 from CSF were significantly expressed in bladder. No other significant findings from GTEx were identified.

Genetic variants associated with serum immune biomarkers
Over five million SNPs from 184 bipolar patients and 107 controls were included in our final meta-analysis for biomarkers in serum. The quantile-quantile plots for each of the six biomarkers were consistent with no inflation and strong association signals (Fig. S3). The genomic inflation factor, λ, ranged from 1.01 to 1.05.
The results from MAGMA tissue expression analysis for 30 general tissue types for biomarkers from serum are shown in Fig. S4. None demonstrated significance.

Difference tests for top SNPs in CSF and serum
The genome-wide significant SNPs for serum and CSF for each biomarker with both measures did not overlap. To formally test for similarity in sub-significance    Table 3. From this test, there were significant positive correlations between standardized effect sizes of top SNPs in different tissues in YKL-40 (r = 0.533, P < 0.001), MCP-1 (r = 0.449, P < 0.001), and sCD14 (r = 0.092, P = 0.020). However, the correlation in sCD14 was rather modest while the standardized effect sizes of top SNPs from different tissues in YKL-40 and MCP-1 showed moderate correlation. The correlation coefficients of TIMP-1 (r = −0.036, P = 0.360) and TIMP-2 (r = −0.008, P = 0.796) were close to zero, which indicates no relationship between effect sizes of SNPs associated with CSF or serum.

Discussion
This is the first genome-wide study of multiple immune biomarkers in serum and CSF in bipolar disorder. Despite the modest sample size, a number of SNPs reached genome-wide statistical significance.

CSF markers
Among immune biomarkers in CSF, the strongest association was found between rs150248456 and YKL-40. Rs150248456 is located within an intron of CNTNAP5 (contactin associated protein like 5) on chromosome 2. The product of CNTNAP5 belongs to the neurexin family 36 . Interestingly, SNPs located in CNTNAP5 have been found to be significantly associated with mathematical ability, self-reported educational attainment, cognitive performance and response to antipsychotic treatment in Bold gene names include the index SNP within the gene. Genes not in bold are those that SNPs are closest to. GWS is defined as P < 5e-08. A1/A2, beta(ß), p-value, waves are based on the meta-analysis of GWAS data from each wave. Position is the basepair position or given in UCSC hg19 coordinates. Chr chromosome, (Index)SNP single nucleotide polymorphism (with the strongest association in the genomic region), A1/A2 reference and alternate allele, Freq weighted average frequency of reference allele, ß random-effects meta-analysis ß estimate, p-value random-effects meta-analysis p-value, N number of SNPs in the reported region, Waves valid waves included for the SNP.
a Results from meta-analysis including wave 2. schizophrenia 37,38 . Two GWS SNPs-only represented in genotyping waves containing control subjects-were observed in this area. Although CSF concentration of YKL-40 differs between bipolar cases and controls 20 , genes are likely to regulate the expression of biomarkers in people with and without bipolar disorder in a similar way. Interestingly, a previous GWAS of bipolar disorder in Norwegian individuals followed by replication in Icelandic samples also found nominally significant (P < 0.05) markers located in this gene area 39 . The top significant region associated with CSF YKL-40 in bipolar disorder cases was in the gene EYS (rs11753319), which is a novel locus. SNPs located in EYS were also reported to be significantly associated with mathematical ability, self-reported educational attainment, BMI, alcohol drinking, systolic blood pressure, mood disorder, unipolar depression, and schizophrenia 37,[40][41][42][43] . Several SNPs located near LINC01288 and EDEM3 were found to be genome-wide significantly associated with MCP-1. LINC01288 was reported to be significantly associated with BMI, while EDEM3 was found to be associated with educational attainment, mathematical ability, systemic lupus erythematosus, and cognitive performance 37,44,45 . Four GWS SNPs located in C8orf37-AS1 were found to be associated with CSF concentration of sCD14. Heel bone mineral density and smoking behaviors have previously been associated with C8orf37-AS1.
Although no SNPs reached genome-wide significance for TIMP-1 or TIMP-2 concentration in CSF, several association peaks for the two biomarkers nearly reached GWS (Fig. 2). This suggests that larger sample sizes may yield significant associations.
Among genes associated with IL-8 in CSF, SATB2 is highly expressed in brain 46 , indicating a potential molecular mechanism from associated SNPs to IL-8 CSF levels.

Serum markers
As for immune biomarkers in serum, the top three GWS SNPs associated with YKL-40 are located in the intron area of gene FANCI on chromosome 15. The product of the FANCI gene is a member of the Fanconi anaemia complementation (FANC) group 46 . Genetic variation in FANC group has previously been found to be associated with psychiatric illness 47 . When considering both cases and controls, the top SNP is located near the gene SLC39A12. This gene is highly expressed in brain and was identified by bioinformatic analyses as a significant molecular biomarker in the progression of psychiatric disorders, including bipolar disorder 48 . Moreover, the genes SPTLC1and ROR2 that were associated with YKL-40 level in serum are moderately associated with schizophrenia and bipolar disorder (P = 5.79 × 10 −7 ) according to a previous GWAS 49 .
The strongest association with sCD14 in serum was with rs190197089. This SNP is located within an intron of SLC8A1 (solute carrier family 8 member A1) on chromosome 2. There is another GWS SNP located in the same area. Both of them are from the meta-analysis of two waves (wave 1 and wave 2), but were unavailable in wave 3. Interestingly, the SLC8A1 gene has previously been associated with bipolar disorder 50,51 . The largest number of SNPs (n = 110) associated with sCD14 after metaanalysis with all 3 waves were identified on chromosome 5q31.3 (top: rs2569191). More than 15 genes are located in this area. The 5q31.3 region has previously been linked to bipolar disorder 50 . The other top SNP associated with sCD14 in serum, rs2569191, is located near the CD14 gene. CD14 encodes a surface antigen that is expressed by monocytes 46 .
There was one GWS locus associated with TIMP-1 level in serum. It is located near NPAS3, which is a transcription factor involved in neurogenesis. NPAS3 has been implicated in a pathway associated with bipolar disorder 52 . However, this was the result only from controls. Among genes where SNPs associated with TIMP-2 were located, FSTL5 is the gene with the highest expression in brain among all tissues 53 .
After log transformation of CRP levels in serum, one SNP (rs57213254) was GWS. Altered CRP levels in serum have been found to be associated with not only bipolar disorder, but also hypertension, stroke, coronary heart disease, type 2 diabetes mellitus, and cancer [54][55][56][57][58] . A number of loci were associated with serum CRP levels in a previous GWAS in the general population 25 . However, the GWS SNP associated with CRP in this study is novel. Replication is needed to confirm the association between this GWS SNP and altered CRP levels in bipolar patients and controls.
Compared with previous studies in larger samples, we still found a number of novel GWS SNPs associated with disease-associated immune biomarker level differences. This might imply possible biological pathways from genetic markers to inflammatory mediators in bipolar disorder. However, additional work is needed to confirm this conjecture. There are several potential explanations for the new associations despite the small sample size: First, previous GWAS of biomarkers have studied the general population or people with Alzheimer's disease 24,25 . The novel GWS SNPs found in the meta-analysis including wave 2 (which was case-only) might be associated with biomarker differences in bipolar disorder specifically. Second, although imputation was conducted on each wave, some SNPs did not exist across all three waves due to array differences or fluctuations in allele frequencies across waves. The GWS results from only the control waves which were not significant in prior studies 25,59 have very low minor allele frequencies (MAF < 0.015). They might therefore have been removed from wave 2 during quality control steps and may not have been analyzed in other studies. However, the biomarker levels differ between bipolar cases and controls, and genes regulate the expression of biomarkers in people with and without bipolar disorder in a similar way 20 . We could also posit that these findings might be specific to the Swedish population. Further studies are needed to affirm these assertions. Finally, it is possible that the slight deviations from normal distributions for biomarkers other than CRP could have resulted in false positive associations. However, the statistical tests used are robust to minor deviations in normality. False positives are a concern for all studies and may lead to variation in significant SNPs across studies, underscoring the importance of replication efforts.
The number of significant SNPs associated with diseaseassociated immune biomarkers differed depending on whether the sampling substrate was CSF or serum. GWS SNPs associated with blood serum biomarkers are more numerous than GWS SNPs in relation to CSF biomarkers. The reason might be the smaller sample size for CSF than blood serum. The associated SNPs and the genes where the SNPs localized are different between the same biomarker in CSF and serum. However, when comparing nominally associated SNPs from CSF and serum, effect sizes for YKL-40 and MCP-1 from CSF and serum are moderately correlated. Little correlation was observed in sCD14 while no correlation was found for TIMP-1 and TIMP-2. This mirrors findings from a previous study in this sample which observed correlations between CSF and serum levels for YKL-40 and MCP-1 20 but not for TIMP-1 and TIMP-2. These convergent lines of evidence indicate that some biomarkers share genetic regulation across tissues while others do not.
There are limitations of this study to consider. First, given the hurdle of lumbar puncture to collect CSF, the sample size is limited and we were unable to identify a replication cohort. Despite this, we found numerous loci associated with immune biomarkers, but our sample size is almost certainly not sufficient to identify all relevant loci. Second, genotyping was conducted on different chips (PsychChip, Affymetrix 6.0, and Illumina OmniExpress) that provide incomplete overlap of directly genotyped markers. Although we use the imputed SNP dosages in order to increase the number of same SNPs shared by all three waves, the overlap is incomplete. Finally, there might be confounders associated with bipolar disorder such as smoking and alcohol use. Smoking and alcohol abuse might not only affect immune-related biomarker levels, but are also partly genetically mediated 60,61 . It is therefore possible that some significant genetic associations are due to the intermediary effect of smoking or alcohol consumption rather directly linked to biomarkers levels.
Despite the limitations, there are several strengths of this study. First, this is the first combined GWAS of peripheral and central immune biomarkers focusing on bipolar disorder to date, which shed light on the genetic regulation of the immune system in bipolar disorder. Second, we studied biomarkers measured in CSF that closely reflect the chemistry of the brain. Third, there is no evidence of heterogeneity for markers with the strongest association in our study, using Q-tests (P > 0.50) and I 2 index (I 2 = 0.00), while other SNPs showed high heterogeneity with Q-tests (P < 0.10) and I 2 index (I 2 > 10). We used the random-effects model for meta-analysis, which conservatively accounts for heterogeneity. We detected a number of SNPs associated with immune biomarker levels with large effect sizes. Compared to results from previous GWAS on bipolar disorder, the strength of the associations (i.e., effect sizes) between genetic variations and immune mediators is stronger than that with bipolar disorder itself 8,11,39 .
This study raises the possibility of using specific genetic markers as a proxy for immune biomarkers measured in CSF, which would be less cumbersome than collecting CSF through lumbar puncture. As sample size increases, the predictive ability will continue to improve. If replicated, the genetic markers may thus serve as indicators of biological processes implicated in psychiatric disorders. Indeed, specific genetic markers related to immune biomarkers profiles might be used to differentiate between different psychiatric diseases.
In summary, a number of biologically plausible SNPs significantly influencing immune biomarker levels in CSF and serum which demonstrated prior alterations in bipolar disorder have been identified in this study. Several of these SNPs are located in genes reported to be associated with bipolar disorder. The genetic variants associated with immune biomarker levels in CSF differ compared with those in serum. However, nominally significant variants showed correlated effect sizes for some biomarkers and this generally corresponded to whether the biomarker levels directly exhibited correlations between CSF and serum. The results of these GWAS can provide a route for the future investigations of genetic factors and immune biomarkers to aid in accurate diagnosis and development of treatments for bipolar disorder.