Despite compelling evidence for a major genetic contribution to risk of bipolar mood disorder, conclusive evidence implicating specific genes or pathophysiological systems has proved elusive. In part this is likely to be related to the unknown validity of current phenotype definitions and consequent aetiological heterogeneity of samples. In the recent Wellcome Trust Case Control Consortium genome-wide association analysis of bipolar disorder (1868 cases, 2938 controls) one of the most strongly associated polymorphisms lay within the gene encoding the GABAA receptor β1 subunit, GABRB1. Aiming to increase biological homogeneity, we sought the diagnostic subset that showed the strongest signal at this polymorphism and used this to test for independent evidence of association with other members of the GABAA receptor gene family. The index signal was significantly enriched in the 279 cases meeting Research Diagnostic Criteria for schizoaffective disorder, bipolar type (P=3.8 × 10−6). Independently, these cases showed strong evidence that variation in GABAA receptor genes influences risk for this phenotype (independent system-wide P=6.6 × 10−5) with association signals also at GABRA4, GABRB3, GABRA5 and GABRR1. Our findings have the potential to inform understanding of presentation, pathogenesis and nosology of bipolar disorders. Our method of phenotype refinement may be useful in studies of other complex psychiatric and non-psychiatric disorders.
Bipolar disorder (BD; manic depressive illness)1 refers to an episodic recurrent pathological disturbance in mood (affect) ranging from extreme elation, or mania, to severe depression usually accompanied by disturbances in thinking and behaviour. Psychotic features (delusions and hallucinations) often occur. Pathogenesis is poorly understood and despite compelling evidence for a substantial genetic contribution to risk (heritability ∼80–90%),2, 3 conclusive evidence implicating specific genes or systems has proved elusive.4, 5 In part this is likely to be related to the unknown biological validity of current phenotype definitions that are based solely on clinical features and there is increasing evidence of genetic overlaps in susceptibility across the major mood and psychotic disorders.6, 7, 8, 9 Varying approaches have been used to investigate whether taking account of occurrence of psychosis may help to reduce heterogeneity in genetic studies of BD (for example, Potash et al.10 and Park et al.11) and to study samples of BD and schizophrenia together in the hope of identifying shared susceptibility loci (for example, Maziade et al.12). Molecular genetic approaches can be expected to contribute to an improvement in diagnostic classification through identification of the biological systems that underpin the clinical syndromes.13
The recently published Wellcome Trust Case Control Consortium (WTCCC) study14 described a genome-wide association analysis of 1868 bipolar cases and 2938 controls in which the 92nd most strongly associated polymorphism lay within the γ-amino butyric acid A (GABAA) receptor gene, GABRB1 (Tables 1a and b; rs7680321; allelic odds ratio (OR)=1.36 (95% confidence intervals (CI): 1.16–1.54); χ2=16.03, 1 d.f., P=6.2 × 10−5). GABAA receptors transduce the major central nervous system inhibitory neurotransmitter GABA. This genetic finding is of substantial interest because GABAA receptors have been hypothesized to be involved in the pathogenesis of mood15 and psychotic16 illnesses, have been implicated in anxiety17 and alcohol disorders,18 and are known to be involved in the actions of several psychoactive agents.19, 20, 21 GABAA receptors are hetero-pentameric chloride channels constructed from various permutations of the products of multiple genes (α1–6; β1–4; γ1–3; δ, ɛ, θ, π, ρ1–3). It is not completely known which permutations of subunits combine in nature, but native receptors usually contain two α-, two β- and one γ-subunits, the precise combination being a critical determinant of the physiological and pharmacological properties of the assembled receptor. Most of the genes-encoding GABAA receptors are arranged genomically within clusters.22 For example, the cluster on chromosome 4p12 includes GABRB1, GABRA4, GABRA2 and GABRG1 within a stretch of 1.4 Mb of DNA.
Our aim in the current study was to identify a subset of bipolar cases showing an enriched signal at the index polymorphism, rs7680321, in the expectation that those cases would represent a group with greater biological homogeneity than the BD group as a whole. Under the hypothesis that this group might also exhibit relative aetiological homogeneity at other functionally related loci, we then sought to use this subset of cases to test for independent evidence for association with other polymorphisms in the GABAA receptor gene family.
Materials and methods
Our analyses used a subset of the single nucleotide polymorphisms (SNPs) and cases reported in the BD component of the WTCCC genome-wide association study of seven common familial diseases.14 All individuals were white and resident in the UK.
Bipolar disorder cases
The WTCCC bipolar data set comprised 1868 BD cases who were all over the age of 16 years, living in mainland UK and of European descent. Recruitment was undertaken throughout the UK by teams based in Aberdeen (8% of cases), Birmingham (35% cases), Cardiff (33% cases), London (15% cases) and Newcastle (9% cases). Individuals who had been in contact with mental health services were recruited if they suffered with a major mood disorder in which clinically significant episodes of elevated mood had occurred. This was defined as a lifetime diagnosis of a bipolar mood disorder according to Research Diagnostic Criteria23 and included: bipolar I disorder (71% cases), schizoaffective disorder, bipolar type (SABP, 15% cases), bipolar II disorder (9% cases) and manic disorder (5% cases). After providing written informed consent, all subjects were interviewed by a trained psychologist or psychiatrist using a semi-structured lifetime diagnostic psychiatric interview (in most cases the Schedules for Clinical Assessment in Neuropsychiatry24 and available psychiatric medical records were reviewed). Using all available data, best-estimate ratings were made for a set of key phenotypic measures on the basis of the OPCRIT checklist25 (which covers both psychopathology and course of illness) and lifetime psychiatric diagnoses were assigned according to the Research Diagnostic Criteria.23 Further details of clinical methodology can be found elsewhere.26, 27
The characteristics of the subset of 279 cases meeting Research Diagnostic Criteria for SABP were: 42% men; mean age at interview: 43.3 (s.d. 12.1) years; all individuals had experienced psychotic symptoms (delusions or hallucinations); mean age at onset of impairment due to mood disorder: 23.2 (s.d. 7.9) years; lifetime occurrence of rapid cycling (that is, 4 or more episodes of mood disorder within a 12-month period): 10%; lifetime occurrence of a postnatal episode of mania within 6 weeks of parturition (that is ‘postnatal’ or ‘puerperal psychosis’): 8%; lifetime occurrence of a definite suicide attempt: 17%.
There were 2938 controls, who were not screened to exclude presence of psychiatric illness, and came from two sources.
1958 Birth Cohort Controls
A total of 1458 controls came from the 1958 Birth Cohort (also known as the National Child Development Study) that includes all births in England, Wales and Scotland, during 1 week in 1958. From an original sample of over 17 000 births, survivors were followed up at ages 7, 11, 16, 23, 33 and 42 year (http://www.cls.ioe.ac.uk/studies.asp?section5000100020003). In a biomedical examination at 44–45 years of age (http://www.b58cgene.sgul.ac.uk/followup.php), 9377 cohort members were visited at home providing 7692 blood samples with consent for future Epstein–Barr virus (EBV)-transformed cell lines. DNA samples extracted from 1500 cell lines of self-reported white ethnicity and representative of gender and each geographical region were selected for use as controls. Men were 50%.
UK blood services controls
The second set of controls was made up of 1480 individuals selected from a sample of blood donors recruited as part of the current project. WTCCC in collaboration with the UK Blood Services (NHS Blood and Transplant in England, Scottish National Blood Transfusion Service in Scotland and The Welsh Blood Services in Wales) set up a UK national repository of anonymized samples of DNA and viable mononuclear cells from 3622 consenting blood donors, age range: 18–69 year (ethical approval 05/Q0106/74). A set of 1564 samples was selected from the 3622 samples recruited based on sex and geographical region (to reproduce the distribution of the samples of the 1958 Birth Cohort) for use as common controls in the WTCCC study. Men were 48%.
Exploratory phase of phenotype optimization
The exploratory phase of the analysis involved identifying a subset of the bipolar cases in which there was enrichment of the association signal at the index SNP, rs783021. The following subsets of bipolar cases were considered: (1) research diagnostic criteria (RDC) bipolar I disorder or manic disorder (N=1418); (2) RDC bipolar II disorder (N=171), (3) RDC SABP (N=279); (4) Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSMIV)28 bipolar I disorder (N=1594); (5) DSMIV bipolar II disorder (N=134); (6) DSMIV SABP (N=98); (7) age of onset of impairment due to affective illness <20 years (N=484); (8) lifetime occurrence of rapid cycling of mood episodes (that is, four or more episodes per year, N=231); (9) lifetime occurrence of psychotic symptoms (N=1225); (10) lifetime occurrence of psychotic symptoms in at least 50% of episodes of mood disorder (N=316); (11) lifetime occurrence of predominantly mood-incongruent psychotic features (defined as >19 on the Bipolar Affective Disorder Dimension Scale (BADDS) incongruence scale;29 N=496). We identified the subset showing optimal signal enhancement using forward step-wise logistic regression as implemented in SPSS version 12.0.1, as follows. We considered only the BD cases and created a genotype–phenotype data set representing the individual alleles at rs783021 (that is, the data set contained two entries for each individual with the first entry being for one of the two alleles at the index SNP and the second entry being for the second allele). The phenotype subsets were represented using binary variables (case being member of subset was designated by ‘1’, otherwise case was assigned ‘0’). All binary phenotype variables were considered within a forward step-wise logistic regression model to predict the risk allele. Only one subset (RDC SABP) was retained in the logistic model as being a significant predictor of the risk allele. Note that this exploratory phase was undertaken prior to, and was independent of, the subsequent hypothesis testing.
Polymorphisms used in analyses
The WTCCC data set comprised 469 557 SNPs distributed across the genome. For the hypothesis testing analyses reported here we selected SNPs for analysis that (1) had excellent genotyping quality, and (2) tagged common genetic variation at or near genes-encoding GABAA receptors.
Selection of SNPs with high genotyping quality
For the current analyses, we selected only SNPs that had a minor allele frequency of at least 5% in our total sample and met stringent levels of genotyping quality. We used the following quality filter for inclusion of SNPs: (1) call rate >99.5% in WTCCC BD cases and controls, (2) Hardy–Weinberg P-value >0.001 in cases, (3) Hardy–Weinberg P-value >0.01 in controls, (4) good clustering of genotypes on visual inspection of cluster plots (as described in WTCCC paper). We have demonstrated that SNPs meeting these criteria showed a very high level of genotype agreement with genotypes scored independently in our laboratory using the Sequenom platform (of over 67 000 genotypes typed for 140 SNPs, we found 99.95% agreement; data not shown).
Selection of SNPs tagging common variation in region of genes-encoding GABAA receptors
The locations of SNPs and the reference sequence of genes-encoding GABAA receptors were determined using the Golden Path (NCBI build 35: http://genome.ucsc.edu/). There are 19 known GABAA receptor genes that are arranged in eight distinct chromosomal locations, with several of the genes being arranged in clusters. For each of these chromosomal locations, we selected for analysis typed SNPs meeting our quality filters that were located within the reference sequence of a GABAA receptor gene or that lay within 20 kb (upstream or downstream) of a GABAA receptor gene. To reduce the redundancy of information, we selected a set of SNPs that captured the common genetic variation within our sample. This reduced the number of SNPs examined that has the beneficial effect of reducing multiple testing (of most relevance for the logistic regression analyses). The reduced redundancy is also desirable for set-based analyses30 (see below). We selected this subset using the Tagger option of HaploView version 3.3231 with the requirement for variation with minor allele frequency >5% to be captured at r2>0.95. For one of the genes (GABRD) there were no typed SNPs in the data set. In the reduced set of markers, we used in the analysis, there was a total of 223 SNPs distributed across 18 genes as follows:1 4p12–13 cluster: GABRB1 (24 SNPs), GABRA4 (9 SNPs), GABRA2 (8 SNPs), GABRG1 (5 SNPs); 5q31–q25 cluster: GABRB2 (25 SNPs), GABRA6 (6 SNPs), GABRA1 (8 SNPs), GABRG2 (11 SNPs); 15q11–q13 cluster: GABRB3 (27 SNPs), GABRA5 (10 SNPs), GABRG3 (32 SNPs); Xq28 cluster: GABRQ (3 SNPs), GABRA3 (9 SNPs), GABRE (5 SNPs); 6q15 GABRR1 (12 SNPs), GABRR2 (9 SNPs); 5q35.1: GABRP (10 SNPs); 3q11.2: GABRR3 (12 SNPs).
Testing association at a single SNP
The Armitage trend test was used to assess association at each SNP, by comparing genotype distributions in cases vs controls using the association model option within the analysis package, PLINK version 0.99.32 We also calculated allelic ORs and their 95% CIs from the 2 × 2 contingency tables of allele counts.
Testing for additional evidence of association at genes in the 4p12 cluster while allowing for the specific association signal at the index SNP, rs7683021
At the 4p12 cluster, logistic regression was used to confirm the presence of additional independent evidence of association over and above that resulting from association with the index SNP rs7683021. For logistic regression analyses, we used the forward step-wise option of SPSS version 12.0.1 and compared logistic models that included only rs7683021 with models that included rs7683021 and an additional SNP at the 4p12 cluster. Correction of the significance of model improvement was made by Bonferroni adjustment (that is, by multiplying the P-value by the total number of SNPs, excluding rs7683021, at the gene examined).
Testing for gene-wide significance for SNP association at genes outside the 4p12 cluster
An empirical P-value was determined for the best SNP association in each gene, allowing for all SNPs tested, by using the set-based analysis option within PLINK32 with the set-max option set to 1 and with 100 000 permutations. (We discussed the analyses with Dr Purcell, developer of PLINK, to ensure that the version of PLINK used correctly implemented the set-based analyses). At the 15q cluster, logistic regression was used in a manner consistent with its use for the 4p12 cluster to allow for the effect of the GABRB3 SNP, rs890319 and test the independent significance of rs17561681 in GABRA5.
Testing overall statistical significance of association at GABAA receptor gene SNPs—excluding the haplotype block with index SNP
To test the support in our data set for association over the set of SNPs examined, we used set-based analysis30 as implemented in PLINK32 using the default options and with 1 000 000 permutations. We omitted from this analysis, the five SNPs that are within the haplotype block (as determined from our data set using HaploView version 3.3231) that includes the index SNP, rs7683021.
Population attributable fraction
We estimated population attributable fraction, AF, using the formula, AF=f(1−OR)/(1+(1−OR)) where f is the population frequency of the risk allele and OR is the estimated allelic OR between cases and controls for the risk allele.33 The CIs for AF were estimated assuming that population allele frequency was equal to that in the controls.
Canonical correlation analysis
We used canonical correlation analysis34 as implemented in SAS 8.02 (PROC CANCOR) to attempt to further refine the genotype–phenotype relationship within the RDC SABP sample (N=279). The genetic variables in the analysis were the genotypes at each of the five SNPs rs3934674, rs6414684, rs890319, rs17561681, rs854579. The phenotype variables used were binary (present/not present) measures of the following variables (further details of definitions used for ratings are available on request from the corresponding author): male sex; lifetime occurrence of mood instability; lifetime occurrence of marked mood fluctuation; lifetime occurrence of alcohol abuse; family history of major mood disorder in first or second-degree relative; family history of psychotic illness in first- or second-degree relative; onset of illness <20 years age; onset of illness >30 years age; presence of psychotic features in at least 50% of mood episodes; definite lifetime mood incongruence of psychotic features; definite incapacitating manic episode; definite major depressive episode; definite incapacitating major depressive episode; lifetime occurrence of postnatal mania; lifetime occurrence of rapid cycling; lifetime occurrence of suicidal ideation; lifetime occurrence of suicide attempt; objective good response to lithium; episodic course of illness; chronic course of illness; lifetime occurrence of features of disorganization; lifetime occurrence of persecutory delusions; sudden onset of first episode of illness; lifetime occurrence of auditory hallucinations and lifetime occurrence of panic episodes.
In our initial exploratory analysis using logistic regression, we found that, of 11 clinical phenotypic subsets considered, the subset of 279 bipolar cases that met Research Diagnostic Criteria23 for SABP showed the strongest evidence for association with the risk allele at the index SNP, rs7680321 (Tables 1a and b; OR=1. 80 (1.37–2.37); χ2=21.34, 1 d.f., P=3.8 × 10−6). As can be seen in Table 1b, the association signal is substantially less strong in the subset of cases meeting criteria for DSMIV SABP or the larger subset of cases with predominantly mood-incongruent psychotic features. The RDC SABP subset differed significantly from the remaining 1589 BD cases (χ2=7.38, 1 d.f., P=0.0066) in which the association signal at this polymorphism was attenuated (OR=1.26, CI: 1.08–1.47; χ2=9.06, 1 d.f.; P=2.6 × 10−3). It is important to stress that this exploratory phase of phenotype refinement was undertaken prior to, and independently of, the subsequent hypothesis testing.
We next sought additional evidence for association at GABAA receptor genes in these 279 SABP cases. To make this test independent of the index SNP (and therefore independent of the associated prior multiple testing), we excluded from our analysis all SNPs within the GABRB1 haplotype block containing rs7680321 because they are highly correlated with the index SNP.
In total, 5 of the 18 genes tested showed evidence independent of the index signal for association at gene-wide levels of statistical significance: GABRB1 (P=0.0039), GABRA5 (P=0.0024), GABRB3 (P=0.0107), GABRA4 (P=0.013), GABRR3 (P=0.0439; Table 2). For each gene the significance level has been corrected for all SNPs examined within that gene. We also obtained an experiment-wide empirical significance for association across the total set of 220 SNPs (excluding the index block) across the GABAA receptor genes using a permutation-based analysis (empirical P=6.6 × 10−5, with 1 000 000 permutations). This significance level takes account of the multiple SNPs tested across all genes. We observed no evidence for statistical interaction between the risk alleles, although it is important to recognize that our sample is not well powered to detect interactions.
We failed to find association at GABAA receptor genes when the control sample was compared to the 1589 bipolar cases not meeting criteria for RDC SABP (Table 3). Furthermore, we found no evidence of association when, using the same methodology and genotyping platform, we examined the set of SNPs at these genes within our sample of 476 white UK cases meeting DSMIV28 criteria for schizophrenia (Table 3).
Canonical correlation analysis did not reveal any significant relationship between genotype and phenotype variables or subgroups of these variables (that is, first canonical correlation coefficient not significant).
Our data provide strong statistical support for the involvement of GABAA receptor genes in susceptibility to a component of the bipolar mood phenotype and point to a relatively specific effect on a form of bipolar spectrum illness meeting RDC criteria for SABP. Such cases, in addition to clearcut episodes of mania, display psychotic symptoms (delusions and/or hallucinations) that are not easily understood as being the result of extreme mood change and that are often seen also in individuals diagnosed with schizophrenia. We note that the category of RDC SABP is itself a clinically heterogeneous category (albeit substantially less heterogeneous than the BD sample as a whole). However, we did not find any further clinical subdivision of this category that usefully refined the genetic signal in our data set.
The exploratory, phenotype refinement phase of our analysis identified the RDC SABP diagnostic subset of the BD cases as being of particular interest in the context of association at the index polymorphism in GABRB1. It is important to recognize that this phase of analysis was data-driven and considered a range of phenotypic subsets. It was not based upon a specific prior hypothesis about the genetic relationship between mood and psychotic disorders. This contrasts with the hypothesis-based approaches of researchers who have used occurrence of psychosis in an attempt to reduce heterogeneity in BD (for example, Potash et al.10 and Park et al.11) and to seek loci that overlap between BD and schizophrenia (for example, Maziade et al.12).
An important strength of our study is that, because our hypothesis testing was independent of the initial exploratory procedure, the results we have presented do not require correction for examining either multiple phenotypes or multiple biological systems. Neither do our hypothesis-driven analyses require the extremely stringent levels of statistical significance needed to assess the discovery-oriented findings from a genome-wide association study.4, 35 Moreover, the significance level should be interpreted against the background knowledge of several lines of evidence implicating GABAA receptors in psychiatric illness. Thus, the strong statistical support within the context of the substantial prior probability helps to provide confidence in the validity of our findings.
A measure of the population level importance of a risk factor is provided by the attributable fraction33 that can be interpreted as the proportion of cases in the population that could, in principle, be avoided if the risk of illness for those with risk alleles could be reduced to that of those without risk alleles. In our sample, estimates of population attributable fraction were in the range 10–20% for several of these risk alleles (Table 2). This suggests that variation at GABAA receptor genes make an important contribution to the burden of this disease phenotype in the population.
None of the associated polymorphisms is predicted to cause a change in the amino-acid sequence of an encoded receptor protein, and there are no known coding variants that are sufficiently common to explain the observed association (by linkage disequilibrium). Thus, it is likely that the associations reflect pathologically relevant variants that modify gene expression and, hence, the subunit composition, and thereby the physiological and pharmacological properties, of the GABAA receptors. It will require other research approaches to confirm which subunit(s) are most relevant to bipolar illness and to identify the mechanisms involved.
Our findings have several implications. First, within our sample and at least with respect to GABAA receptor (dys)function, the cases meeting criteria for RDC SABP are more biologically homogeneous than our total sample of bipolar cases. This is consistent with emerging molecular genetic evidence for the existence of relatively specific genetic susceptibility for a form of major psychiatric illness that has features of both BD and schizophrenia.27, 36, 37 The RDC and other modern diagnostic criteria in psychiatry were developed on largely descriptive grounds and we consider it most unlikely that the SABP category will map directly onto the underlying biology. We do not believe that ‘schizoaffective disorder’ in general, or RDC schizoaffective disorder in particular, is a neatly defined, discrete, biological diagnostic entity. Our findings do, however, show that it can be useful for the purposes of research (and probably also clinical practice) to identify and classify together sets of cases with such clinical features. Whether, in the long run, this is best achieved by using categories, dimensions or some mixture of the two will require future study. Such further work aimed at refining the relationship between clinical phenotype and genetic risk factors has the potential to help psychiatry move towards a system of classification that relates more closely to underlying pathogenesis.
Second, our findings may help to explain some of the common and clinically important co-morbidity between BP and alcohol abuse, anxiety states and panic episodes,38 as these are disorders in which GABAergic transmission has been robustly implicated.17, 18 Indeed, it may soon be possible to start developing diagnostic classifications that group disorders together according to underlying pathogenesis.13 Such a move is likely to be beneficial for aetiological research as well as clinical management and would signal a shift from the current situation of a purely descriptive approach.
Finally, we note that our findings demonstrate the utility of a data-driven, iterative approach39 in biological studies of psychiatric, and other complex, disorders in which exploratory phenotype refinement is used to optimize an initial biological signal for further independent testing of specific hypotheses about the involvement of genes, proteins or systems. Here we have described use of this approach to perform independent tests within the same large data set. Provided close attention is paid to the comparability of phenotypic measures, the approach can also be used across independent data sets.
Disclosure/conflict of interest
We are indebted to all individuals who have participated in our research. Funding for recruitment and phenotype assessment has been provided by the Wellcome Trust and the Medical Research Council. We are grateful to Dr Shaun Purcell for advice and support in the use of the PLINK analysis software. The genotype analyses were funded by the Wellcome Trust and undertaken within the context of the Wellcome Trust Case Control Consortium (WTCCC). The members of the WTCCC are listed in online supplementary information.
About this article
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)
BMC Psychiatry (2013)