The Wellcome Trust Case Control Consortium (WTCCC)1 recently published a landmark study for genetics of common diseases, reporting genome-wide association (GWA) results in ∼2000 Caucasian patients for each of seven common diseases and 3000 shared controls, each genotyped with an Affymetrix 500K platform of evenly distributed markers. Exceedingly rigorous quality control and statistical analyses were performed. Multiple preceding association hypotheses were supported (with significance defined as nominal P-value <10−7) in several common diseases. In bipolar disorder (BD), there was only one association that met the criterion, and it did not correspond to previous association reports.
The hypothesis tested in a GWA is that somewhere in the genome, with detectable linkage disequilibrium to the tested markers, there is at least one allele or genotype that is significantly associated with a studied disease after correction for the number of tests performed. A key assumption of GWA is that a priori power to detect association is not too small to diminish credibility of positive results (or to make notable an absence of positive results). Altshuler and Daly,2 commenting on a priori power to detect association in the WTCCC study, emphasize that single-nucleotide polymorphisms (SNPs) with modest effect (modest odds ratio (OR)) are unlikely to be detected in data sets of the WTCCC size, even though some have in fact been detected.
If we assume a significance threshold at P<10−7, linkage disequilibrium between marker and disease allele with r2=0.8, and unscreened controls, the WTCCC study had 80% power to detect ORs ⩾1.6, 1.4 and 1.5 for allele frequencies of 0.1, 0.5 and 0.75 respectively. To detect ORs ⩾1.2, it would take ∼12 000, 5000 and 7000 cases (with 1.5 times as many controls) for the same allele frequencies. For significance threshold of P<10−6, and ORs ⩾1.2, 80% power would require ∼11 000, 4000 and 6000 cases for these allele frequencies. Given the paucity of findings so far in BD, the implication of these calculations is that even larger sample sizes than were assembled by the WTCCC will be needed to detect genes with modest effect.
Modest risk ratios can result from multiple disease genes and allelic heterogeneity (more than one disease-associated allele at the same locus or haplotype), since each allele of each gene must generally be detected separately. Epistasis and non-analyzed environmental factors may also reduce the OR of a true disease allele.
Some types of hypothesis have not, so far, been tested in the WTCCC data. The first type includes multi-marker hypotheses (including association with haplotypes and gene–gene interaction), copy number variation, subphenotype hypotheses and uncommon/rare-variants association.
A debate over whether uncommon alleles may be expected in common disease has gone on for some years.3, 4, 5 Nonetheless, the field would be unwise not to consider methods for detecting them. Multi-marker haplotypes6 or analysis of patterns of segmental sharing7 may succeed in detecting association with some uncommon disease alleles, but it would appear that resequencing may be required to detect at least some and possibly most uncommon disease alleles and certainly for detecting rare-variants associated with disease.8
Second, hypotheses based on linkage, candidate genes and other non-genome-wide hypotheses may be tested using data from a GWA experiment, but they are logically and statistically different from GWA. There are legitimate assumptions based on pre-existing linkage, association and aneuploidy results that would generate association experiments with considerably smaller probability spaces than a GWA experiment (that is, there is no probability of outcome defined in parts of the genome not included in the hypothesis). The nominal P-values required for statistical significance would be, therefore, less stringent than for a genome-wide experiment. Nonetheless, one must always be cautious about hypotheses that could conceivably be tailored to known results after an experiment. Repeated replication, independent biological corroboration and meta-analyses are more needed for a regional or gene-based hypothesis than for a straightforward GWA result.
Bipolar disorder in the WTCCC has one significant GWA result, on 16p12, at SNP rs420259. This result is not terribly distant (11 Mb) from the peak of a non-parametric linkage report of the NIMH collaborative study9 and is 18 Mb from the peak for a parametric linkage report.10 However, there are multiple linkage scans that do not show this linkage. The 16p12 region does not appear positive in the major meta-analyses of BD linkage.11, 12, 13 Lack of persuasive linkage evidence does not invalidate an association finding, of course.
Recently, Baum et al.,14 in the McMahon lab at NIMH Intramural (USA), reported a 2-stage GWA study of BD, by pooling Caucasian samples and controls from an NIMH study in the first stage. Single-nucleotide polymorphisms with nominal P<0.05, OR >1.4 and near a known gene were tested on German cases and controls in a second stage. Genotyping was on an Illumina platform, whereas the WTCCC used an Affymetrix platform. The total numbers of cases and controls, and the Caucasian samples, were less than but comparable to the WTCCC BD and control samples. Single-nucleotide polymorphisms that showed association findings of P<0.05 in both sets of pooled case and control samples were studied with individual genotyping. Individual genotyping association results in Baum et al. were significant (P<10−7) for rs1012053, which is in the DGKH gene on chromosome 13q.14.
It is disappointing to note the lack of correspondence between the results of the two published GWA papers. None of the alleles selected for individual genotyping based on the pooling experiment (in the Baum et al. study) have suggestive P-values at or very close to the significant or suggestive SNPs in the WTCCC study. To look for overlap, we took the P-values of the two studies as three samples (NIMH (pooled), German (pooled) and WTCCC), filtered P<0.05 in all three samples and calculated a combined probability using Fisher's χ2-calculation. Wherever possible, imputed genotypes from the WTCCC study (with its Affymetrix platform) were used to give genotypic and allelic frequencies for corresponding SNPs in the Baum study (with its Illumina platform). No combined values had P<10−6. The two ‘best’ values were for rs10791345 and rs4806874, with P=5 × 10−6 and 9 × 10−6 respectively. In the blogosphere (http://www.genetics.med.ed.ac.uk/blog/, http://www.polygenicpathways.co.uk/Bipolargenes.html), the two studies are interpreted to show considerable overlap, but this is not statistically correct. In the Schizophrenia Research Forum (http://www.schizophreniaforum.org/res/sczgene/default.asp), the ‘best’ agreement was reported to be for the DFNB31 gene on chromosome 9, where the G allele of rs10982246 has a P-value of 2.6 × 10−6 in the WTCCC study (using a trend test), and the G allele of rs942518 (only 22 kb away) has a P-value of 0.0001 in the combined samples of the Baum et al. study. The imputed WTCCC G allele of rs942518 gives an association P-value of 0.43, however, so these data cannot be counted as a consistent association between the two studies.
The design differences between the WTCCC and Baum et al. studies could have contributed to the discrepant outcomes. If this is the case, then the ongoing GWA of thousands of individuals from the NIMH samples in the US should come out with more similar results to WTCCC. These ongoing studies are based on the same sample source (NIMH) as Baum et al., but have larger sample sizes, and will perform individual genotypes on a comparable Affymetrix platform to the WTCCC.
But we suspect that the lack of consistent BD associations is due to the nature of the underlying genes. As noted above, there are a number of genetic analyses that have not yet been performed, including set-based analyses.7 For discovery of individual low-OR loci, the only systematic solution would be much larger samples, according to the discussions above and in Altshuler and Daly.2 For smaller samples, true positives may be detected but not replicated in other, similar-sized samples, and this may have led to the discrepancies between the two BD GWA publications.
For uncommon and rare-variants association, extensive resequencing in selected regions may be required. It is also possible that phenotypic refinements are needed, and that these may be generated by multivariate analysis of clinical data already present in the NIMH databases, or by biological studies of new individuals who volunteer for these large-scale samples.
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14 000 cases of seven common diseases and 3000 shared controls. Nature 2007; 447 (7 June 2007): 661–683.
Altshuler D, Daly M . Guilt beyond a reasonable doubt. Nat Genet 2007; 39: 813–815.
Pritchard JK . Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 2001; 69: 124–137.
Pritchard JK, Cox NJ . The allelic architecture of human disease genes: common disease-common variant…or not? Hum Mol Genet 2002; 11: 2417–2423.
Reich DE, Lander ES . On the allelic spectrum of human disease. Trends Genet 2001; 17: 502–510.
Lin S, Chakravarti A, Cutler DJ . Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat Genet 2004; 36: 1181–1188.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am J Hum Genet 2007; 81: 559–575.
Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA et al. An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nat Genet 2005; 37: 1320–1322.
Dick DM, Foroud T, Edenberg HJ, Miller M, Bowman E, Rau NL et al. Apparent replication of suggestive linkage on chromosome 16 in the NIMH genetics initiative bipolar pedigrees. Am J Med Genet 2002; 114: 407–412.
Ekholm JM, Kieseppa T, Hiekkalinna T, Partonen T, Paunio T, Perola M et al. Evidence of susceptibility loci on 4q32 and 16p12 for bipolar disorder. Hum Mol Genet 2003; 12: 1907–1915.
Badner JA, Gershon ES . Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry 2002; 7: 405–411.
McQueen MB, Devlin B, Faraone SV, Nimgaonkar VL, Sklar P, Smoller JW et al. Combined analysis from eleven linkage studies of bipolar disorder provides strong evidence of susceptibility loci on chromosomes 6q and 8q. Am J Hum Genet 2005; 77: 582–595.
Segurado R, De tera-Wadleigh SD, Levinson DF, Lewis CM, Gill M, Nurnberger Jr JI et al. Genome scan meta-analysis of schizophrenia and bipolar disorder, part III: Bipolar disorder. Am J Hum Genet 2003; 73: 49–62.
Baum AE, Akula N, Cabanero M, Cardona I, Corona W, Klemens B et al. A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder. Mol Psychiatry 2007 (e-pub ahead of print).
About this article
Cite this article
Gershon, E., Liu, C. & Badner, J. Genome-wide association in bipolar. Mol Psychiatry 13, 1–2 (2008). https://doi.org/10.1038/sj.mp.4002117
This article is cited by
Dopaminergic drug treatment remediates exaggerated cingulate prediction error responses in obsessive-compulsive disorder
Brain responses to different types of salience in antipsychotic naïve first episode psychosis: An fMRI study
Translational Psychiatry (2018)
Abnormal reward prediction-error signalling in antipsychotic naive individuals with first-episode psychosis or clinical risk for psychosis
Does the oxytocin receptor polymorphism (rs2254298) confer 'vulnerability' for psychopathology or 'differential susceptibility'? insights from evolution
BMC Medicine (2012)
Strong genetic evidence for a selective influence of GABAA receptors on a component of the bipolar disorder phenotype
Molecular Psychiatry (2010)