Introduction

Schizophrenia is a severe psychiatric disorder with a lifetime risk of 1%. It is characterized by psychotic symptoms (delusions and hallucinations), apathy, altered emotional reactivity and disorganized behavior.1 Subtle cognitive and behavioral signs are often present from childhood onwards, but the characteristic features generally have their onset in the late teens and early 20s. Outcomes vary, but the typical course is one of relapses followed by only partial remission and a marked reduction in social and occupational functions.

Schizophrenia has been the subject of an extensive series of family, twin and adoption studies (Figure 1), which clearly show variance in population risk is largely the result of genetic factors, with heritability being 80%. Patterns of recurrence in families do not suggest that the generality of schizophrenia can be attributed to a collection of single major genes, and are most compatible with a complex multi-locus model.3, 4 Similar to other disorders with complex inheritance,5 it has been increasingly postulated that most of the genetic variance will be attributable to weak genetic effects as proposed many years ago,6 a hypothesis compatible with meta-analyses of the linkage data.7 Those hypotheses and data do not, however, preclude the existence of highly penetrant mutations in occasional families; indeed, there is now compelling evidence for several such events (St Clair et al.8 and see below).

Figure 1
figure 1

Graph shows the risk of developing schizophrenia in the relatives of a schizophrenic proband. The data are based on the review of studies compiled by Gottesman.2

Given the increasing emphasis by most researchers on small genetic effects, it has been proposed that pure association approaches independent of genetic linkage offer the greatest potential in schizophrenia and other complex disorders,9 as they are better powered to detect such effects. However, the requirement to genotype hundreds of thousands of markers to attain even a modest degree of coverage of common human genomic variation has until recently restricted the application of pure association approaches to studies of functional candidate genes. Until now, the results of such studies have been disappointing, and in general, approaches that have been based on position rather than on functional hypothesis appear to have met with greater success. In this review, our focus is on the results emerging from new approaches rather than going over old ground. Nevertheless, it should be noted that several candidate susceptibility genes have been identified on the basis of an initial report of linkage, and for some of these, for example dysbindin (DTNBP1),10 neuregulin 1 (NRG1)11 and D -amino oxidase activator (DAOA),12 although the evidence is not decisive, there is a substantial amount of support from follow-up studies.13 Moreover, the gene encoding Disrupted in Schizophrenia 1 (DISC1) was also identified based on positional data,15 which was disrupted by a balanced translocation that co-segregates with schizophrenia and other major mental disorders in a single family.8 Although subsequent population studies have not provided much evidence for additional associations with common variation in DISC1, DISC1 seems likely to be a true schizophrenia susceptibility gene, which is a perfect illustration of the desirability of retaining both linkage and association approaches as tools for gene discovery. We will not consider the classic positional candidate genes further in this review, although we note that many potentially important insights and new hypotheses concerning the pathophysiology of schizophrenia have been the gained through their study.13, 14, 16, 17 However, whether these are relevant to the disorder is yet to be confirmed based on evidence supporting their status as susceptibility genes, failing which a successful therapeutic intervention based on these insights is required.

Functional candidate genes

The pathophysiology of schizophrenia is largely unknown but many hypotheses have been proposed. The choice of putative functional candidate genes is therefore almost unbounded by membership of any functional category. The largest number of studies relate to targets that encode proteins involved in neurotransmission, particularly dopaminergic and serotonergic, but it is difficult to think of any neurotransmitter system that has no plausibility, or that has not been the topic of investigation at one time or the other. Reflecting the widely held neuro-developmental hypothesis of schizophrenia, the other popular category of candidate genes is that relating to brain development. Regarding other disorders, the quality of the functional candidate gene literature reflects the evolution of technology and genomic resources of the times, with the vast majority of studies comprising tests of one or two variants in small samples. Even studies based on systematic mutation scanning have focused on subsets of the functional elements of genes, typically exons and promoter regions. Given these considerations, particularly the small genetic effect sizes (odds ratio (OR) <1.1) now thought typical of all but a small proportion of common susceptibility variants (Zeggini et al.18), in retrospect, it would have been more surprising had the results been more reproducible than they have been. Negative findings abound, as do positive findings that have either not been replicated or have seemed insufficiently convincing because there have been few, if any, attempts to replicate them. The failure of candidate gene studies does not, however, invalidate the approach. As better information about the pathophysiology of schizophrenia emerges from the application of large samples to genome-wide association approaches, the application of those same large samples to detailed studies of candidate genes may meet with greater levels of success.

Meanwhile, to what extent any of the functional candidate data represent true positives is sufficiently debatable for there to be a wide spectrum of opinion concerning their interpretation. Nevertheless, there are a number of comments that can be made with considerable confidence. First, in schizophrenia nothing has emerged with support on a par with, for example, the evidence for CFH in macular degeneration,19 APOE in Alzheimer's disease20 or human leukocyte antigen in type I diabetes.21 Second, while most of the reported positives may not withstand testing in adequately powered samples, no functional candidate has yet been excluded from containing common risk alleles, let alone rare low penetrance alleles. Third, if as may be the case (see below), schizophrenia is a highly polygenic disorder, a substantial number of the hundreds of positive findings that are currently not well supported are likely to be true positives.

In the presence of a vast and inconsistent literature, a new approach to schizophrenia research has been the establishment of the SzGene database,22 the goal of which is to collate association data and, where possible, perform systematic meta-analyses. The resource is online and is free to access (http://www.schizophreniaforum.org/res/sczgene/default.asp). On the basis of the April 2007 freeze, which contained data from 1179 publications relating to 3608 variants in 516 genes, the curators of the database performed meta-analyses where data were available from at least four case–control samples. Of 118 variants in 52 genes that met this criterion, 24 variants were nominally (P0.05) associated with schizophrenia. Only two of these were associated at a P<0.001, one was in the dopamine DRD2 receptor while the other was one in the gene TPH1 which is one of two enzymes that encodes tryptophan hydroxylase, the rate limiting enzyme in the synthesis of serotonin, a neurotransmitter whose function is targeted by several second generation antipsychotics.

It may be relevant that both were among those single nucleotide polymorphisms (SNPs) that had been tested in the fewest number of subjects (473 and 879 cases, respectively). Nevertheless, in particular, the variant in DRD2, rs6277, has a strong functional plausibility, in that although it is synonymous, it appears to alter mRNA stability and translational efficiency,23 with the associated allele conferring higher mRNA stability and translational efficiency. The gene itself is also a particularly appealing candidate because the hypothesis that psychotic symptoms arise from excess dopamine is one of the more convincing hypotheses relating to schizophrenia, and this specifically relates to dopamine activity at DRD2 as there is a strong correlation between the therapeutic efficacy of anti-psychotic drugs and their affinity for blocking this receptor.24 Although the most recent data in the SzGene database (Figure 2) continue to support the finding at DRD2 as does a recent meta-analysis of TPH1,25 both genes require additional support in larger samples before the findings can be viewed with confidence.

Figure 2
figure 2

Example screen-shot to show the graphical output from SzGene website. In this example, the graph shows the pooled odds ratio estimates of all published association studies for polymorphism rs6277 in DRD2.

The authors additionally ranked the associated genes using three sets of criteria, namely the amount of evidence, the consistency of the evidence and protection from bias.26 In the associated genes, four SNPs met the most stringent category for each criterion. These SNPs were in the D1 dopamine receptor (DRD1), DTNBP1, TPH1 and methylenetetrahydrofolate reductase (MTHFR). However, a recent meta-analysis that included additional data27 no longer supports the marker in MTHFR, and therefore this gene will not be considered further.

The association at DRD1 together with the data on DRD2 would appear to provide etiological credibility to the dopamine hypothesis of schizophrenia, but it is premature to consider DRD1, a confirmed susceptibility gene. Not a single positive association study has yet been reported for this gene. Although this is not incompatible with a true finding of small effect, in the meta-analysis of Allen et al.22 the statistical support was weak (P=0.037) and would not even be gene-wide significant allowing for other variants at this locus. Moreover, the meta-analysis was based on only 725 cases and 1000 controls. In this context, it should be noted that previously, another dopamine receptor, DRD3 showed somewhat stronger statistical support which survived several waves of meta-analysis only for the significance to disappear as the sample size was increased to >10 000 subjects.28

DTNBP1, which codes for the dystrobrevin binding protein 1 (commonly known as dysbindin), is located on 6p24-22 within a schizophrenia linkage region10 and has been mentioned briefly earlier. The protein is expressed pre- and post-synaptically with high levels of expression in the cerebellum and the hippocampus.29 After the initial report of association,10 a large number of positive studies have been reported. However, there has been little consistency in the specific associated alleles or haplotypes,30 which suggests that if the association is correct, there is considerable variation in the linkage disequilibrium structure at the locus, or that there are multiple functional variants that contribute to the association. The greater consistency evident from the SzGene meta-analysis adds weight to the hypothesis that DTNBP1 is a true susceptibility gene for schizophrenia, although caveats remain concerning the modest support (P=0.003) from the meta-analysis. At a functional level, several of the associated haplotypes are associated with reduced DTNBP1 expression in the post-mortem brain31 compatible with other observations of underexpression of DTNBP1 in schizophrenia.29, 32 How DTNBP1 might be involved in schizophrenia is the topic of intense investigation, but leading hypotheses are that as a component of the biogenesis of lysosome-related organelles complex 1 (BLOC-1), variation in DTNBP1 might perturb pre-synaptic trafficking of neurotransmitter vesicles, in particular, those containing glutamate and dopamine33, 34 or impair internalization of dopamine DRD2 receptors.35

The development of the SzGene database is a welcome and timely addition to the schizophrenia genetics research environment, and represents a major step in facilitating an unbiased evaluation of the extensive genetic literature. The interested reader is therefore directed to a recent review by its curators in which some of its strengths and weaknesses are more extensively discussed.36

Genome-wide association studies

Genome-wide association studies (GWASs) have been widely embraced by those researching the genetics of complex disorders because they incorporate the power to detect small effects with the huge advantage of the positional genetics design, which requires no specific knowledge of pathogenesis. For some phenotypes, GWAS approaches have transformed our knowledge of genetic susceptibility37 and the early evidence for schizophrenia suggest similar gains might accrue.

At the time of writing, there have been six published GWASs of schizophrenia, of which three have been based on DNA pooling and one has been limited to non-synonymous SNPs.38 In the first study based on individual genotyping (178 cases and 144 controls), the authors reported only the strongest result (P=3.7 × 10−7), which was equidistant (350 kb) from two genes, colony stimulating factor 2 receptor alpha (CSF2RA) and short stature homeobox isoform b (SHOX). CSF2RA is a subunit of a cytokine receptor, which regulates granulocytes and macrophages, whereas SHOX is a putative transcription factor that is not present in some people with short stature. The authors targeted CSF2RA and the neighboring interleukin 3 receptor alpha (IL3RA) in a second small sample (71 cases and 31 controls) for sequencing. A number of significant associations were observed across the two genes, but to what extent these relate to the findings from the GWAS stage is unclear. A priori, replication in such a small sample would seem extremely unlikely unless the effect sizes are much larger that at a typical complex disease locus.

The second GWAS with individual genotypes39 was larger (738 cases and 733 controls), but none of the findings achieved genome-wide significance,40 and in the absence of follow-up data it is impossible to draw conclusions regarding any specific locus. It may be relevant that the sample was ethnically heterogeneous, which may have adversely impacted upon power.

The only other GWAS published with individual genotypes41 was based on an initial GWAS of 479 UK cases and 2937 UK controls, with sequential follow-up of loci surpassing a threshold P <10−5 in up to 6829 cases and 9897 controls from Europe, the United States, Australia, Israel, Japan and China. Of 12 loci followed up, strong, replicated support (P<5 × 10−4) was observed in 3 of the loci, whereas another two were nominally significant. The distribution of test statistics for the 12 follow-up SNPs was highly improbable (P=9 × 10−8) under the null hypothesis that these were a random set of SNPs unrelated to disease, strongly supporting the existence of true associations. The best supported locus was in the vicinity of a putative transcription regulator (zinc finger protein 804A or ZNF804A). Although the evidence (P=1 × 10−7) for association fell short of genome-wide significance (P=5 × 10−8, Dudbridge and Gusnanto40), this threshold was surpassed when the affected phenotype was extended to include bipolar disorder (P=9 × 10−9), a phenotype for which there is prior evidence for shared genetic risk with schizophrenia.42 Thus, the ZNF804A locus is very likely a true susceptibility locus for schizophrenia and bipolar disorder, and the main challenge now is, under the assumption that it is a transcription factor, to understand what targets are regulated and which of these are relevant to pathogenesis.

It has been argued that as schizophrenia is associated with reduced fecundity, common variants may not be relevant to its etiology, the idea being that selection pressure would remove such variants from the population. The identification of even one common susceptibility variant for schizophrenia now allows this hypothesis to be refuted, and it seems virtually inevitable that because one common risk variant of small effect escapes purification (the OR for ZNF804A in the replication samples was 1.09), there will be many others. Power analysis of our study leads to similar conclusions. As we only had power of <0.001 to detect an effect of this size in any specific gene, it is probable this low power was offset by the existence of hundreds of genes of similar effect size, one of which we detected.

Three GWASs38, 44, 45 were based on genotyping of pooled DNA samples, whereby DNA from cases are mixed together in a single quantitative assay as are controls and the differences in allele frequency are estimated. All three identified interesting rather than definitive findings, but of particular note, the study of Shifman and colleagues provided strong evidence for a female-specific association between reelin (RELN) and schizophrenia, a finding they were able to replicate in samples from Europe and the United States (meta-analysis P=8.8 × 10−7, uncorrected for models and gender-specific tests). Although awaiting further replication, the association with reelin is compatible with the widely held hypothesis of schizophrenia as a neuro-developmental disorder, reelin being involved in corticogenesis and also being associated with an autosomal-recessive form of lissencephaly.46

The total combined sample sizes reported thus far for which there are GWAS data based on individual genotyping are still small, numbering fewer than 1500 cases, and power to capture small genetic effects is still limited. Nevertheless, the findings provide cause for optimism that larger scale endeavors will meet with success in identifying additional loci. One caveat with respect to these larger endeavors is that unlike almost all other medical disorders, the diagnosis of schizophrenia is unsupported by validating laboratory tests. As a result, it is at least conceivable that efforts to extract more power by combining multiple samples might be hampered by increases in the amount of heterogeneity present.47 To what extent this concern will operate in practice is a real, but open question.

Copy number variants

It is now clear that copy number variants (CNVs) are ubiquitous in the population, a conclusion that stands even though early studies appear to have overestimated the physical size of the sequences contained within each CNV.48 This has led to the proposition that, in addition to their known role in a number of fairly uncommon syndromes, CNVs might contribute to more common phenotypes. Among these, neuro-developmental disorders including autism and mental retardation (MR) have been the most prominent focus of CNV research, and the first generation of genome-wide CNV scans have shown that de novo CNV mutations of about 100 kb or more contribute to about 10% of cases of these disorders (for example, Sebat et al.49 and Autism Genome Project Consortium50).

The idea that a proportion of schizophrenia might be caused by CNVs has been around for some time, indeed until recently, this was the only class of DNA variation where a pathogenic role was virtually undisputed. Thus, the 22q11DEL gives rise to dramatic increase in the risk of schizophrenia,51, 52 along with the other variable features of the velo-cardio-facial syndrome. To date, several specific schizophrenia risk genes within the deleted region have been suggested, but as yet none are supported by multiple independent replications.

There is now compelling evidence that additional CNVs contribute to schizophrenia. Kirov and colleagues43, 44 detected a single de novo event after scanning 93 subjects; a duplication of 1.4 Mb of chromosome 15. They also observed a rare transmitted deletion that affects part of neurexin 1 (NRXN1). Although not de novo, it was also carried by an affected sibling and was of particular interest as deletions of NRXN1 had been observed in both autism and Mental Retardation (MR) (for example, Autism Genome Project Consortium50). Moreover, an NRXN1 CNV has been independently observed in a schizophrenic proband,53 and a large consortium has subsequently shown an increased rate of CNVs spanning exons of NRXN1 in schizophrenia.54 Thus, there is now strong evidence that reduced function of NRXN1 increases risk of a number of neuro-developmental disorders, including schizophrenia. Interestingly, the de novo duplication reported by Kirov et al.43, 44 also included the gene encoding amyloid-β A4 precursor protein-binding (APBA2), a protein with which NRXN1 interacts. Together, NRXN1 and APBA2 play a role in synaptic development and function, processes that are likely fundamental to schizophrenia etiology.

De novo CNV events were also uncommon in a study of childhood onset schizophrenia53 in stark contrast to another study where a comparison of the de novo rate in cases and controls revealed a rate of about 10% in cases and <2% in controls.55 The design of the study of Kirov and colleagues is very likely to underestimate the de novo rate, but it is not evident why the other two studies differ.53, 55 Which is the more representative remains to be elucidated.

Although the rate of de novo CNVs in schizophrenia is unresolved, three studies have now shown that the total burden (number of events or number of genes spanned by those events) of rare CNVs is increased in the disorder.53, 56, 57 Moreover, two of these have been large enough to identify association with specific rare loci as well as with total burden, and here the findings are in striking agreement.56, 57 The same two deletions were associated with schizophrenia in both studies, one at 1q21.1 (position 142.5–145.5 Mb), the other at 15q13.2 (28–31 Mb). The CNVs were rare and had strong effects on disease risk, with ORs of 6.6 and 17.9, respectively. This is now becoming a familiar pattern, and each CNV locus has also been associated with other phenotypes—in particular, the 1q21.1 locus is associated with microcephaly, various physical abnormalities and neuro-developmental abnormalities, including MR, autism and attention deficit hyperactivity disorder (ADHD).58 Each CNV spans multiple genes that are clearly high-priority candidate genes for the disorder, although it remains to be proven that more subtle variation within genes mapping to these regions contributes to disease susceptibility.

Despite the strong evidence for the involvement of specific CNV loci, it is clear that none of them is sufficient to cause disease, as each has been observed in individuals with no history of psychosis. Thus, although at least some of the loci are of fairly substantial effect, expression of disorder might still require environmental contributions or additional risk variants. Similar uncertainties surround the interpretation of the excess burden analysis. Those results could, in principle, be attributed to a number of rare highly penetrant CNVs or to more subtle effects. Finally, it is important to appreciate that we are as yet uncertain to what extent CNVs contribute to disorder. Currently, the specific associated CNVs are involved in only about 1% of cases, although this figure will probably increase as better-resolution platforms are deployed. These uncertainties aside, the early work shows CNV analysis has the potential to pinpoint novel pathogenic mechanisms and provides a strong rationale for further work in this area as a complementary approach to more traditional SNP-based analyses.

Conclusions

Association studies in schizophrenia have evolved in parallel with genome analysis technology, and mirror developments in complex disease genetics as a whole. Until recently, although there have been a number of findings particularly based on positional approaches for which the evidence is credible, and which are very likely to include true susceptibility genes, there have been no findings where the support is unequivocal and which point to specific alleles or haplotypes. Some of the more robust findings for other disorders doubtless reflect the fact that for some loci, the effect sizes are at the larger end of the spectrum seen in common disease, but most of the success is recent and has come from GWAS applied to large patient and control samples and, of vital importance, follow-up analyses in even larger samples. Such approaches are currently being applied to schizophrenia and early results are promising. The application of genome-wide approaches has also revealed that sub-microscopic chromosomal abnormalities play a role in at least some cases, and this figure will probably increase with the use of technologies with higher resolution.48 The foregoing suggests that efforts in the genetics of schizophrenia are bearing fruit, and there are strong grounds for being optimistic that research in these disorders is poised to benefit further from the same advances in genomics and post-genomics that are also being successfully applied to other common non-psychiatric disorders. The success of this work will depend among other things on the assembly of large, well-phenotyped patient samples, effective collaboration and sharing of patient resources, and the ability to handle and analyze increasingly large and complex data sets. Moreover, as more risk loci are identified, there will be an increasing imperative to focus on translating genetic findings into clinical benefit through a greater understanding of pathogenic mechanisms, improved classification and the development of new interventions.