Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain, providing biological plausibility for the findings. Many findings have the potential to provide entirely new insights into aetiology, but associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that have important roles in immunity, providing support for the speculated link between the immune system and schizophrenia.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Core funding for the Psychiatric Genomics Consortium is from the US National Institute of Mental Health (U01 MH094421). We thank T. Lehner (NIMH). The work of the contributing groups was supported by numerous grants from governmental and charitable bodies as well as philanthropic donation. Details are provided in the Supplementary Notes. Membership of the Wellcome Trust Case Control Consortium and of the Psychosis Endophenotype International Consortium are provided in the Supplementary Notes.
Credible causal schizophrenia SNPs, coding variants, and eQTLs. Worksheet 1: Coding variants: Index SNP is the schizophrenia associated SNP defining the schizophrenia associated region. Coding variant, R2, and gene denotes a coding credible SNP and the R2 with the index SNP, and the gene containing the coding variant. CHR (chromosome), BP (base position), A1A2 (alleles 1 and 2), frequencies of allele 1 (FRQ_A1), INFO (imputation quality) and P (P-value) refer to the index SNP in the discovery GWAS. P (incl rep) refers to replication P value for index SNP. Worksheets 2 and 3: Brain and blood eQTL: Credible SNP denotes a SNP within the schizophrenia credible set (defined in supplementary material) that is also a cis eQTL (transcript within 1Mb, PeQTL<1x10-4). P(cSCZ) is the schizophrenia (discovery) GWAS association P-value for the credible SNP. The Prob(cSCZ) is the normalized probability of the credible variant being causal for schizophrenia. N(cSCZ) is the number of variants in the credible set of schizophrenia variants within a region spanned by eQTLs at P<10-4. eQTL SNP is the most significant expression associated SNP in the region for the gene in next column (N.B., many regions have an eQTL for more than 1 gene). eQTLgene is the gene that is linked to the eQTL SNP. P(eQTL) is the association P-value between the eQTL SNP and the eQTLgene in the previous two columns. Prob(eQTL) is the normalized probability that the eQTL SNP is also the causal SNP for schizophrenia (high values mean higher probability of being causal). eQTLcumsum is the cumulative sum of the probability of all SNPs into the region, up to the inclusion of the max eQTL in locus ordered by probability of being the functional SNP. PeQTL(SCZ) is the schizophrenia association P-value for the eQTL SNP. R2 (cSCZ/ eQTL) is the R2 between the credibleSNP and eQTL SNP. Associations to schizophrenia that are plausibly explained by an eQTL are in bold. Separate worksheets provide information on brain and blood eQTL analyses. Distinct loci are alternately shaded/unshaded.
Pathway analyses by ALIGATOR and INRICH. Enrichment analyses using ALIGATOR and INRICH were performed as described in Supplementary Text. Pathway ID denotes the pathway source: GO (Gene ontology; http://www.geneontology.org), KEGG (Kyoto Encyclopaedia of Genes and Genomes; http://www.genome.jp/kegg), PAN-PW (PANTHER; http://www.pantherdb.org/pathway), Reactome (http://www.reactome.org/download), BioCarta (downloaded from the Molecular Signatures Database v4.0 http://www.broadinstitute.org/gsea/msigdb/index.jsp), MGI (Mouse Genome Informatics; http://www.informatics.jax.org), and NCI pathways (NCI: http://pid.nci.nih.gov).
Risk Profile Score Analyses. Risk Profile Score (RPS) analysis was performed as described in supplementary text. RPS datasets tab provides the name given for sample in which RPS was performed (target label) and the datasets included (defined in Supplementary Table 1). The GWAS data used to define the risk alleles for RPS analysis represents the remaining GWAS samples. For various GWAS P-value thresholds (denoted PT), we calculated: 1) the significance of the case-control score difference was analyzed (P tab), 2) the proportion of variance explained (Nagelkerke’s R2, R2 tab), 3) the proportion of variance on the liability scale explained by RPS (h2I tab) with standard error in brackets, 4) area under the receiver operator characteristic curve (AUC tab), and 5) odds ratio for the 10th RPS decile group compared with lowest decile with confidence interval in brackets. Ncases tab denotes number of cases in each target set.
RPS analysis of MGS sample. Risk Profile Score (RPS) analyses was performed using the MGS dataset as target, using three distinct published results for SCZ GWAS, from the (1) ISC (2009) study of 2615 cases and 3338 controls11 (denoted ISC columns) (2) PGC1 (excluding MGS, denoted PGC1 columns) with 9320 cases and 10228 controls22, (3) current meta analysis (excluding MGS, denoted Current columns) with 32838 cases and 44357 controls. For various GWAS P value thresholds (denoted PT), we calculated 1) the significance of the case-control score difference was analyzed (P tab) 2) The proportion of variance explained (Nagelkerke’s R2, R2 tab) 3) The proportion of variance on the liability scale explained by RPS (h2I tab) with standard error in brackets 4) Area under the receiver operator characteristic curve (AUC tab) and 5) Odds ratio for 10th RPS decile group compared with lowest decile with confidence interval in brackets. Ncases tab denotes number of cases in each target set.