A polygenic burden of rare disruptive mutations in schizophrenia


Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Composite set gene set analysis, stratified by mutation type.


  1. 1

    Sullivan, P. F., Kendler, K. S. & Neale, M. C. Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. Arch. Gen. Psychiatry 60, 1187–1192 (2003)

    Article  Google Scholar 

  2. 2

    Lichtenstein, P. et al. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 373, 234–239 (2009)

    CAS  Article  Google Scholar 

  3. 3

    Ripke, S. et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nature Genet. 45, 1150–1159 (2013)

    CAS  Article  Google Scholar 

  4. 4

    Levinson, D. F. et al. Copy number variants in schizophrenia: confirmation of five previous findings and new evidence for 3q29 microdeletions and VIPR2 duplications. Am. J. Psychiatry 168, 302–316 (2011)

    Article  Google Scholar 

  5. 5

    Sullivan, P. F., Daly, M. J. & O’Donovan, M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nature Rev. Genet. 13, 537–551 (2012)

    CAS  Article  Google Scholar 

  6. 6

    Stone, J. L. et al. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237–241 (2008)

    ADS  CAS  Article  Google Scholar 

  7. 7

    Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009)

    ADS  CAS  Article  Google Scholar 

  8. 8

    Lee, S. H. et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nature Genet. 44, 247–250 (2012)

    CAS  Article  Google Scholar 

  9. 9

    Malhotra, D. & Sebat, J. CNVs: harbingers of a rare variant revolution in psychiatric genetics. Cell 148, 1223–1241 (2012)

    CAS  Article  Google Scholar 

  10. 10

    Psychiatric GWAS Consortium Bipolar Disorder Working Group. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4 . Nature Genet. 43, 977–983 (2011)

  11. 11

    Moreno-De-Luca, D. et al. Deletion 17q12 is a recurrent copy number variant that confers high risk of autism and schizophrenia. Am. J. Hum. Genet. 87, 618–630 (2010)

    CAS  Article  Google Scholar 

  12. 12

    Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013)

  13. 13

    Kirov, G. et al. De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia. Mol. Psychiatry 17, 142–153 (2012)

    CAS  Article  Google Scholar 

  14. 14

    Need, A. C. et al. Exome sequencing followed by large-scale genotyping suggests a limited role for moderately rare risk factors of strong effect in schizophrenia. Am. J. Hum. Genet. 91, 303–312 (2012)

    CAS  Article  Google Scholar 

  15. 15

    Crowley, J. J. et al. Deep resequencing and association analysis of schizophrenia candidate genes. Mol. Psychiatry 18, 138–140 (2013)

    CAS  Article  Google Scholar 

  16. 16

    Takata, A. et al. A population-specific uncommon variant in GRIN3A associated with schizophrenia. Biol. Psychiatry 73, 532–539 (2013)

    CAS  Article  Google Scholar 

  17. 17

    Timms, A. E. et al. Support for the N-methyl-d-aspartate receptor hypofunction hypothesis of schizophrenia from exome sequencing in multiplex families. JAMA Psychiatry 3, 1–9 (2013)

    Google Scholar 

  18. 18

    Girard, S. L. et al. Increased exonic de novo mutation rate in individuals with schizophrenia. Nature Genet. 43, 860–863 (2011)

    CAS  Article  Google Scholar 

  19. 19

    Xu, B. et al. De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia. Nature Genet. 44, 1365–1369 (2012)

    CAS  Article  Google Scholar 

  20. 20

    de Ligt, J. et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 367, 1921–1929 (2012)

    ADS  CAS  Article  Google Scholar 

  21. 21

    Rauch, A. et al. Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet 380, 1674–1682 (2012)

    CAS  Article  Google Scholar 

  22. 22

    Iossifov, I. et al. De novo gene disruptions in children on the autistic spectrum. Neuron 74, 285–299 (2012)

    CAS  Article  Google Scholar 

  23. 23

    Neale, B. M. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–245 (2012)

    ADS  CAS  Article  Google Scholar 

  24. 24

    O’Roak, B. J. et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246–250 (2012)

    ADS  Article  Google Scholar 

  25. 25

    Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012)

    ADS  CAS  Article  Google Scholar 

  26. 26

    Keinan, A. & Clark, A. G. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336, 740–743 (2012)

    ADS  CAS  Article  Google Scholar 

  27. 27

    Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011)

    CAS  Article  Google Scholar 

  28. 28

    Liu, L. et al. Analysis of rare, exonic variation amongst subjects with autism spectrum disorders and population controls. PLoS Genet. 9, e1003443 (2013)

    CAS  Article  Google Scholar 

  29. 29

    Kiezun, A. et al. Exome sequencing and the genetic basis of complex traits. Nature Genet. 44, 623–630 (2012)

    CAS  Article  Google Scholar 

  30. 30

    Fromer, M. et al. De novo mutations in schizophrenia implicate synaptic networks. Nature http://dx.doi.org/10.1038/nature12929 (this issue)

  31. 31

    Linderholm, K. R. et al. Increased levels of kynurenine and kynurenic acid in the CSF of patients with schizophrenia. Schizophr. Bull. 38, 426–432 (2012)

    Article  Google Scholar 

  32. 32

    Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium. Genome-wide association study identifies five new schizophrenia loci. Nature Genet. 43, 969–976 (2011)

  33. 33

    Bayés, À. et al. Characterisation of the proteome, diseases and evolution of the human postsynaptic density. Nature Neurosci. 14, 19–21 (2011)

    Article  Google Scholar 

  34. 34

    Dolmetsch, R. E., Xu, K. & Lewis, R. S. Calcium oscillations increase the efficiency and specificity of gene expression. Nature 392, 933–936 (1998)

    ADS  CAS  Article  Google Scholar 

  35. 35

    Yasuda, R., Sabatini, B. L. & Svoboda, K. Plasticity of calcium channels in dendritic spines. Nature Neurosci. 6, 948–955 (2003)

    CAS  Article  Google Scholar 

  36. 36

    Müller, C. S. et al. Quantitative proteomics of the Cav2 channel nano-environments in the mammalian brain. Proc. Natl Acad. Sci. USA 107, 14950–14957 (2010)

    ADS  Article  Google Scholar 

  37. 37

    Komiyama, N. H. et al. SynGAP regulates ERK/MAPK signaling, synaptic plasticity, and learning in the complex with postsynaptic density 95 and NMDA receptor. J. Neurosci. 22, 9721–9732 (2002)

    CAS  Article  Google Scholar 

  38. 38

    Berryer, M. H. et al. Mutations in SYNGAP1 cause intellectual disability, autism, and a specific form of epilepsy by inducing haploinsufficiency. Hum. Mutat. 34, 385–394 (2013)

    CAS  Article  Google Scholar 

  39. 39

    Betancur, C. Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting. Brain Res. 1380, 42–77 (2011)

    CAS  Article  Google Scholar 

  40. 40

    Darnell, J. C. et al. FMRP stalls ribosomal translocation on mRNAs linked to synaptic function and autism. Cell 146, 247–261 (2011)

    CAS  Article  Google Scholar 

  41. 41

    Callan, M. A. & Zarnescu, D. C. Heads-up: new roles for the fragile X mental retardation protein in neural stem and progenitor cells. Genesis 49, 424–440 (2011)

    CAS  Article  Google Scholar 

  42. 42

    Ascano, M., Jr et al. FMRP targets distinct mRNA sequence elements to regulate protein expression. Nature 492, 382–386 (2012)

    ADS  CAS  Article  Google Scholar 

  43. 43

    Migaud, M. et al. Enhanced long-term potentiation and impaired learning in mice with mutant postsynaptic density-95 protein. Nature 396, 433–439 (1998)

    ADS  CAS  Article  Google Scholar 

  44. 44

    Fernández, E. et al. Targeted tandem affinity purification of PSD-95 recovers core postsynaptic complexes and schizophrenia susceptibility proteins. Mol. Syst. Biol. 5, 269 (2009)

    Article  Google Scholar 

  45. 45

    Nithianantharajah, J. et al. Synaptic scaffold evolution generated components of vertebrate cognitive complexity. Nature Neurosci. 16, 16–24 (2013)

    CAS  Article  Google Scholar 

  46. 46

    Steward, O. & Worley, P. F. Selective targeting of newly synthesized Arc mRNA to active synapses requires NMDA receptor activation. Neuron 30, 227–240 (2001)

    CAS  Article  Google Scholar 

  47. 47

    Waltereit, R. et al. Arg3.1/Arc mRNA induction by Ca2+ and cAMP requires protein kinase A and mitogen-activated protein kinase/extracellular regulated kinase activation. J. Neurosci. 21, 5484–5493 (2001)

    CAS  Article  Google Scholar 

  48. 48

    Albrechtsen A Exome sequencing-driven discovery of coding polymorphisms associated with common metabolic phenotypes. Diabetologia 56, 298–310 (2013)

    Article  Google Scholar 

  49. 49

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler Transform. Bioinformatics 25, 1754–1760 (2009)

    CAS  Article  Google Scholar 

  50. 50

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genet. 43, 491–498 (2011)

    CAS  Article  Google Scholar 

Download references


We are grateful for the participation of all subjects contributing to this research, and to the collection team that worked to recruit them: E. Flordal-Thelander, A.-B. Holmgren, M. Hallin, M. Lundin, A.-K. Sundberg, C. Pettersson, R. Satgunanthan-Dawoud, S. Hassellund, M. Rådstrom, B. Ohlander, L. Nyrén and I. Kizling. We acknowledge funding support from National Institutes of Health (NIH)/National Institute of Mental Health (NIMH) ARRA Grand Opportunity grant NIMH RC2 MH089905 (S.M.P., P.S.), the Sylvan Herman Foundation, the Stanley Center for Psychiatric Research, the Stanley Medical Research Institute, NIH/National Human Genome Research Institute (NHGRI) grant U54HG003067 (E.S.L.), NIH/NIMH grant R01 MH095088 (S.J.H.), NIH/NIMH grant R01 MH091115 (S.J.H.), the Tau Consortium (S.J.H.), NIH/NIMH grant R01 MH099126 (S.M.P.), NIH/NHGRI grant R01 HG005827 (S.M.P.), NIH/NIMH grant R01 MH077139 (P.F.S.), NIH/NIMH grant R01 MH095034 (P.S.), NIH/NIMH grant T32 MH017119 (L.D.), the Friedman Brain Institute at Mount Sinai School of Medicine, the Karolinska Institutet, Karolinska University Hospital, the Swedish Research Council, an ALF grant from Swedish County Council, the Söderström Königska Foundation, the Netherlands Scientific Organization (NWO 645-000-003), the Wellcome Trust, Genes to Cognition Program, The Medical Research Council and European Union projects GENCODYS no. 241995, EUROSPIN no. 242498 and SYNSYS no. 242167 (E.F., M.O.C., N.H.K., J.S.C., S.G.N.G.). Work at the Icahn School of Medicine at Mount Sinai was also supported by the Institute for Genomics and Multiscale Biology (including computational resources and staff expertise provided by the Department of Scientific Computing). The funders had no role in study design, execution, analysis or manuscript preparation.

Author information




Project leadership: S.M.P., J.L.M., P.F.S., S.A.M., C.M.H., P.S. Sample collection and phenotyping: A.K., P.K.E.M., P.F.S., K.C., J.L.M., C.M.H. Sample processing and data management: K.C., D.R., M.F., J.L.M. Sequencing and variant calling: M.d.P., E.B., K.S., K.G., T.F., S.G. Primary statistical analysis: D.R., M.F., S.M.P. Additional analyses: L.D., E.S., G.G., S.J.H., N.S., P.R., C.O’D., S.E.B. Determination of synaptic gene sets: S.J.H., E.F., M.O.C., N.H.K., J.S.C., S.G.N.G. Interpretation of main findings: S.M.P., E.M.S., E.S.L., S.J.H., M.F., P.F.S., S.A.M., P.S. Primary drafting of manuscript: S.M.P., P.F.S., S.A.M., E.S.L., P.S. Production and approval of the final manuscript: all authors.

Corresponding author

Correspondence to Shaun M. Purcell.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Ancestry and association summaries.

a, Multidimensional scaling plot of ancestry in the full Swedish GWAS sample, in which each point represents one individual; the left panel superimposes HapMap CEU and Finnish samples and the right panel highlights (in green) the subset of the full Swedish sample for whom we have exome sequence data. b, Quantile–quantile plot for gene-based SKAT results (MAF < 5% coding variants). Similar, or more conservative, profiles were obtained for other subsets of variants. c, Case enrichment of rare (MAF < 0.1%) and singleton disruptive mutations for the constituent sets of the primary/schizophrenia gene set (top panel in green) and the secondary (autism/intellectual disability) gene set (bottom panel in orange). The primary set is enriched in cases (MAF < 0.1%; disruptive mutations, P = 10−4; singletons, P = 8 × 10−4; significant after correction for multiple testing) whereas the autism/intellectual disability set shows only a modest trend (P = 0.04 for MAF < 0.1% and P = 0.03 for singletons) and is not significant after correction. x axis represents –log10(P); OR, odds ratio. Number of genes is for total in the set (whether or not they had a rare variant).

Extended Data Figure 2 Genic and phenotypic subset analyses for the composite set.

a, Individual gene-ranking of composite set genes. Genes are ranked by their case burden of rare disruptive mutations, from left to right, for the composite set. The squares along the bottom indicate to which sets each gene belongs. The red and blue triangles represent case and control counts for each gene. The lines above represent the statistical significance of the best test for this set: that is, the significance of the top K genes, evaluated by permutation. The black line represents results for the real data (disruptive MAF < 0.1% composite set analysis). The orange line represents the dummy condition, in which we artificially constructed a set in which the number of genes, statistical enrichment, odds ratio and case/control counts were similar to the real composite set. However, this set included the 25 top-ranked genes from individual gene-based tests (disruptive MAF < 0.1% variants), with the remainder selected at random. The profile of the best test line is markedly different between the real and dummy gene sets (note: truncated at P = 0.0001 reflecting the number of permutations performed). Whereas the dummy P value climbs quickly and then drops to the final aggregate result, the true composite set line continues to climb after 200 genes, indicating that many genes with a single disruptive mutation contribute to the observed set enrichment (rather than a relatively small proportion of the 1,796 genes accounting for the majority of the signal, as in the dummy set). b, Phenotypic characteristics of cases carrying mutations. Relationship between clinical and demographic measures in schizophrenia cases in relation to carrying one or more composite set disruptive risk alleles (MAF < 0.1%). Hospital Discharge Registry (HDR) data (ICD9 codes) were available on 979 of the 990 case carriers. All P values (uncorrected) are two-sided from a case-only joint logistic regression of carrier status (one or more risk alleles) on all admission and demographic variables including year of first and last admissions. The four pairs of columns represent analyses in which we varied the way in which the HDR admission data were represented (for drug abuse, general medication condition, epilepsy and intellectual disability). # admissions, independent variables are the untransformed number of admissions; >X admissions, independent variable is binary 0/1 variable representing whether individuals had more than X admissions. Of all clinical/demographic measures considered, we observed a nominally significant increased likelihood that cases carrying a disruptive allele in the composite set have increased rates of secondary diagnoses of intellectual disability compared to other cases (based on HDR ICD9 codes).

Extended Data Figure 3 Stratified enrichment analysis P values by developmental trajectory of expression in brain (BrainSpan and Human Brain Transcriptome (HBT) data sets).

a, Uncorrected P values for a set of exploratory analyses in which we stratified genes in the enrichment analyses by their developmental profile of brain expression. We used four schemes to classify genes as ‘brain expressed’ and/or ‘biased’ with respect to prenatal or postnatal expression (see Supplementary Information section 6 for details). We merged data on the hippocampus and dorsolateral prefrontal cortex for the BrainSpan classifications; to mirror the classification of Xu et al.19 we kept separate these two groupings for the HBT data set. Results presented for MAF < 0.1% disruptive variants; similar results were obtained for singletons with the exception that the ‘K4’ prenatal enrichment signals were no longer significant. In general, the most consistent enrichment across variant classes, classification schemes and brain regions emerges for postnatally biased genes with high brain expression. b, Analysis of exome variants by developmental expression trajectory in human brain. Genes are grouped by cluster analysis of human postmortem brain expression into eight developmental trajectories, using RNA-sequencing data from the BrainSpan project. The top row gives the number of genes per cluster and the cluster centres in log2-scaled RPKM (reads per kilobase per million) values; solid and dotted solid lines indicate dorsolateral prefrontal cortex (DLPFC) and hippocampus (HPC), respectively. The bottom two rows show enrichment in the current study, relative to the exome-wide average, for singleton disruptive mutations in cases compared to controls, either subsetting all genes by expression profile (first row), or considering only genes in the composite set (second row). In both cases, we only observed nominally (P < 0.01) significant enrichment for genes that are postnatally biased. By contrast, a list of genes with loss-of-function (LoF) de novo mutations (compiled and reported in Fromer et al.30) shows strong enrichment for prenatal bias (see Fromer et al.30 for details on how de novo enrichment was calculated). Alternative approaches to classifying genes as prenatally or postnatally biased led to similar conclusions (Supplementary Information section 6).

Extended Data Table 1 Sample and detected variant properties
Extended Data Table 2 Genes prioritized as more likely to harbour large-effect alleles
Extended Data Table 3 Extended results for all PSD gene sets
Extended Data Table 4 Association results for individual CNV regions
Extended Data Table 5 Further stratification of enrichment analyses by class of variant
Extended Data Table 6 Gene set analysis of de novo genes from schizophrenia exome-sequencing studies
Extended Data Table 7 Summary of observed likely N.A, deleterious variants in ARC genes across studies

Supplementary information

Supplementary Information

This file contains Supplementary Methods and Supplementary References. (PDF 747 kb)

Supplementary Table 1

Genes and variants in highly enriched genesets: Details for specific variants for the genes in the small but highly enriched genesets, that showed large odds ratios (>5): ARC, PSD-95, calcium channel genes, Müller et al. Cav2 genes as well as disruptive KYNU variants. The variants from the four genesets are all singleton disruptive mutations, corresponding to the genes listed in ED Table 2. The (a) and (b) in the phenotype column indicate that the same individual carried two of these variants. (That the same cases carries two of these alleles is not necessarily surprising: assuming across cases equal and independent probabilities for carrying each allele, of the 49 case unique variants listed in this table, there is a 37% chance that at least one individual carries two of the 49 alleles, and a 7% chance that at least two individuals carry multiple alleles, based on 2,536 total cases.) The genotype meta-information fields are from GATK: AD = allele depth (number of reference, alternate reads); DP = total high quality reads used in calling; GQ = genotype quality, (max 99); PL = phred-scaled genotype likelihoods (reference, heterozygote, homozygote). (XLSX 39 kb)

Supplementary Table 2

Genes selected for the composite set: Source indicated: PSD = any PSD gene; FMRP-target = Darnell et al. FRMP target; SCZ de novo = nonsynonymous de novo mutation in Xu et al., Girard et al. or Fromer et al.; Calcium channel = voltage-gated calcium channel. (XLSX 97 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Purcell, S., Moran, J., Fromer, M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014). https://doi.org/10.1038/nature12975

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing