Rare coding variation has historically provided the most direct connections between gene function and disease pathogenesis. By meta-analysing the whole exomes of 24,248 schizophrenia cases and 97,322 controls, we implicate ultra-rare coding variants (URVs) in 10 genes as conferring substantial risk for schizophrenia (odds ratios of 3–50, P < 2.14 × 10−6) and 32 genes at a false discovery rate of <5%. These genes have the greatest expression in central nervous system neurons and have diverse molecular functions that include the formation, structure and function of the synapse. The associations of the NMDA (N-methyl-d-aspartate) receptor subunit GRIN2A and AMPA (α-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid) receptor subunit GRIA3 provide support for dysfunction of the glutamatergic system as a mechanistic hypothesis in the pathogenesis of schizophrenia. We observe an overlap of rare variant risk among schizophrenia, autism spectrum disorders1, epilepsy and severe neurodevelopmental disorders2, although different mutation types are implicated in some shared genes. Most genes described here, however, are not implicated in neurodevelopment. We demonstrate that genes prioritized from common variant analyses of schizophrenia are enriched in rare variant risk3, suggesting that common and rare genetic risk factors converge at least partially on the same underlying pathogenic biological processes. Even after excluding significantly associated genes, schizophrenia cases still carry a substantial excess of URVs, which indicates that more risk genes await discovery using this approach.
This is a preview of subscription content
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
We describe all datasets in the manuscript or its Supplementary Information. We provide summary-level data at the variant and gene level in an online browser for viewing and download (https://schema.broadinstitute.org). There are no restrictions on the aggregated data released on the browser. For contributing datasets that are permitted to be distributed at the individual level, we have deposited or are currently depositing the data in a public repository (the database of Genotypes and Phenotypes (dbGaP) and/or the European Genome–Phenome Archive (EGA)), and we provide the accessions in Supplementary Table 1. Whole-exome sequence data generated under this study are currently hosted on and shared with the collaborating study groupsvia the controlled-access Terra platform (https://app.terra.bio/). The Terra environment, created by the Broad Institute, contains a rich system of workspace functionalities centred on data sharing and analysis. Requests for access to the controlled datasets are managed by data custodians of the SCHEMA consortium and the Broad Institute and are sent to sample contributing investigators for approval.
The software and code used are described throughout the Supplementary Methods. In brief, for sequence data generation, we used GATK versions 3.4 and 3.6, Picard version 1.1431 and VerifyBamID version 1.0.0. Sample and variant quality control and analyses were performed using Hail 0.1 and 0.2 (https://hail.is/), with functions and arguments referred to in the Supplementary Methods. Wrappers and methods using Hail code can be found at https://github.com/TarjinderSingh/hailutils. Additional (basic) processing and visualization were performed using base R (version 3.6) with tidyverse libraries (https://www.tidyverse.org/packages/).
Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584 (2020).
Kaplanis, J. et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 586, 757–762 (2020).
Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature https://doi.org/10.1038/s41586-022-04434-5 (2022).
McGrath, J., Saha, S., Chant, D. & Welham, J. Schizophrenia: a concise overview of incidence, prevalence, and mortality. Epidemiol. Rev. 30, 67–76 (2008).
Hjorthøj, C., Stürup, A. E., McGrath, J. J. & Nordentoft, M. Years of potential life lost and life expectancy in schizophrenia: a systematic review and meta-analysis. Lancet Psychiat. 4, 295–301 (2017).
Lehman, A. F. et al. Practice guideline for the treatment of patients with schizophrenia, second edition. Am. J. Psychiat. 161, 1–56 (2004).
Hyman, S. E. Revolution stalled. Sci. Transl. Med. 4, 155cm11 (2012).
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Genovese, G. et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat. Neurosci. 19, 1433–1441 (2016).
Marshall, C. R. et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat. Genet. 49, 27–35 (2017).
Singh, T. et al. The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat. Genet. 49, 1167–1173 (2017).
Gottesman, I. I. & Shields, J. A polygenic theory of schizophrenia. Proc. Natl Acad. Sci. USA 58, 199–205 (1967).
Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).
Karayiorgou, M. et al. Schizophrenia susceptibility associated with interstitial deletions of chromosome 22q11. Proc. Natl Acad. Sci. USA 92, 7612–7616 (1995).
Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl Acad. Sci. USA 111, E455–E464 (2014).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Rivas, M. A. et al. Effect of predicted protein-truncating genetic variants on the human transcriptome. Science 348, 666–669 (2015).
Fromer, M. et al. De novo mutations in schizophrenia implicate synaptic networks. Nature 506, 179–184 (2014).
Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).
Singh, T. et al. Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders. Nat. Neurosci. 19, 571–577 (2016).
Howrigan, D. P. et al. Exome sequencing in schizophrenia-affected parent–offspring trios reveals risk conferred by protein-coding de novo mutations. Nat. Neurosci. 23, 185–193 (2020).
Gulsuner, S. et al. Genetics of schizophrenia in the South African Xhosa. Science 367, 569–573 (2020).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).
Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Hu, W., MacDonald, M. L., Elswick, D. E. & Sweet, R. A. The glutamate hypothesis of schizophrenia: evidence from human brain tissue studies. Ann. NY Acad. Sci. 1338, 38–57 (2015).
Kang, H. J. et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).
Network and Pathway Analysis Subgroup of the Psychiatric Genetics Consortium. Psychiatric genome-wide association study analyses implicate neuronal, immune and histone pathways. Nat. Neurosci. 18, 199–209 (2015).
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Koopmans, F. et al. SynGO: an evidence-based, expert-curated knowledge base for the synapse. Neuron 103, 217–234 (2019).
Zeisel, A. et al. Molecular architecture of the mouse nervous system. Cell 174, 999–1014 (2018).
Skene, N. G. et al. Genetic identification of brain cell types underlying schizophrenia. Nat. Genet. 50, 825–833 (2018).
De Rubeis, S. et al. Synaptic, transcriptional and chromatin genes disrupted in autism. Nature 515, 209–215 (2014).
The Deciphering Developmental Disorders Study. Large-scale discovery of novel genetic causes of developmental disorders. Nature 519, 223–228 (2015).
Barbosa, S. et al. Opposite modulation of RAC1 by mutations in TRIO is associated with distinct, domain-specific neurodevelopmental disorders. Am. J. Hum. Genet. 106, 338–355 (2020).
Haukka, J., Suvisaari, J. & Lönnqvist, J. Fertility of patients with schizophrenia, their siblings, and the general population: a cohort study from 1950 to 1959 in Finland. Am. J. Psychiat. 160, 460–463 (2003).
Laursen, T. M. & Munk-Olsen, T. Reproductive patterns in psychotic patients. Schizophr. Res. 121, 234–240 (2010).
Power, R. A. et al. Fecundity of patients with schizophrenia, autism, bipolar disorder, depression, anorexia nervosa, or substance abuse vs their unaffected siblings. Arch. Gen. Psychiat. 70, 22–30 (2013).
Endele, S. et al. Mutations in GRIN2A and GRIN2B encoding regulatory subunits of NMDA receptors cause variable neurodevelopmental phenotypes. Nat. Genet. 42, 1021–1026 (2010).
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
We would like to thank the patients and families who participated in our studies during the past two decades, without whom our research and findings would not be possible. The research reported in this publication was supported by the National Institute of Mental Health (NIMH) and the National Human Genome Research Institute of the National Institutes of Health under award numbers U01 MH105641, U01 MH105578, U01 MH105666, U01 MH109539, R01 MH085548, R01 MH085521, R01 MH124851 and U54 HG003067. We would also like to acknowledge support from K. Dauten and E. Dauten, the Stanley Family Foundation and the Dalio Foundation, which has enabled us to rapidly expand our data generation collections with the goal of moving towards better treatments for schizophrenia and other psychiatric disorders. Further, we wish to acknowledge all of the research participants in the BRIDGES cohort, which wassupported by NIMH under award numbers R01 MH094145 (M.B. and R.M.M., PIs) and U01 MH105653 (M.B., PI). The collection and storage of cases and controls from the Centre for Addiction and Mental Health (CAMH) in Toronto and from the Institute of Psychiatry, Psychology and Neuroscience (IoPPN), King’s College London, in London was supported by funding from GlaxoSmithKline. CAMH was supported by the Canadian Institutes of Health Research (MOP-172013, J. B. Vincent, PI, CAMH). IoPPN was supported by funding from the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and the Maudsley NHS Foundation Trust and by King’s College London. The views expressed are those of the author(s) and not necessarily those of the UK NHS, the NIHR or the UK Department of Health. Case and control collection was supported by the Heinz C. Prechter Bipolar Research Fund at the University of Michigan Depression Center to M.G. McInnis. Data and biomaterials were collected for the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD), a multi-centre, longitudinal project selected from responses to RFP NIMH-98-DS-0001, ‘Treatment for Bipolar Disorder’, which was led by G. Sachs and coordinated by Massachusetts General Hospital in Boston, with support from 2N01 MH080001-001. The Genomic Psychiatric Cohort (GPC) was supported by NIMH (U01 MH105641 (C.N.P., PI), R01 MH085548 (C.N.P. and M.T.P., PIs) and R01 MH104964 (C.N.P. and M.T.P., PIs). The MCTFR study was supported through grants from the National Institutes of Health under numbers DA037904, DA024417, DA036216, DA05147, AA09367, DA024417, HG007022 and HL117626. The work at Cardiff University was supported by Medical Research Council Centre grant no. MR/L010305/1 and programme grant mo. G0800509. We would like to acknowledge the Pritzker Neuropsychiatric Disorders Research Consortium for funding sample collection efforts. The iPSYCH team was supported by grants from the Lundbeck Foundation (R102-A9118, R155-2014-1724, and R248-2017-2003) and the Universities and University Hospitals of Aarhus and Copenhagen. The Danish National Biobank resource was supported by the Novo Nordisk Foundation. A.P. was supported by Academy of Finland Centre of Excellence in Complex Disease Genetics (grant no. 312074, 336824).S.V.F. is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 667302 and 965381; NIMH grants U01MH109536-01, U01AR076092-01A1, R0MH116037 and 5R01AG06495502; Oregon Health and Science University, Otsuka Pharmaceuticals and Supernus Pharmaceutical Company. T.S. was supported by a NARSAD Young Investigator Award from the Brain and Behavior Research Foundation.
M.J.D. is a founder of Maze Therapeutics and Neumora Therapeutics. B.M.N. is a member of the scientific advisory board at Deep Genomics and Neumora Therapeutics, a member of the scientific advisory committee at Milken and a consultant for Camp4 Therapeutics, Merck and Biogen. A.P. is a member of the genomics advisory board at AstraZeneca. M.C.O., M.J.O. and J.T.W. are supported by a collaborative research grant from Takeda Pharmaceuticals. E.A.S. is currently an employee of the Regeneron Genetics Center. D.S.P. was an employee of Genomics plc; all analyses reported in this paper were performed as part of his employment at Massachusetts General Hospital and the Broad Institute. The remaining authors declare no competing interests. In the past year, S.V.F. received income, potential income, travel expenses continuing education support and/or research support from Aardvark, Akili, Genomind, Ironshore, KemPharm/Corium, Noven, Ondosis, Otsuka, Rhodes, Supernus, Takeda, Tris and Vallon. In previous years, S.V.F. received support from: Alcobra, Arbor, Aveksham, CogCubed, Eli Lilly, Enzymotec, Impact, Janssen, Lundbeck/Takeda, McNeil, NeuroLifeSciences, Neurovance, Novartis, Pfizer, Shire, and Sunovion. With his institution, S.V.F. has US patent US20130217707 A1 for the use of sodium-hydrogen exchange inhibitors in the treatment of ADHD. S.V.F. receives royalties from books published by Guilford Press: Straight Talk about Your Child’s Mental Health; Oxford University Press: Schizophrenia: The Facts; and Elsevier: ADHD: Non-Pharmacologic Interventions, and is Program Director of www.adhdinadults.com.
Peer review information
Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Schizophrenia case-control enrichment in constrained genes (pLI > 0.9) in different SCHEMA cohorts (n = 22,444 cases and n = 39,837 controls).
The odds ratio and standard error of PTVs and synonymous variants are provided for each cohort. The meta-analyzed odds ratio and standard error is calculated using inverse-variance. PTVs show consistent signals across the different cohorts, and synonymous variants do not deviate from expectation. Bars represent the 95% CIs of the point estimates.
Extended Data Fig. 2 Schizophrenia case-control enrichment in constrained genes (pLI > 0.9) stratified by different variant annotations and inferred consequences (n = 22,444 cases and n = 39,837 controls).
LoF: all loss-of-function or PTVs; LoFHC: high-confidence LOFTEE PTVs; LoFLC: low-confidence based on LOFTEE; MPC > 3: missense variants with MPC > 3; MPC 2 - 3: missense variants with MPC 2 - 3; Other missense: missense variants with MPC < 2; Syn: synonymous variants. The dot represents the odds ratio, and the bars represent the 95% CIs of the point estimates.
Extended Data Fig. 3 Enrichment of URVs in n = 4,403 ASD and n = 3,292 ADHD cases compared to n = 5,220 controls stratified by variant annotation and consequences in constrained genes (pLI > 0.9).
Two-sided P values from logistic regression displayed are from comparing the burden of variants of the labeled consequence in cases compared to controls. The dot represents the odds ratio, and the bars represent the 95% CIs of the point estimates.
Extended Data Fig. 4 Schizophrenia case-control gene set enrichment in brain and non-brain GTEx tissues.
We test for the burden of rare PTVs in genes with the strongest specific expression in that tissue type relative to other tissues as defined in30. Gene set burden statistics are calculated using a logistic regression model of rare variants from n = 22,444 cases and n = 39,837 controls. We report two-sided P values. Each bar is a different tissue in GTEx, grouped by whether it is part of the central nervous system and sorted by P value (Supplementary Table 8).
This file contains Supplementary Methods, Supplementary Figs. 1–23, Supplementary Tables 2 and 3, full descriptions for Supplementary Tables 1–13, Supplementary Note and Supplementary References. See contents page for details.
Supplementary Tables 1 and 4–13; see Supplementary Information document for full descriptions.
About this article
Cite this article
Singh, T., Poterba, T., Curtis, D. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022). https://doi.org/10.1038/s41586-022-04556-w
Nature Reviews Genetics (2022)
Ultra-rare and common genetic variant analysis converge to implicate negative selection and neuronal processes in the aetiology of schizophrenia
Molecular Psychiatry (2022)
Nature Reviews Neuroscience (2022)
Rare variants implicate NMDA receptor signaling and cerebellar gene networks in risk for bipolar disorder
Molecular Psychiatry (2022)