Schizophrenia risk conferred by rare protein-truncating variants is conserved across diverse human populations

Liu, Dongjing; Meyer, Dara; Fennessy, Brian; Feng, Claudia; Cheng, Esther; Johnson, Jessica S.; Park, You Jeong; Rieder, Marysia-Kolbe; Ascolillo, Steven; de Pins, Agathe; Dobbyn, Amanda; Lebovitch, Dannielle; Moya, Emily; Nguyen, Tan-Hoang; Wilkins, Lillian; Hassan, Arsalan; Burdick, Katherine E.; Buxbaum, Joseph D.; Domenici, Enrico; Frangou, Sophia; Hartmann, Annette M.; Laurent-Levinson, Claudine; Malhotra, Dheeraj; Pato, Carlos N.; Pato, Michele T.; Ressler, Kerry; Roussos, Panos; Rujescu, Dan; Arango, Celso; Bertolino, Alessandro; Blasi, Giuseppe; Bocchio-Chiavetto, Luisella; Campion, Dominique; Carr, Vaughan; Fullerton, Janice M.; Gennarelli, Massimo; González-Peñas, Javier; Levinson, Douglas F.; Mowry, Bryan; Nimgaokar, Vishwajit L.; Pergola, Giulio; Rampino, Antonio; Cervilla, Jorge A.; Rivera, Margarita; Schwab, Sibylle G.; Wildenauer, Dieter B.; Daly, Mark; Neale, Benjamin; Singh, Tarjinder; O’Donovan, Michael C.; Owen, Michael J.; Walters, James T.; Ayub, Muhammad; Malhotra, Anil K.; Lencz, Todd; Sullivan, Patrick F.; Sklar, Pamela; Stahl, Eli A.; Huckins, Laura M.; Charney, Alexander W.

doi:10.1038/s41588-023-01305-1

Download PDF

Letter
Open access
Published: 13 March 2023

Schizophrenia risk conferred by rare protein-truncating variants is conserved across diverse human populations

Nature Genetics volume 55, pages 369–376 (2023)Cite this article

39k Accesses
17 Citations
318 Altmetric
Metrics details

Subjects

Abstract

Schizophrenia (SCZ) is a chronic mental illness and among the most debilitating conditions encountered in medical practice. A recent landmark SCZ study of the protein-coding regions of the genome identified a causal role for ten genes and a concentration of rare variant signals in evolutionarily constrained genes¹. This recent study—and most other large-scale human genetics studies—was mainly composed of individuals of European (EUR) ancestry, and the generalizability of the findings in non-EUR populations remains unclear. To address this gap, we designed a custom sequencing panel of 161 genes selected based on the current knowledge of SCZ genetics and sequenced a new cohort of 11,580 SCZ cases and 10,555 controls of diverse ancestries. Replicating earlier work, we found that cases carried a significantly higher burden of rare protein-truncating variants (PTVs) among evolutionarily constrained genes (odds ratio = 1.48; P = 5.4 × 10⁻⁶). In meta-analyses with existing datasets totaling up to 35,828 cases and 107,877 controls, this excess burden was largely consistent across five ancestral populations. Two genes (SRRM2 and AKAP11) were newly implicated as SCZ risk genes, and one gene (PCLO) was identified as shared by individuals with SCZ and those with autism. Overall, our results lend robust support to the rare allelic spectrum of the genetic architecture of SCZ being conserved across diverse human populations.

Rare coding variants in ten genes confer substantial risk for schizophrenia

Article 08 April 2022

The genetic architecture of schizophrenia: review of large-scale genetic studies

Article 12 July 2022

The molecular pathology of schizophrenia: an overview of existing knowledge and new directions for future research

Article Open access 06 March 2023

Main

SCZ is a severe, chronic psychiatric illness associated with lifelong progression and early mortality^2,3,4. The genetic architecture of SCZ includes clear contributions from common single-nucleotide polymorphisms (SNPs)⁵, large copy number variants (CNVs)⁶ and rare PTVs^{1,7,8,9,10,11,12,13,14}. Among these, rare PTVs provide unique value by linking disease risk to individual genes unambiguously. Most recently, the Schizophrenia Exome Sequencing Meta-Analysis (SCHEMA) Consortium increased the sequenced sample size for rare PTV investigations to 24,248 SCZ cases and 97,322 controls, established the rare PTV enrichment in genes under strong evolutionary constraint and identified ten genes with excess burden of rare PTVs in cases compared with controls¹. When considered alongside earlier studies, these results suggest that, with greater sample sizes, additional SCZ genes harboring rare PTVs will be discovered. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) remain cost prohibitive when applied at large scales, and targeted sequencing of carefully chosen genes is an alternative approach to rapidly achieve the required sample size for novel risk gene discovery.

Most large-scale human genetics research initiatives to date have failed to include diverse populations. Over 80% of genome-wide association study (GWAS) participants are of EUR ancestry, despite this group comprising less than one-quarter of the total human population^15,16. Studies of mental illness have contributed to this disparity with almost exclusively EUR GWAS cohorts despite the roughly equal prevalence of psychiatric disorders worldwide¹⁷. The limited evidence from SCZ GWASs and CNV studies of non-EUR populations suggests broadly shared genetic architecture with that of EUR populations, but ancestry-specific effects, such as the major histocompatibility complex locus in EUR populations, are also present^{18,19,20,21,22,23,24}. For rare genetic variants, findings on a broad range of complex human traits have been largely consistent across populations^{25,26,27,28,29,30}. Evidence for ancestry-specific rare variant effects is limited but starting to emerge, such as TMEM136 and serum lipid measurements in individuals of South Asian (SAS) ancestry²⁵. No studies have yet shown the effect of rare PTVs in diverse ancestries for SCZ.

Here, to diversify populations in SCZ studies and achieve sufficient power to discover novel risk genes, we designed a custom sequencing panel of 161 putative SCZ genes and applied it to case–control cohorts totaling 22,135 individuals from diverse ancestries (40% non-EUR; Fig. 1 and Supplementary Table 1). This study, outlined in Fig. 1a and hereafter referred to as the Psychiatric Genomics Consortium Phase 3 Targeted Sequencing of Schizophrenia Study (PGC3SEQ), was limited to cohorts that were not part of earlier SCZ sequencing initiatives such as SCHEMA. In constructing the sequencing panel, we used a data-driven algorithm to synthesize current knowledge of the genetic architecture of SCZ, including a preliminary version of the SCHEMA gene-level burden statistics^31,32, with the goal of enriching for genes likely to harbor excess rare PTVs in SCZ that had not reached exome-wide significance due to a lack of power. This algorithm^33,34 is a Bayesian framework that prioritizes genes by integrating gene-level burden statistics with gene membership in gene sets that have been implicated in SCZ (Fig. 1b and Supplementary Tables 2 and 3). The exonic regions of the 161 prioritized genes were sequenced on the Ion Torrent platform followed by rigorous quality control (Supplementary Figs. 1–6). Analyses comparing individuals with SCZ and controls were performed for rare PTVs (stop–gain, frameshift indels or essential splicing donor/acceptor) and deleterious missense variants (placed into tiers based on the missense badness, PolyPhen-2 and constraint (MPC) score³⁵ (tier 1: MPC > 3; tier 2: MPC 2–3; nondamaging: MPC < 2), and synonymous variants were analyzed as a negative control. In our primary analysis, rare was defined as a minor allele count of ≤5 among the entire cohort. To maximize power, PGC3SEQ was further meta-analyzed with SCHEMA data (Supplementary Table 4 and Supplementary Fig. 7) and sequencing datasets for bipolar disorder and autism. We performed two broad types of analysis: (1) a global enrichment of all constrained genes on the custom panel (n = 80 genes) to investigate the overall role of rare disruptive variants in diverse ancestries; and (2) gene-level burden tests to identify novel SCZ risk genes.

**Fig. 1: Study design and cohort ancestry composition.**

PGC3SEQ SCZ cases carried a significantly higher burden of rare PTVs among the 80 constrained genes after adjusting for rare synonymous variant counts and five ancestry principal components (odds ratio (OR) = 1.48; P = 5.4 × 10⁻⁶; Fig. 2a and Supplementary Table 5), indicating an independent replication of the excess burden of rare PTVs observed in 3,063 constrained genes in SCHEMA. The higher effect size seen in PGC3SEQ compared with SCHEMA (OR_PGC3SEQ = 1.48 in 80 genes; OR_SCHEMA = 1.22 in 3,063 genes) demonstrates the effectiveness of the gene prioritization strategy used for PGC3SEQ. For the 80 genes available in both studies, the signal in PGC3SEQ was much attenuated compared with in SCHEMA (OR_PGC3SEQ = 1.48 versus OR_SCHEMA = 3.0; Fig. 2a), indicating that effect sizes are probably overestimated in SCHEMA. In contrast, tier 1 and 2 missense variants were not significantly enriched in cases relative to controls in PGC3SEQ. The effects of missense variants were directionally consistent with those in SCHEMA, indicating that the insignificant results may be due to a lack of power. The burden of rare synonymous variants, which were analyzed as a negative control, was significantly higher in those with SCZ relative to controls in PGC3SEQ but not in SCHEMA. Sensitivity analysis showed that this signal was due to an overall higher burden of rare coding variants in people with SCZ relative to controls in PGC3SEQ, rather than due to technical bias or variability between contributing cohorts (Supplementary Note and Supplementary Fig. 8). The global PTV enrichment in PGC3SEQ remained significant after accounting for this overall higher baseline burden (OR = 1.4; P = 1.2 × 10⁻⁴; Supplementary Fig. 8c and Supplementary Table 5).

**Fig. 2: Global enrichment in 80 panel genes under strong constraint (pLI > 0.9).**

We performed meta-analyses of PGC3SEQ and SCHEMA to test whether the global enrichment signal was consistent across diverse ancestries (n = 57,323; ancestry breakdown in Fig. 1a). We assigned samples to five ancestral super-populations, as defined in the 1000 Genomes Project (Methods). At the aggregate level, four of the five populations displayed a higher burden of rare disruptive variants (PTV + MPC > 3 missense) in SCZ cases compared with controls (P < 0.05; Fig. 2b (left) and Supplementary Table 6). Although we did not find a nominally significant enrichment in the fifth ancestral population (SAS), the magnitude of enrichment was similar to that in the African (AFR) population (OR = 1.5), indicating that nonsignificance is probably a power issue (Supplementary Note and Supplementary Fig. 9). When considered separately, PGC3SEQ and SCHEMA provided independent support for the ancestry-stratified enrichments (all ancestries had OR > 1 in both datasets; Supplementary Table 6). Indeed, the PGC3SEQ data alone showed nominal significance for admixed American (AMR), East Asian (EAS) and EUR populations, exempt from any potential effect overestimation in SCHEMA. Differences, if any, in the strength of enrichment between pairs of ancestral populations were not sizable enough to be detected as significant. Across the five ancestral populations, the burden of tier 2 missense variants was evaluated, although not significant in most (OR = 1.1–1.2; Fig. 2b, middle), whereas synonymous variants were not enriched in any (Fig. 2b, right).

Having replicated the global rare PTV enrichment in PGC3SEQ and established its conservation across diverse populations, we then tested individual genes for harboring an excess burden of rare PTVs in cases relative to controls. In the PGC3SEQ data alone, none of the 161 genes sequenced were significant after Bonferroni correction (0.05/161 = 3.1 × 10⁻⁴; Supplementary Tables 7–9 and Supplementary Fig. 10). The direction of effects of all genes was overall consistent with the directions observed in SCHEMA (binomial test, P = 0.016) and this observation became more pronounced when considering only those 44 genes with a SCHEMA P value of <0.01 (binomial test, P = 0.002). Of the ten significant genes identified in SCHEMA, nine were included in the PGC3SEQ panel (GRIA3 was the lone exception). PGC3SEQ had enrichment of rare PTVs for these nine genes collectively (OR = 1.66; P = 0.03; 49 PTVs in cases versus 24 in controls) and two of the nine genes had P < 0.05 when considered individually (RB1CC1 and CUL1; Table 1). Notably, SETD1A—the gene with the strongest signal in SCHEMA—had a nonsignificant, weakened enrichment in PGC3SEQ, suggesting an overestimation of its effect magnitude in SCHEMA (OR_PGC3SEQ = 1.6 versus OR_SCHEMA = 20.1). Another gene implicated by SCHEMA that did not find support in PGC3SEQ was CACNA1G. Among the nine SCHEMA genes on the PGC3SEQ panel, CACNA1G had the largest number of PTVs in PGC3SEQ (n = 19) yet an OR of 0.42, directionally inconsistent with its effect in SCHEMA (OR_SCHEMA = 3.1). Despite some evidence of winner’s curse, altogether the gene-level replication tests in PGC3SEQ suggest that many of the SCHEMA genes probably confer genuine SCZ risk, including those not yet reaching exome-wide significance.

Table 1 Attempted replication of the nine significant SCHEMA genes in PGC3SEQ

Full size table

Combining PGC3SEQ and SCHEMA (totaling 35,828 cases and 107,877 controls) via a P value-based meta-analysis of gene-level statistics, we identified two new genes at the exome-wide significance threshold (Table 2 and Supplementary Table 7): SRRM2 (P = 7.2 × 10⁻⁷) and AKAP11 (P = 4.2 × 10⁻⁷). In previous work, SRRM2 has been shown to play a role in the tauopathy of Alzheimer’s disease^36,37,38, and de novo mutations in this gene have been linked to developmental disorders³⁹, while AKAP11 was suggested as a trans-gene linking to a SCZ GWAS locus in a recent study⁴⁰, which, together with our results, adds to examples of the convergence of common and rare variant associations in the same gene. A recent meta-analysis of SCHEMA and a bipolar disorder dataset also found exome-wide significance for AKAP11 (ref. ⁴¹), suggesting a role for this gene in the shared etiology of SCZ and bipolar disorder. The current study consolidates the role of AKAP11 in SCZ, independent of other psychiatric disorders.

Table 2 Novel exome-wide significant SCZ genes

Full size table

Lastly, we meta-analyzed gene-level rare disruptive variant statistics from SCZ, autism spectrum disorder (ASD)⁴² and bipolar disorder⁴¹ to identify pleiotropic risk genes that are not detectable at the sample sizes attained by studies of any single disorder. This identified PCLO as a shared risk gene for SCZ and ASD (P = 5.8 × 10⁻⁸; Table 2). The result suggests that PCLO may be driving the common variant association at nearby loci reported in GWASs of SCZ⁴³ and other psychiatric disorders^44,45,46,47.

The major contribution that the PGC3SEQ study makes to the field of human genetics is demonstrating the cross-ancestry conservation of the risk conferred by a major class of genetic variation for the most severe adult mental illness. To date, the paucity of exome sequencing studies of non-EUR populations has impeded the field in developing a complete view of the genetic architecture of complex diseases, and has made it difficult to assess the degree to which rare PTV associations are susceptible to the well-known confounding effects of ancestry in GWASs and polygenic prediction studies^{48,49,50,51,52}. Here we addressed this knowledge gap with respect to severe mental illnesses. In doing so, findings previously established in predominantly EUR cohorts have been extended to non-EUR populations for one of the major classes of genetic risk variation. This observation was not a foregone conclusion, especially since the targeted gene list was derived from SCHEMA—a study of predominantly EUR cohorts. In effect, PGC3SEQ showed that the burden signal in genes with the strongest evidence in EUR populations is conserved across non-EUR populations. Our findings are also timely information following the publication of SCHEMA, showing that some of the top genes implicated in that study are probably false positives.

There are limitations to the current study. The Ion Torrent technology is known to have decreased accuracy for indels involving homopolymer repeats of the same nucleotide⁵³. We assessed the impact of such indels on our findings via a sensitivity analysis and found that excluding them would not change our conclusions (Supplementary Tables 10 and 11 and Supplementary Fig. 11). We used an interim version of the SCHEMA results for PGC3SEQ panel design, and this version is different from the published results due to changes in SCHEMA analytical strategy. Specifically, the interim SCHEMA statistics^31,32 did not include de novo mutations from trios, used a different strategy to combine PTV and missense variants and were compiled before the incorporation of Genome Aggregation Database (gnomAD) controls. Comparing the interim and published SCHEMA results, gene ranks underwent nontrivial changes, with only 27 overlapping genes between the top 100 lists in the two versions. Consequently, our panel probably included more random noise than it would have if panel construction had waited until SCHEMA was complete. As WES studies of other diseases approach the sample size achieved for SCZ, and strategies are considered for how to increase power, the current report offers valuable lessons, and we note that results on datasets as large as 24,000 cases and 50,000 controls can still change substantially as more samples are added. The possibility of such changes makes the targeted panel approach vulnerable, and perhaps WES and WGS are the safest strategies despite their cost.

In summary, rare PTVs have a robust role in SCZ, and across ancestries their effect is consistently concentrated in genes under strong evolutionary constraint. The deconvolution of this overall contribution into individual genes that may have ancestry-specific effects will require the sequencing of more individuals of diverse backgrounds. Achieving diversity in human genetic research must be a top priority to prevent health disparities from worsening as findings from genetic research begin to be translated into clinical practice.

Methods

Cohorts

A brief description of the individual contributing sample collection of PGC3SEQ is available in the Supplementary Note, along with the institutional review boards that approved the sample collections. To ensure compatibility with Psychiatric Genomics Consortium definitions, we define cases as those having a diagnosis of SCZ or a schizoaffective disorder. A total of 23,352 samples selected to be nonoverlapping with SCHEMA as well as other previous and ongoing sequencing efforts in the field were identified and sequenced (Supplementary Table 1). The PGC3SEQ study protocol was approved by the Icahn School of Medicine at Mount Sinai ethical review board (16-00101).

Gene panel construction

We intended to build a panel of putative SCZ risk genes from within which the majority of new discoveries from additional WES and WGS would come. To this end, we applied both traditional burden statistics and the generalized/gene set transmission and de novo association test (gTADA) to the SCHEMA data.

Traditional burden statistics

For each gene in SCHEMA, the enrichment statistics of rare variants in cases compared with controls were calculated using Fisher’s exact test separately for PTVs and damaging missense variants, then the two classes of variants were combined using meta-analysis to generate a gene-level P value. Of note, this gene-level P value is different from that in the SCHEMA publication, which used a slightly different strategy in combining PTVs and missense variants, additionally incorporated evidence from de novo mutations using trio data and included external gnomAD controls. Such analysis strategy changes in the later stage of SCHEMA have led to nontrivial changes in gene rank, which may impact the power of our panel to implicate disease genes.

gTADA

gTADA is a generalized Bayesian framework where de novo and rare variant case/control data are integrated with gene-level external information to identify risk genes for neuropsychiatric disorders^33,34. We first sought to identify gene sets associated with SCZ in SCHEMA. Through curation of the literature, we identified an initial set of ~160 candidate gene sets. Next, each set was tested independently for association with SCZ in SCHEMA data using gTADA. From all of the sets tested, we identified 27 significantly enriched gene sets. We then calculated a joint enrichment Z score from the marginal Z scores and the gene set correlation matrix and kept the 25 gene sets with positive joint Z scores (Supplementary Table 2). For each of the 25 sets retained, gene-level statistics (posterior probability of being a risk gene) were then calculated. The genes were then ranked by this metric and the mean ranking across the 25 ranks was calculated.

Combining traditional burden statistics and gTADA, genes in the top 100 based on the gTADA mean ranking across the 25 ranks or the top 100 based on the minimum ranking across the 25 ranks and/or the top 100 based on the burden test were included in the panel (Fig. 1b and Supplementary Table 3; n = 139 genes; six were later removed due to the logistics of designing the sequencing panel). We next included four genes with evidence for association with SCZ in both GWASs and SCHEMA, with the criteria being: gene burden test P value < 0.05; gene with a top 200 rank in gTADA; and gene start and stop positions spanning an SNP associated with SCZ in GWAS or, if not, gene located in a GWAS locus with fewer than or equal to ten genes. Finally, an additional 24 genes were chosen for inclusion by taking the best 24 gTADA rankings of the remaining genes with a burden P value < 0.05.

Based on the observation that gene-level rare single-nucleotide variant burden statistics have been consistent across ancestries in a wide range of diseases^{18,19,20,21,22,23,24}, our targeted panel was expected to have broad utility across ancestries, even though its construction used EUR-dominant datasets. This was further consolidated by findings from our own ancestry-stratified analysis (Fig. 2b).

Sequencing and variant calling

Ion AmpliSeq technology is an amplicon-based enrichment method for creating sequencing libraries. We used Ion AmpliSeq Designer version 6.13 to design amplicons that cover the exons of the 161 genes defined based on the Ion hg19 reference. The mean and median percentages of covered base pairs across all exons were 97.7 and 100%, respectively. Sequencing of the PGC3SEQ samples was performed on the Ion Torrent platform at Sema4 between June 2018 and April 2019. Sequencing plates were matched with respect to ancestry and case versus control composition whenever possible. The average sequencing depth across all samples was 224×. The Sema4 sequencing facility returned to the research team BAM files with flow signal and associated quality control metrics. Single-sample calling was performed using Torrent variantCaller version 5.8.0, which is specially optimized to exploit the underlying flow signal information generated by the Ion Torrent sequencing. Sites were left aligned and normalized and multiallelic sites were split into separate lines using BCFtools version 1.9 (http://samtools.github.io/bcftools/).

Genotype-level quality control

We interrogated the call set with respect to a variety of quality control metrics and implement procedures to ensure rigorous quality control standards. In the absence of well-established quality control procedures specifically for Ion Torrent data, we drew on the idea of GATK’s variant quality score recalibration technique and developed a machine-learning genotype-level filter based on 177 quality metrics and annotation profiles, including Ion Torrent sequencing metrics such as QUAL, FMT/GQ and FMT/DP, allele-related metrics such as AF, HRUN and MLLD and coverage and allele frequency from the gnomAD database version 2 (https://gnomad.broadinstitute.org). Considering that the majority of SCHEMA data with which we meta-analyzed were generated on the Illumina platform, we calibrated our Ion Torrent targeted sequencing data using a subset of the control samples (n = 1,347) with available Illumina WES data. Specifically, we used XGBoost version 1.3 (ref. ⁵⁴) in Python version 3.7.3 to train the classifier in 70% of the Ion Torrent–Illumina paired data using Illumina as the ground truth. In the remaining 30% test set, the classifier achieved an area under the curve of 0.95, an accuracy of 95.3% and a false discovery rate of 4.4% for SNPs and an accuracy of 99.0% and a false discovery rate of 6.4% for indels. Applying the trained classifier to the test dataset improved the concordance between Ion Torrent and Illumina calls from 83.1 to 95.7%. We also compared our machine-learning classifier with a set of conventional hard filters and confirmed that the classifier performs unanimously better in all metrics considered (sensitivity, specificity, accuracy and false discovery rate).

Applying the machine-learning filter to the entire dataset, 83.2% of the calls were retained, and among the passed variants, 96% were SNPs and 4% were indels. Five out of 919 detected multiallelics passed the filter and were split into multiple biallelic variants. The proportion of calls that passed the filter among samples used for model training and testing (n = 1,347) and the remaining samples were similar (83.9 versus 83.1%, respectively). Likewise, the pass rate among sites that were covered by both Illumina WES capture and our sequencing panel (33.8% of the calls fell into these regions) and sites only covered by our panel were comparable (85.8 versus 81.8%), indicating that the machine-learning model generalized well to new samples and new genomic regions

Sample- and site-level quality control

To identify low-quality and outlier samples, we examined per-sample sequencing quality metrics, including the number of mapped reads, average read depth across the panel, on-target rate and uniformity rate. We also examined sample-level call set characteristics, including the call rate, inbreeding coefficient, transition-to-transversion ratio at heterozygote sites, heterozygous-to-homozygous call ratio, total number of variants, number of SNPs and indels and number of singletons. We visualized the distribution of the above quality control metrics (Supplementary Fig. 1) and identified 94 low-quality/outlier samples that met either one of the following criteria: MappedReads < 400,000; MeanDepth < 40; OnTarget < 80; Uniformity < 65; MissingCallRate > 0.3; Inbreeding_F > 0.6; Het_Hom_Ratio < 0.6; Total_SNPs < 400; and Total_Indels < 10. The number of low-quality or outlier samples was not significantly different between cases and controls (55 out of 12,045 cases were low quality or outliers and 38 out of 11,212 controls were low quality or outliers; chi-squared test, P = 0.1878). All of the quality control metrics distributed similarly between SCZ cases and controls (Supplementary Fig. 2).

When combining data from single-sample calls, a no call at a particular site in a particular sample was deemed as a homozygous reference genotype if the depth at that site in that sample was greater than ten and missing otherwise. Lastly, we applied the site-level filters to exclude variants with a missing rate of >10%.

Sample relatedness

We used the population structure-adjusted relatedness estimation methods PC-AiR and PC-Relate to estimate pairwise relatedness between samples. In addition to the quality control steps performed per previous sections, we further performed linkage disequilibrium pruning on the dataset and removed indels before relatedness estimation. Considering that the conventional kinship coefficient ranges for varying degrees of relatedness may not be appropriate when the estimates are from targeted sequencing data covering only a small fraction of the genome, we derived empirical boundaries based on the clustering of sample pairs on an identity-by-descent kinship scatterplot (Supplementary Fig. 3). The unrelated and related pairs were clearly separated into two clusters with distinct patterns (unrelated pairs: lower oval-shaped cluster; related pairs: upper left). We identified 1,096 pairs of genetic relatives and retained one sample from each pair according to the following prioritization scheme: (1) the sample has fewer genetic relatives in the entire cohort; (2) patient with SCZ; (3) the sample has available genome-wide SNP data; (4) the sample has self-reported sex information; and (5) the sample has fewer missing genotypes for variants with a minor allele frequency (MAF) of <0.1%. These measures yielded a total of 22,135 unrelated individuals for downstream analysis.

Control for population stratification

We calculated ancestry principal components for the 22,135 unrelated individuals in PLINK version 1.9 (ref. ⁵⁵) using 1,392 linkage disequilibrium-pruned common SNPs (MAF > 1%) that passed all quality control steps. Cases and controls were broadly matched on population structures (Supplementary Fig. 5a,b). The first five principal components were used in later association analysis to control for population substructure, based on the observation that: (1) the first five principal components explained 75% of the cumulative variance in the genetic variation among study participants; and (2) the ability of principal components to separate ancestral genetic backgrounds dissipated after the first five principal components (Supplementary Fig. 5c).

Ancestry assignment

The genetic ancestry assignment of the PGC3SEQ participants was done by calculating principal components jointly with 1000 Genomes phase 3 participants (n = 2,501), followed by a K-nearest-neighbor classification using the top three principal components. We restricted the analysis to 1,372 linkage disequilibrium-pruned common SNPs (MAF > 1%) that were present in both the study dataset and the reference dataset (1000 Genomes). The reference data were first cleaned and quality controlled using PLINK by filtering for missingness per individual (<10%) and missingness per SNP (<10%) and then subsetted to the variant set that passed all of the quality control filters in the PGC3SEQ cohort. The cleaned reference and study datasets were harmonized, combined and pruned for linkage disequilibrium, then input into PLINK for principal component analysis with default settings.

K-nearest-neighbor classification was used for ancestry assignment of the study participants. Cross-validation determined K = 5 and the first three principal components could best classify participants into five super-populations (AFR, AMR, EAS, EUR and SAS). Applying the trained classification model, we assigned each study participant to the super-population that included the most of the participant’s five neighbors. About half of our study participants had self-reported ancestry and ethnicity data, which were broadly consistent with their genetically inferred ancestry. There was reasonable concordance between the country of origin of the sample collection and assigned ancestries (Supplementary Fig. 6).

We then ran another round of principal component analysis for each global population separately to generate ancestry-specific principal components, identified ancestry-specific outliers on the principal component plots and removed the outliers and recalculated the principal components until no obvious outlier existed. After two rounds of recalculation, two EAS and seven SAS individuals were flagged as outliers within ancestry and were not included in the analysis in which stratification by population was performed.

Variant annotation

We employed the same variation annotation workflow as was used in SCHEMA for ease of replication and comparison. Specially, annotation by LOFTEE (as implemented in the Variant Effect Predictor)⁵⁶ was applied to variants that passed all quality control filters, and the analysis was restricted to the canonical transcript with the most damaging annotation. The three broad types of coding variants analyzed were: (1) PTVs, defined as any mutation that introduced a stop codon, changed the frame of the open reading frame or introduced a change at a predicted splice donor or splice acceptor site; (2) missense variants, which included any single-nucleotide variant that caused an amino acid change; and (3) synonymous variants, which resulted in no amino acid change, as a negative control. Missense variants were further partitioned into groups with increasing deleteriousness based on the MPC score annotation³⁵. Tier 1 missense variants had an MPC score of >3, tier 2 missense variants had an MPC score of 2–3 and an MPC score of <2 indicated nondamaging missense variants. The use of MPC as the missense classifier was based on the SCHEMA results that were compared with Combined Annotation Dependent Depletion and PolyPhen; MPC most powerfully prioritized damaging missense de novo variants in ASD and developmental delay/intellectual disability trios¹.

Use of SCHEMA data

SCHEMA is a large multisite collaboration aggregating, generating and analyzing high-throughput exome sequencing data of individuals with SCZ and controls to advance gene discovery. We accessed the post-quality control data of a subset of SCHEMA case–control samples with appropriate sharing permissions at the time of this work and did not reperform genotype- and sample-level filtering. Specifically, the controls from gnomAD, as included in SCHEMA, were not used in the current study due to data sharing restrictions. After excluding 216 samples detected as genetic duplicates with a PGC3SEQ sample, the available SCHEMA datasets contained 19,108 cases and 18,001 controls (Supplementary Table 4). We used the genetic ancestry label for each individual determined by the SCHEMA analysis team and, within each ancestral group, calculated population-specific principal components using linkage disequilibrium-pruned SNPs with a MAF of >1%, a call rate of >95 and a Hardy–Weinberg P value of 1 × 10⁻⁶. Using a similar procedure to that used in the PGC3SEQ data analysis, we detected and removed 24 outlier samples from the EAS group. Supplementary Fig. 7 shows the ancestral composition of the SCHEMA cohort and Supplementary Table 4 displays the number of SCHEMA cases and controls used for this study by original sample collection.

Statistical approaches for global enrichment across constrained genes

We defined rare variants as those with a minor allele count of ≤5 in the entire sample for any ancestry-combined analysis and lifted this threshold to MAF < 0.1% in ancestry-stratified analysis to preserve power. We counted the number of rare variants by annotation type observed in each participant in individual genes and added up the counts across the 80 constrained genes. The association between rare variant burden in the gene set of interest and SCZ status was tested using logistic regression with Firth’s penalized likelihood method to account for sparse data⁵⁷, while adjusting for ancestry principal components and baseline rare variant burden. The first five global principal components were used in the ancestry-combined analysis and the first four principal components calculated within each ancestry were used in ancestry-stratified analysis. The baseline rare variant burden was used to control for technical and biological differences between cases and controls. To ensure a minimum correlation between the baseline burden and the burden of interest, we used the rare synonymous variant count as the baseline burden when the burden of interest was a PTV or missense variant and the rare nonsynonymous variant count as the baseline burden when the burden of interest was a synonymous variant. The significance threshold for the enrichment analysis was determined using the Bonferroni method, correcting for the five annotation classes tested (PTVs, the three missense groups and synonymous variants); that is, 0.05/5 = 0.01. P < 0.05 was used for nominal significance.

Using the available individual-level SCHEMA data, we performed global enrichment tests across the 80 constrained genes using similar approaches as in the PGC3SEQ analysis. Specifically, we used logistic regression with Firth’s correction and adjusted for ancestry principal components, sex, sequencing cohort and baseline rare variant burden. The first five global principal components were used in the ancestry-combined analysis and the first four principal components calculated within each ancestry were used in ancestry-stratified analysis.

Four of the global populations (AFR, AMR, EUR and EAS) had n > 100 in both PGC3SEQ and SCHEMA and we used inverse variance-weighted meta-analysis to combine their odds ratios in the two cohorts (sample size by population in Fig. 1a). To balance the power reduction due to sample stratification, we relaxed the definition of rare variants to include those with a MAF of <0.1% (compared with a minor allele count of ≤5 in the ancestry-combined analysis). In the full SCHEMA cohort, missense variants with MPC > 3 had a global signal on par with PTVs¹; therefore, we grouped these two types of variants together in our analysis of both cohorts to further increase the power. Only PGC3SEQ contributed to the analysis of the SAS population.

Statistical approaches for gene-based tests

Gene-based tests aggregate the effects of multiple rare variants and can increase the power to detect genetic associations⁵⁸. It is reasonable to assume that rare disruptive variants in a gene all have the same effect direction (variant alleles associated with higher risk) and under this scenario a burden test is appropriate. Considering the sparsity of the observed count data, we used Fisher’s exact test to compare the burden of PTVs in cases and controls and computed two-sided P values. The total disruptive burden per gene was quantified by adding up all PTVs (or synonymous variants, as a negative control) annotated to the gene. Different from SCHEMA, we did not incorporate missense variants because they were not significantly enriched globally (Fig. 2a). We did not pursue a meta-analysis of the PTV and MPC > 3 variants because the extremely low number of MPC > 3 variants prohibited a reliable estimation of their effect magnitude, which would be used as weights in a meta-analysis. Although Fisher’s exact test is not able to accommodate covariates such as ancestry principal components and baseline burden, this did not adversely affect our analysis as the Q–Q plot showed no sign of inflation in the statistics (Supplementary Fig. 10, top row).

In the gene-level analysis of SCHEMA, case–control cohorts and trio cohorts were meta-analyzed, and rare variants found in both types of cohort were not double counted. We combined gene-level P values from PGC3SEQ and SCHEMA (summary statistics obtained from the SCHEMA publication) using signed Stouffer’s method, with the sign of the Z scores being the effect direction of the PTVs and the weights of each study calculated as:

$$\frac{4}{{\frac{1}{{\# {\rm{cases}}}} + \frac{1}{{\# {\rm{controls}}}}}} + (\# {\rm{trios}}\;{\rm{in}}\;{\rm{SCHEMA}})$$

The above equation applies equal weight to the case–control data and trio data. Since only a subset of genes had de novo mutations in SCHEMA trios and the number of trios was small relative to the case–control sample size, fine-tuning weights would not meaningfully change our results. This meta-analysis totaled 35,828 SCZ cases and 107,877 controls, representing the largest SCZ sequencing dataset to date. The exome-wide significance level was determined to be 0.05/(23,321 tests performed in SCHEMA + 161 tests performed in PGC3SEQ) = 2.13 × 10⁻⁶. As expected, the meta-analysis P values deviated substantially from the null (Supplementary Fig. 10, middle left), consistent with an enrichment of risk genes in the targeted panel. Gene-level synonymous variant P values displayed the expected null distribution (Supplementary Fig. 10 (middle right) and Supplementary Table 9), assuring that the gene-level PTV results were free from technical or methodological artifacts agnostic to variant annotation.

We then combined the two SCZ cohorts with the WES datasets of two other psychiatric diseases to identify genes shared across diagnoses. The two studies from which we obtained summary statistics were: (1) the latest release of the Autism Sequencing Consortium (ASC)⁴² (and we further converted the gene-level q values to P values); and (2) the WES of bipolar disorder by Palmer et al.⁴¹. Meta-analysis was performed similarly as above and the same exome-wide significance threshold was also applied (2.13 × 10⁻⁶). We noted some degree of control overlap between these studies (for example, SCHEMA and ASC both included Swedish controls from the same collection). As the overlap between SCHEMA and ASC consists only a small fraction of the entire sample, our analysis (and the discovery of PCLO) should only be minimally affected. The controls overlapping between SCZ and bipolar disorder are expected to be greater per contributing cohort makeup, although we did not identify any new genes.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

We describe all of the datasets in the Methods and Supplementary Information. The raw PGC3SEQ genotype and phenotype datasets are permitted to be distributed at the individual level and we have deposited the data in the database of Genotypes and Phenotypes under accession number phs003138.v1.p1. We provide the aggregated variant counts at the gene and gene set level in Supplementary Tables 1–9. SCHEMA summary-level data are available online for viewing and download (https://schema.broadinstitute.org). SCHEMA individual-level whole-exome sequence data are hosted on the controlled-access Terra platform (https://app.terra.bio/) and shared with the collaborating study groups. Requests for access to the controlled datasets are managed by data custodians of the SCHEMA Consortium and the Broad Institute and are sent to sample contributing investigators for approval. The gnomAD database can be accessed at https://gnomad.broadinstitute.org.

Code availability

The software and code used in this study are described in the Methods. In brief, we used Torrent variantCaller version 5.8.0 to call variants from the raw sequence data. For quality control and preprocessing, we used XGBoost version 1.3 in Python version 3.7.3, BCFtools version 1.9 and PLINK version 1.9. Reanalysis of the SCHEMA cohort was performed using Hail 0.1 and 0.2 (https://hail.is/). Main analyses of the PGC3SEQ data and their meta-analysis with SCHEMA were performed using R version 3.6 with various libraries. Visualization was generated with ggplot2 version 3.3.

References

Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).
Article CAS PubMed PubMed Central Google Scholar
Laursen, T. M., Nordentoft, M. & Mortensen, P. B. Excess early mortality in schizophrenia. Annu. Rev. Clin. Psychol. 10, 425–448 (2014).
Article PubMed Google Scholar
Owen, M. J., Sawa, A. & Mortensen, P. B. Schizophrenia. Lancet 388, 86–97 (2016).
Article PubMed PubMed Central Google Scholar
Hjorthøj, C., Stürup, A. E., McGrath, J. J. & Nordentoft, M. Years of potential life lost and life expectancy in schizophrenia: a systematic review and meta-analysis. Lancet Psychiatry 4, 295–301 (2017).
Article PubMed Google Scholar
Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502–508 (2022).
Article CAS PubMed PubMed Central Google Scholar
Marshall, C. R. et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat. Genet. 49, 27–35 (2017).
Article CAS PubMed Google Scholar
Lescai, F. et al. Meta-analysis of Scandinavian schizophrenia exomes. Preprint at bioRxiv https://doi.org/10.1101/836957 (2019).
Steinberg, S. et al. Truncating mutations in RBM12 are associated with psychosis. Nat. Genet. 49, 1251–1254 (2017).
Article CAS PubMed Google Scholar
Singh, T. et al. The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat. Genet. 49, 1167–1173 (2017).
Article CAS PubMed PubMed Central Google Scholar
Genovese, G. et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat. Neurosci. 19, 1433–1441 (2016).
Article CAS PubMed PubMed Central Google Scholar
Singh, T. et al. Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders. Nat. Neurosci. 19, 571–577 (2016).
Article CAS PubMed PubMed Central Google Scholar
Takata, A. et al. Loss-of-function variants in schizophrenia risk and SETD1A as a candidate susceptibility gene. Neuron 82, 773–780 (2014).
Article CAS PubMed PubMed Central Google Scholar
Fromer, M. et al. De novo mutations in schizophrenia implicate synaptic networks. Nature 506, 179–184 (2014).
Article CAS PubMed PubMed Central Google Scholar
Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).
Article CAS PubMed PubMed Central Google Scholar
World Population Prospects 2019: Highlights (United Nations, 2019).
Popejoy, A. B. & Fullerton, S. M. Genomics is failing on diversity. Nature 538, 161–164 (2016).
Article CAS PubMed PubMed Central Google Scholar
Peterson, R. E. et al. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell 179, 589–603 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bigdeli, T. B. et al. Contributions of common genetic variants to risk of schizophrenia among individuals of African and Latino ancestry. Mol. Psychiatry 25, 2455–2467 (2020).
Article CAS PubMed Google Scholar
Gulsuner, S. et al. Genetics of schizophrenia in the South African Xhosa. Science 367, 569–573 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lam, M. et al. Comparative genetic architectures of schizophrenia in East Asian and European populations. Nat. Genet. 51, 1670–1678 (2019).
Article CAS PubMed PubMed Central Google Scholar
Li, Z. et al. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia. Nat. Genet. 49, 1576–1583 (2017).
Article CAS PubMed Google Scholar
Li, Z. et al. Genome-wide analysis of the role of copy number variation in schizophrenia risk in Chinese. Biol. Psychiatry 80, 331–337 (2016).
Article PubMed Google Scholar
De Candia, T. R. et al. Additive genetic variation in schizophrenia risk is shared by populations of African and European descent. Am. J. Hum. Genet. 93, 463–470 (2013).
Article PubMed PubMed Central Google Scholar
Mensah-Ablorh, A. et al. Meta-analysis of rare variant association tests in multiethnic populations. Genet. Epidemiol. 40, 57–65 (2016).
Article PubMed Google Scholar
Hindy, G. et al. Rare coding variants in 35 genes associate with circulating lipid levels—a multi-ancestry analysis of 170,000 exomes. Am. J. Hum. Genet. 109, 81–96 (2022).
Article CAS PubMed Google Scholar
Ulirsch, J. C. et al. The genetic landscape of Diamond–Blackfan anemia. Am. J. Hum. Genet. 104, 356 (2019).
Article CAS PubMed PubMed Central Google Scholar
Guo, M. H., Plummer, L., Chan, Y.-M., Hirschhorn, J. N. & Lippincott, M. F. Burden testing of rare variants identified through exome sequencing via publicly available control data. Am. J. Hum. Genet. 103, 522–534 (2018).
Article CAS PubMed PubMed Central Google Scholar
Shaw, N. D. et al. SMCHD1 mutations associated with a rare muscular dystrophy can also cause isolated arhinia and Bosma arhinia microphthalmia syndrome. Nat. Genet. 49, 238–248 (2017).
Article CAS PubMed PubMed Central Google Scholar
Zhao, J. et al. A burden of rare variants associated with extremes of gene expression in human peripheral blood. Am. J. Hum. Genet. 98, 299–309 (2016).
Article CAS PubMed PubMed Central Google Scholar
Huffman, J. E. et al. Rare and low-frequency variants and their association with plasma levels of fibrinogen, FVII, FVIII, and vWF. Blood 126, e19–e29 (2015).
Article CAS PubMed PubMed Central Google Scholar
Singh, T., Neale, B., Daly, M. J. & SCHEMA Consortium Exome sequencing of 23,851 cases implicates novel risk genes and provides insights into the genetic architecture of schizophrenia.Eur. Neuropsychopharmacol. 29, S1098 (2019).
Article Google Scholar
Singh, T., Neale, B., Daly, M. J. & SCHEMA Consortium Initial results from the meta-analysis of the whole-exomes of over 20,000 schizophrenia cases and 45,000 controls.Eur. Neuropsychopharmacol. 29, S813–S814 (2019).
Article Google Scholar
Nguyen, H. T. et al. Integrative analysis of rare variants and pathway information shows convergent results between immune pathways, drug targets and epilepsy genes. Preprint at bioRxiv https://doi.org/10.1101/410100 (2018).
Nguyen, H. T. et al. Integrated Bayesian analysis of rare exonic variants to identify risk genes for schizophrenia and neurodevelopmental disorders. Genome Med. 9, 114 (2017).
Article PubMed PubMed Central Google Scholar
Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).
McMillan, P. J. et al. Pathological tau drives ectopic nuclear speckle scaffold protein SRRM2 accumulation in neuron cytoplasm in Alzheimer’s disease. Acta Neuropathol. Commun. 9, 117 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lester, E. et al. Tau aggregates are RNA–protein assemblies that mislocalize multiple nuclear speckle components. Neuron 109, 1675–1691.e9 (2021).
Article CAS PubMed PubMed Central Google Scholar
Fazeli, S. et al. A compound downregulation of SRRM2 and miR-27a-3p with upregulation of miR-27b-3p in PBMCs of Parkinson’s patients is associated with the early stage onset of disease. PLoS ONE 15, e0240855 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kaplanis, J. et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 586, 757–762 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, S. et al. Illuminating links between cis-regulators and trans-acting variants in the human prefrontal cortex. Genome Med. 14, 133 (2022).
Article PubMed PubMed Central Google Scholar
Palmer, D. S. et al. Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia. Nat. Genet. 54, 541–547 (2022).
Article CAS PubMed PubMed Central Google Scholar
Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584.e23 (2020).
Article CAS PubMed PubMed Central Google Scholar
Bigdeli, T. B. et al. Genome-wide association studies of schizophrenia and bipolar disorder in a diverse cohort of US veterans. Schizophr. Bull. 47, 517–529 (2021).
Article PubMed Google Scholar
Cross-Disorder Group of the Psychiatric Genomics Consortium Genomic relationships, novel loci, and pleiotropic mechanisms across eight psychiatric disorders. Cell 179, 1469–1482.e11 (2019).
Article PubMed Central Google Scholar
Minelli, A. et al. PCLO gene: its role in vulnerability to major depressive disorder. J. Affect. Disord. 139, 250–255 (2012).
Article PubMed Google Scholar
Choi, K. H. et al. Gene expression and genetic variation data implicate PCLO in bipolar disorder. Biol. Psychiatry 69, 353–359 (2011).
Article CAS PubMed Google Scholar
Sullivan, P. F. et al. Genome-wide association for major depressive disorder: a possible role for the presynaptic protein piccolo. Mol. Psychiatry 14, 359–375 (2009).
Article CAS PubMed Google Scholar
Lu, H. et al. Evaluating marginal genetic correlation of associated loci for complex diseases and traits between European and East Asian populations. Hum. Genet. 140, 1285–1297 (2021).
Article CAS PubMed Google Scholar
Guo, J. et al. Quantifying genetic heterogeneity between continental populations for human height and body mass index. Sci. Rep. 11, 5240 (2021).
Article CAS PubMed PubMed Central Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Article CAS PubMed PubMed Central Google Scholar
Curtis, D. Polygenic risk score for schizophrenia is more strongly associated with ancestry than with schizophrenia. Psychiatr. Genet. 28, 85–89 (2018).
Article PubMed Google Scholar
Brown, B. C. et al. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bragg, L. M., Stone, G., Butler, M. K., Hugenholtz, P. & Tyson, G. W. Shining a light on dark sequencing: characterising errors in Ion Torrent PGM data. PLoS Comput. Biol. 9, e1003031 (2013).
Article CAS PubMed PubMed Central Google Scholar
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. Preprint at https://doi.org/10.48550/arXiv.1603.02754 (2016).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ma, C., Blackwell, T., Boehnke, M., Scott, L. J. & GoT2D investigators Recommended joint and meta-analysis strategies for case-control association testing of single low-count variants. Genet. Epidemiol. 37, 539–550 2013).
Article PubMed PubMed Central Google Scholar
Do, R., Kathiresan, S. & Abecasis, G. R. Exome sequencing and complex disease: practical aspects of rare variant association studies. Hum. Mol. Genet. 21, R1–R9 (2012).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

A.W.C. is supported by the National Institute of Mental Health (NIMH; R01MH109536). L.M.H. is supported by the NIMH (R01MH118278, R01MH124839 and U01MH109536). M.C.O., M.J.O. and J.T.W. are supported by Medical Research Council Centre grant number MR/L010305/1 and program MR/P005748/1. For sample acquisition, curation and preparation, we are grateful to Leyden Delta, Magna Laboratories and their staff (M. Helthuis, J. Jansen and A. King). We also thank L. Hopkins and the core laboratory team at Cardiff University. B.N. is supported by 1R01MH124851. K.E.B. is supported by R01MH100125 and 1I01CX000995. J.D.B. is supported by P50MH066392. J.M.F. is supported by the Janette Mary O’Neil Research Fellowship and Australian National Health and Medical Research Council (NHMRC) Project Grant 1063960. P.R.S. is supported by the Australian NHMRC Program Grant 1037196 and Investigator Grant 1176716. D.B.W. and S.G.S. are supported by NHMRC grant 513861. M.R. has been funded by Instituto de Salud Carlos III projects PI18/00238 and PI18/00467 (cofunded by the European Regional Development Fund/European Social Fund A Way to Make Europe/Investing in Your Future). C.A. has received support from the Spanish Ministry of Science and Innovation Instituto de Salud Carlos III (PI19/024), cofinanced by European Regional Development Fund Funds from the European Commission, A Way of Making Europe, Centro de Investigación Biomédica en Red de Salud Mental, Madrid Regional Government (B2017/BMD-3740 AGES-CM-2) and European Union Horizon 2020 Program under the Innovative Medicines Initiative 2 Joint Undertaking (grant agreement number 115916 (Project PRISM) and grant agreement number 777394 (Project AIMS-2-TRIALS)), Fundación Familia Alonso and Fundación Alicia Koplowitz. We acknowledge the Biobanc of Parc Sanitari Sant Joan de Déu and Centro de Investigación Biomédica en Red de Salud Mental for samples and data procurement. J.G.-P. holds a Sara Borrell grant from Instituto de Salud Carlos III (CD20/00118). B.R. has received support from the Spanish Ministry of Science and Innovation Instituto de Salud Carlos III (PI18/00213 and Miguel Servet grants CPII21/00008 and MS16/00153), cofinanced by European Regional Development Fund Funds from the European Commission. M.G. and the work at IRCCS Centro San Giovanni di Dio Fatebenefratelli is supported by the Italian Ministry of Health (Ricerca Corrente). The CommonMind datasets were generated as part of the CommonMind Consortium, supported by funding from Takeda Pharmaceutical Company, F. Hoffmann-La Roche and National Institutes of Health grants R01MH085542, R01MH093725, P50MH066392, P50MH080405, R01MH097276, RO1MH-075916, P50M096891, P50MH084053S1, R37MH057881, AG02219, AG05138, MH06692, R01MH110921, R01MH109677, R01MH109897, U01MH103392 and U01MH116442, project ZIC MH002903 and contract HHSN271201300031C through the NIMH Intramural Research Program. Brain tissue for the study was obtained from the following brain bank collections: the Mount Sinai/JJ Peters VA Medical Center NIH Brain and Tissue Repository, University of Pennsylvania Alzheimer’s Disease Core Center, University of Pittsburgh Brain Tissue Donation Program and NIMH Human Brain Collection Core. P.R. is supported by R01AG067025, R01AG065582, R01AG050986, R01MH110921, U01MH116442, R01MH125246, R01MH106056 and R01MH109897. This study was supported by the Australian Schizophrenia Research Bank (chief investigators: V.C., U.S., R.J.S., A.V.J., B.M., P.T.M., S.V.C., F.A.H., C.P. and C.M.L.). The Australian Schizophrenia Research Bank is supported by Neuroscience Research Australia. The SCHEMA Consortium provided quality-controlled data on independent schizophrenia cohorts. We thank the SCHEMA Consortium team for sharing these data and results.

Author information

Authors and Affiliations

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Dongjing Liu, Dara Meyer, Brian Fennessy, Claudia Feng, Esther Cheng, You Jeong Park, Marysia-Kolbe Rieder, Steven Ascolillo, Agathe de Pins, Amanda Dobbyn, Emily Moya, Lillian Wilkins, John F. Fullard, Steven P. Kleopoulos, Panos Roussos & Alexander W. Charney
Wellcome Sanger Institute, Hinxton, UK
Claudia Feng
Pamela Sklar Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Jessica S. Johnson, Dannielle Lebovitch, Panos Roussos, Pamela Sklar, Eli A. Stahl & Laura M. Huckins
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
You Jeong Park, Amanda Dobbyn, Brett L. Collins, Elodie Drapeau, Joseph Friedman, Vahram Haroutunian, Abraham Reichenberg, Joseph D. Buxbaum, Sophia Frangou, Panos Roussos & Alexander W. Charney
Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
Tan-Hoang Nguyen
University of Peshawar, Peshawar, Pakistan
Arsalan Hassan
Department of Psychiatry, Brigham and Women’s Hospital, Boston, MA, USA
Megan Shanahan & Katherine E. Burdick
Department of Psychiatry, Harvard Medical School, Boston, MA, USA
Katherine E. Burdick & Kerry Ressler
Centre for Computational and Systems Biology, Fondazione The Microsoft Research – University of Trento, Rovereto, Italy
Enrico Domenici
Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
Enrico Domenici
Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada
Sophia Frangou
Department of Psychiatry and Psychotherapy, Medical University of Vienna, Vienna, Austria
Ina Giegling, Bettina Konte, Annette M. Hartmann & Dan Rujescu
Faculté de Médecine Sorbonne Université, Groupe de Recherche Clinique n°15—Troubles Psychiatriques et Développement, Department of Child and Adolescent Psychiatry, Hôpital Universitaire de la Pitié-Salpêtrière, Paris, France
David Cohen, Angèle Consoli, Marianna Giannitelli & Claudine Laurent-Levinson
Centre de Référence des Maladies Rares à Expression Psychiatrique, Department of Child and Adolescent Psychiatry, AP-HP Sorbonne Université, Hôpital Universitaire de la Pitié-Salpêtrière, Paris, France
David Cohen, Angèle Consoli, Marianna Giannitelli & Claudine Laurent-Levinson
Department of Neuroscience and Rare Diseases, Roche Pharma Research and Early Development, F. Hoffmann-La Roche, Basel, Switzerland
Dheeraj Malhotra
Department of Psychiatry and Behavioral Sciences, SUNY Downstate College of Medicine, New York, NY, USA
Penelope Georgakopoulos, Stella Kim Hansen, Carlos N. Pato & Michele T. Pato
Division of Depression and Anxiety Disorders, McLean Hospital, Belmont, MA, USA
Kerry Ressler
Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, New York, NY, USA
Panos Roussos
Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain
Celso Arango & Javier González-Peñas
Centro de Investigación Biomédica en Red de Salud Mental, Madrid, Spain
Jose L. Ayuso, Julio Bobes, Mar Fatjo, María D. Molto, Gerard Muntané, Ana G. Pinto, Belén Ramos, Elisabet Vilella, Celso Arango & Javier González-Peñas
Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy
Alessandro Bertolino, Giuseppe Blasi, Giulio Pergola & Antonio Rampino
Department of Theoretical and Applied Sciences, eCampus University, Novedrate, Italy
Luisella Bocchio-Chiavetto
Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
Luisella Bocchio-Chiavetto & Massimo Gennarelli
INSERM U1245, Rouen, France
Olivier Guillin & Dominique Campion
Centre Hospitalier du Rouvray, Rouen, France
Olivier Guillin & Dominique Campion
Neuroscience Research Australia, Sydney, New South Wales, Australia
Melissa J. Green, Peter R. Schofield, Cynthia Shannon Weickert, Vaughan Carr & Janice M. Fullerton
School of Psychiatry, University of New South Wales, Sydney, New South Wales, Australia
Melissa J. Green, Cynthia Shannon Weickert & Vaughan Carr
Department of Psychiatry, School of Clinical Sciences, Monash University, Melbourne, Victoria, Australia
Vaughan Carr
School of Medical Sciences, University of New South Wales, Sydney, New South Wales, Australia
Peter R. Schofield & Janice M. Fullerton
Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
Massimo Gennarelli
Department of Psychiatry, Stanford University, Stanford, CA, USA
Douglas F. Levinson
Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, Australia
Cheryl Filippich, Sathish Periyasamy, Heather J. Smith & Bryan Mowry
Queensland Centre for Mental Health Research, The University of Queensland, Brisbane, Queensland, Australia
Cheryl Filippich, Sathish Periyasamy, Heather J. Smith & Bryan Mowry
Department of Psychiatry, University of Pittsburgh School of Medicine, Western Psychiatric Hospital, Pittsburgh, PA, USA
Vishwajit L. Nimgaokar
Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
Vishwajit L. Nimgaokar
Institute of Neurosciences, Biomedical Research Centre, University of Granada, Granada, Spain
Blanca Gutierrez, Esther Molina, Jorge A. Cervilla & Margarita Rivera
Department of Psychiatry, San Cecilio University Hospital, University of Granada, Granada, Spain
Jorge A. Cervilla
Department of Biochemistry and Molecular Biology II, Faculty of Pharmacy, University of Granada, Granada, Spain
Margarita Rivera
Molecular Horizons, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, New South Wales, Australia
Sibylle G. Schwab
The University of Western Australia, Perth, Western Australia, Australia
Dieter B. Wildenauer
Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Mark Daly, Benjamin Neale & Tarjinder Singh
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Mark Daly, Benjamin Neale & Tarjinder Singh
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Mark Daly & Benjamin Neale
Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
Mark Daly
MRC Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
Michael C. O’Donovan, Michael J. Owen & James T. Walters
University College London, London, UK
Muhammad Ayub
Department of Psychiatry, Queen’s University, Kingston, Ontario, Canada
Muhammad Ayub
Department of Psychiatry, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
Anil K. Malhotra & Todd Lencz
Institute for Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA
Anil K. Malhotra & Todd Lencz
Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health, New York, NY, USA
Anil K. Malhotra & Todd Lencz
Departments of Genetics and Psychiatry, University of North Carolina, Chapel Hill, NC, USA
Patrick F. Sullivan
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
Patrick F. Sullivan
Regeneron Pharmaceuticals, Tarrytown, NY, USA
Eli A. Stahl
St Andrew’s Toowoomba Hospital, Toowoomba, Queensland, Australia
Henry S. Aghanwa
Sir Cowasjee Jehangir Institute of Psychiatry, Hyderabad, Pakistan
Moin Ansari
King Edward Medical University, Lahore, Pakistan
Aftab Asif
Allama Iqbal Medical College, Lahore, Pakistan
Rubina Aslam
Department of Psychiatry, Universidad Autónoma de Madrid, Madrid, Spain
Jose L. Ayuso, Benedicto Crespo-Facorro & Lourdes Fañanás
Hospital Universitario de La Princesa, Instituto de Investigación Sanitaria, Madrid, Spain
Jose L. Ayuso
SUNY Downstate Health Sciences University, Brooklyn, NY, USA
Tim Bigdeli & James Knowles
Psychiatry Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
Stefano Bignotti
Faculty of Medicine and Health Sciences – Psychiatry, Universidad de Oviedo, Institute of Health Research of Principado de Asturias, Instituto de Neurociencias del Principado de Asturias, Oviedo, Spain
Julio Bobes
Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, GA, USA
Bekh Bradley, Adriana Lori, Vasiliki Michopolous & Abigail Powers
Virginia Commonwealth University, Richmond, VA, USA
Peter Buckley
School of Biomedical Sciences and Pharmacy, University of Newcastle, Newcastle, New South Wales, Australia
Murray J. Cairns, Rodney J. Scott & Paul A. Tooney
Hunter Medical Research Institute, Newcastle, New South Wales, Australia
Murray J. Cairns, Ulrich Schall & Paul A. Tooney
Centre for Brain and Mental Health Research, The University of Newcastle, Newcastle, New South Wales, Australia
Murray J. Cairns & Paul A. Tooney
Brain and Mind Centre, The University of Sydney, Sydney, New South Wales, Australia
Stanley V. Catts
School of Medicine, The University of Queensland, Brisbane, Queensland, Australia
Stanley V. Catts
New Millat Brain Center, Sahiwal, Pakistan
Abdul Rashid Chaudhry
Institut des Systèmes Intelligents et de Robotique, CNRS UMR7222, Sorbonne Université, Campus Pierre et Marie Curie, Faculté des Sciences et Ingénierie, Paris, France
David Cohen
Instituto de Investigación Sanitaria de Santiago de Compostela, Complexo Hospitalario Universitario de Santiago de Compostela, Servizo Galego de Saúde, Santiago de Compostela, Spain
Javier Costas & Jose Javier Suárez-Rama
Hospital Universitario Virgen del Rocío, Department of Psychiatry, Universidad de Sevilla, Sevilla, Spain
Benedicto Crespo-Facorro
Harvard Medical School, Boston, MA, USA
Nikolaos P. Daskalakis
McLean Hospital, Belmont, MA, USA
Nikolaos P. Daskalakis
Nicosia University School of Medicine, Nicosia, Cyprus
Michael Davidson
Icahn School of Medicine at Mount Sinai, New York, NY, USA
Kenneth L. Davis & Dolores Malaspina
Sheppard Pratt Hospital, Baltimore, MD, USA
Faith Dickerson
District Headquarter Hospital Failsalbad, Failsalbad, Pakistan
Imtiaz A. Dogar
Department of Evolutionary Biology, Ecology and Environmental Sciences, Faculty of Biology, University of Barcelona, Barcelona, Spain
Lourdes Fañanás
University of Arizona, Tuscon, AZ, USA
Ayman Fanous
Veterans Affairs, New York, NY, USA
Ayman Fanous
University of Punjab, Lahore, Pakistan
Warda Fatima
FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain
Mar Fatjo
Departament de Biologia Evolutiva, Ecologia i Ciències Ambientals, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
Mar Fatjo
UFR Santé, Université de Rouen Normandie, Rouen, France
Olivier Guillin
Department of Psychiatry, Faculty of Medicine, University of Granada, Granada, Spain
Blanca Gutierrez
Drug Discovery Group, Cell and Molecular Biology Department, Cancer Programme, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
Herlina Y. Handoko
Professor Dr. Haroon Rashid Clinic, Lahore, Pakistan
Maryam Haroon
School of Medicine and Public Health, University of Newcastle, Newcastle, New South Wales, Australia
Frans A. Henskens & Brian J. Kelly
Lahore Institute of Research and Development, Lahore, Pakistan
Fahad Hussain & Bakht Khizar
Centre for Clinical Research in Neuropsychiatry, The University of Western Australia, Perth, Western Australia, Australia
Assen V. Jablensky & Muhammad Umar
Department of Psychiatry and Behavioural Sciences, Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan
Jamil Junejo
Al-Shamas Hospital, Sargodha, Pakistan
Shams-ud-Din A. Khan
Queen’s University, Kingston, Ontario, Canada
Muhammad N. S. Khan, Xudong Liu, Amy McNaughton & Amina Nasar
Nai Zindage Psychiatric Hospital, Multan, Pakistan
Anisuzzaman Khan & Calvin Sjaarda
Azad Jammu and Kashmir Medical College, Muzaffarabad, Pakistan
Hamid R. Khawaja
Department of Psychiatry, Cipto Mangunkusumo General Hospital, Universitas Indonesia, Jakarta, Indonesia
Agung A. A. A. Kusumawardhani
Nishtar Medical University, Multan, Pakistan
Naeemullah Leghari
School of Psychology, University of Newcastle, Newcastle, New South Wales, Australia
Carmel M. Loughland & Patricia T. Michie
Ar-Rahma Hospital, Multan, Pakistan
Khalid Mahmood
University of Health Sciences, Lahore, Pakistan
Saqib Mahmood
Ameena Clinic, Gujranwala, Pakistan
Danish Malik & Tanvir Nasr
Department of Nursing, Faculty of Health Sciences, University of Granada, Granada, Spain
Esther Molina
Department of Genetics, University of Valencia, Valencia, Spain
María D. Molto
Wali Neuropsychiatric Centre, Faisalabad, Pakistan
Asim Munir
Hospital Universitari Institut Pere Mata, IISPV, Universitat Rovira i Virgili, Reus, Spain
Gerard Muntané
Center for Addiction and Mental Health, Toronto, Ontario, Canada
Farooq Naeem & Elisabet Vilella
Department of Surgery, University of Michigan, Ann Arbor, MI, USA
Derek J. Nancarrow
Department of Psychological Medicine, University of Nigeria Teaching Hospital, Enugu, Nigeria
Jude U. Ohaeri
Laboratory of Statistical Genetics, The Rockefeller University, New York, NY, USA
Jurg Ott
Melbourne Neuropsychiatry Centre, University of Melbourne and Melbourne Health, Melbourne, Victoria, Australia
Christos Pantelis
The Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Melbourne, Australia
Christos Pantelis
NorthWestern Mental Health, Melbourne, Victoria, Australia
Christos Pantelis
Bioaraba Health Research Institute, OSI Araba, University Hospital, University of the Basque Country, Vitoria, Spain
Ana G. Pinto
Parc Sanitari Sant Joan de Déu, Barcelona, Spain
Belén Ramos
Punjab Institute of Mental Health, Lahore, Pakistan
Nusrat H. Rana
University of Utah, Salt Lake City, UT, USA
Mark Rapaport
James J. Peters VA Medical Center, New York, NY, USA
Abraham Reichenberg
Généthon, Paris, France
Safaa Saker-Delye
Priority Centre for Brain and Mental Health Research, The University of Newcastle, Mater Hospital, Newcastle, New South Wales, Australia
Ulrich Schall
Division of Molecular Medicine, NSW Health Pathology North, Newcastle, New South Wales, Australia
Rodney J. Scott
Department of Neuroscience, SUNY Upstate Medical University, Syracuse, NY, USA
Cynthia Shannon Weickert
Grupo de Medicina Xenómica, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Jose Javier Suárez-Rama
Shafique Psychiatric Clinic, Peshawar, Pakistan
Muhammad Tariq
Université de Paris, Faculté de Médecine, Hôpital Cochin-Tarnier, Paris, France
Florence Thibaut
INSERM U1266, Institut de Psychiatrie et de Neurosciences, Paris, France
Florence Thibaut
Sheba Medical Center, Ramat Gan, Israel
Mark Weiser
School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
Jin Qin Wu
Stanley Neurovirology Laboratory, Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, MD, USA
Robert Yolken

Authors

Dongjing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dara Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Brian Fennessy
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Feng
View author publications
You can also search for this author in PubMed Google Scholar
Esther Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Jessica S. Johnson
View author publications
You can also search for this author in PubMed Google Scholar
You Jeong Park
View author publications
You can also search for this author in PubMed Google Scholar
Marysia-Kolbe Rieder
View author publications
You can also search for this author in PubMed Google Scholar
Steven Ascolillo
View author publications
You can also search for this author in PubMed Google Scholar
Agathe de Pins
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Dobbyn
View author publications
You can also search for this author in PubMed Google Scholar
Dannielle Lebovitch
View author publications
You can also search for this author in PubMed Google Scholar
Emily Moya
View author publications
You can also search for this author in PubMed Google Scholar
Tan-Hoang Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Lillian Wilkins
View author publications
You can also search for this author in PubMed Google Scholar
Arsalan Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Katherine E. Burdick
View author publications
You can also search for this author in PubMed Google Scholar
Joseph D. Buxbaum
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Domenici
View author publications
You can also search for this author in PubMed Google Scholar
Sophia Frangou
View author publications
You can also search for this author in PubMed Google Scholar
Annette M. Hartmann
View author publications
You can also search for this author in PubMed Google Scholar
Claudine Laurent-Levinson
View author publications
You can also search for this author in PubMed Google Scholar
Dheeraj Malhotra
View author publications
You can also search for this author in PubMed Google Scholar
Carlos N. Pato
View author publications
You can also search for this author in PubMed Google Scholar
Michele T. Pato
View author publications
You can also search for this author in PubMed Google Scholar
Kerry Ressler
View author publications
You can also search for this author in PubMed Google Scholar
Panos Roussos
View author publications
You can also search for this author in PubMed Google Scholar
Dan Rujescu
View author publications
You can also search for this author in PubMed Google Scholar
Celso Arango
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Bertolino
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Blasi
View author publications
You can also search for this author in PubMed Google Scholar
Luisella Bocchio-Chiavetto
View author publications
You can also search for this author in PubMed Google Scholar
Dominique Campion
View author publications
You can also search for this author in PubMed Google Scholar
Vaughan Carr
View author publications
You can also search for this author in PubMed Google Scholar
Janice M. Fullerton
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Gennarelli
View author publications
You can also search for this author in PubMed Google Scholar
Javier González-Peñas
View author publications
You can also search for this author in PubMed Google Scholar
Douglas F. Levinson
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Mowry
View author publications
You can also search for this author in PubMed Google Scholar
Vishwajit L. Nimgaokar
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Pergola
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Rampino
View author publications
You can also search for this author in PubMed Google Scholar
Jorge A. Cervilla
View author publications
You can also search for this author in PubMed Google Scholar
Margarita Rivera
View author publications
You can also search for this author in PubMed Google Scholar
Sibylle G. Schwab
View author publications
You can also search for this author in PubMed Google Scholar
Dieter B. Wildenauer
View author publications
You can also search for this author in PubMed Google Scholar
Mark Daly
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Neale
View author publications
You can also search for this author in PubMed Google Scholar
Tarjinder Singh
View author publications
You can also search for this author in PubMed Google Scholar
Michael C. O’Donovan
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Owen
View author publications
You can also search for this author in PubMed Google Scholar
James T. Walters
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Ayub
View author publications
You can also search for this author in PubMed Google Scholar
Anil K. Malhotra
View author publications
You can also search for this author in PubMed Google Scholar
Todd Lencz
View author publications
You can also search for this author in PubMed Google Scholar
Patrick F. Sullivan
View author publications
You can also search for this author in PubMed Google Scholar
Pamela Sklar
View author publications
You can also search for this author in PubMed Google Scholar
Eli A. Stahl
View author publications
You can also search for this author in PubMed Google Scholar
Laura M. Huckins
View author publications
You can also search for this author in PubMed Google Scholar
Alexander W. Charney
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Psychiatric Genomics Consortium Phase 3 Targeted Sequencing of Schizophrenia Study Team

Henry S. Aghanwa
, Moin Ansari
, Aftab Asif
, Rubina Aslam
, Jose L. Ayuso
, Tim Bigdeli
, Stefano Bignotti
, Julio Bobes
, Bekh Bradley
, Peter Buckley
, Murray J. Cairns
, Stanley V. Catts
, Abdul Rashid Chaudhry
, David Cohen
, Brett L. Collins
, Angèle Consoli
, Javier Costas
, Benedicto Crespo-Facorro
, Nikolaos P. Daskalakis
, Michael Davidson
, Kenneth L. Davis
, Faith Dickerson
, Imtiaz A. Dogar
, Elodie Drapeau
, Lourdes Fañanás
, Ayman Fanous
, Warda Fatima
, Mar Fatjo
, Cheryl Filippich
, Joseph Friedman
, John F. Fullard
, Penelope Georgakopoulos
, Marianna Giannitelli
, Ina Giegling
, Melissa J. Green
, Olivier Guillin
, Blanca Gutierrez
, Herlina Y. Handoko
, Stella Kim Hansen
, Maryam Haroon
, Vahram Haroutunian
, Frans A. Henskens
, Fahad Hussain
, Assen V. Jablensky
, Jamil Junejo
, Brian J. Kelly
, Shams-ud-Din A. Khan
, Muhammad N. S. Khan
, Anisuzzaman Khan
, Hamid R. Khawaja
, Bakht Khizar
, Steven P. Kleopoulos
, James Knowles
, Bettina Konte
, Agung A. A. A. Kusumawardhani
, Naeemullah Leghari
, Xudong Liu
, Adriana Lori
, Carmel M. Loughland
, Khalid Mahmood
, Saqib Mahmood
, Dolores Malaspina
, Danish Malik
, Amy McNaughton
, Patricia T. Michie
, Vasiliki Michopolous
, Esther Molina
, María D. Molto
, Asim Munir
, Gerard Muntané
, Farooq Naeem
, Derek J. Nancarrow
, Amina Nasar
, Tanvir Nasr
, Jude U. Ohaeri
, Jurg Ott
, Christos Pantelis
, Sathish Periyasamy
, Ana G. Pinto
, Abigail Powers
, Belén Ramos
, Nusrat H. Rana
, Mark Rapaport
, Abraham Reichenberg
, Safaa Saker-Delye
, Ulrich Schall
, Peter R. Schofield
, Rodney J. Scott
, Megan Shanahan
, Cynthia Shannon Weickert
, Calvin Sjaarda
, Heather J. Smith
, Jose Javier Suárez-Rama
, Muhammad Tariq
, Florence Thibaut
, Paul A. Tooney
, Muhammad Umar
, Elisabet Vilella
, Mark Weiser
, Jin Qin Wu
& Robert Yolken

Contributions

A.W.C., P.S., L.M.H., E.A.S., P.F.S. and M.C.O. conceived of and designed the study. J.D.B., K.E.B., E.D., S.F., A.M.H., C.L.-L., D.Malhotra, C.N.P., M.T.P., K.R., P.R., D.R., C.A., A.B., G.B., L.B.-C., D.C., V.C., J.M.F., M.G., J.G.-P., D.F.L., B.M., V.L.N., G.P., A.R., J.A.C., M.R., S.G.S., D.B.W., M.D., B.N., T.S., M.C.O., M.J.O., J.T.W., M.A., A.K.M., T.L., P.F.S., P.S. and PGC3SEQ Team contributed to the original recruitment of study participants and the collection of DNA samples. A.W.C., P.S., L.M.H., E.A.S, J.S.J. and Y.J.P. coordinated attainment of the samples from original collectors. A.W.C., E.A.S., J.S.J., Y.J.P., L.W., E.M., E.C., S.A., M.-K.R., D.Lebovitch and D.Meyer contributed to sample management, processing and sequencing. D.Liu, A.W.C., L.M.H., E.A.S., B.F., C.F., A.D., T.-H.N., A.d.P. and A.H. developed statistical pipelines and performed computational analyses. D.Liu, A.W.C. and L.M.H. wrote and/or edited the manuscript. All authors reviewed and approved the paper.

Corresponding authors

Correspondence to Dongjing Liu, Laura M. Huckins or Alexander W. Charney.

Ethics declarations

Competing interests

M.C.O., M.J.O. and J.T.W. are supported by a collaborative research grant from Takeda Pharmaceutical and Akrivia Health. A.K.M. is a consultant at Genomind and InformedDNA. D.M. is a full-time employee of F. Hoffmann-La Roche. M.D. is the Scientific Founder of Maze Therapeutics. C.A. has been a consultant to or has received honoraria or grants from Acadia, Angelini, Biogen, Boehringer, Gedeon Richter, Janssen-Cilag, Lundbeck, Medscape, Minerva, Otsuka, Pfizer, Roche, Sage, Servier, Shire, Schering-Plough, Sumitomo Dainippon Pharma, Sunovion and Takeda. D.R. served as a consultant for Janssen, received honoraria from Gerot-Lannacher, Janssen and Pharmagenetix, received travel support from Angelini and Janssen and served on the advisory boards of AC Immune, Roche and Rovi. E.A.S. is an employee of Regeneron. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Atsushi Takata, Santhosh Girirajan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Note, Figs, 1–11 and Tables 10 and 11.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–9.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, D., Meyer, D., Fennessy, B. et al. Schizophrenia risk conferred by rare protein-truncating variants is conserved across diverse human populations. Nat Genet 55, 369–376 (2023). https://doi.org/10.1038/s41588-023-01305-1

Download citation

Received: 20 December 2021
Accepted: 23 January 2023
Published: 13 March 2023
Issue Date: March 2023
DOI: https://doi.org/10.1038/s41588-023-01305-1

This article is cited by

A preliminary metabolomics study of the database for biological samples of schizophrenia among Chinese ethnic minorities
- Jun Ye
- Haixia Chen
- Wenfeng Li
BMC Psychiatry (2024)
Large-scale neurophysiology and single-cell profiling in human neuroscience
- Anthony T. Lee
- Edward F. Chang
- Tomasz J. Nowakowski
Nature (2024)
Rediscovering tandem repeat variation in schizophrenia: challenges and opportunities
- Rebecca Birnbaum
Translational Psychiatry (2023)
Isoform-level transcriptome-wide association uncovers genetic risk mechanisms for neuropsychiatric disorders in the human brain
- Arjun Bhattacharya
- Daniel D. Vo
- Michael J. Gandal
Nature Genetics (2023)
Genomic findings in schizophrenia and their implications
- Michael J. Owen
- Sophie E. Legge
- Michael C. O’Donovan
Molecular Psychiatry (2023)