As technology advances, whole genome sequencing (WGS) is likely to supersede other genotyping technologies. The rate of this change depends on its relative cost and utility. Variants identified uniquely through WGS may reveal novel biological pathways underlying complex disorders and provide high-resolution insight into when, where, and in which cell type these pathways are affected. Alternatively, cheaper and less computationally intensive approaches may yield equivalent insights. Understanding the role of rare variants in the noncoding gene-regulating genome through pilot WGS projects will be critical to determining which of these two extremes best represents reality. With large cohorts, well-defined risk loci, and a compelling need to understand the underlying biology, psychiatric disorders have a role to play in this preliminary WGS assessment. The Whole Genome Sequencing for Psychiatric Disorders Consortium will integrate data for 18,000 individuals with psychiatric disorders, beginning with autism spectrum disorder, schizophrenia, bipolar disorder, and major depressive disorder, along with over 150,000 controls.
The authors acknowledge and thank the study participants and their families. The WGSPD is a public–private partnership between the NIMH, the Stanley Center for Psychiatric Research, and researchers at 11 academic institutions across the USA. This work was supported by grants from the NIMH, specifically U01 MH105653 (M.B.), U01 MH105641 (S.A.M.), U01 MH105573 (C.N.P.), U01 MH105670 (D.B.G.), U01 MH105575 (M.W.S., A.J.W.), U01 MH105669 (M.J.D., K.E.), U01 MH105575 (N.B.F., D.H.G., R.A.O.), U01 MH105666 (A.P.), U01 MH105630 (D.C.G.), U01 MH105632 (J.B.), U01 MH105634 (R.E.G.), U01 MH100239-03S1 (M.W.S., S.J.S., A.J.W.), R01 MH095454 (N.B.F.); by grants from the Simons Foundation (SFARI #385110, M.W.S., S.J.S., A.J.W., D.B.G., SFARI #401457 (D.H.G.)); and by a gift from the Stanley Foundation (S.E.H.).
Integrated supplementary information
Supplementary Figure 1 Statistical power in the noncoding genome by cohort size, related to Figure 1 in the main manuscript.
We estimated the power at a significance threshold (alpha) of 5 × 10−5, accounting for 1,000 categories of noncoding variants, to detect an excess of noncoding variants at 122,500 risk loci in cases vs. controls as we varied the sample size and risk:non-risk ratio, which represents annotation quality (Supplementary Tables 1 and 3). In a) we assessed the power for detecting an excess of de novo mutations at a relative risk of 5 as sample size increases. With a risk:non-risk ratio of 1:20, approximately equivalent to assessing protein truncating variants in the coding genome, we achieve >80% power with a sample size of 5,000. In b) the power to detect an excess burden of rare variants (allele frequency ≤0.1%) is assessed at a relative risk of 1.2. In c) we assessed the power to identify an excess of de novo mutations at a specific genomic locus, e.g. the noncoding region regulating a single gene. Consequently, we set the significance threshold (alpha) at 2.5 × 10−6 to account for 20,000 genes. In d) we assessed the power to identify an excess of rare variants (allele frequency ≤0.1%) at a specific nucleotide (alpha = 1.7 × 10−11), since this yielded better power than testing for burden at a locus (alpha = 2.5 × 10−6).