Abstract
Insomnia is a heritable, highly prevalent sleep disorder for which no sufficient treatment currently exists. Previous genome-wide association studies with up to 1.3 million subjects identified over 200 associated loci. This extreme polygenicity suggested that many more loci remain to be discovered. The current study almost doubled the sample size to 593,724 cases and 1,771,286 controls, thereby increasing statistical power, and identified 554 risk loci (including 364 novel loci). To capitalize on this large number of loci, we propose a novel strategy to prioritize genes using external biological resources and functional interactions between genes across risk loci. Of all 3,898 genes naively implicated from the risk loci, we prioritize 289 and find brain-tissue expression specificity and enrichment in specific gene sets of synaptic signaling functions and neuronal differentiation. We show that this novel gene prioritization strategy yields specific hypotheses on underlying mechanisms of insomnia that would have been missed by traditional approaches.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The full GWAS summary statistics for UKB and the top 10,000 SNPs for 23andMe are available at https://ctg.cncr.nl/software/summary_statistics/. The full GWAS summary statistics for the 23andMe dataset will be made available through 23andMe to qualified researchers under an agreement with 23andMe that protects the privacy of 23andMe participants. Please visit https://research.23andme.com/collaborate/#publication for more information and to apply to access the data. The following publicly available datasets were used in this manuscript: GTEx v.8 (https://gtexportal.org/home/datasets), Allen Human Brain Atlas (http://human.brain-map.org/static/download), scRNA-seq from Linnerson’s laboratory (http://linnarssonlab.org/data/; GSE60361, GSE74672, GSE75330, GSE76381, GSE97478), DropViz (http://dropviz.org/), MsigDB v.6.2 (http://software.broadinstitute.org/gsea/msigdb/index.jsp), InWeb protein–protein interaction (https://inbio-discover.com/download), eQTLGen (https://www.eqtlgen.org/) and PsychEncode (http://resource.psychencode.org/).
Code availability
The R script used to perform gene prioritization approach proposed in this manuscript is available at https://doi.org/10.5281/zenodo.6598552 (ref. 77). The following software and packages were used for data analysis: PLINK 2.0 (https://www.cog-genomics.org/plink/2.0/), METAL (http://csg.sph.umich.edu/abecasis/Metal/download/), MAGMA v.1.07 (https://ctg.cncr.nl/software/magma), FUMA (https://fuma.ctglab.nl/), LDscore (https://github.com/bulik/ldsc), LDstore v.1.1 (http://www.christianbenner.com/#), FINEMAP v.1.3.1 (http://www.christianbenner.com/#), PRSice v.2.2.1 (https://www.prsice.info/), Eagle2 (https://alkesgroup.broadinstitute.org/Eagle/downloads/), Minimac3 (https://genome.sph.umich.edu/wiki/Minimac3), REGENIE v.2.0.1 (https://rgcgithub.github.io/regenie/), MiXeR (https://github.com/precimed/mixer), BUHMBOX (https://software.broadinstitute.org/mpg/buhmbox/) and R v.3.6.0 (https://www.r-project.org/) with packages data.table v.1.12.2, GenomicRegion v.1.36.0, stats v.3.6.3, fpc v.2.2-3, coloc v.3.2-1, Rtsne v.0.15 and ggplot2 v.3.2.0.
References
Roth, T. Insomnia: definition, prevalence, etiology, and consquences. J. Clin. Sleep Med. 3, S7–S10 (2007).
Kripke, D. F., Garfinkel, L., Wingard, D. L., Klauber, M. R. & Marler, M. R. Mortality associated with sleep duration and insomnia. Arch. Gen. Psychiatry 59, 131–136 (2002).
Daley, M., Morin, C. M., Leblanc, M., Grégoire, J. & Savard, J. The economic burden of insomnia: direct and indirect costs for individuals with insomnia. Sleep 32, 55–64 (2009).
Lind, M. J., Aggen, S. H., Kirkpatrick, R. M., Kendler, K. S. & Amstadter, A. B. A longitudinal twin study of insomnia symptoms in adults. Sleep 38, 1423–1430 (2015).
Jansen, P. R. et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways. Nat. Genet. 51, 394–403 (2019).
Lane, J. M. et al. Genome-wide association analyses of sleep disturbance traits identify new loci and highlight shared genetics with neuropsychiatric and metabolic traits. Nat. Genet. 49, 274–281 (2017).
Lane, J. M. et al. Biological and clinical insights from genetics of insomnia symptoms. Nat. Genet. 51, 387–393 (2019).
Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).
Zaitlen, N., Paşaniuc, B., Gur, T., Ziv, E. & Halperin, E. Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010).
Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).
Hammerschlag, A. R. et al. Genome-wide association analysis of insomnia complaints identifies risk genes and genetic overlap with psychiatric and metabolic traits. Nat. Genet. 49, 1584–1592 (2017).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Bulik-sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Schormair, B. et al. Identification of novel risk loci for restless legs syndrome in genome-wide association studies in individuals of European ancestry: a meta-analysis. Lancet Neurol. 16, 898–907 (2017).
Tsai, F. J. et al. A genome-wide association study identifies susceptibility variants for type 2 diabetes in Han Chinese. PLoS Genet. 6, e1000847 (2010).
Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–340 (2011).
Koido, K. et al. Associations between LSAMP gene polymorphisms and major depressive disorder and panic disorder. Transl. Psychiatry 2, e152 (2012).
Must, A. et al. Association of limbic system-associated membrane protein (LSAMP) to male completed suicide. BMC Med. Genet. 9, 34 (2008).
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, 2–3 (2014).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).
Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
The GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).
Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).
Li, T. et al. A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2016).
Sinnott-Armstrong, N., Naqvi, S., Rivas, M. & Pritchard, J. K. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. eLife 10, e58615 (2021).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 50, 668–681 (2018).
Savage, J. E. et al. GWAS meta-analysis (N=279,930) identifies new genes and functional links to intelligence. Nat. Genet. 50, 912–919 (2018).
Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50, 1112–1121 (2018).
Singh, K. et al. Neuronal growth and behavioral alterations in mice deficient for the psychiatric disease-associated negr1 gene. Front. Mol. Neurosci. 11, 30 (2018).
Singh, K. et al. Neural cell adhesion molecule Negr1 deficiency in mouse results in structural brain endophenotypes and behavioral deviations related to psychiatric disorders. Sci. Rep. 9, 5457 (2019).
Koike, N. et al. Transcriptional architecture and chromatin landscape of the core circadian clock in mammals. Science 338, 349–354 (2012).
Bonnet, M. H. & Arand, D. L. Hyperarousal and insomnia: state of the science. Sleep Med. Rev. 14, 9–15 (2010).
Hikosaka, O. The habenula: from stress evasion to value-based decision-making. Nat. Rev. Neurosci. 11, 503–513 (2010).
Benarroch, E. E. Habenula: recently recognized functions and potential clinical relevance. Neurology 58, 992–1000 (2015).
Zhao, H. & Rusak, B. Circadian firing-rate rhythms and light responses of rat habenular nucleus neurons in vivo and in vitro. Neuroscience 132, 519–528 (2005).
Haun, F., Eckenrode, T. C. & Murray, M. Habenula and thalamus cell transplants restore normal sleep behaviors disrupted by denervation of the interpeduncular nucleus. J. Neurosci. 12, 3282–3290 (1992).
Bianco, I. H. & Wilson, S. W. The habenular nuclei: a conserved asymmetric relay station in the vertebrate brain. Philos. Trans. R. Soc. B Biol. Sci. 364, 1005–1020 (2009).
Chrobok, L. et al. Intrinsic circadian timekeeping properties of the thalamic lateral geniculate nucleus. J. Neurosci. Res. 99, 3306–3324 (2021).
Harrington, M. E. The ventral lateral geniculate nucleus and the intergeniculate leaflet: interrelated structures in the visual and circadian systems. Neurosci. Biobehav. Rev. 21, 705–727 (1997).
Johnson, R. F., Moore, R. Y. & Morin, L. P. Lateral geniculate lesions alter circadian activity rhythms in the hamster. Brain Res. Bull. 22, 411–422 (1989).
Moore, R. Y. & Speh, J. C. GABA is the principal neurotransmitter of the circadian system. Neurosci. Lett. 150, 112–116 (1993).
Melzer, S. & Monyer, H. Diversity and function of corticopetal and corticofugal GABAergic projection neurons. Nat. Rev. Neurosci. 21, 499–515 (2020).
España, R. A. & Scammell, T. E. Sleep neurobiology from a clinical perspective. Sleep 34, 845–858 (2011).
Gottesmann, C. GABA mechanisms and sleep. Neuroscience 111, 231–239 (2002).
Kostin, A., Alam, M. A., McGinty, D. & Alam, M. N. Adult hypothalamic neurogenesis and sleep-wake dysfunction in aging. Sleep 44, zsaa173 (2021).
Levenson, J. C., Kay, D. B. & Buysse, D. J. The pathophysiology of insomnia. Chest 147, 1179–1192 (2015).
Spiegelhalder, K. et al. Neuroimaging insights into insomnia. Curr. Neurol. Neurosci. Rep. 15, 9 (2015).
Kay, D. B. & Buysse, D. J. Hyperarousal and beyond: new insights to the pathophysiology of insomnia disorder through functional neuroimaging studies. Brain Sci. 7, brainsci7030023 (2017).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics 33, 2776–2778 (2017).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Euesden, J., Lewis, C. M. & Reilly, P. F. O. PRSice: Polygenic Risk Score software. Bioinformatics 31, 1466–1468 (2015).
Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012).
Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 348, 1138–1142 (2015).
Romanov, R. A. et al. Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes. Nat. Neurosci. 20, 176–188 (2017).
Marques, S. et al. Oligodendrocyte heterogneity in the mouse juvenile and adult central nervous system. Science 352, 1326–1329 (2016).
La Manno, G. et al. Molecular diversity of midbrain development in mouse, human, and stem cells. Cell 167, 566–580 (2016).
Muñoz-Manchado, A. B. et al. Diversity of interneurons in the dorsal atriatum revealed by single-cell RNA sequencing and PatchSeq. Cell Rep. 24, 2179–2190 (2018).
Saunders, A. et al. Molecular diversity and specializations among the cells of the adult mouse brain. Cell 174, 1015–1030 (2018).
Watanabe, K., Umićević Mirkov, M., de Leeuw, C. A., van den Heuvel, M. P. & Posthuma, D. Genetic mapping of cell type specificity for complex traits. Nat. Commun. 10, 3222 (2019).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Benner, C. et al. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 101, 539–551 (2017).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Boyle, A. P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Schmitt, A. D. et al. A Compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep. 17, 2042–2059 (2016).
Maaten, L. VanDer & Hinton, G. Visualizing high dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Watanabe, K. Gene prioritization using multi-loci information for insomnia meta analysis. https://doi.org/10.5281/zenodo.6598552
Acknowledgements
We thank both UKB and 23andMe participants who consented to participate in research, and researchers who collected and contributed the data. D.P. was funded by The Netherlands Organization for Scientific Research (no. NWO VICI 453-14-005), NWO Gravitation: BRAINSCAPES: A Roadmap from Neurogenetics to Neurobiology (grant no. 024.004.012) and a European Research Council advanced grant (no. ERC-2018-AdG GWAS2FUNC 834057). E.J.W.V.S. was funded by the European Research Council (no. ERC-ADG-2014-671084 INSOMNIA) and P.R.J. was funded by the Netherlands Organization for Scientific Research (no. ZonMW VENI-09150162010138). The research was conducted using the UK Biobank Resource (application no. 16406). Analyses were carried out on the Genetic Cluster Computer hosted by the Dutch National Computing and Networking Services, SurfSARA. We additionally thank the GTEx Portal for providing RNA-seq data. The research was based in part on data from the Million Veteran Program – Office of Research and Development, Veterans Health Administration, supported by award nos. CSP 575B and Merit 1I01CX001849.e.
Author information
Authors and Affiliations
Consortia
Contributions
D.P. conceived the study. K.W. performed analyses. J.F.S. performed quality control on the UKB data and wrote the analysis pipeline. P.N., D.A.H., X.W. and the 23andMe Research Team contributed and analyzed the 23andMe cohort data. J.G., D.F.L., R.P. and M.B.S. performed PGS analysis for the MVP cohort. E.J.W.V.S and A.B.S provided valuable discussions. K.W., P.R.J. and D.P. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
P.N., X.W., D.A.H. and members of the 23andMe research team are employees of 23andMe, Inc. and hold stock or stock options in 23andMe, Inc. K.W. is a current employee of Regeneron Pharmaceuticals and holds stock and stock options in Regeneron Pharmaceuticals. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Phenotypic variance explained by polygenic risk scoring.
Bars are colored by P-value threshold of SNPs used to compute the polygenic risk score.
Extended Data Fig. 2 Genetic overlap between insomnia and 350 traits.
Significant genetic correlations of insomnia with 350 traits after Bonferroni correction (p < 9.07e-5). P-values were based on two-sided Z-test. Each data point represents a trait and is colored by the domain category.
Extended Data Fig. 3 Distribution of PRS based on metabolic and psychiatric loci.
A single star represents nominal significant (p<0.05) and double star represents significant after Bonferroni correction (p<0.05/9) of two-sided Mann-Whitney U test (see Supplementary Table 21 for full results). The boxes indicate 25% (Q1) and 75% (Q3) quantiles and horizontal black lines indelicate median. The minimum and maximum of the whisker are Q1-1.5*IQR and Q3+1.5*IQR where IQR is Q3-Q1. Data points which do not fall within the whisker’s interval are displayed as dots. Number of data points (individuals) are: for column 1 (based on metabolic loci) 300 top and 299 bottom 1%, 1495 top and bottom 5%, 2986 top and 2984 bottom 10% for overall health rating, 297 top and bottom 1%, 1475 top and 1471 bottom 5%, 2950 top and 2948 bottom 10% for body fat percentage, 281 top and 283 bottom 1%, 1405 top and 1402 bottom 5%, 2812 top and 2815 bottom 10% for depressive symptoms, for column 2 (based on psychiatric loci) 299 top and 298 bottom 1%, 1490 top and 1493 bottom 5%, 2986 top and 2990 bottom 10% for overall health rating, 297 top and 294 bottom 1%, 1470 top and 1477 bottom 5%, 2941 top and 2959 bottom 10% for body fat percentage, 287 top and 285 bottom 1%, 1403 top and 1418 bottom 5%, 2809 top and 2844 bottom 10% for depressive symptoms.
Extended Data Fig. 4 Additional conditional analyses for MAGMA tissue and brain region association analyses.
P-values were computed by MAGMA gene analysis based on one-sided T-test for the regression coefficient of the gene expression. (a) P-values of brain regions from GTEx, with (Conditional) and without (Marginal) conditioning on the average expression across 13 brain regions. (b) Comparison of AHBA (low resolution) and DropViz datasets with MAGMA gene-property analysis. P-values (top) and standardized effect size (Beta, bottom) of brain regions from the AHBA low dataset and cell types from the DropViz dataset. The most left bar indicates the marginal association statistics for each item. The middle bar indicates the association statistics based only on genes present in both datasets (~11,000 genes). The most right bar indicates the association statistics based only on genes that are not available in the other dataset (~2,000 for AHBA low and ~4,000 for DropViz). The horizontal dashed line indicates the Bonferroni corrected threshold for statistical significance (p=0.05/5974).
Extended Data Fig. 5 MAGMA gene-property and gene-set analyses conditioning on sets of genes from insomnia risk loci.
The top (most significantly associated) 5 brain regions/cell types/gene-sets (referred to as gene-sets hereafter) were selected for each dataset, except for DropViz where 4 independently associated cell types were selected. For each gene-set, MAGMA was performed while conditioning on 3 sets of genes; high-confidence prioritized (HCP), unsolved and excluded genes.
Extended Data Fig. 6 Heatmap of the overlap of genes across significantly enriched gene-sets.
The displayed 18 gene-sets showed significant enrichment with 289 HCP genes. The heatmap is asymmetric. A cell of row i and column j represents the proportion of the prioritized genes in the gene-set i relative to the number of prioritized genes in the gene-set j.
Supplementary information
Supplementary Information
Supplementary note and Figs. 1–8.
Supplementary Data 1
Locus Zoom plots for 554 loci.
Supplementary Tables
Supplementary Tables 1–49.
Rights and permissions
About this article
Cite this article
Watanabe, K., Jansen, P.R., Savage, J.E. et al. Genome-wide meta-analysis of insomnia prioritizes genes associated with metabolic and psychiatric pathways. Nat Genet 54, 1125–1132 (2022). https://doi.org/10.1038/s41588-022-01124-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-022-01124-w
This article is cited by
-
Mendelian randomization study supports positive bidirectional causal relationships between genetically predicted insomnia symptom and liability to benign prostatic hyperplasia
BMC Urology (2024)
-
Genetic architecture of the structural connectome
Nature Communications (2024)
-
The genetic architecture of multimodal human brain age
Nature Communications (2024)
-
Association between alcohol consumption and sleep traits: observational and mendelian randomization studies in the UK biobank
Molecular Psychiatry (2024)
-
Genetic factors associated with suicidal behaviors and alcohol use disorders in an American Indian population
Molecular Psychiatry (2024)