Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

Integrating whole-genome sequencing with multi-omic data reveals the impact of structural variants on gene regulation in the human brain

Abstract

Structural variants (SVs), which are genomic rearrangements of more than 50 base pairs, are an important source of genetic diversity and have been linked to many diseases. However, it remains unclear how they modulate human brain function and disease risk. Here we report 170,996 SVs discovered using 1,760 short-read whole genomes from aged adults and individuals with Alzheimer’s disease. By applying quantitative trait locus (SV-xQTL) analyses, we quantified the impact of cis-acting SVs on histone modifications, gene expression, splicing and protein abundance in postmortem brain tissues. More than 3,200 SVs were associated with at least one molecular phenotype. We found reproducibility of 65–99% SV-eQTLs across cohorts and brain regions. SV associations with mRNA and proteins shared the same direction of effect in more than 87% of SV–gene pairs. Mediation analysis showed ~8% of SV-eQTLs mediated by histone acetylation and ~11% by splicing. Additionally, associations of SVs with progressive supranuclear palsy identified previously known and novel SVs.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Study overview.
Fig. 2: Summary of SV calls across cohorts.
Fig. 3: Properties of SV-eQTLs.
Fig. 4: Impact of SVs on the gene regulatory cascade.
Fig. 5: Mediation of SV-xQTL.
Fig. 6: Impact of rare SVs on gene expression outliers.
Fig. 7: SVs associated with PSP and their effects on molecular phenotypes.

Similar content being viewed by others

Data availability

Data supporting the findings of this study are available via the AD Knowledge Portal (https://adknowledgeportal.org). The AD Knowledge Portal is a platform for accessing data, analyses and tools generated by the AMP-AD Target Discovery Program and other National Institute on Aging (NIA)-supported programs to enable open-science practices and accelerate translational learning. The data, analyses and tools are shared early in the research cycle without a publication embargo on secondary use. Data are available for general research use according to the following requirements for data access and data attribution (https://adknowledgeportal.org/DataAccess/Instructions). For access to content described in this manuscript, including raw PacBio long-read sequencing data, individual-level SV calls and SV-xQTL summary statistics, see https://doi.org/10.7303/syn26952206. Additionally, individual-level genotyping and SV-xQTL summary statistics data are also being made available through NIAGADS (accession number NG00118). All SV site frequency data from 1,706 donors discovered separately in each cohort, complete nominal and permuted SV-xQTL summary statistics and disease status association summary statistics are publicly available on GitHub (https://github.com/RajLabMSSM/AMP_AD_StructuralVariation). The raw WGS data used for SV discovery are available for each cohort respectively: ROS/MAP26 (syn10901595); MSBB29 (syn10901600); and Mayo Clinic28 (syn10901601). ROS/MAP H3K9ac ChIP-seq data are available at syn4896408, and TMT proteomics data are available at syn17015098. RNA-seq reprocessed data from all cohorts were obtained from the RNA-seq harmonization study89 (syn9702085). Splicing junction proportions were obtained from Raj et al.86, and a respective sQTL visualization (Shiny App) browser is available at https://rajlab.shinyapps.io/sQTLviz_ROSMAP/. ROS/MAP data can also be requested at https://www.radc.rush.edu.

Code availability

All code used in this study has been provided in a single repository on GitHub (https://github.com/RajLabMSSM/AMP_AD_StructuralVariation).

References

  1. Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Abel, H. J. et al. Mapping and characterization of structural variation in 17,795 human genomes. Nature 583, 83–89 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Collins, R. L. et al. A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Feuk, L., Carson, A. R. & Scherer, S. W. Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006).

    Article  CAS  PubMed  Google Scholar 

  5. Sharp, A. J., Cheng, Z. & Eichler, E. E. Structural variation of the human genome. Annu. Rev. Genomics Hum. Genet. 7, 407–442 (2006).

    Article  CAS  PubMed  Google Scholar 

  6. Conrad, D. F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

    Article  CAS  PubMed  Google Scholar 

  7. Ebert, P. et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117 (2021).

  8. Byrska-Bishop, M. et al. High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Preprint at https://www.biorxiv.org/content/10.1101/2021.02.06.430068v1 (2021).

  9. Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1784 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  10. McCarthy, S. E. et al. Microduplications of 16p11.2 are associated with schizophrenia. Nat. Genet. 41, 1223–1227 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177–183 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Marshall, C. R. et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat. Genet. 49, 27–35 (2017).

    Article  CAS  PubMed  Google Scholar 

  13. Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466, 368–372 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Mitra, I. et al. Patterns of de novo tandem repeat mutations and their role in autism. Nature 589, 246–250 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Männik, K. et al. Copy number variations and cognitive phenotypes in unselected populations. JAMA 313, 2044–2054 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Stefansson, H. et al. CNVs conferring risk of autism or schizophrenia affect cognition in controls. Nature 505, 361–366 (2014).

    Article  CAS  PubMed  Google Scholar 

  18. Battle, A. et al. Impact of regulatory variation from RNA to protein. Science 347, 664–667 (2015).

    Article  CAS  PubMed  Google Scholar 

  19. Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Ng, B. et al. An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat. Neurosci. 20, 1418–1426 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Chiang, C. et al. The impact of structural variation on human gene expression. Nat. Genet. 49, 692–699 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Scott, A. J., Chiang, C. & Hall, I. M. Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes. Genome Res. 31, 2249–2257 (2021).

  23. Ramsköld, D., Wang, E. T., Burge, C. B. & Sandberg, R. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data. PLoS Comput. Biol. 5, e1000598 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Polymenidou, M. et al. Long pre-mRNA depletion and RNA missplicing contribute to neuronal vulnerability from loss of TDP-43. Nat. Neurosci. 14, 459–468 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Sonawane, A. R. et al. Understanding tissue-specific gene regulation. Cell Rep. 21, 1077–1088 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. De Jager, P. L. et al. A multi-omic atlas of the human frontal cortex for aging and Alzheimer’s disease research. Sci. Data 5, 180142 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Bennett, D. A. et al. Religious Orders Study and Rush Memory and Aging Project. J. Alzheimers Dis. 64, S161–S189 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Allen, M. et al. Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci. Data 3, 160089 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wang, M. et al. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci. Data 5, 180185 (2018).

  30. Hodes, R. J. & Buckholtz, N. Accelerating medicines partnership: Alzheimer’s disease (AMP-AD) knowledge portal aids Alzheimer’s drug discovery through open data sharing. Expert Opin. Ther. Targets 20, 389–391 (2016).

    Article  PubMed  Google Scholar 

  31. Lappalainen, I. et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 41, D936–D941 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. MacDonald, J. R., Ziman, R., Yuen, R. K. C., Feuk, L. & Scherer, S. W. The Database of Genomic Variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 42, D986–D992 (2014).

    Article  CAS  PubMed  Google Scholar 

  33. Firth, H. V. & Wright, C. F., DDD Study. The Deciphering Developmental Disorders (DDD) study. Dev. Med. Child Neurol. 53, 702–703 (2011).

    Article  PubMed  Google Scholar 

  34. Han, L. et al. Functional annotation of rare structural variation in the human brain. Nat. Commun. 11, 2990 (2022).

  35. Jakubosky, D. et al. Properties of structural variants and short tandem repeats associated with gene expression and complex traits. Nat. Commun. 11, 2927 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).

    Article  CAS  PubMed  Google Scholar 

  38. Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Shi, Y. et al. Common variants on 8p12 and 1q24.2 confer risk of schizophrenia. Nat. Genet. 43, 1224–1227 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Kondrashov, F. A. & Koonin, E. V. Origin of alternative splicing by tandem exon duplication. Hum. Mol. Genet. 10, 2661–2669 (2001).

    Article  CAS  PubMed  Google Scholar 

  41. Sieberts, S. K. et al. Large eQTL meta-analysis reveals differing patterns between cerebral cortical and cerebellar brain regions. Sci. Data 7, 340 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Lev-Maor, G. et al. Intronic Alus influence alternative splicing. PLoS Genet. 4, e1000204 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Ade, C., Roy-Engel, A. M. & Deininger, P. L. Alu elements: an intrinsic source of human genome instability. Curr. Opin. Virol. 3, 639–645 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Kim, D. S. & Hahn, Y. Identification of human-specific transcript variants induced by DNA insertions in the human genome. Bioinformatics 27, 14–21 (2011).

    Article  CAS  PubMed  Google Scholar 

  45. Hancks, D. C., Ewing, A. D., Chen, J. E., Tokunaga, K. & Kazazian, H. H. Jr. Exon-trapping mediated by the human retrotransposon SVA. Genome Res. 19, 1983–1991 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Crouse, W. L., Keele, G. R., Gastonguay, M. S., Churchill, G. A. & Valdar, W. A Bayesian model selection approach to mediation analysis. Preprint at https://www.biorxiv.org/content/10.1101/2021.07.19.452969v2.full (2021).

  47. Robins, C. et al. Genetic control of the human brain proteome. Am. J. Hum. Genet. 108, 400–410 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Ferraro, N. M. et al. Transcriptomic signatures across human tissues identify functional rare genetic variation. Science 369, eaaz5900 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Li, X. et al. The impact of rare variation on gene expression across tissues. Nature 550, 239–243 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

    Article  PubMed Central  Google Scholar 

  51. Nalls, M. A. et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease. Nat. Genet. 46, 989–993 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Höglinger, G. U. et al. Identification of common variants influencing risk of the tauopathy progressive supranuclear palsy. Nat. Genet. 43, 699–705 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Chen, J. A. et al. Joint genome-wide association study of progressive supranuclear palsy identifies novel susceptibility loci and genetic correlation to neurodegenerative diseases. Mol. Neurodegener. 13, 41 (2018).

  54. Corces, M. R. et al. Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases. Nat. Genet. 52, 1158–1168 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Han, L. et al. Functional annotation of rare structural variation in the human brain. Nat. Commun. 11, 2990 (2020).

  57. Vogel, C. & Marcotte, E. M. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 13, 227–232 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Jacques, P.-É., Jeyakani, J. & Bourque, G. The majority of primate-specific regulatory sequences are derived from transposable elements. PLoS Genet. 9, e1003504 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Kellner, M. & Makałowski, W. Transposable elements significantly contributed to the core promoters in the human genome. Sci. China Life Sci. 62, 489–497 (2019).

    Article  CAS  PubMed  Google Scholar 

  60. Bennett, E. A., Coleman, L. E., Tsui, C., Pittard, W. S. & Devine, S. E. Natural genetic variation caused by transposable elements in humans. Genetics 168, 933–951 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Kwon, Y.-J. et al. Structure and expression analyses of SVA elements in relation to functional genes. Genomics Inform. 11, 142–148 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Gianfrancesco, O. et al. The Role of SINE-VNTR-Alu (SVA) retrotransposons in shaping the human genome. Int. J. Mol. Sci. 20, 5977 (2019).

  63. Savage, A. L., Bubb, V. J., Breen, G. & Quinn, J. P. Characterisation of the potential function of SVA retrotransposons to modulate gene expression patterns. BMC Evol. Biol. 13, 101 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Savage, A. L. et al. An evaluation of a SVA retrotransposon in the FUS promoter as a transcriptional regulator and its association to ALS. PLoS ONE 9, e90833 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Gianfrancesco, O., Bubb, V. J. & Quinn, J. P. SVA retrotransposons as potential modulators of neuropeptide gene expression. Neuropeptides 64, 3–7 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Quinn, J. P. & Bubb, V. J. SVA retrotransposons as modulators of gene expression. Mob. Genet. Elem. 4, e32102 (2014).

    Article  Google Scholar 

  67. Chander, V., Gibbs, R. A. & Sedlazeck, F. J. Evaluation of computational genotyping of structural variation for clinical diagnoses. Gigascience 8, giz110 (2019).

  68. Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  70. Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).

    Article  CAS  PubMed  Google Scholar 

  71. Chen, K. et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat. Methods 6, 677–681 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Abyzov, A. et al. Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms. Nat. Commun. 6, 7256 (2015).

    Article  CAS  PubMed  Google Scholar 

  74. Gardner, E. J. et al. The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res. 27, 1916–1929 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Geoffroy, V. et al. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34, 3572–3574 (2018).

    Article  CAS  PubMed  Google Scholar 

  77. Graffelman, J., Nelson, S., Gogarten, S. M. & Weir, B. S. Exact inference for Hardy–Weinberg proportions with missing genotypes: single and multiple imputation. G3 (Bethesda) 5, 2365–2373 (2015).

    Article  Google Scholar 

  78. Jiang, T. et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol. 21, 189 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Heller, D. & Vingron, M. SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36, 5519–5521 (2020).

    Article  CAS  PubMed Central  Google Scholar 

  81. Zhao, X., Weber, A. M. & Mills, R. E. A recurrence-based approach for validating structural variation using long-read sequencing technology. Gigascience 6, 1–9 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    Article  CAS  PubMed  Google Scholar 

  83. Klein, H.-U. et al. Epigenome-wide study uncovers large-scale changes in histone acetylation driven by tau pathology in aging and Alzheimer’s human brains. Nat. Neurosci. 22, 37–46 (2019).

    Article  CAS  PubMed  Google Scholar 

  84. Ping, L. et al. Global quantitative analysis of the human brain proteome in Alzheimer’s and Parkinson’s disease. Sci. Data 5, 180036 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Johnson, E. C. B. et al. Deep proteomic network analysis of Alzheimer’s disease brain reveals alterations in RNA binding proteins and RNA splicing associated with disease. Mol. Neurodegener. 13, 1–22 (2018).

    Article  Google Scholar 

  86. Raj, T. et al. Integrative transcriptome analyses of the aging brain implicate altered splicing in Alzheimer’s disease susceptibility. Nat. Genet. 50, 1584–1592 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. The GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  PubMed Central  Google Scholar 

  88. Brechtmann, F. et al. OUTRIDER: a statistical method for detecting aberrantly expressed genes in RNA sequencing data. Am. J. Hum. Genet. 103, 907–917 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Wan, Y.-W. et al. Meta-analysis of the Alzheimer’s disease human brain transcriptome and functional dissection in mouse models. Cell Rep. 32, 107908 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Allen, M. et al. Gene expression, methylation and neuropathology correlations at progressive supranuclear palsy risk loci. Acta Neuropathol. 132, 197–211 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the participants of AMP-AD cohorts for their essential contributions and gift to these projects. ROS/MAP study data were provided by the Rush Alzheimer’s Disease Center at Rush University Medical Center. Data collection was supported through funding by National Institute on Aging (NIA) grants P30AG10161, R01AG15819, R01AG17917, R01AG30146, R01AG36836, U01AG32984, U01AG46152 and U01AG61356 and by the Illinois Department of Public Health. Mayo RNA-seq study data were provided by the following sources: the Mayo Clinic Alzheimer’s Disease Genetic Studies, led by N. Ertekin-Taner and S. G. Younkin (Mayo Clinic, Jacksonville, Florida), using samples from the Mayo Clinic Study of Aging, the Mayo Clinic Alzheimer’s Disease Research Center and the Mayo Clinic Brain Bank. Data collection was supported through funding by NIA grants P50 AG016574, R01 AG032990, U01 AG046139, R01 AG018023, U01 AG006576, U01 AG006786, R01 AG025711, R01 AG017216 and R01 AG003949; by National Institute of Neurological Disorders and Stroke (NINDS) grant R01 NS080820; by the CurePSP Foundation; and by support from the Mayo Foundation. Study data include samples collected through the Sun Health Research Institute Brain and Body Donation Program of Sun City, Arizona. The Brain and Body Donation Program is supported by the NINDS (U24 NS072026, National Brain and Tissue Resource for Parkinson’s Disease and Related Disorders), the NIA (P30 AG19610, Arizona Alzheimer’s Disease Core Center), the Arizona Department of Health Services (contract 211002, Arizona Alzheimer’s Research Center), the Arizona Biomedical Research Commission (contracts 4001, 0011, 05-901 and 1001 to the Arizona Parkinson’s Disease Consortium) and the Michael J. Fox Foundation for Parkinson’s Research. Mount Sinai Brain Bank data were generated from postmortem brain tissue collected through the Mount Sinai VA Medical Center Brain Bank and were provided by E. Schadt of the Mount Sinai School of Medicine through funding from NIA grant U01AG046170. The authors thank B. Zhang and E. Wang for assistance with data sharing and members of the Raj and Crary laboratories for their feedback on the manuscript. We thank J. Humphrey for insightful comments and suggestions during this work. This work was supported by grants from the National Institutes of Health (NIH) (NIH NIA U01-AG068880, NIA R01-AG054005, NIA R56-AG055824 and NIA R01-AG054008). This work was supported, in part, through the computational and data resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai. We thank the Mount Sinai Technology Development core for help and support with performing long-read sequencing. Cartoons in Figs. 1 and 5b,c were created with BioRender. The research reported in this paper was supported by the Office of Research Infrastructure of the NIH under award number S10OD026880. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: T.R. and R.A.V.; Methodology: T.R. and R.A.V.; Software: R.A.V.; Formal analysis: R.A.V. and K.P.L.; Resources and data curation: D.A.B., T.R. and J.F.C.; Writing—original draft: T.R. and R.A.V.; Writing—review and editing: T.R., D.A.B., J.F.C., R.A.V. and K.P.L.; Supervision, project administration and funding acquisition: T.R. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Towfique Raj.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Neuroscience thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Functional context and evolutionary constraints.

a, Cumulative fraction of SVs by minor allele frequency (MAF). b, Enrichment of SVs overlapping each region stratified by common (MAF > 5%), rare (MAF < 5%), and singleton. Enrichment of OMIM genes (c), LoF intolerant genes (d), and Haploinsufficient genes (e) overlapping SVs in different frequency stratum. Lines in the enrichment plots indicate Wald confidence intervals while the midpoints represent the relative log odds.

Extended Data Fig. 2 Pairwise sharing of eQTLs among brain tissues and cohorts.

a, SV-eQTL sharing across different groups and regions measured by π1 from qvalue R package. Columns represent the discovery sets while rows represent the replication set. b, Sharing according to mashR meta-analysis. SV-eQTLs with local false sign rate (lfsr) lower than 0.05 in at least one of the two tissues were considered (n = 1,081–1,364 gene-SV pairs, depending on pair of tissues compared). Lower triangle shows the proportion of sharing by sign (that is effect estimates have the same direction). Upper triangle shows the proportion of sharing in magnitude (that is effect estimates that are in the same direction and within a factor of 2 in size).

Extended Data Fig. 3

Comparison between brain and monocytes SV-eQTLs effect sizes. Scatter plot shows the slope of 429 eGenes mapped in ROS/MAP DLPFC and Monocytes with a significant association in either dataset (FDR < 5%). Although majority of effects are concordant in direction, many genes show opposite direction of effects between brain and monocytes (for example ARL17B and CASP8). The x-axis shows the effect size in DLPFC and y-axis shows the effect size in Monocytes for the same SV-gene pair. Dots colored in blue are significant only at Monocytes, dots colored in grey are significant only in DLPFC, and dots in red are significant in both. Pearson correlation coefficient (and P-value, two-sided) of slopes for all 144 SV-gene pairs is shown on top.

Extended Data Fig. 4 SV-xQTL top hits.

Manhattan plots showing the top SV-xQTLs measured in ROS/MAP. Colored labels represent each SV class. a, SV-haQTL (H3K9ac), showing labels for associations with -log10(P-value)>30. b, SV-eQTL, labels for associations with -log10(P-value)>40. c, SV-sQTL, labels for associations with -log10(P-value)>40. d, SV-pQTL, labels for associations with -log10(P-value)>10.

Extended Data Fig. 5 SV-xQTL effect sizes.

Distribution of effect sizes for all SVx-QTLs by SV class. Plots on the left show results for all associated SVs, plots on the right show results only for SVs overlapping either the associated histone peak (SV-haQTL, a), or exonic regions of the associated gene (SV-eQTL on b and SV-pQTL on c).

Extended Data Fig. 6 SV-eQTL mediation by SV-pQTL.

a, The locus plot shows a 3.7 kb deletion (in red) deleting the splicing acceptor sites on exon 16 of the gene ACOT11 (from which is an SV-eQTL and SV-pQTL). Genes and histone peaks colored in red had significant associations (FDR < 0.05) with the SV. b, Mediation analysis performed on 112 biologically independent samples with both RNA-seq and proteomics data available, supports the mediation of the gene MROH7 SV-eQTL via SV-pQTL of ACOT11 (complete mediation posterior probability = 0.59). The scatter plot on the left shows the correlation between both phenotypes, x-axis is the residual mRNA expression of MROH7 while the y-axis is the residual protein abundance levels for ACOT11. Pearson correlation coefficient (R) and respective P-value as well as a linear regression line are shown in the plot. The box plots show the median in the central line, the box spans the first to the third quartiles and the whiskers extend 1.5 times the IQR from the box. Nominal P-values and effect sizes from the linear regression model are listed on the top of each box plot.

Extended Data Fig. 7 SV-xQTL in LD with Schizophrenia GWAS variant.

a, locus plot showing a 129 bp deletion that is in LD with a Schizophrenia GWAS variant (rs8070345, R2 = 0.94)6. Plot also shows genes and H3K9ac peaks near the SV. Genes colored in red represent phenotypes found significantly associated with the deletion at RNA and protein levels (SV-eQTL and SV-pQTL at FDR < 5%). b, shows the boxplot for the SV-eQTL association with the gene SRR (n = 456 biologically independent samples), c, shows the boxplot for the SV-pQTL association with the gene SRR (n = 272 biologically independent samples). In the box plots, slopes (β) and FDR adjusted P-values are shown for each association (linear regression model), the median values are shown in the central line, the box spans the first to the third quartiles and the whiskers extend 1.5 times the IQR from the box.

Extended Data Fig. 8 SV-xQTL in LD with Schizophrenia GWAS variant.

a, locus plot showing a 5191 bp deletion that is in LD with a Schizophrenia GWAS variant (rs66691851, R2 = 0.95)6. Plot also shows genes and H3K9ac peaks near the SV. Genes and H3K9ac bars colored in red represent phenotypes found significantly associated with the deletion (SV-eQTL and SV-haQTL at FDR < 5%). b, shows the boxplot for the SV-eQTL association for the PCCB (n = 456 biologically independent samples), c, shows the boxplot for the SV-haQTL association for a peak in the promoter region of STAG1 (n = 571 biologically independent samples). In the box plots, slopes (β) and FDR adjusted P-values are shown for each association (linear regression model), the median values are shown in the central line, the box spans the first to the third quartiles and the whiskers extend 1.5 times the IQR from the box.

Extended Data Fig. 9 SV-xQTL in LD with Alzheimer’s disease GWAS variant.

a, locus plot showing a 82 bp insertion that is in LD with an Alzheimer’s disease GWAS variant (rs73045691, R2 = 0.80)6. Plot also shows genes and H3K9ac peaks near the SV. Genes colored in red represent phenotypes found significantly associated with the insertion (SV-eQTL and SV-sQTL at FDR < 5%). b, shows the boxplot for the SV-eQTL association for the APOC1 gene (n = 456 biologically independent samples), c, shows the boxplot for the SV-sQTL association for a peak in the promoter region of APOC2 (n = 505 biologically independent samples). In the box plots, slopes (β) and FDR adjusted P-values are shown for each association (linear regression model), the median values are shown in the central line, the box spans the first to the third quartiles and the whiskers extend 1.5 times the IQR from the box.

Extended Data Fig. 10 Quality assessment of variant calling.

In silico benchmarking and validation. a, Benchmarking of individual SV discovery tools and combined tools (“Merged”) for the sample HG002 evaluated against the Genome in a Bottle v0.6 Tier 1 using truvari. “Merged” strategy was defined by the best F1-score after testing all possible combinations of tools (for insertions and deletions separately). The same merging criteria was applied for all samples in AMP-AD. b, Benchmarking results of all 1,760 AMP-AD samples evaluated against the Genome in a Bottle v0.6 Tier 1 using truvari. In the box plots, the median values are shown in the central line, the box spans the first to the third quartiles and the whiskers extend 1.5 times the IQR from the box.

Supplementary information

Supplementary Information

Supplementary Figs. 1–22, Supplementary Tables 1–5 and Supplementary Methods

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vialle, R.A., de Paiva Lopes, K., Bennett, D.A. et al. Integrating whole-genome sequencing with multi-omic data reveals the impact of structural variants on gene regulation in the human brain. Nat Neurosci 25, 504–514 (2022). https://doi.org/10.1038/s41593-022-01031-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41593-022-01031-7

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research