Abstract
Common genetic risk for neuropsychiatric disorders is enriched in regulatory elements active during cortical neurogenesis. However, it remains poorly understood as to how these variants influence gene regulation. To model the functional impact of common genetic variation on the noncoding genome during human cortical development, we performed the assay for transposase accessible chromatin using sequencing (ATAC-seq) and analyzed chromatin accessibility quantitative trait loci (QTL) in cultured human neural progenitor cells and their differentiated neuronal progeny from 87 donors. We identified significant genetic effects on 988/1,839 neuron/progenitor regulatory elements, with highly cell-type and temporally specific effects. A subset (roughly 30%) of chromatin accessibility-QTL were also associated with changes in gene expression. Motif-disrupting alleles of transcriptional activators generally led to decreases in chromatin accessibility, whereas motif-disrupting alleles of repressors led to increases in chromatin accessibility. By integrating cell-type-specific chromatin accessibility-QTL and brain-relevant genome-wide association data, we were able to fine-map and identify regulatory mechanisms underlying noncoding neuropsychiatric disorder risk loci.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Data generated in this paper (including metadata) can be accessed via dbGaP at accession number phs001958.v1.p1. The RNA-seq and genotype datasets used for fetal cortical eQTL analysis are available at dbGaP with accession number phs001900. REST ChIP–seq data in H1 embryonic stem cells and neurons differentiated from H1 cells are available via the ENCODE portal (https://www.encodeproject.org/) with the following identifiers: ENCSR000BTV and ENCSR000BHM.
Code availability
All code used in this paper is deposited on bitbucket at https://bitbucket.org/steinlabunc/celltypespecificcaqtls_wasp/src/master/.
References
Grasby, K. L. et al. The genetic architecture of the human cerebral cortex. Science 367, eaay6690 (2020).
Sullivan, P. F. & Geschwind, D. H. Defining the genetic, genomic, cellular, and diagnostic architectures of psychiatric disorders. Cell 177, 162–183 (2019).
Barešić, A., Nash, A. J., Dahoun, T., Howes, O. & Lenhard, B. Understanding the genetics of neuropsychiatric disorders: the potential role of genomic regulatory blocks. Mol. Psychiatry 25, 6–18 (2019).
Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).
Lee, P. H. et al. Principles and methods of in-silico prioritization of non-coding regulatory variants. Hum. Genet. 137, 15–30 (2018).
Albert, F. W. & Kruglyak, L. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).
Kumasaka, N., Knights, A. J. & Gaffney, D. J. High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat. Genet. 51, 128–137 (2019).
Davis, C. A. et al. The encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
GTEx Consortium. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).
PsychENCODE Consortium. et al. The PsychENCODE project. Nat. Neurosci. 18, 1707–1712 (2015).
Won, H. et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016).
Walker, R. L. et al. Genetic control of expression and splicing in developing human brain informs disease mechanisms. Cell 179, 750–771.e22 (2019).
de la Torre-Ubieta, L. et al. The dynamic landscape of open chromatin during human cortical neurogenesis. Cell 172, 289–304.e18 (2018).
Bryois, J. et al. Evaluation of chromatin accessibility in prefrontal cortex of individuals with schizophrenia. Nat. Commun. 9, 3121 (2018).
Schwartzentruber, J. et al. Molecular and functional variation in iPSC-derived sensory neurons. Nat. Genet. 50, 54–61 (2018).
Stein, J. L. et al. A quantitative framework to evaluate modeling of cortical development by neural stem cells. Neuron 83, 69–86 (2014).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Hansen, D. V., Lui, J. H., Parker, P. R. L. & Kriegstein, A. R. Neurogenic radial glia in the outer subventricular zone of human neocortex. Nature 464, 554–561 (2010).
Pollen, A. A. et al. Molecular identity of human outer radial glia during cortical development. Cell 163, 55–67 (2015).
Polioudakis, D. et al. A single-cell transcriptomic atlas of human neocortical development during mid-gestation. Neuron 103, 785–801.e8 (2019).
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).
Pickrell, J. K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).
Aygün, N. et al. Genetic influences on cell type specific gene expression and splicing during neurogenesis elucidate regulatory mechanisms of brain traits. Preprint at bioRxiv https://doi.org/10.1101/2020.10.21.349019 (2020).
Huang, Q. Q., Ritchie, S. C., Brozynska, M. & Inouye, M. Power, false discovery rate and Winner’s Curse in eQTL studies. Nucleic Acids Res. 46, e133 (2018).
Gate, R. E. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat. Genet. 50, 1140–1150 (2018).
Loeb-Hennard, C., Cousin, X., Prengel, I. & Kremmer, E. Cloning and expression pattern of vat-1 homolog gene in zebrafish. Gene Expr. Patterns 5, 91–96 (2004).
Feng, L., Hatten, M. E. & Heintz, N. Brain lipid-binding protein (BLBP): a novel signaling system in the developing mammalian CNS. Neuron 12, 895–908 (1994).
Hsu, Y.-C. et al. Brain-specific 1B promoter of FGF1 gene facilitates the isolation of neural stem/progenitor cells with self-renewal and multipotent capacities. Dev. Dyn. 238, 302–314 (2009).
Ballas, N., Grunseich, C., Lu, D. D., Speh, J. C. & Mandel, G. REST and its corepressors mediate plasticity of neuronal gene chromatin throughout neurogenesis. Cell 121, 645–657 (2005).
Pastinen, T. Genome-wide allele-specific analysis: insights into regulatory variation. Nat. Rev. Genet. 11, 533–538 (2010).
Heinz, S., Romanoski, C. E., Benner, C. & Glass, C. K. The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol. 16, 144–154 (2015).
Diedenhofen, B. & Musch, J. cocor: a comprehensive solution for the statistical comparison of correlations. PLoS ONE 10, e0121945 (2015).
Behera, V. et al. Exploiting genetic variation to uncover rules of transcription factor binding and chromatin accessibility. Nat. Commun. 9, 782 (2018).
Bergsland, M., Werme, M., Malewicz, M., Perlmann, T. & Muhr, J. The establishment of neuronal properties is controlled by Sox4 and Sox11. Genes Dev. 20, 3475–3486 (2006).
Pattabiraman, K., Shibata, M., Lorente Galdos, B., Andrijevic, D. & Sestan, N. Regulation of prefrontal patterning, connectivity and synaptogenesis by retinoic acid. Biol. Psychiatry 87, S132 (2020).
Tsunemoto, R. et al. Diverse reprogramming codes for neuronal identity. Nature 557, 375–380 (2018).
He, X. et al. Expression of a large family of POU-domain regulatory genes in mammalian brain development. Nature 340, 35–41 (1989).
Wang, H. et al. ZEB1 represses neural differentiation and cooperates with CTBP2 to dynamically regulate cell migration during neocortex development. Cell Rep. 27, 2335–2353.e6 (2019).
Rakic, P. Specification of cerebral cortical areas. Science 241, 170–176 (1988).
Li, S. et al. Regulatory mechanisms of major depressive disorder risk variants. Mol. Psychiatry 25, 1926–1945 (2020).
Dobbyn, A. et al. Landscape of conditional eQTL in dorsolateral prefrontal cortex and co-localization with schizophrenia GWAS. Am. J. Hum. Genet. 102, 1169–1184 (2018).
Li, M. et al. Integrative functional genomic analysis of human brain development and neuropsychiatric risks. Science 362, eaat7615 (2018).
Skene, N. G. et al. Genetic identification of brain cell types underlying schizophrenia. Nat. Genet. 50, 825–833 (2018).
Gaffney, D. J. et al. Dissecting the regulatory architecture of gene expression QTLs. Genome Biol. 13, R7 (2012).
Bell, J. T. et al. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol. 12, R10 (2011).
Chen, K. & Rajewsky, N. Natural selection on human microRNA binding sites inferred from SNP data. Nat. Genet. 38, 1452–1456 (2006).
Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).
Thiel, G., Greengard, P. & Südhof, T. C. Characterization of tissue-specific transcription by the human synapsin I gene promoter. Proc. Natl Acad. Sci. USA 88, 3431–3435 (1991).
Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).
Orchard, P., Kyono, Y., Hensley, J., Kitzman, J. O. & Parker, S. C. J. Quantification, dynamic visualization, and validation of bias in ATAC-seq data with ataqv. Cell Syst. 10, 298–306.e4 (2020).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at arXiv https://arxiv.org/abs/1303.3997 (2013).
van de Geijn, B., McVicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Feng, J., Liu, T. & Zhang, Y. Using MACS to identify peaks from ChIP–seq data. Curr. Protoc. Bioinforma. 34, 2.14.1–2.14.14 (2011).
Lun, A. T. L. & Smyth, G. K. CSAW: a Bioconductor package for differential binding analysis of ChIP–seq data using sliding windows. Nucleic Acids Res. 44, e45 (2016).
Hansen, K. D., Irizarry, R. A. & Wu, Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 13, 204–216 (2012).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Ernst, J. & Kellis, M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).
Alexa, A. & Rahnenfuhrer, J. topGO: Enrichment analysis for gene ontology. R Package Version 2 (2010).
Mathelier, A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44, D110–D115 (2016).
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012).
Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2011).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Price, A. L., Zaitlen, N. A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).
Davis, J. R. et al. An efficient multiple-testing adjustment for eQTL studies that accounts for linkage disequilibrium between variants. Am. J. Hum. Genet. 98, 216–224 (2016).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).
Shim, H. et al. A multivariate genome-wide association analysis of 10 LDL subfractions, and their response to statin treatment, in 1868 Caucasians. PLoS ONE 10, e0120758 (2015).
Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).
Han, B. & Eskin, E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 8, e1002555 (2012).
Dabney, A., Storey, J. D. & Warnes, G. R. qvalue: Q-value estimation for false discovery rate control. R Package Version 1 (2010).
Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31, 3847–3849 (2015).
Touzet, H. & Varré, J.-S. Efficient and accurate P-value computation for position weight matrices. Algorithms Mol. Biol. 2, 15 (2007).
Shannon, P. & Richards, M. MotifDb: An annotated collection of protein-DNA binding sequence motifs. R Package Version 1 (2014).
Finucane, H. K. et al. Partitioning heritability by functional category using GWAS summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Demontis, D. et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat. Genet. 51, 63–75 (2019).
Grove, J. et al. Identification of common genetic risk variants for autism spectrum disorder. Nat. Genet. 51, 431–444 (2019).
Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. 50, 912–919 (2018).
Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 50, 668–681 (2018).
Stahl, E. A. et al. Genome-wide association study identifies 30 loci associated with bipolar disorder. Nat. Genet. 51, 793–803 (2019).
Pardiñas, A. F. et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat. Genet. 50, 381–389 (2018).
Jansen, P. R. et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways. Nat. Genet. 51, 394–403 (2019).
Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50, 1112–1121 (2018).
Okbay, A. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48, 624–633 (2016).
Nagel, M. et al. Meta-analysis of genome-wide association studies for neuroticism in 449,484 individuals identifies novel genetic loci and pathways. Nat. Genet. 50, 920–927 (2018).
Duncan, L. et al. Significant locus and metabolic genetic correlations revealed in genome-wide association study of anorexia nervosa. Am. J. Psychiatry 174, 850–858 (2017).
Otowa, T. et al. Meta-analysis of genome-wide association studies of anxiety disorders. Mol. Psychiatry 21, 1391–1399 (2016).
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
International League Against Epilepsy Consortium on Complex Epilepsies. Genome-wide mega-analysis identifies 16 loci and highlights diverse biological mechanisms in the common epilepsies. Nat. Commun. 9, 5269 (2018).
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102 (2019).
Civelek, M. et al. Genetic regulation of adipose gene expression and cardio-metabolic traits. Am. J. Hum. Genet. 100, 428–443 (2017).
Hahne, F. & Ivanek, R. Visualizing genomic data using GVIZ and Bioconductor. Methods Mol. Biol. 1418, 335–351 (2016).
Han, D. W. et al. Direct reprogramming of fibroblasts into neural stem cells by defined factors. Cell Stem Cell 10, 465–472 (2012).
Liang, H. et al. Neural development is dependent on the function of specificity protein 2 in cell cycle progression. Development 140, 552–561 (2013).
Naka, H., Nakamura, S., Shimazaki, T. & Okano, H. Requirement for COUP-TFI and II in the temporal specification of neural stem cells in CNS development. Nat. Neurosci. 11, 1014–1023 (2008).
Liu, Y. & Zhang, Y. ETV5 is essential for neuronal differentiation of human neural progenitor cells by repressing NEUROG2 expression. Stem Cell Rev. Rep. 15, 703–716 (2019).
Araújo, J. A. et al. Direct reprogramming of adult human somatic stem cells into functional neurons using Sox2, Ascl1, and Neurog2. Front. Cell. Neurosci. 12, https://doi.org/10.3389/fncel.2018.00155 (2018).
Fathi, A., Rasouli, H., Yeganeh, M., Salekdeh, G. H. & Baharvand, H. Efficient differentiation of human embryonic stem cells toward dopaminergic neurons using recombinant LMX1A factor. Mol. Biotechnol. 57, 184–194 (2015).
Cubelos, B., Briz, C. G., Esteban-Ortega, G. M. & Nieto, M. Cux1 and Cux2 selectively target basal and apical dendritic compartments of layer II-III cortical neurons. Dev. Neurobiol. 75, 163–172 (2015).
Zimmer, C., Tiveron, M.-C., Bodmer, R. & Cremer, H. Dynamics of Cux2 expression suggests that an early pool of SVZ precursors is fated to become upper cortical layer neurons. Cereb. Cortex 14, 1408–1420 (2004).
Acknowledgements
This work was supported by the National Institutes of Health (NIH) (grant nos. R00MH102357, U54EB020403, R01MH118349 and R01MH120125), Brain Research Foundation and NC TraCS Pilot funding to J.L.S. D.H.G. was supported by NIH (grant nos. R37 MH060233, R01 MH094714, UO1MH116489 and R01 MH110927). The following core facilities were used for this project: UNC Neuroscience Center Microscopy Core (P30NS045892), UNC Mammalian Genotyping Core, CGIBD Advanced Analytics Core (NIH grant no. P30 DK034987), UNC Flow Cytometry Core Facility, UNC Vector Core and UNC Research Computing. Additional core facilities used for this project were UCLA CFAR (5P30 AI028697) and the UCLA Neuroscience Genomics Core. We thank K.L. Mohlke for helpful comments. Adult caQTLs were supported by the PsychENCODE Consortium: grant nos. U01MH103392, U01MH103365, U01MH103346, U01MH103340, U01MH103339, R21MH109956, R21MH105881, R21MH105853, R21MH103877, R21MH102791, R01MH111721, R01MH110928, R01MH110927, R01MH110926, R01MH110921, R01MH110920, R01MH110905, R01MH109715, R01MH109677, R01MH105898, R01MH105898, R01MH094714, P50MH106934, U01MH116488, U01MH116487, U01MH116492, U01MH116489, U01MH116438, U01MH116441, U01MH116442, R01MH114911, R01MH114899, R01MH114901, R01MH117293, R01MH117291 and R01MH117292 awarded to: S. Akbarian (Icahn School of Medicine at Mount Sinai), G. Crawford (Duke University), S. Dracheva (Icahn School of Medicine at Mount Sinai), P. Farnham (University of Southern California), M. Gerstein (Yale University), D.H.G. (University of California, Los Angeles), F. Goes (Johns Hopkins University), T.M. Hyde (Lieber Institute for Brain Development), A. Jaffe (Lieber Institute for Brain Development), J.A. Knowles (University of Southern California), C. Liu (SUNY Upstate Medical University), D. Pinto (Icahn School of Medicine at Mount Sinai), P. Roussos (Icahn School of Medicine at Mount Sinai), S. Sanders (University of California, San Francisco), N. Sestan (Yale University), P. Sklar (Icahn School of Medicine at Mount Sinai), M. State (University of California, San Francisco), P. Sullivan (University of North Carolina), F. Vaccarino (Yale University), D. Weinberger (Lieber Institute for Brain Development), S. Weissman (Yale University), K. White (University of Chicago), J. Willsey (University of California, San Francisco) and P. Zandi (Johns Hopkins University. We acknowledge the ENCODE Consortium and the ENCODE production laboratory(ies) generating the particular dataset(s).
Author information
Authors and Affiliations
Contributions
J.L.S., D.H.G. and L.T.U. conceived the study. J.L.S. directed and supervised the study. J.L.S. along with D.H.G. provided funding. A.L.E., K.E.C., K.P.C., M.Y., L.T.U. and J.L.S. cultured HNP cells. A.L.E. performed library preparation. M.J.L. preprocessed the RNA-seq data for eQTL. N.A. performed eQTL analysis. O.K. performed immunocytochemistry. O.K., J.M.W., F.A.K. and D.L. performed the functional validation assays. M.E.G., A.A.-K. and G.E.C. provided access to adult dlPFC caQTL data. M.I.L. aided in ASCA methodology. D.L. performed preprocessing, differential accessibility, caQTL, ASCA, colocalization and motif analyses. J.L.S. and D.L. wrote the paper. All authors commented on and approved the final version of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Neuroscience thanks Andrew Jaffe and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Flowchart for cell culture and preprocessing of ATAC-seq data.
a, Flowchart of cell culture for 17 rounds. b, The FACS gates for sorting EGFP + neurons. c, Images of immunofluorescence for cell markers in progenitor cultures. Immunolabeling experiments were repeated in at least 10 unique donor cell lines with similar results. The scale bar presents 100 μm. d, Images of immunofluorescence for cell markers in 8-week differentiated cultures. Immunolabeling experiments were repeated in at least 10 unique donor cell lines with similar results. The scale bar presents 100 μm. e, Box plot for total sequence depth (forward reads and reverse reads), unique read number (forward reads and reverse reads), duplicate rate, mitochondrial duplicate rate, TSS enrichment and the fraction of reads in called peak regions (FRiP score) in neurons (N = 61) and progenitors (N = 76) compared to previously published data (N_GZ = 3 biologically independent samples with 3-4 replicates for each sample, N_CP = 3 biologically independent samples with 3 replicates for each sample)15. The center of the box is median of the data, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data.
Extended Data Fig. 2 ATAC-seq data QC.
a, Peak calling versus library sequencing depth. We observed a slower rise in the number of new peaks called after 15 millions filtered read pairs. This indicates a reasonable balance between read depth and number of peaks called using an average of 14 million read pairs after filtering in our samples. b, Insert size histograms for 3 randomly selected neuron and progenitor samples. c, PCA plot for ATAC-seq data (N = 137) before batch correction (left) and after batch correction (right), colored by sorter. We corrected normalized reads within ATAC-seq peaks in neurons by sorter locations. Then, we corrected normalized reads within ATAC-seq peaks in neurons and progenitors by cell culture round. d, Correlations of batch corrected normalized reads across donors and within donors. Correlations within donors was significantly higher than correlations across donors in progenitor (n = 15). Correlations within donors were higher than correlations across donors in neurons (n = 4), but not significant (p = 0.07). P values are estimated by two-sided wilcoxon tests. The center of the box is median of the data, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data. e, Correlations between PC1 to PC10 from normalized reads in neurons with known technical and biological factors. f, Correlations between PC1 to PC10 from batch correction normalized reads in progenitors with known technical and biological factors.
Extended Data Fig. 3 Annotating differentially accessible peaks during neuronal differentiation.
a, Gene ontology (GO) enrichment of differentially accessible peaks at the TSS. Progenitor peaks (left) and neuron peaks (right) showed enrichment for GO terms related to proliferation and differentiation, as expected. b, TFs with significantly differentially enriched conserved binding sites in differentially accessible peaks. The statistical test identifies TFs likely involved in neural progenitor proliferation and maintenance (progenitorTFs; top) or neurogenesis and maturation (neuronTFs; bottom). The top 30 significantly enriched TFs were shown in this figure, and the full list can be found in Supplementary Table 2. Within progenitorTFs, we found TFs previously characterized to have key roles for neural stem cell renewal and reprogramming, such as SOX2101,102, and those known to be required for the maintenance of stem cells in cortex, such as NR2F1, ETV5, and SP2103,104,105. Within neuronTFs, NEUROG2 and LMX1A were identified, which are known to drive neuronal differentiation106,107, as well as TFs shown to induce neuronal identity from fibroblasts, including ASCL2 and the POU family39. NeuronTFs also included CUX1/2, a marker for layer II-III neurons61,108 and other laminar markers such as TBR1 and FOXP1. c, Schematic of known functions for selected progenitorTFs and neuronTFs.
Extended Data Fig. 4 Features of caQTLs.
a, Flowchart for caQTL data analysis. b, PCA plot for ATAC-seq data on sex chromosomes (chrX and chrY), colored by sex from genotype data, showing sex could be called using ATAC-seq data. c, MDS plot for genotype data of HapMap3 and donors in this study, colored by populations from HapMap3 data. ASW: African ancestry in Southwest USA; CEU: Utah residents with Northern and Western European ancestry from the CEPH collection; CHB: Han Chinese in Beijing, China; CHD: Chinese in Metropolitan Denver, Colorado; GIH: Gujarati Indians in Houston, Texas; JPT: Japanese in Tokyo, Japan; LWK: Luhya in Webuye, Kenya; MEX: Mexican ancestry in Los Angeles, California; MKK: Maasai in Kinyawa, Kenya; TSI: Toscans in Italy; YRI: Yoruba in Ibadan, Nigeria. d, Neuron and progenitor caPeaks enrichment at epigenetically annotated regulatory elements from fetal brain (Epigenetics Roadmap ID = E081). e, Comparison of percent variance explained (r2) for shared neuron/progenitor caQTLs and fetal brain eQTLs (subset to the same sample size). P values are estimated by two-sided paired student-t tests. The center of the box is median of the data, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data.
Extended Data Fig. 5 Examples of fine-mapping and regulatory mechanisms underlying eQTLs.
a, Colocalization of a progenitor-specific caQTL and fetal cortical eQTL for ETFDH. b, caQTL for rs11544037 and the labeled peak in progenitor (N = 76). P-values are estimated by a mixed linear effects model using a two-sided test (Methods). c, eQTL of ETFDH in bulk fetal cortex (N = 235). P-values are estimated by a mixed linear effects model using a two-sided test (Methods). d, The expression of TFs whose motifs are disrupted by rs1154403722 (LFC = -0.32, FDR = 7.55e-18)26. e, The motif Logo of RAD21, where the red box shows the position disrupted by rs11544037. Schematic cartoon of mechanisms for rs11544037 regulating chromatin accessibility and gene expression. f, Luciferase signals for alleles of rs11544037 in progenitors (N = 8). P value is from two-sided paired t-tests. g, Co-localization of a progenitor-specific caQTL and eQTL for FGF1. h, CaQTL for rs11960262 and the labeled peak in progenitor (N = 76). P-values are estimated by a mixed linear effects model using a two-sided test (Methods). i, eQTL of ETFDH in progenitors (N = 85). P-values are estimated by a mixed linear effects model using a two-sided test (Methods). j, The expression of TFs in which motifs are disrupted by rs11960262. k, The motif Logo of EGR1, where the red box shows the position disrupted by rs11960262. Schematic cartoon of mechanisms for rs11960262 regulating chromatin accessibility and gene expression. (For box plots in (b-c), (f) and (h-i), the center of the box is the median, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data.).
Extended Data Fig. 6 Features of ASCA.
a, Density plot for caPeak length from shared caQTLs and ASCA, and from peaks only significant in ASCA in neurons (top) and progenitors (bottom). P values are estimated by two-sided Student’s t-tests. b, The neuron ASCA (caSNP: rs62332390; caPeak: chr4:148,441,611-148,46,300; P values are estimated by the negative binomial generalized linear models from DESeq2 using a two-sided test62) is not a significant caQTL (N = 61; P values are estimated by the mixed linear model using a two sided test) in neurons because the caPeak was very wide (4,689 bp) and only the region near the ASCA SNP shows an association with genotype. c, The neuron ASCA (caSNP:rs77191441; caPeak:chr5:116,571,961-116,576,710; P values are estimated by the negative binomial generalized linear models from DESeq2 using a two-sided test62) is not a significant caQTL (N = 61; P values are estimated by the mixed linear effects model with a two-sided test) in neurons due to low minor allele frequency leading to less power to detect a caQTL. d, ASCA between rs185220 (see Fig. 3) and chromatin accessibility in progenitors (left) and neurons (right). P-values are estimated by the negative binomial generalized linear models from DESeq2 using a two-sided test62. (For box plots in (b) and (c), the center of the box is the median, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data.).
Extended Data Fig. 7 Comparison to adult dorsolateral prefrontal cortex (DLPFC) caQTLs.
a, Shared accessible peaks overlap at epigenetically annotated regulatory elements from different tissues. Accessible peak bp percentage overlapped with epigenetically annotated regulatory elements. From left to right, tissues ordered by bp percentage overlap with enhancers and promoters. Shared peaks overlap with both adult and fetal regulatory elements. b, PCA plot for read counts from shared peaks in adult DLPFC, neurons and progenitors. c, Correlations of effect sizes for significant neuron caQTLs and the same SNP-Peak pairs in adult DLPFC (left). Correlations of effect sizes for significant progenitor caQTLs and the same SNP-Peak pairs in adult DLPFC (right).
Extended Data Fig. 8 An example of a neuron-specific caQTL leading to regulatory mechanisms underlying GWAS loci.
a, Numbers of colocalizations between ASCA and GWAS loci. b, The neuron-specific significant caQTL (caSNP: rs9930307; caPeak: chr16: 9,805,221-9,805,420) co-localized with schizophrenia GWAS locus (index SNP: rs7191183). c, Box plot for the caQTL (left, N = 61; P values are estimated by the mixed linear effects model using a two-sided test) and ASCA (right) (caSNP: rs9930307; caPeak: chr16: 9,805,221-9,805,420; P values are estimated by the negative binomial generalized linear models from DESeq2 using a two-sided test62). d, The expression of TFs in which motifs are disrupted by rs9930307. e, The motif logo of TP53 and the position disrupted by rs9930307. f, The box plot for luciferase signal for alleles of rs9930307 in progenitors (N = 8). P value is from two-sided paired student-t tests. (For box plots in (c) and (f), the center of the box is median of the data, the bounds of the box are 25th percentile and 75th percentile of the data, and the whisker boundary is 1.5 times the IQR. Maximum and minimum are the maximum and minimum of the data.).
Supplementary information
Rights and permissions
About this article
Cite this article
Liang, D., Elwell, A.L., Aygün, N. et al. Cell-type-specific effects of genetic variation on chromatin accessibility during human neuronal differentiation. Nat Neurosci 24, 941–953 (2021). https://doi.org/10.1038/s41593-021-00858-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-021-00858-w
This article is cited by
-
High-dimensional phenotyping to define the genetic basis of cellular morphology
Nature Communications (2024)
-
SEESAW: detecting isoform-level allelic imbalance accounting for inferential uncertainty
Genome Biology (2023)
-
Inferring cell-type-specific causal gene regulatory networks during human neurogenesis
Genome Biology (2023)
-
Interpreting non-coding disease-associated human variants using single-cell epigenomics
Nature Reviews Genetics (2023)
-
Balanced SET levels favor the correct enhancer repertoire during cell fate acquisition
Nature Communications (2023)