Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases

Corces, M. Ryan; Shcherbina, Anna; Kundu, Soumya; Gloudemans, Michael J.; Frésard, Laure; Granja, Jeffrey M.; Louie, Bryan H.; Eulalio, Tiffany; Shams, Shadi; Bagdatli, S. Tansu; Mumbach, Maxwell R.; Liu, Boxiang; Montine, Kathleen S.; Greenleaf, William J.; Kundaje, Anshul; Montgomery, Stephen B.; Chang, Howard Y.; Montine, Thomas J.

doi:10.1038/s41588-020-00721-x

Article
Published: 26 October 2020

Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases

Nature Genetics volume 52, pages 1158–1168 (2020)Cite this article

34k Accesses
147 Citations
114 Altmetric
Metrics details

Subjects

Abstract

Genome-wide association studies of neurological diseases have identified thousands of variants associated with disease phenotypes. However, most of these variants do not alter coding sequences, making it difficult to assign their function. Here, we present a multi-omic epigenetic atlas of the adult human brain through profiling of single-cell chromatin accessibility landscapes and three-dimensional chromatin interactions of diverse adult brain regions across a cohort of cognitively healthy individuals. We developed a machine-learning classifier to integrate this multi-omic framework and predict dozens of functional SNPs for Alzheimer’s and Parkinson’s diseases, nominating target genes and cell types for previously orphaned loci from genome-wide association studies. Moreover, we dissected the complex inverted haplotype of the MAPT (encoding tau) Parkinson’s disease risk locus, identifying putative ectopic regulatory interactions in neurons that may mediate this disease association. This work expands understanding of inherited variation and provides a roadmap for the epigenomic dissection of causal regulatory variation in disease.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: scATAC–seq identifies cell-type-specific chromatin accessibility in the adult brain.**

**Fig. 2: Subclustering identifies diverse biologically relevant neuronal cell types in the adult brain.**

**Fig. 3: Machine learning predicts functional polymorphisms in AD and PD.**

**Fig. 4: Vertical integration of multi-omic data and machine-learning nominate gene targets in AD and PD.**

**Fig. 5: Epigenetic deconvolution of the *MAPT* locus explains haplotype-associated transcriptional changes.**

An integrated multi-omics approach identifies epigenetic alterations associated with Alzheimer’s disease

Article 28 September 2020

Neuronal and glial 3D chromatin architecture informs the cellular etiology of brain disorders

Article Open access 25 June 2021

Single-nucleus RNA-sequencing of autosomal dominant Alzheimer disease and risk variant carriers

Article Open access 21 April 2023

Data availability

All data generated in this work are available through GEO accession no. GSE147672. To facilitate wide access to our data, we created a WashU Epigenome browser session (session ID: drS3o1n4kJ) for our scATAC–seq data in the following track formats: (1) broad cell types (Corces_scATAC_BroadCellTypes);(2) broad clusters (Corces_scATAC_BroadClusters); (3) neuron subclusters (Corces_scATAC_NeuronSubClusters); and (4) neuron subclustered cell types/LDSC groups (Corces_scATAC_NeuronSubCellTypes). These tracks are accessible via the following link: http://epigenomegateway.wustl.edu/legacy/?genome=hg38&session=drS3o1n4kJ.

Code availability

All custom code used in this work is available at the following GitHub repository: https://github.com/kundajelab/alzheimers_parkinsons.

References

Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 51, 414–430 (2019).
Article CAS PubMed PubMed Central Google Scholar
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lambert, J.-C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 45, 1452–1458 (2013).
Article CAS PubMed PubMed Central Google Scholar
Beecham, G. W. et al. Genome-wide association meta-analysis of neuropathologic features of Alzheimer’s disease and related dementias. PLoS Genet. 10, e1004606 (2014).
Article PubMed PubMed Central CAS Google Scholar
Pankratz, N. et al. Meta-analysis of Parkinson’s disease: identification of a novel locus, RIT2. Ann. Neurol. 71, 370–384 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chang, D. et al. A meta-analysis of genome-wide association studies identifies 17 new Parkinson’s disease risk loci. Nat. Genet. 49, 1511–1516 (2017).
Article CAS PubMed PubMed Central Google Scholar
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102, 717–730 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nott, A. et al. Brain cell type-specific enhancer–promoter interactome maps and disease-risk association. Science 366, 1134–1139 (2019).
Article CAS PubMed PubMed Central Google Scholar
Li, M. et al. Integrative functional genomic analysis of human brain development and neuropsychiatric risks. Science 362, eaat7615 (2018).
Article CAS PubMed PubMed Central Google Scholar
Amiri, A. et al. Transcriptome and epigenome landscape of human cortical development modeled in organoids. Science 362, eaat6720 (2018).
Article CAS PubMed PubMed Central Google Scholar
Trevino, A. E. et al. Chromatin accessibility dynamics in a model of human forebrain development. Science 367, eaay1645 (2020).
Article CAS PubMed PubMed Central Google Scholar
Nowakowski, T. J. et al. Spatiotemporal gene expression trajectories reveal developmental hierarchies of the human cortex. Science 358, 1318–1323 (2017).
Article CAS PubMed PubMed Central Google Scholar
Song, M. et al. Mapping cis-regulatory chromatin contacts in neural cells links neuropsychiatric disorder risk variants to target genes. Nat. Genet. 51, 1252–1262 (2019).
Article CAS PubMed PubMed Central Google Scholar
Rajarajan, P. et al. Neuron-specific signatures in the chromosomal connectome associated with schizophrenia risk. Science 362, eaat4311 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fullard, J. F. et al. An atlas of chromatin accessibility in the adult human brain. Genome Res. 28, 1243–1252 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fullard, J. F. et al. Open chromatin profiling of human postmortem brain infers functional roles for non-coding schizophrenia loci. Hum. Mol. Genet. 26, 1942–1951 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bryois, J. et al. Evaluation of chromatin accessibility in prefrontal cortex of individuals with schizophrenia. Nat. Commun. 9, 3121 (2018).
Article PubMed PubMed Central CAS Google Scholar
Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sey, N. Y. A. et al. A computational tool (H-MAGMA) for improved prediction of brain-disorder risk genes by incorporating brain chromatin interaction profiles. Nat. Neurosci. 23, 583–593 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lee, D. et al. A method to predict the impact of regulatory variants from DNA sequence. Nat. Genet. 47, 955–961 (2015).
Article CAS PubMed PubMed Central Google Scholar
Shrikumar, A., Prakash, E. & Kundaje, A. GkmExplain: fast and accurate interpretation of nonlinear gapped k-mer SVMs. Bioinformatics 35, i173–i182 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kumasaka, N., Knights, A. J. & Gaffney, D. J. High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat. Genet. 51, 128–137 (2019).
Article CAS PubMed Google Scholar
Amlie-Wolf, A. et al. INFERNO: inferring the molecular mechanisms of noncoding genetic variants. Nucleic Acids Res. 46, 8740–8753 (2018).
Article CAS PubMed PubMed Central Google Scholar
Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).
Article CAS PubMed PubMed Central Google Scholar
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Article CAS PubMed PubMed Central Google Scholar
Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mumbach, M. R. et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mumbach, M. R. et al. Enhancer connectome in primary human cells reveals target genes of disease-associated DNA elements. Nat. Genet. 49, 1602–1612 (2017).
Article CAS PubMed PubMed Central Google Scholar
Granja, J. M. et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat. Biotechnol. 37, 1458–1465 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871.e8 (2018).
Article CAS PubMed PubMed Central Google Scholar
Corces, M. R. et al. Lineage-specific and single cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).
Article CAS PubMed PubMed Central Google Scholar
Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).
Article PubMed PubMed Central CAS Google Scholar
McKeown, M. R. et al. Superenhancer analysis defines novel epigenomic subtypes of non-APL AML, including an RARα dependency targetable by SY-1425, a potent and selective RARα agonist. Cancer Discov. 7, 1136–1153 (2017).
Article CAS PubMed PubMed Central Google Scholar
Stolt, C. C. et al. The Sox9 transcription factor determines glial fate choice in the developing spinal cord. Genes Dev. 17, 1677–1689 (2003).
Article CAS PubMed PubMed Central Google Scholar
Kuhlbrodt, K., Herbarth, B., Sock, E., Hermans-Borgmeyer, I. & Wegner, M. Sox10, a novel transcriptional modulator in glial cells. J. Neurosci. 18, 237–250 (1998).
Article CAS PubMed PubMed Central Google Scholar
Kondo, T. & Raff, M. Basic helix-loop-helix proteins and the timing of oligodendrocyte differentiation. Development 127, 2989–2998 (2000).
Article CAS PubMed Google Scholar
Nakatani, H. et al. Ascl1/Mash1 promotes brain oligodendrogenesis during myelination and remyelination. J. Neurosci. 33, 9752–9768 (2013).
Article CAS PubMed PubMed Central Google Scholar
Smith, A. M. et al. The transcription factor PU.1 is critical for viability and function of human brain microglia. Glia 61, 929–942 (2013).
Article PubMed Google Scholar
Schlingensiepen, K. H. et al. The role of Jun transcription factor expression and phosphorylation in neuronal differentiation, neuronal cell death, and plastic adaptations in vivo. Cell. Mol. Neurobiol. 14, 487–505 (1994).
Article CAS PubMed Google Scholar
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hemonnot, A. L., Hua, J., Ulmann, L. & Hirbec, H. Microglia in Alzheimer disease: well-known targets and new opportunities. Front. Aging Neurosci. 11, 233 (2019).
Efthymiou, A. G. & Goate, A. M. Late onset Alzheimer’s disease genetics implicates microglial pathways in disease risk. Mol. Neurodegener. 12, 43 (2017).
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ghandi, M. et al. gkmSVM: an R package for gapped-kmer SVM. Bioinformatics 32, 2205–2207 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bromberg, Y. & Rost, B. Comprehensive in silico mutagenesis highlights functionally important residues in proteins. Bioinformatics 24, i207–i212 (2008).
Article PubMed PubMed Central Google Scholar
Xu, W., Tan, L. & Yu, J.-T. The role of PICALM in Alzheimer’s disease. Mol. Neurobiol. 52, 399–413 (2015).
Article CAS PubMed Google Scholar
Stage, E. et al. The effect of the top 20 Alzheimer disease risk genes on gray-matter density and FDG PET brain metabolism. Alzheimers Dement. (Amst) 5, 53–66 (2016).
Article Google Scholar
Andrew, R. J. et al. Reduction of the expression of the late-onset Alzheimer’s disease (AD) risk-factor BIN1 does not affect amyloid pathology in an AD mouse model. J. Biol. Chem. 294, 4477–4487 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ma, J., Yu, J.-T. & Tan, L. MS4A cluster in Alzheimer’s disease. Mol. Neurobiol. 51, 1240–1248 (2015).
Article CAS PubMed Google Scholar
Rouka, E. et al. Differential recognition preferences of the three Src homology 3 (SH3) domains from the adaptor CD2-associated protein (CD2AP) and direct association with Ras and Rab interactor 3 (RIN3). J. Biol. Chem. 290, 25275–25292 (2015).
Article CAS PubMed PubMed Central Google Scholar
Larsson, M. et al. GWAS findings for human iris patterns: associations with variants in genes that influence normal neuronal pattern development. Am. J. Hum. Genet. 89, 334–343 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kajiho, H. et al. RIN3: a novel Rab5 GEF interacting with amphiphysin II involved in the early endocytic pathway. J. Cell Sci. 116, 4159–4168 (2003).
Article CAS PubMed Google Scholar
Lecours, C. et al. Microglial implication in Parkinson’s disease: loss of beneficial physiological roles or gain of inflammatory functions? Front. Cell. Neurosci. 12, 282 (2018).
Article PubMed PubMed Central CAS Google Scholar
Kaushik, D. K., Gupta, M., Das, S. & Basu, A. Krüppel-like factor 4, a novel transcription factor regulates microglial activation and subsequent neuroinflammation. J. Neuroinflammation 7, 68 (2010).
Article PubMed PubMed Central CAS Google Scholar
Schellenberg, G. D. & Montine, T. J. The genetics and neuropathology of Alzheimer’s disease. Acta Neuropathol. 124, 305–323 (2012).
Article CAS PubMed PubMed Central Google Scholar
Stefansson, H. et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005).
Article CAS PubMed Google Scholar
Zody, M. C. et al. Evolutionary toggling of the MAPT 17q21.31 inversion region. Nat. Genet. 40, 1076–1083 (2008).
Article CAS PubMed PubMed Central Google Scholar
Valenca, G. T. et al. The role of MAPT haplotype H2 and isoform 1N/4R in Parkinsonism of older adults. PLoS ONE 11, e0157452 (2016).
Article PubMed PubMed Central CAS Google Scholar
Allen, M. et al. Association of MAPT haplotypes with Alzheimer’s disease risk and MAPT brain gene expression levels. Alzheimers Res. Ther. 6, 39 (2014).
Article PubMed PubMed Central CAS Google Scholar
Pascale, E. et al. Genetic architecture of MAPT gene region in Parkinson disease subtypes. Front. Cell. Neurosci. 10, 96 (2016).
Article PubMed PubMed Central CAS Google Scholar
Beevers, J. E. et al. MAPT genetic variation and neuronal maturity alter isoform expression affecting axonal transport in iPSC-derived dopamine neurons. Stem Cell Reports 9, 587–599 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lai, M. C. et al. Haplotype-specific MAPT exon 3 expression regulated by common intronic polymorphisms associated with Parkinsonian disorders. Mol. Neurodegener. 12, 79 (2017).
Article PubMed PubMed Central CAS Google Scholar
Huin, V. et al. Alternative promoter usage generates novel shorter MAPT mRNA transcripts in Alzheimer’s disease and progressive supranuclear palsy brains. Sci. Rep. 7, 12589 (2017).
Article PubMed PubMed Central CAS Google Scholar
Pankratz, N. et al. Genomewide association study for susceptibility genes contributing to familial Parkinson disease. Hum. Genet. 124, 593–605 (2009).
Article CAS PubMed Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Article CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, Z. et al. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia. Nat. Genet. 49, 1576–1583 (2017).
Article CAS PubMed Google Scholar
Duncan, L. et al. Significant locus and metabolic genetic correlations revealed in genome-wide association study of anorexia nervosa. Am. J. Psychiatry 174, 850–858 (2017).
Article PubMed PubMed Central Google Scholar
Demontis, D. et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat. Genet. 51, 63–75 (2019).
Article CAS PubMed Google Scholar
Otowa, T. et al. Meta-analysis of genome-wide association studies of anxiety disorders. Mol. Psychiatry 21, 1391–1399 (2016).
Article CAS PubMed PubMed Central Google Scholar
Okbay, A. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48, 624–633 (2016).
Article CAS PubMed PubMed Central Google Scholar
Anney, R. J. L. et al. Genetic determinants of common epilepsies: a meta-analysis of genome-wide association studies. Lancet Neurol. 13, 893–903 (2014).
Article CAS Google Scholar
Zillikens, M. C. et al. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass. Nat. Commun. 8, 80 (2017).
Article PubMed PubMed Central CAS Google Scholar
Kemp, J. P. et al. Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis. Nat. Genet. 49, 1468–1475 (2017).
Article CAS PubMed PubMed Central Google Scholar
Howson, J. M. M. et al. Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms. Nat. Genet. 49, 1113–1119 (2017).
Article CAS PubMed PubMed Central Google Scholar
Harvey, C. T. et al. QuASAR: quantitative allele-specific analysis of reads. Bioinformatics 31, 1235–1242 (2015).
Article PubMed Google Scholar
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article PubMed CAS Google Scholar
Granja, J. M. et al. ArchR: an integrative and scalable software package for single-cell chromatin accessibility analysis. Preprint at bioRxiv https://doi.org/10.1101/2020.04.28.066498 (2020).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cusanovich, D. A. et al. The cis-regulatory dynamics of embryonic development at single-cell resolution. Nature 555, 538–542 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fulco, C. P. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Article CAS PubMed PubMed Central Google Scholar
van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729.e27 (2018).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central CAS Google Scholar
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Article PubMed PubMed Central CAS Google Scholar
Bhattacharyya, S., Chandra, V., Vijayanand, P. & Ay, F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP. Nat. Commun. 10, 4221 (2019).
Article PubMed PubMed Central CAS Google Scholar
Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555–3557 (2015).
Article CAS PubMed PubMed Central Google Scholar
Krueger, F. & Andrews, S. R. SNPsplit: allele-specific splitting of alignments between genomes with known SNP genotypes. F1000Res. 5, 1479 (2016).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

This work was supported by National Institutes of Health (NIH) grant nos. NS062684, AG057707, AG053959, AG047366 (to T.J.M.), HG007735 (to H.Y.C.), HG009431 (to S.B.M./A.K.), AG066490 (to S.B.M.) and AG059918 (to M.R.C.). Additional support for patient sample collection was provided by NIH grant nos. AG005136 and AG019610. Sequencing data for this project were generated on an Illumina HiSeq 4000 System supported in part by NIH grant no. S10OD018220. Additional resources at the Stanford Center for Genomics and Personalized Medicine Sequencing Center were supported by NIH grant no. S10OD025212. M.R.C. is supported by the American Society of Hematology Scholar Award. A.S. is supported by the Stanford BioX Bowes fellowship. M.J.G. and T.E. are supported by a National Library of Medicine training grant no. 5T15LM007033-36. M.J.G. is additionally supported by a Stanford Graduate Fellowship. H.Y.C. is an Investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
M. Ryan Corces, Michael J. Gloudemans, Laure Frésard, Bryan H. Louie, Tiffany Eulalio, Boxiang Liu, Kathleen S. Montine, Stephen B. Montgomery & Thomas J. Montine
Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
M. Ryan Corces, Jeffrey M. Granja, Bryan H. Louie, Shadi Shams, S. Tansu Bagdatli, Maxwell R. Mumbach, William J. Greenleaf & Howard Y. Chang
Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
Anna Shcherbina, Michael J. Gloudemans & Tiffany Eulalio
Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Anna Shcherbina, Soumya Kundu, Jeffrey M. Granja, Shadi Shams, S. Tansu Bagdatli, Maxwell R. Mumbach, William J. Greenleaf, Anshul Kundaje, Stephen B. Montgomery & Howard Y. Chang
Department of Computer Science, Stanford University, Stanford, CA, USA
Soumya Kundu & Anshul Kundaje
Program in Biophysics, Stanford University, Stanford, CA, USA
Jeffrey M. Granja
Department of Biology, Stanford University, Stanford, CA, USA
Boxiang Liu
Baidu Research, Sunnyvale, CA, USA
Boxiang Liu
Department of Applied Physics, Stanford University, Stanford, CA, USA
William J. Greenleaf
Chan Zuckerberg Biohub, San Francisco, CA, USA
William J. Greenleaf
Program in Epithelial Biology, Stanford University, Stanford, CA, USA
Howard Y. Chang
Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
Howard Y. Chang

Authors

M. Ryan Corces
View author publications
You can also search for this author in PubMed Google Scholar
Anna Shcherbina
View author publications
You can also search for this author in PubMed Google Scholar
Soumya Kundu
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Gloudemans
View author publications
You can also search for this author in PubMed Google Scholar
Laure Frésard
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey M. Granja
View author publications
You can also search for this author in PubMed Google Scholar
Bryan H. Louie
View author publications
You can also search for this author in PubMed Google Scholar
Tiffany Eulalio
View author publications
You can also search for this author in PubMed Google Scholar
Shadi Shams
View author publications
You can also search for this author in PubMed Google Scholar
S. Tansu Bagdatli
View author publications
You can also search for this author in PubMed Google Scholar
Maxwell R. Mumbach
View author publications
You can also search for this author in PubMed Google Scholar
Boxiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen S. Montine
View author publications
You can also search for this author in PubMed Google Scholar
William J. Greenleaf
View author publications
You can also search for this author in PubMed Google Scholar
Anshul Kundaje
View author publications
You can also search for this author in PubMed Google Scholar
Stephen B. Montgomery
View author publications
You can also search for this author in PubMed Google Scholar
Howard Y. Chang
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Montine
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.R.C., H.Y.C. and T.J.M. conceived of and designed the project. M.R.C. and T.J.M. compiled the figures and wrote the manuscript with help and input from all authors. A.S. and M.R.C. performed the bulk ATAC–seq data processing and analysis. M.R.C. performed all HiChIP data analysis with help from M.R.M. and J.M.G. J.M.G., M.R.C. and A.S. performed all scATAC–seq data processing and analysis with supervision from W.J.G., A.K., S.B.M. and H.Y.C. M.J.G. performed the GWAS locus curation, colocalization analysis and GTEx analysis. M.J.G., L.F. and B.L. performed all LD score regression analysis with supervision from S.B.M. S.K. and A.S. performed the machine-learning analysis with supervision from A.K. S.K. and T.E. performed the allelic imbalance analyses with supervision from A.K. and S.B.M. B.H.L., S.S. and M.R.C. performed all ATAC–seq, scATAC–seq and HiChIP data generation with help from S.T.B. and M.R.M. K.S.M. curated the frozen tissue specimens used in this work.

Corresponding authors

Correspondence to Howard Y. Chang or Thomas J. Montine.

Ethics declarations

Competing interests

H.Y.C. is a cofounder of Accent Therapeutics, Boundless Bio, and an advisor to 10x Genomics, Arsenal Bio and Spring Discovery. S.B.M. is on the scientific advisory board of MyOme. A.K. is a consultant for Biogen. A.S. is a consultant for MyoKardia. W.J.G. is a consultant for Guardant Health, 10x Genomics and Protillion Biosciences.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Region-centric scATAC-seq identifies cellular and regional heterogeneity in chromatin accessibility in adult brain.

a, b, UMAP dimensionality reduction (a) prior to and (b) after batch correction with Harmony of scATAC-seq data from 10 different samples. Each dot represents a single cell (N = 70,631). Dots are colored by the sample of origin. Color labels are shown in Extended Data Fig. 1b. c, The same UMAP dimensionality reduction shown in Extended Data Fig. 1b but each cell is colored by its gene activity score for the annotated lineage-defining gene. Gene activity scores were imputed using MAGIC. Grey represents the minimum gene activity score while purple represents the maximum gene activity score for the given gene. The minimum and maximum scores are shown in the bottom left of each panel. The gene of interest and the cell type that it identified are shown in the upper left of each panel. MSNs – medium spiny neurons. d, Heatmap of cell type-specific markers used to define the cell type corresponding to each cluster. Color represents the row-wise Z-score of chromatin accessibility in the vicinity of each gene for each cluster. e, Cluster residence heatmap showing the percent of each cluster that is composed of cells from each sample. Cell numbers were normalized across samples prior to calculating cluster residence percentages to account for differences in total pass filter cells per sample. f–h, UMAP dimensionality reduction as shown in Extended Data Fig. 1b but colored by (f) the gross brain region from which each cell was obtained, (g) the biological sex of the donor for each cell, or (h) the predicted cell class for each cell. i–k, Bar plot showing the number of cells identified in our scATAC-seq data from (i) each of the annotated cell classes, (j) each of the annotated donors/samples, or (k) each of the gross brain regions subdivided based on cell class. Color represents the predicted cell class as shown in the legend of Extended Data Fig. 1h. l, m, Bar plot showing the percentage of cells in our scATAC-seq data from (l) each of the gross brain regions subdivided based on cell class or (m) each of the annotated cell classes subdivided based on donor/sample of origin. Color represents (l) the predicted cell class as shown in the Extended Data Fig. 1h or (m) the biological sample from which the cells were obtained.

Extended Data Fig. 2 Cellular heterogeneity in brain tissue necessitates single-cell approaches to capture biological complexity.

a, b, Bar plot of the log2(Fold Change) in the percent of peaks mapping to various genomic annotations comparing peaks from (a) the scATAC-seq peak set that are not overlapped by a peak from the bulk ATAC-seq peak set to peaks that are overlapped by a peak from the bulk ATAC-seq peak set or (b) the scATAC-seq peak set that were identified as cell type-unique through feature binarization to all peaks from the scATAC-seq peak set. c, Sequencing tracks of lineage-defining factors shown across all 24 scATAC-seq clusters (except Cluster 18 – putative doublets). From left to right, NEFL (neurons; chr8:24933431-24966791), AIF1 (aka IBA1, microglia; chr6:31607841-31617906), MOG (oligodendrocytes; chr6:29652183-29699713), GJB6 (astrocytes; chr13:20200243-20239571), and PDGFRA (OPCs; chr4:54209541-54303643). d, Box and whiskers plots showing the distribution of the number of single cells from our scATAC-seq data showing accessibility within (left) each peak from the set of peaks from the scATAC-seq peak set that overlap a peak from the bulk ATAC-seq peak set (N = 120,941 peaks) and (right) each peak from the set of peaks from the scATAC-seq peak set that do not overlap a peak from the bulk ATAC-seq peak set (N = 238,081 peaks). The lower and upper ends of the box represent the 25th and 75th percentiles and the internal line represents the median. The whiskers represent 1.5 multiplied by the inter-quartile range. P-value determined by Kolmogorov–Smirnov test. e, Dot plot showing the inter-region Pearson correlation of pseudo-bulk replicates comprised of all cells from either SMTG, PARL, or MDFG within each of the clusters shown. The clusters shown were selected based on biological relevance (that is clusters annotated as “substantia nigra astrocytes” should not be compared across isocortical regions) and on cluster size (that is clusters with small numbers of isocortical cells would not provide robust comparisons).

Extended Data Fig. 3 Neuronal sub-clustering identifies diverse biologically relevant populations of neurons.

a–d, UMAP dimensionality reduction of neuronal cells (identified as Clusters 1-7, 11, and 12 from Fig. 1e) (a) prior to or (b–d) after batch correction with Harmony of scATAC-seq data from 10 different samples. Each dot represents a single cell (N = 21,116). Dots are colored by (a, b) the sample of origin, (c) the neuronal sub-cluster (repeated from Fig. 2a), or (d) its gene activity score for the annotated lineage-defining gene. In (d), gene activity scores were imputed using MAGIC. Grey represents the minimum gene activity score while purple represents the maximum gene activity score for the given gene. The minimum and maximum scores are shown in the bottom left of each panel. The gene of interest is shown in the upper right of each panel. e, Heatmap of gene activity scores for all neuronal markers used in identifying relevant cell types for neuronal sub-clusters. Color represents the column-wise z-scores for each gene across all neuronal sub-clusters with values thresholded at -2 and +2. Neuronal cluster “major annotation” is shown by color along with a cluster description to the right of the plot. f–h, The same UMAP dimensionality reduction shown in Extended Data Fig. 3c but cells are colored by (f) the major cell class annotation, (g) a more granular neuronal sub-annotation, or (h) the neuronal cell class annotation. Assignment was made based on gene activity scores of lineage-defining genes. The cell class annotation shown in (h) was used to perform LD score regression analysis.

Extended Data Fig. 4 Sub-clustering of cells from the substantia nigra identifies TH-positive dopaminergic neurons.

a–d, UMAP dimensionality reduction after iterative LSI of scATAC-seq data from substantia nigra cells from 2 different samples. Each dot represents a single cell (N = 11,199). Dots are colored by (a) their corresponding substantia nigra sub-cluster, (b) the sample of origin, or (c, d) its gene activity score for (c) the tyrosine hydoxylase (TH) gene, a specific marker of dopaminergic neurons or (d) other lineage-defining genes. In (c, d), gene activity scores were imputed using MAGIC. Grey represents the minimum gene activity score while purple represents the maximum gene activity score. The minimum and maximum scores are shown in the bottom left of each panel. In (a-c), the predicted cluster cell type identities are overlaid on the UMAPs.

Extended Data Fig. 5 HiChIP and co-accessibilty predict enhancer-promoter interactions in primary adult human brain.

a, Heatmap representation of HiChIP interaction signal at 100-kb, 25-kb, and 5-kb resolution at the OLIG2 locus. Sample shown represents the substantia nigra from donor 03-41. Signal is normalized to the square root of the coverage. The maximum value of the color range and the coordinates along chromosome 21 are shown below each panel. b, Bar plots showing the total number of paired-end reads sequenced for each HiChIP library generated in this study. Color represents the brain region from which the data was generated. c, Bar plots showing the number of valid interaction pairs identified in HiChIP data from all samples profiled in this study. Color represents the type of interaction identified. d, Bar plot showing the overlap of FitHiChIP loop calls from the 4 gross brain regions profiled. Color indicates whether the loop was identified in a single region (unique) or more than one region (shared). e, Bar plot showing the classification of FitHiChIP loop calls based on whether the loop call contained an ATAC-seq peak (from either the bulk ATAC-seq peak set or the scATAC-seq peak set) or TSS in one, both, or no anchor. f, Bar plots showing the number of Cicero-predicted co-accessibility-based peak links that are observed in HiChIP (left) or the number of HiChIP-based FitHiChIP loop calls that are predicted as peak links by Cicero. g, Bar plot showing the number of cell type-specific peaks (defined as peaks identified through feature binarization; N = 221,062) or non-cell type-specific peaks (defined as scATAC-seq peaks that were not identified through feature binarization; N = 137,960) that overlap or do not overlap a Cicero-predicated co-accessibility linkage. Significance determined by Kolmogorov-Smirnov test.

Extended Data Fig. 6 A multi-omic tiered approach leverages machine learning to predict functional noncoding SNPs in AD and PD.

a, Flow chart of the analytical framework used to prioritize noncoding SNPs and predict functionality. The highest confidence SNPs (Tier 1) are supported by either machine learning predictions, allelic imbalance, or both. Moderate confidence SNPs (Tier 2) are supported by the presence of the SNP within a peak and a HiChIP loop or co-accessibility peak link that connects the SNP to a gene. Lower confidence SNPs (Tier 3) are only supported by the presence of the SNP in a peak. b, c, Box plot showing the area under (b) the precision-recall curve or (c) the receiver-operating characteristics curve for the gkm-SVM machine learning classifier. Performance for each of the 24 broad clusters is shown with dots representing outliers. The lower and upper ends of the box represent the 25th and 75th percentiles. The whiskers represent 1.5 multiplied by the inter-quartile range. The center line represents the median. d, GkmExplain importance scores shown across all 10 folds for each base across a 100-bp window surrounding rs636317 for the effect (left) and non-effect (right) bases. e, Dot plots showing comparison of the GkmExplain score, ISM score, and deltaSVM score. Each dot represents an individual SNP test in a given fold. Dot color represents the GWAS locus number. The only off-diagonal dots (circled) correspond to repetitive regions within the MAPT locus where the deltaSVM score appears to be particularly sensitive. f, Dot plot showing allelic imbalance assessed by RASQUAL across all bulk ATAC-seq data used in this study from a region-specific analysis. Significance is assessed by RASQUAL (see Methods). Dot color indicates the brain region found to have significant allelic imbalance. Grey dots do not pass significance testing based on an empircal distribution of permuted null q-values and a 10% false discovery rate. A RASQUAL effect size greater than 0.5 indicates that the alternate allele is enriched while less than 0.5 indicates that the reference allele is enriched. The plot is divided to show SNPs within the MAPT and DNAH17 loci (bottom) and SNPs in all other loci (top). SNPs mentioned in downstream analyses are highlighted by red text.

Extended Data Fig. 7 Multi-omic characterization of well-studied AD-related GWAS loci pinpoints putative functional noncoding SNPs.

a, c, Normalized scATAC-seq-derived pseudo-bulk tracks, H3K27ac HiChIP loop calls, co-accessibility correlations, and publically available H3K4me3 PLAC-seq loop calls (Nott. et al. 2019) in (a) the BIN1 gene locus (chr2:127045000-127182000) and (c) the MS4A gene locus (chr11:60023000-60554000). scATAC-seq tracks represent the aggregate signal of all cells from the given cell type and have been normalized to the total number of reads in TSS regions, enabling direct comparison of tracks across cell types. For HiChIP, each line represents a FitHiChIP loop call connecting the points on each end. Red lines contain one anchor overlapping the SNP of interest while grey lines do not. For co-accessibility, only interactions involving the accessible chromatin region of interest are shown. For PLAC-seq, MAPS loop calls from microglia (blue), neurons (orange), and oligodendrocytes (purple) are shown. b, d, GkmExplain importance scores for each base in the 50-bp region surrounding (b) rs13025717 or (d) rs636317 for the effect and non-effect alleles from the gkm-SVM model for microglia (Cluster 24). The predicted motif affected by the SNP is shown at the bottom and the SNP of interest is highlighted in blue. e, Dot plot showing allelic imbalance at rs636317. Significance of allelic imbalance was determined by RASQUAL. The bulk ATAC-seq counts determined by WASP and ASEReadCounter for the reference/non-effect (A) allele and variant/effect (T) allele are plotted. Each dot represents an individual bulk ATAC-seq sample (N = 140) colored by the brain region from which the sample was collected. Samples where fewer than 3 reads were present to support both the reference and variant allele (that is presumed homozygotes or samples with insufficient sequencing depth) are shown in grey. The blue line represents a linear regression of the non-grey points and the grey box represents the 95% confidence interval of that regression.

Extended Data Fig. 8 Multi-omic characterization of noncoding SNPs identifies novel genes implicated in PD.

a, c, Normalized scATAC-seq-derived pseudo-bulk tracks, H3K27ac HiChIP loop calls, co-accessibility correlations, and publically available H3K4me3 PLAC-seq loop calls (Nott. et al. 2019) in (a) the IP6K2 gene locus (chr3:48671000-49205000) or (c) the TMEM163 gene locus (chr2:134429000-134905000). scATAC-seq tracks represent the aggregate signal of all cells from the given cell type and have been normalized to the total number of reads in TSS regions, enabling direct comparison of tracks across cell types. For HiChIP, each line represents a FitHiChIP loop call connecting the points on each end. Red lines contain one anchor overlapping the SNP of interest while grey lines do not. For co-accessibility, only interactions involving the accessible chromatin region of interest are shown. For PLAC-seq, MAPS loop calls from microglia (blue), neurons (orange), and oligodendrocytes (purple) are shown. b, d, GkmExplain importance scores for each base in the 50-bp region surrounding (b) rs6781790 or (d) rs7599054 for the effect and non-effect alleles from the gkm-SVM model for (b) astrocytes (Cluster 15) or (d) microglia (Cluster 24). The predicted motif affected by the SNP is shown at the bottom and the SNP of interest is highlighted in blue. e, Dot plot comparing the –log10(p-value) from 23andMe PD GWAS data with the –log10(p-value) from GTEx Caudate eQTL data of SNPs in the TMEM163 locus. Each dot represents an individual SNP. Dot color represents the r² value of LD with the lead SNP (rs7599054 – purple diamond) within a European reference population. f, g, Dot plots showing the genomic coordinates of each SNP and the –log10(p-value) from (f) 23andMe PD GWAS data or (g) GTEx Caudate eQTL data. Dots are colored as in Extended Data Fig. 8e. In (e–g), p-values are based on genome-wide chi-squared statistics from the relevant GWAS and eQTL studies.

Extended Data Fig. 9 Epigenomic dissection of the MAPT locus.

a, Flowchart illustrating the analytical scheme used to identify bins with significant allelic imbalance across the H1 and H2 MAPT haplotypes. b, Heatmaps showing chromatin accessibility in 500-bp bins identified as having significantly different accessibility across MAPT haplotypes. Regions are shown for homozygous samples without allelic read splitting (left) and for heterozygous samples after allelic read splitting (right). Bin start coordinates are shown to the right. c, Box and whiskers plots for multiple regions which show differential chromatin accessibility across the H1 and H2 MAPT haplotypes. Each dot represents a single homozygous H1 (N = 91) or homozygous H2 (N = 12) sample. Heterozygotes are not shown. The lower and upper ends of the box represent the 25th and 75th percentiles. The whiskers represent 1.5 multiplied by the inter-quartile range. The center line represents the median. d, e, Gene expression of (d) the KANSL1-AS1 gene or (e) the MAPK8IP1P2 gene shown as a box plot from GTEx cortex brain samples subdivided based on MAPT haplotype. The lower and upper ends of the box represent the 25th and 75th percentiles. The whiskers represent 1.5 multiplied by the inter-quartile range. The center line represents the median. ***p < 10^-5 based on Wilcoxon rank sum test. N = 117 H1/H1, 78 H1/H2, and 10 H2/H2. f, Sequencing tracks from pseudo-bulk data derived from predicted cell types in scATAC-seq data. This region represents a zoomed in view of the predicted distal regulatory region (chr17:45216500-45324000) that interacts with the MAPT promoter in the H1 haplotype. Putative neuron-specific regulatory elements are highlighted in blue. g, Box plots showing differential HiChIP interaction signal occurring between regions within the MAPT inversion and regions outside the inversion (“left” or “right”). The schematic at the top explains the analysis performed. The box plots show normalized HiChIP interaction counts for the H1 (N = 6) and H2 (N = 6) haplotypes for upstream/“left” interactions and downstream/“right” interactions. P-value determined by paired two-sided t-test.

Supplementary information

Supplementary Information

Supplementary Figs. 1–6, Notes 1–7 and Methods

Reporting Summary

Supplementary Tables 1 and 2

Donor information, sequencing statistics, GWAS SNP metadata and multi-omic characterizations

Supplementary Data 1

Final merged peak set derived from all bulk ATAC–seq data.

Supplementary Data 2

Quality control information for all individual cells profiled by scATAC–seq and the cluster residence information for all clusters and samples.

Supplementary Data 3

Final merged peak set derived from all scATAC–seq data.

Supplementary Data 4

Results of feature binarization from scATAC–seq data showing cell-type-specific peaks.

Supplementary Data 5

CIBERSORT signature matrices for the cell group-specific and cluster-specific classifiers.

Supplementary Data 6

Results of differential accessibility comparisons between the substantia nigra and isocortex for astrocytes, OPCs, oligodendrocytes, and microglia.

Supplementary Data 7

Neuronal cluster definitions, and cell class-specific peaks, TF motifs, and genes.

Supplementary Data 8

Results of all LD score regression analyses across all conditions and cell types.

Supplementary Data 9

All FitHiChIP loop calls overlapping a SNP on at least one anchor.

Supplementary Data 10

Results of RASQUAL allelic imbalance analysis and allele counts from WASP/ASEReadCounter for all tested SNPs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corces, M.R., Shcherbina, A., Kundu, S. et al. Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases. Nat Genet 52, 1158–1168 (2020). https://doi.org/10.1038/s41588-020-00721-x

Download citation

Received: 27 December 2019
Accepted: 18 September 2020
Published: 26 October 2020
Issue Date: November 2020
DOI: https://doi.org/10.1038/s41588-020-00721-x

This article is cited by

Identification of 27 allele-specific regulatory variants in Parkinson’s disease using a massively parallel reporter assay
- Sophie L. Farrow
- Sreemol Gokuladhas
- Justin M. O’Sullivan
npj Parkinson's Disease (2024)
Leveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types
- Samuel S. Kim
- Buu Truong
- Alkes L. Price
Nature Communications (2024)
A concerted neuron–astrocyte program declines in ageing and schizophrenia
- Emi Ling
- James Nemesh
- Steven A. McCarroll
Nature (2024)
Transcriptomic analysis reveals associations of blood-based A-to-I editing with Parkinson’s disease
- Weimin Li
- Hao Wu
- Ganqiang Liu
Journal of Neurology (2024)
Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation
- Jun Wang
- Xuesen Cheng
- Rui Chen
Genome Biology (2023)