Abstract
Human-specific genomic changes contribute to the unique functionalities of the human brain1,2,3,4,5. The cellular heterogeneity of the human brain6,7 and the complex regulation of gene expression highlight the need to characterize human-specific molecular features at cellular resolution. Here we analysed single-nucleus RNA-sequencing and single-nucleus assay for transposase-accessible chromatin with sequencing datasets for human, chimpanzee and rhesus macaque brain tissue from posterior cingulate cortex. We show a human-specific increase of oligodendrocyte progenitor cells and a decrease of mature oligodendrocytes across cortical tissues. Human-specific regulatory changes were accelerated in oligodendrocyte progenitor cells, and we highlight key biological pathways that may be associated with the proportional changes. We also identify human-specific regulatory changes in neuronal subtypes, which reveal human-specific upregulation of FOXP2 in only two of the neuronal subtypes. We additionally identify hundreds of new human accelerated genomic regions associated with human-specific chromatin accessibility changes. Our data also reveal that FOS::JUN and FOX motifs are enriched in the human-specifically accessible chromatin regions of excitatory neuronal subtypes. Together, our results reveal several new mechanisms underlying the evolutionary innovation of human brain at cell-type resolution.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Raw and processed data are available at National Center for Biotechnology Information GEO under the accession number GSE192774. Processed data associated with ref. 7 were accessed from https://assets.nemoarchive.org/dat-ek5dbmu. Other datasets were obtained using their GEO accession numbers (GSE127774, for ref. 11; GSE107638, GSE123936 and GSE139914, for ref. 18; GSE18653 for ref. 19).
Code availability
All analysis scripts are available at https://github.com/konopkalab/Comparative_snATAC_snRNA.
References
King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science 188, 107–116 (1975).
Konopka, G. et al. Human-specific transcriptional networks in the brain. Neuron 75, 601–617 (2012).
Liu, X. et al. Extension of cortical synaptic development distinguishes humans from chimpanzees and macaques. Genome Res. 22, 611–622 (2012).
Sousa, A. M. M. et al. Molecular and cellular reorganization of neural circuits in the human lineage. Science 358, 1027–1032 (2017).
Zhu, Y. et al. Spatiotemporal transcriptomic divergence across human and macaque brain development. Science https://doi.org/10.1126/science.aat8077 (2018).
Hodge, R. D. et al. Conserved cell types with divergent features in human versus mouse cortex. Nature 573, 61–68 (2019).
Bakken, T. E. et al. Comparative cellular analysis of motor cortex in human, marmoset and mouse. Nature 598, 111–119 (2021).
Miller, D. J. et al. Prolonged myelination in human neocortical evolution. Proc. Natl Acad. Sci. USA 109, 16480–16485 (2012).
Jakel, S. et al. Altered human oligodendrocyte heterogeneity in multiple sclerosis. Nature 566, 543–547 (2019).
Jeong, H. et al. Evolution of DNA methylation in the human brain. Nat. Commun. 12, 2021 (2021).
Khrameeva, E. et al. Single-cell-resolution transcriptome map of human, chimpanzee, bonobo, and macaque brains. Genome Res. 30, 776–789 (2020).
Kozlenkov, A. et al. Evolution of regulatory signatures in primate cortical neurons at cell-type resolution. Proc. Natl Acad. Sci. USA 117, 28422–28432 (2020).
Krienen, F. M. et al. Innovations present in the primate interneuron repertoire. Nature 586, 262–269 (2020).
Ma, S. et al. Molecular and cellular evolution of the primate dorsolateral prefrontal cortex. Science https://doi.org/10.1126/science.abo7257 (2022).
Mendizabal, I. et al. Comparative methylome analyses identify epigenetic regulatory loci of human brain evolution. Mol. Biol. Evol. 33, 2947–2959 (2016).
Li, W., Mai, X. & Liu, C. The default mode network and social understanding of others: what do brain connectivity studies tell us. Front. Hum. Neurosci. 8, 74 (2014).
Wang, D. et al. Altered functional connectivity of the cingulate subregions in schizophrenia. Transl. Psychiatry 5, e575 (2015).
Berto, S. et al. Accelerated evolution of oligodendrocytes in the human brain. Proc. Natl Acad. Sci. USA 116, 24334–24342 (2019).
Franjic, D. et al. Transcriptomic taxonomy and neurogenic trajectories of adult human, macaque, and pig hippocampal and entorhinal cells. Neuron 110, 452–469 (2022).
Brown, T. L. & Verden, D. R. Cytoskeletal regulation of oligodendrocyte differentiation and myelination. J. Neurosci. 37, 7797–7799 (2017).
Caglayan, E., Liu, Y. & Konopka, G. Neuronal ambient RNA contamination causes misinterpreted and masked cell types in brain single-nuclei datasets. Neuron https://doi.org/10.1016/j.neuron.2022.09.010 (2022).
Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70–80 (2018).
Velmeshev, D. et al. Single-cell genomics identifies cell type-specific molecular changes in autism. Science 364, 685–689 (2019).
Fumagalli, M. et al. The ubiquitin ligase Mdm2 controls oligodendrocyte maturation by intertwining mTOR with G protein-coupled receptor kinase 2 in the regulation of GPR17 receptor desensitization. Glia 63, 2327–2339 (2015).
den Hoed, J., Devaraju, K. & Fisher, S. E. Molecular networks of the FOXP2 transcription factor in the brain. EMBO Rep. 22, e52803 (2021).
Konopka, G. et al. Human-specific transcriptional regulation of CNS development genes by FOXP2. Nature 462, 213–217 (2009).
Doan, R. N. et al. Mutations in human accelerated regions disrupt cognition and social behavior. Cell 167, 341–354 (2016).
Franchini, L. F. & Pollard, K. S. Human evolution: the non-coding revolution. BMC Biol. 15, 89 (2017).
Capra, J. A., Erwin, G. D., McKinsey, G., Rubenstein, J. L. & Pollard, K. S. Many human accelerated regions are developmental enhancers. Philos. Trans. R. Soc. Lond. B 368, 20130025 (2013).
Girskis, K. M. et al. Rewiring of human neurodevelopmental gene regulatory programs by human accelerated regions. Neuron https://doi.org/10.1016/j.neuron.2021.08.005 (2021).
Wagnon, J. L. et al. CELF4 regulates translation and local abundance of a vast set of mRNAs, including genes associated with regulation of synaptic function. PLoS Genet. 8, e1003067 (2012).
Lundgaard, I. et al. Neuregulin and BDNF induce a switch to NMDA receptor-dependent myelination by oligodendrocytes. PLoS Biol. 11, e1001743 (2013).
Prufer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).
Arora, V. et al. Increased Grik4 gene dosage causes imbalanced circuit output and human disease-related behaviors. Cell Rep. 23, 3827–3838 (2018).
Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).
Yap, E. L. & Greenberg, M. E. Activity-regulated transcription: bridging the gap between neural activity and behavior. Neuron 100, 330–348 (2018).
Berto, S. et al. Gene-expression correlates of the oscillatory signatures supporting human episodic memory encoding. Nat. Neurosci. 24, 554–564 (2021).
Ducker, G. S. & Rabinowitz, J. D. One-carbon metabolism in health and disease. Cell Metab. 25, 27–42 (2017).
Yeung, M. S. et al. Dynamics of oligodendrocyte generation and myelination in the human brain. Cell 159, 766–774 (2014).
Marques, S. et al. Oligodendrocyte heterogeneity in the mouse juvenile and adult central nervous system. Science 352, 1326–1329 (2016).
Buchanan, J. et al. Oligodendrocyte precursor cells ingest axons in the mouse neocortex. Proc. Natl Acad. Sci. USA 119, e2202580119 (2022).
Jorstad, N. L. et al. Comparative transcriptomics reveals human-specific cortical features. Preprint at bioRxiv https://doi.org/10.1101/2022.09.19.508480 (2022).
Berg, M. et al. FastCAR: Fast Correction for Ambient RNA to facilitate differential gene expression analysis in single-cell RNA-sequencing datasets. Preprint at bioRxiv https://doi.org/10.1101/2022.07.19.500594 (2022).
McLean, C. Y. et al. Human-specific loss of regulatory DNA and the evolution of human-specific traits. Nature 471, 216–219 (2011).
Hickey, S. L., Berto, S. & Konopka, G. Chromatin decondensation by FOXP2 promotes human neuron maturation and expression of neurodevelopmental disease genes. Cell Rep. 27, 1699–1711 (2019).
Yang, C. C. et al. Discovering chromatin motifs using FAIRE sequencing and the human diploid genome. BMC Genomics 14, 310 (2013).
Ataman, B. et al. Evolution of osteocrin as an activity-regulated factor in the primate brain. Nature 539, 242–247 (2016).
Pruunsild, P., Bengtson, C. P. & Bading, H. Networks of cultured iPSC-derived neurons reveal the human synaptic activity-regulated adaptive gene program. Cell Rep. 18, 122–135 (2017).
Qiu, J. et al. Evidence for evolutionary divergence of activity-dependent gene expression in developing neurons. Elife https://doi.org/10.7554/eLife.20337 (2016).
Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018).
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 30, 1006–1007 (2014).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Fleming, S. J., Marioni, J. C. & Babadi, M. Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender. Preprint at bioRxiv https://doi.org/10.1101/791699v2 (2019).
Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49, D884–D891 (2021).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Picard Toolkit (Broad Institute, 2019); http://broadinstitute.github.io/picard/.
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Kuhn, R. M., Haussler, D. & Kent, W. J. The UCSC genome browser and associated tools. Brief. Bioinform. 14, 144–161 (2013).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Lareau, C. A., Ma, S., Duarte, F. M. & Buenrostro, J. D. Inference and effects of barcode multiplets in droplet-based single-cell assays. Nat. Commun. 11, 866 (2020).
Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871 (2018).
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Statist. Softw. 67, 1–48 (2015).
Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).
Chen, Y., Lun, A. T. & Smyth, G. K. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Res 5, 1438 (2016).
McCarthy, D. J., Campbell, K. R., Lun, A. T. & Wills, Q. F. Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 33, 1179–1186 (2017).
Gontarz, P. et al. Comparison of differential accessibility analysis strategies for ATAC-seq data. Sci. Rep. 10, 10150 (2020).
Wang, X., Park, J., Susztak, K., Zhang, N. R. & Li, M. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat. Commun. 10, 380 (2019).
Mendizabal, I. et al. Cell type-specific epigenetic links to schizophrenia risk in the brain. Genome Biol. 20, 135 (2019).
van Arensbergen, J., van Steensel, B. & Bussemaker, H. J. In search of the determinants of enhancer-promoter interaction specificity. Trends Cell Biol. 24, 695–702 (2014).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
Cavalcante, R. G. & Sartor, M. A. annotatr: genomic regions in context. Bioinformatics 33, 2381–2383 (2017).
Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 46, D260–D266 (2018).
Schep, A. motifmatchr: Fast motif matching in R. R version 1.4.0. (2018).
Kolde, R. pheatmap: Pretty heatmaps. R version 4.1.1. https://cran.r-project.org/web/packages/pheatmap/index.html (2012).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Gittelman, R. M. et al. Comprehensive identification and analysis of human accelerated regulatory DNA. Genome Res. 25, 1245–1255 (2015).
Blanchette, M. et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004).
Hubisz, M. J., Pollard, K. S. & Siepel, A. PHAST and RPHAST: phylogenetic analysis with space/time models. Brief. Bioinform. 12, 41–51 (2011).
Mafessoni, F. et al. A high-coverage Neandertal genome from Chagyrskaya Cave. Proc. Natl Acad. Sci. USA 117, 15132–15136 (2020).
Prufer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358, 655–658 (2017).
The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Ghandi, M. et al. gkmSVM: an R package for gapped-kmer SVM. Bioinformatics 32, 2205–2207 (2016).
Acknowledgements
We thank C. Tamminga and K. Gleason at UT Southwestern for postmortem human tissues. We also thank H. Stroud and M. Chahrour for their critical comments on the manuscript. We also thank K. Luby-Phelps and S. Yamazaki for their help with imaging. G.K. is a Jon Heighten Scholar in Autism Research and Townsend Distinguished Chair in Research on Autism Spectrum Disorders at UT Southwestern. E.C. is a Neural Scientist Training Program Fellow in the Peter O’Donnell Brain Institute at UT Southwestern. This work was partially supported by the James S. McDonnell Foundation 21st Century Science Initiative in Understanding Human Cognition Scholar Award to G.K.; NHGRI (HG011641) to G.K., S.V.Y. and C.C.S.; National Science Foundation (SBE-131719 and EF-2021635) to S.V.Y. and C.C.S.; the NIMH (MH103517) to T.M.P., G.K. and S.V.Y.; NIH grants T32DA007290 and T32HL139438 to F.A., American Heart Association Postdoctoral Fellowship (915654) to Y.L. and NIMH grant MH126481 to R.M.V. and G.K. The National Chimpanzee Brain Resource was supported by NINDS (R24NS092988). Macaque tissue collection and archiving was supported by funding from the NIH National Center for Research Resources (P51RR165; superseded by the Office of Research Infrastructure Programs (OD P51OD11132)) to the Yerkes National Primate Research Center. The Zeiss LSM 880 with Airyscan was supported by NIH grant 1S10OD021684-01 to K. Luby-Phelps.
Author information
Authors and Affiliations
Contributions
S.V.Y., G.K. and E.C. designed the study. T.M.P. and C.C.S. carried out tissue dissections. S.V.Y., T.M.P., C.C.S. and G.K. obtained funding. F.A. collected snRNA-seq and snATAC-seq data. Y.L., R.M.V. and E.O. carried out smFISH experiments. Y.L. carried out image quantification. E.C. carried out all analyses. T.M.P. edited the manuscript. E.C., S.V.Y. and G.K. wrote the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Annotation and quality control of single-nuclei RNA-seq and single-nuclei ATAC-seq.
(a) Distribution of sex and humanized age of samples. (b) Broadly annotated UMAP of nuclei per species. (c) Total nuclei number per sample after filtering. (d) Normalized, log (ln) transformed expression values of major cell type markers. (e) Violin plots of number of detected UMIs (log10 transformed) per major cell type. (f) Percentage of cells contributed per individual per species per major cell type. (g) Broad annotation of snATAC-seq data per species. (h) Total nuclei number per sample after quality control. (i) Nucleosome band pattern per sample; each line represents one sample. First, second and third peaks represent nucleosome free, mononucleosome and dinucleosome fractions, respectively. (j) Percentage of cells contributed per individual per species per major cell type. (k) Clarity of annotation transfer from snRNA-seq to snATAC-seq as displayed by Jaccard similarity index, which is the number of nuclei with the same final annotation and prediction (intersection) divided by the total number of nuclei with a given annotation or prediction (union). y-axis represents final annotation; x-axis represents the prediction which was assigned by label transfer per nucleus. Higher values indicate more similarity between final annotation and initial prediction. (l) Fraction of reads in peaks per sample (N = 9280, 5383, 5657, 4655, 5941, 4381, 5691, 4690, 3321, 6426, 5984, 6793 left to right). Boxplots represent median and interquartile range.
Extended Data Fig. 2 Annotation of oligodendrocyte lineage cells.
(a) UMAP visualization of integrated and annotated oligodendrocyte lineage nuclei in snRNA-seq. Oligodendrocyte: mature oligodendrocytes, COP: committer oligodendrocyte progenitor cells, OPC: oligodendrocyte progenitor cells. (b) Percentage of nuclei per sample for each subtype in snRNA-seq. (c) Normalized and scaled (z-scored) expression values of major oligodendrocyte lineage cell type markers. (d) UMAP visualization of annotated oligodendrocyte lineage nuclei in snATAC-seq per species. (e) Clarity of annotation transfer from snRNA-seq to snATAC-seq as displayed by Jaccard similarity index (similar to Extended Data Fig. 1k). (f) Percentage of cells contributed per individual per species per major cell type. (g-h) smFISH of PDGFRA (OPC) and MOG (MOL) in humans (g) and chimpanzees (h) (region: posterior cingulate cortex. Images span all cortical layers in both species. Scale bar is 100 μm). Similar results have been obtained for 4 bins across 2 humans and for 6 bins across 3 chimpanzees (see Extended Data Fig. 4c, d).
Extended Data Fig. 3 Integration and annotation of neurons.
(a) Annotated UMAP of excitatory neurons integrated across all species in snRNA-seq. (b) Percentage of nuclei per sample for each excitatory subtype in snRNA-seq. (c) Annotated snATAC-seq per species in the UMAP space. All 14 subtypes identified in snRNA-seq are also distinctly annotated in snATAC-seq for all species. (d) Percentage of nuclei per sample for each excitatory subtype in snATAC-seq. (e) Excitatory subtype markers for validation of annotation (expressions are normalized and log transformed). Note that the individual cells are plotted for C1QL3 since the expression level is not sufficient for a violin plot. (f) Clarity of annotation transfer from snRNA-seq to snATAC-seq as displayed by Jaccard similarity index (similar to Extended Data Fig. 1k). (g) Annotated UMAP of inhibitory neurons integrated across all species in snRNA-seq. (h) Percentage of nuclei per sample for each inhibitory subtype in snRNA-seq. (i) Known inhibitory subtype markers for validation of annotation. Expression levels are normalized and log transformed. Note that the individual cells are plotted for NMBR, PAX6, SYT6 since the expression level is not sufficient for a violin plot. (j) Annotated snATAC-seq per species in the UMAP space. All 8 subtypes identified in snRNA-seq are also distinctly annotated in snATAC-seq for all species. (k) Percentage of nuclei per sample for each inhibitory subtype in snATAC-seq. (l) Clarity of annotation transfer from snRNA-seq to snATAC-seq as displayed by Jaccard similarity index (similar to Extended Data Fig. 1k).
Extended Data Fig. 4 Additional analyses on the oligodendrocyte lineage.
(a) UMAP of MOLs and OPCs in the anterior cingulate cortex (ACC). (b) Percentage of cells contributed per individual per species per cell type. (c-f) Fractions of MOLs and OPCs in smFISH experiments per section (see Fig. 1). Stitched column images encompassing all layers were divided into 5 equal parts from upper (Section 1) to lower layers (Section 5) in all images from human and chimpanzee. (c-d) are data from posterior cingulate cortex (PCC), and (e-f) are data from ACC. Each data point is a bin that contains sections from all layers. c-d: 4 bins from 2 humans, 6 bins from 3 chimpanzees. e-f: 9 bins from 3 humans and 3 chimpanzees. Data are represented as mean values +/− SEM. (g) Deconvoluted proportions from OLIG2+ bulk RNA-seq dataset (reference datasets: (left) chimpanzee, (right) rhesus macaque from this study). N = 22 (human), 10 (chimpanzee), 10 (rhesus macaque) individuals. P-value: Wilcoxon rank sum test, two-sided). (h-i) Fraction of OPCs or MOLs in glia in (h) caudate nucleus and (i) dentate gyrus per species. Each dot represents a sample (p-value: Wilcoxon rank sum test, two-sided. Caudate nucleus: N = 6 per species. Dentate gyrus: N = 6 for human, 3 for rhesus macaque). Box plots represent median and interquartile range in panels g-i. (j) Number of species-specific regulatory changes (PCC snRNA-seq (top), ACC snRNA-seq (middle), and PCC snATAC-seq (bottom, log10 transformed for better readability). (k) Distributions of UMIs (unique molecular identifiers) in ACC and PCC oligodendrocyte lineage nuclei (N = 12 individuals both for PCC and ACC). Box plots represent median and interquartile range. (i) Enrichment results between species-specifically expressed genes in ACC (x-axis) and PCC (this study, y-axis). Blue asterisk indicates a significant overlap(FDR < 0.05, Fisher’s exact test, one-sided).
Extended Data Fig. 5 Additional analyses of the regulatory changes in neuronal subtypes.
(a-b) Number of regulatory changes that are human-specific, chimpanzee-specific or differential between rhesus macaque - human and rhesus macaque – chimpanzee in (a) snRNA-seq or (b) snATAC-seq (log10 transformed for better readability). (c) Scatter plots of number of HS-Genes and CS-Genes per neuronal subtype. Dashed rectangles indicate the subtypes with an excess number of human-specific regulatory gene expression changes (Two-sided chi-square test, FDR < 0.05). Shaded area indicates 95% confidence interval around the best fit (R indicates Spearman’s rank correlation coefficient). (d) Same as (c) for HS-CREs and CS-CREs identified in snATAC-seq data. (e) Percentage distribution of excitatory HS-Genes that are found in only one subtype or shared among increasing number of subtypes (x-axis). Sum of all percentages equal 100. From left to right: in excitatory snRNA-seq, excitatory snATAC-seq, inhibitory snRNA-seq¸ inhibtory snATAC-seq.
Extended Data Fig. 6 Comparisons of neuronal expression patterns between this dataset and previous comparative bulk datasets.
(a-c) Enrichments of species-specific expression patterns between this study and previous bulk studies between excitatory neurons (left) and inhibitory neurons (right). (a) Transcriptomic changes between the Kozlenkov et al. dataset and this dataset, (b) epigenomic changes between the Kozlenkov et al. dataset and this dataset, (c) transcriptomic changes between the Berto et al. dataset and this dataset. FDR values are from a Fisher’s exact test with multiple testing correction. (d-e) Subtype-specific changes are captured less in the bulk RNA-seq datasets. (d) Comparison of excitatory HS-Genes between a previous bulk analysis and this dataset. Top: odds ratio between the bulk dataset and this dataset with increasing subtype specificity of HS- Genes (from right to left). Bottom: percentage of HS- Genes that were also found in the bulk dataset. (e) Same comparison as (d) with HS-CREs. (f-g) Subtype-specific changes are captured less when the subtypes are combined within the same dataset. (f) Same comparison as (d) with HS-Genes but this time pseudobulk data results are obtained by combining the subtypes in this study. (g) Same comparison as (f) with HS-CREs.
Extended Data Fig. 7 Associations between HS-Genes and HS-CREs.
(a) The specificity of association between HS-Genes and HS-CREs decreases with increasing distance from the transcription start site (TSS). Y-axis shows the odds ratio, which is defined by the ratio of HS-Genes associated with HS-CREs divided by the ratio of not significant genes (NS-Genes) associated with HS-CREs. We calculated the odds ratio for increasing the distance from the TSS in both directions for four different associations (from left to right): HS-Open-CRE & HS-Up-Gene, HS-Open-CRE & HS-Down-Gene, HS-Close-CRE & HS-Up-Gene, HS-Open-CRE & HS-Down-Gene. The value for each observation was obtained by taking the mean across all cell types. (b-d) Enrichments between HS-CRE associated genes within a 10kb window from the TSS and HS-Genes per cell type. (e-f) Putative target genes of human-specific FOXP2 upregulation in (e) L5-6_THEMIS_1 and (f) L4-6_RORB_2 cells. All genes show human-specific up / down regulation in their respective subtype and reside within 500kb of at least one human-specific chromatin accessibility change that has a FOXP2 motif. Dark blue circles indicate the genes that are not altered in the other 12 excitatory subtypes (similar to FOXP2 itself). Red loop in (a) indicates that FOXP2 itself is also identified with this analysis in the L5-6_THEMIS_1 subtype.
Extended Data Fig. 8 Further associations between human-specific substitutions and human-specific chromatin accessibility changes.
(a) Pie-chart distribution of published HARs overlapping CREs in this dataset. (b) Ratio of non-BA23 CREs overlapping BA23 CREs (denominator: all CREs in BA23). Each dot represents an independent library prep. Red datasets indicate cortical regions, blue datasets indicate sub cortical regions. (Sample sizes; Superior Middle Temporal Gyri: 8, Middle Frontal Gyrus: 12, Parietal Lobe: 7, Hippocampus: 16, Caudate: 32, Putamen: 11, Substantia Nigra: 14. Box plots represent median and interquartile range). (c) Overlap between cortical HARs (identified in this study) and published HARs (p-value: One-sided chi-square test). (d) Number of HS-CREs associated with a cortical HAR or a published HAR. (e-f) Examples of HS-Open-CRE associated HARs. Bottom panel shows the multi-species alignment for CELF4 HAR. Dots represent no change from the human (hg38) sequence. Human-specific changes conserved in other lineages are highlighted in shaded blue. (g) Enrichment of human-specific substitutions within the HS-CREs per major cell type. Enrichment is tested by a negative binomial regression model with CRE length and evolution of the CRE as the predictor variables (HS-CRE or not HS-CRE) and number of human-specific substitutions as a response variable(Significance: likelihood ratio test). (h) Example of an HS-Open-CRE with many human specific substitutions. (i) Overlap of substitutions that are specific to the human lineage (in comparison to chimpanzee, gorilla and gibbon) and previously identified modern human substitutions. (j) Log fold changes of substitution and HS-CRE association for substitutions on the human (blue boxplots) and modern human lineage (tile red dots) per cell type (except for excitatory cells). Human lineage-specific substitutions were randomly down sampled for 100 times to 12,161 (the number of modern human-specific variants) for comparison. Box plots represent median and interquartile range.
Extended Data Fig. 9 Supplementary motif enrichment results.
(a-b) Hierarchical clustering of motif enrichments (log-fold change) in HS-Open-CREs across (a) excitatory and (b) inhibitory neuronal subtypes. Transcription factors (TFs) associated with each motif enrichment are displayed in rows and the neuronal subtypes are displayed in columns. Only the motifs enriched in at least one subtype are displayed. (c-d) Accessibility of (c) FOX and (d) FOS / JUN family TFs. Accessibility is assessed by the normalized gene activity scores (calculated using Cicero66) per gene per subtype. (e) Annotated UMAP of excitatory neurons in snRNA-seq of surgically resected samples (referred to as PMI0 compared to postmortem BA23 human samples that are referred to as PMI24 in this figure). (f) Percentage of nuclei per sample for each excitatory subtype in snRNA-seq. (g) Enrichments of species-specifically expressed genes when PMI0 or PMI24 datasets were used as the human dataset in the comparative analyses. (h) Pearson correlations (test for p-value is two-sided) between the log fold changes of HS-Open-CRE motif enrichments when PMI0 or PMI24 datasets were used as the human dataset in the comparative analyses. (i) Heatmap of motif FOS / JUN motif enrichments per excitatory subtype in HS-Open-CREs. Colors correspond to –log10(FDR); numbers correspond to log fold change of enrichment.
Extended Data Fig. 10 Comparisons with external datasets.
(a) Expression levels of three ambient RNA markers highly expressed in neurons (SYT1, SNAP25 and NRGN21) in the Ma et al. dataset14. The dot plot is generated through the interactive web tool linked to the original publication. Dashed square brackets indicate glial cell types, which show exceptionally low levels in the human dataset. Note that the smallest dot shows the presence of a transcript in 40% of the cells. (b) Same as (a) using this PCC dataset. Neuronal ambient RNA markers are detected at very low levels in glial cells across species after ambient RNA removal. (c-e) Enrichment of HS-Genes between the previous study (y-axis) and the current study (x-axis) with two alternative methods. (f) Enrichment of HS-CREs between the previous study (y-axis) and the current study (x-axis) with two alternative methods. For simplicity, we combined all HS-Genes from the subtypes of a major cell type (e.g. all excitatory neuronal subtypes were combined for the excitatory cell type comparisons). P-values were computed using a Fisher’s exact test (one-sided) and false discovery rate (FDR) was calculated per panel.
Supplementary information
Supplementary Table 1
Sample information and demographics. This table summarizes information per sample tissue including demographics and sequencing information. It contains information for snRNA-seq, snATAC-seq and smFISH samples.
Supplementary Table 2
Cell-type proportion statistics. This table summarizes cell statistics regarding cell-type proportion comparisons between species. It contains ratios per sample as well as the statistics (likelihood ratio tests, two-sided; Methods) for each comparison.
Supplementary Table 3
Species-specific genes per cell type. This table reports differentially expressed genes per cell type from the snRNA-seq experiments. Species-specific regulation of each gene is annotated along with the direction of regulation (upregulation and downregulation are shown as UP and DOWN, respectively). Statistics were obtained using edgeR on the aggregated count matrix (Methods).
Supplementary Table 4
Species-specific CREs per cell type. This table reports differentially accessible CREs per cell type from the snATAC-seq experiments. Species-specific regulation of each CRE is annotated along with the direction of regulation (open and closed chromatin are shown as UP and DOWN, respectively). Statistics were obtained using edgeR on the aggregated count matrix (Methods).
Supplementary Table 5
GO enrichment of human-specific oligodendrocyte lineage genes. This table provides GO enrichment statistics (Fisher’s exact test, one-sided) for human-specifically upregulated or downregulated genes for the oligodendrocyte lineage. Only significant enrichments are reported.
Supplementary Table 6
CRE–gene association. This table provides a list of HS-CREs within the 500-kb vicinity of an HS-Gene TSS that changed in the same direction. The list is provided per cell type.
Supplementary Table 7
CRE–HAR overlap. This table provides the CREs that overlap with the publicly available HARs as well as the cortical HARs identified in this study.
Supplementary Table 8
CRE–modern-variant overlap. This table provides the CREs that overlap with the publicly available modern human-specific variants that evolved after the split from Neanderthals and Denisovans. Note that the modern human-specific variants are further refined in this study (Methods).
Supplementary Table 9
Human-specific substitutions within CREs. This table provides human-specific substitutions detected within the CREs. Allele frequency in human populations is reported for each substitution.
Supplementary Table 10
Motif enrichment. This table provides motif enrichment (likelihood ratio test, two-sided; Methods) results for HS-Open-CREs using either human samples from this study (PMI_24) or the surgically resected samples with no PMI (PMI_0).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Caglayan, E., Ayhan, F., Liu, Y. et al. Molecular features driving cellular complexity of human brain evolution. Nature 620, 145–153 (2023). https://doi.org/10.1038/s41586-023-06338-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-023-06338-4
This article is cited by
-
Human-unique brain cell clusters are associated with learning disorders and human episodic memory activity
Molecular Psychiatry (2024)
-
A molecular and cellular perspective on human brain evolution and tempo
Nature (2024)
-
Spatiotemporal expression of thyroid hormone transporter MCT8 and THRA mRNA in human cerebral organoids recapitulating first trimester cortex development
Scientific Reports (2024)
-
Possible roles of deep cortical neurons and oligodendrocytes in the neural basis of human sociality
Anatomical Science International (2024)
-
Functional genomics and systems biology in human neuroscience
Nature (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.