Article | Published:

Variation among intact tissue samples reveals the core transcriptional features of human CNS cell classes

Nature Neurosciencevolume 21pages11711184 (2018) | Download Citation

Abstract

It is widely assumed that cells must be physically isolated to study their molecular profiles. However, intact tissue samples naturally exhibit variation in cellular composition, which drives covariation of cell-class-specific molecular features. By analyzing transcriptional covariation in 7,221 intact CNS samples from 840 neurotypical individuals, representing billions of cells, we reveal the core transcriptional identities of major CNS cell classes in humans. By modeling intact CNS transcriptomes as a function of variation in cellular composition, we identify cell-class-specific transcriptional differences in Alzheimer’s disease, among brain regions, and between species. Among these, we show that PMP2 is expressed by human but not mouse astrocytes and significantly increases mouse astrocyte size upon ectopic expression in vivo, causing them to more closely resemble their human counterparts. Our work is available as an online resource (http://oldhamlab.ctec.ucsf.edu/) and provides a generalizable strategy for determining the core molecular features of cellular identity in intact biological systems.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

All gene expression data sets analyzed in this study are publicly available (accession codes and URLs are provided in Supplementary Table 1). Genome-wide estimates of expression fidelity for major human CNS cell classes are provided on our web site (http://oldhamlab.ctec.ucsf.edu/). All other data that support the findings of this study are available from the corresponding author upon reasonable request.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Darmanis, S. et al. A survey of human brain transcriptome diversity at the single cell level. Proc. Natl. Acad. Sci. USA 112, 7285–7290 (2015).

  2. 2.

    Paul, G. et al. The adult human brain harbors multipotent perivascular mesenchymal stem cells. PLoS One 7, e35577 (2012).

  3. 3.

    Zhang, Y. et al. Purification and characterization of progenitor and mature human astrocytes reveals transcriptional and functional differences with mouse. Neuron 89, 37–53 (2016).

  4. 4.

    Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955–958 (2017).

  5. 5.

    Lake, B. B. et al. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 352, 1586–1590 (2016).

  6. 6.

    Liu, S. & Trapnell, C. Single-cell transcriptome sequencing: recent advances and remaining challenges. F1000Res. https://doi.org/10.12688/f1000research.7223.1 (2016).

  7. 7.

    Azevedo, F. A. et al. Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. J. Comp. Neurol. 513, 532–541 (2009).

  8. 8.

    Oldham, M.C. Transcriptomics: from differential expression to coexpression. In The OMICs: Applications in Neurosciences. (ed. Coppola, G.) 85–113 (Oxford, 2014).

  9. 9.

    Abbas, A. R., Wolslegel, K., Seshasayee, D., Modrusan, Z. & Clark, H. F. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS One 4, e6098 (2009).

  10. 10.

    Gong, T. & Szustakowski, J. D. DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics 29, 1083–1085 (2013).

  11. 11.

    Kuhn, A., Thu, D., Waldvogel, H. J., Faull, R. L. & Luthi-Carter, R. Population-specific expression analysis (PSEA) reveals molecular changes in diseased brain. Nat. Methods 8, 945–947 (2011).

  12. 12.

    Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).

  13. 13.

    Shen-Orr, S. S. et al. Cell type-specific gene expression differences in complex tissues. Nat. Methods 7, 287–289 (2010).

  14. 14.

    Zhong, Y., Wan, Y. W., Pang, K., Chow, L. M. & Liu, Z. Digital sorting of complex tissues for cell type-specific gene expression profiles. BMC Bioinformatics 14, 89 (2013).

  15. 15.

    Zuckerman, N. S., Noam, Y., Goldsmith, A. J. & Lee, P. P. A self-directed method for cell-type identification and separation of gene expression microarrays. PLOS Comput. Biol. 9, e1003189 (2013).

  16. 16.

    Oldham, M. C. et al. Functional organization of the transcriptome in human brain. Nat. Neurosci. 11, 1271–1282 (2008).

  17. 17.

    Fertuzinhos, S. et al. Laminar and temporal expression dynamics of coding and noncoding RNAs in the mouse neocortex. Cell Rep. 6, 938–950 (2014).

  18. 18.

    Ponomarev, I., Rau, V., Eger, E. I., Harris, R. A. & Fanselow, M. S. Amygdala transcriptome and cellular mechanisms underlying stress-enhanced fear learning in a rat model of posttraumatic stress disorder. Neuropsychopharmacology 35, 1402–1411 (2010).

  19. 19.

    Hilliard, A. T., Miller, J. E., Fraley, E. R., Horvath, S. & White, S. A. Molecular microcircuitry underlies functional specification in a basal ganglia circuit dedicated to vocal learning. Neuron 73, 537–552 (2012).

  20. 20.

    Bakken, T. E. et al. A comprehensive transcriptional map of primate brain development. Nature 535, 367–375 (2016).

  21. 21.

    Hawrylycz, M. et al. Canonical genetic signatures of the adult human brain. Nat. Neurosci. 18, 1832–1844 (2015).

  22. 22.

    Cahoy, J. D. et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J. Neurosci. 28, 264–278 (2008).

  23. 23.

    Hickman, S. E. et al. The microglial sensome revealed by direct RNA sequencing. Nat. Neurosci. 16, 1896–1905 (2013).

  24. 24.

    Horvath, S. & Dong, J. Geometric interpretation of gene coexpression network analysis. PLOS Comput. Biol. 4, e1000117 (2008).

  25. 25.

    Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  26. 26.

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

  27. 27.

    Hodges, A. et al. Regional and cellular gene expression changes in human Huntington’s disease brain. Hum. Mol. Genet. 15, 965–977 (2006).

  28. 28.

    Berchtold, N. C. et al. Gene expression changes in the course of normal brain aging are sexually dimorphic. Proc. Natl. Acad. Sci. USA 105, 15605–15610 (2008).

  29. 29.

    Kang, H. J. et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).

  30. 30.

    Hernandez, D. G. et al. Integration of GWAS SNPs and tissue specific expression profiling reveal discrete eQTLs for human traits in blood and brain. Neurobiol. Dis. 47, 20–28 (2012).

  31. 31.

    Li, J. Z. et al. Circadian patterns of gene expression in the human brain and disruption in major depressive disorder. Proc. Natl. Acad. Sci. USA 110, 9950–9955 (2013).

  32. 32.

    Ramasamy, A. et al. Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci. 17, 1418–1428 (2014).

  33. 33.

    GTEx Consortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  34. 34.

    Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70–80 (2018).

  35. 35.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  36. 36.

    Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database (Oxford) 2014, bau012 (2014).

  37. 37.

    Szklarczyk, D. et al. STRINGv10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).

  38. 38.

    Lein, E. S. et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168–176 (2007).

  39. 39.

    Yu, W., Clyne, M., Khoury, M. J. & Gwinn, M. Phenopedia and Genopedia: disease-centered and gene-centered views of the evolving knowledge of human genetic associations. Bioinformatics 26, 145–146 (2010).

  40. 40.

    Cribbs, D. H. et al. Extensive innate immune gene activation accompanies brain aging, increasing vulnerability to cognitive decline and neurodegeneration: a microarray study. J. Neuroinflammation 9, 179 (2012).

  41. 41.

    Zhang, B. et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 153, 707–720 (2013).

  42. 42.

    Hokama, M. et al. Altered expression of diabetes-related genes in Alzheimer’s disease brains: the Hisayama study. Cereb. Cortex 24, 2476–2488 (2014).

  43. 43.

    Oberheim, N. A., Goldman, S. A. & Nedergaard, M. Heterogeneity of astrocytic form and function. Methods Mol. Biol. 814, 23–45 (2012).

  44. 44.

    Zenker, J. et al. A role of peripheral myelin protein 2 in lipid homeostasis of myelinating Schwann cells. Glia 62, 1502–1512 (2014).

  45. 45.

    Bozek, K. et al. Exceptional evolutionary divergence of human muscle and brain metabolomes parallels human cognitive and physical uniqueness. PLoS Biol. 12, e1001871 (2014).

  46. 46.

    Mittelbronn, M., Dietz, K., Schluesener, H. J. & Meyermann, R. Local distribution of microglia in the normal adult human central nervous system differs by up to one order of magnitude. Acta Neuropathol. 101, 249–255 (2001).

  47. 47.

    Zhang, Y. et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J. Neurosci. 34, 11929–11947 (2014).

  48. 48.

    Butovsky, O. et al. Identification of a unique TGF-β-dependent molecular and functional signature in microglia. Nat. Neurosci. 17, 131–143 (2014).

  49. 49.

    Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

  50. 50.

    Irizarry, R. A. et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003).

  51. 51.

    Dai, M. et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33, e175 (2005).

  52. 52.

    Zhang, J., Finney, R. P., Clifford, R. J., Derr, L. K. & Buetow, K. H. Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach. Genomics 85, 297–308 (2005).

  53. 53.

    Gautier, L., Cope, L., Bolstad, B. M. & Irizarry, R. A. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20, 307–315 (2004).

  54. 54.

    Davis, S. & Meltzer, P. S. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23, 1846–1847 (2007).

  55. 55.

    Oldham, M. C., Langfelder, P. & Horvath, S. Network methods for describing sample relationships in genomic datasets: application to Huntington’s disease. BMC Syst. Biol. 6, 63 (2012).

  56. 56.

    Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

  57. 57.

    Bolstad, B. M., Irizarry, R. A., Astrand, M. & Speed, T. P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).

  58. 58.

    Molofsky, A. V. et al. Expression profiling of Aldh1l1-precursors in the developing spinal cord reveals glial lineage-specific genes and direct Sox9-Nfe2l1 interactions. Glia 61, 1518–1532 (2013).

  59. 59.

    Lui, J. H. et al. Radial glia require PDGFD-PDGFRβ signalling in human but not mouse neocortex. Nature 515, 264–268 (2014).

  60. 60.

    Hardin, J., Mitani, A., Hicks, L. & VanKoten, B. A robust measure of correlation between two genes on a microarray. BMC Bioinformatics 8, 220 (2007).

  61. 61.

    Song, L., Langfelder, P. & Horvath, S. Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC Bioinformatics 13, 328 (2012).

  62. 62.

    Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

  63. 63.

    Butler, L. M. et al. Analysis of body-wide unfractionated tissue data to identify a core human endothelial transcriptome. Cell Syst. 3, 287–301.e3 (2016).

  64. 64.

    La Manno, G. et al. Molecular diversity of midbrain development in mouse, human, and stem cells. Cell 167, 566–580.e19 (2016).

  65. 65.

    Kuhn, A. et al. Cell population-specific expression analysis of human cerebellum. BMC Genomics 13, 610 (2012).

  66. 66.

    Fisher, R. A. Statistical Methods for Research Workers. (Hafner Publishing Company, New York, 1970).

  67. 67.

    Doyle, J. P. et al. Application of a translational profiling approach for the comparative analysis of CNS cell types. Cell 135, 749–762 (2008).

  68. 68.

    Mancarci, B.O. et al. Cross-laboratory analysis of brain cell type transcriptomes with applications to interpretation of bulk tissue data. eNeuro https://doi.org/10.1523/eneuro.0212-17.2017 (2017).

  69. 69.

    Gokce, O. et al. Cellular taxonomy of the mouse striatum as revealed by single-cell RNA-seq. Cell Rep. 16, 1126–1137 (2016).

  70. 70.

    He, L. et al. Analysis of the brain mural cell transcriptome. Sci. Rep. 6, 35108 (2016).

  71. 71.

    Wickham, H. lot2: Elegant Graphics for Data Analysis. (Springer-Verlag, New York, 2009; ggp.

  72. 72.

    Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

  73. 73.

    Yin, P. F. & Estimating, X. R2 shrinkage in multiple regression: a comparison of different analytical methods. J. Exp. Educ. 69, 203–224 (2001).

  74. 74.

    Troyanskaya, O. et al. Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001).

  75. 75.

    Benoit, J., Ayoub, A. & Rakic, P. Epigenetic stability in the adult mouse cortex under conditions of pharmacologically induced histone acetylation. Brain Struct. Funct. 221, 3963–3978 (2016).

  76. 76.

    Benton, C. S. et al. Evaluating genetic markers and neurobiochemical analytes for fluoxetine response using a panel of mouse inbred strains. Psychopharmacology (Berl.) 221, 297–315 (2012).

  77. 77.

    Chu, P. L., Keum, S. & Marchuk, D. A. A novel genetic locus modulates infarct volume independently of the extent of collateral circulation. Physiol. Genomics 45, 751–763 (2013).

  78. 78.

    Iancu, O. D. et al. Cosplicing network analysis of mammalian brain RNA-seq data utilizing WGCNA and Mantel correlations. Front. Genet. 6, 174 (2015).

  79. 79.

    Jiang, P. et al. A systems approach identifies networks and genes linking sleep and stress: implications for neuropsychiatric disorders. Cell Rep. 11, 835–848 (2015).

  80. 80.

    Kasukawa, T. et al. Quantitative expression profile of distinct functional regions in the adult mouse brain. PLoS One 6, e23228 (2011).

  81. 81.

    Kleiman, R. J. et al. Dendritic spine density deficits in the hippocampal CA1 region of young Tg2576 mice are ameliorated with the PDE9A inhibitor PF-04447943. Alzheimers Dement. 6, S563–S564 (2010).

  82. 82.

    Ling, K. H. et al. Functional transcriptome analysis of the postnatal brain of the Ts1Cje mouse model for Down syndrome reveals global disruption of interferon-related molecular networks. BMC Genomics 15, 624 (2014).

  83. 83.

    Mackiewicz, M. et al. Macromolecule biosynthesis: a key function of sleep. Physiol. Genomics 31, 441–457 (2007).

  84. 84.

    Matarin, M. et al. A genome-wide gene-expression analysis and database in transgenic mice during development of amyloid or tau pathology. Cell Rep. 10, 633–644 (2015).

  85. 85.

    Parente, M. K., Rozen, R., Cearley, C. N. & Wolfe, J. H. Dysregulation of gene expression in a lysosomal storage disease varies between brain regions implicating unexpected mechanisms of neuropathology. PLoS One 7, e32419 (2012).

  86. 86.

    Peixoto, L. L. et al. Memory acquisition and retrieval impact different epigenetic processes that regulate gene expression. BMC Genomics 16, S5 (2015). Suppl 5.

  87. 87.

    Segall, S. K. et al. Comt1 genotype and expression predicts anxiety and nociceptive sensitivity in inbred strains of mice. Genes Brain Behav. 9, 933–946 (2010).

  88. 88.

    Stark, K. L. et al. Altered brain microRNA biogenesis contributes to phenotypic deficits in a 22q11-deletion mouse model. Nat. Genet. 40, 751–760 (2008).

  89. 89.

    Stevens, S. L. et al. Multiple preconditioning paradigms converge on interferon regulatory factor-dependent signaling to promote tolerance to ischemic brain injury. J. Neurosci. 31, 8456–8463 (2011).

  90. 90.

    Vanderlinden, L. A. et al. Whole brain and brain regional coexpression network interactions associated with predisposition to alcohol consumption. PLoS One 8, e68878 (2013).

  91. 91.

    Wes, P. D. et al. Tau overexpression impacts a neuroinflammation gene expression network perturbed in Alzheimer’s disease. PLoS One 9, e106050 (2014).

  92. 92.

    Wolen, A. R. et al. Genetic dissection of acute ethanol responsive gene networks in prefrontal cortex: functional and mechanistic implications. PLoS One 7, e33575 (2012).

  93. 93.

    Enard, W. et al. Intra- and interspecific variation in primate gene expression patterns. Science 296, 340–343 (2002).

  94. 94.

    Khaitovich, P. et al. Regional patterns of gene expression in human and chimpanzee brains. Genome Res. 14, 1462–1473 (2004).

  95. 95.

    Cáceres, M. et al. Elevated gene expression levels distinguish human from non-human primate brains. Proc. Natl. Acad. Sci. USA 100, 13030–13035 (2003).

  96. 96.

    Fraser, H. B., Khaitovich, P., Plotkin, J. B., Pääbo, S. & Eisen, M. B. Aging and gene expression in the primate brain. PLoS Biol. 3, e274 (2005).

  97. 97.

    Khaitovich, P. et al. Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 309, 1850–1854 (2005).

  98. 98.

    Franz, H. et al. Systematic analysis of gene expression in human brains before and after death. Genome Biol. 6, R112 (2005).

  99. 99.

    Khaitovich, P. et al. Positive selection on gene expression in the human brain. Curr. Biol. 16, R356–R358 (2006).

  100. 100.

    Somel, M. et al. Transcriptional neoteny in the human brain. Proc. Natl. Acad. Sci. USA 106, 5743–5748 (2009).

  101. 101.

    Bernard, A. et al. Transcriptional architecture of the primate neocortex. Neuron 73, 1083–1099 (2012).

  102. 102.

    Chen, J., Bardes, E. E., Aronow, B. J. & Jegga, A. G. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37, W305–11 (2009).

  103. 103.

    Yu, W. et al. GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique. BMC Bioinformatics 9, 205 (2008).

  104. 104.

    Uhlén, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).

  105. 105.

    Viswanathan, S. et al. High-performance probes for light and electron microscopy. Nat. Methods 12, 568–576 (2015).

Download references

Acknowledgements

We are grateful to B. Dispensa (UCSF), J. Hesse (UCSF), D. Kleinhesselink (UCSF), and J. Jed (UCSF) for technical support. We thank A. Molinaro (UCSF) for statistical consultations, D. Rowitch (UCSF) for astrocyte discussions, and E. Huang (UCSF) and M. Paredes (UCSF) for human brain samples. We apologize that many relevant and important publications are not cited, due to space limitations. This work was supported by the UCSF Program for Breakthrough Biomedical Research (to M.C.O.), which is funded in part by the Sandler Foundation, a Scholar Award from the UCSF Weill Institute for Neurosciences (to M.C.O.), a Research Grant from The Shurl and Kay Curci Foundation (to M.C.O.), NIMH R01MH113896 (to M.C.O.), a Pew Scholars Award (to A.V.M.), NIMH K08MH104417 (to A.V.M.), the Burroughs Wellcome Fund (to A.V.M.), and National Institute of General Medical Sciences (NIGMS) Medical Scientist Training Program grant #T32GM007618.

Author information

Affiliations

  1. Department of Neurological Surgery, University of California at San Francisco, San Francisco, CA, USA

    • Kevin W. Kelley
    •  & Michael C. Oldham
  2. The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California at San Francisco, San Francisco, CA, USA

    • Kevin W. Kelley
    • , Hiromi Nakao-Inoue
    • , Anna V. Molofsky
    •  & Michael C. Oldham
  3. Weill Institute for Neurosciences, University of California at San Francisco, San Francisco, CA, USA

    • Kevin W. Kelley
    • , Hiromi Nakao-Inoue
    • , Anna V. Molofsky
    •  & Michael C. Oldham
  4. Department of Psychiatry, University of California at San Francisco, San Francisco, CA, USA

    • Kevin W. Kelley
    • , Hiromi Nakao-Inoue
    •  & Anna V. Molofsky
  5. Medical Scientist Training Program and Neuroscience Graduate Program, University of California at San Francisco, San Francisco, CA, USA

    • Kevin W. Kelley

Authors

  1. Search for Kevin W. Kelley in:

  2. Search for Hiromi Nakao-Inoue in:

  3. Search for Anna V. Molofsky in:

  4. Search for Michael C. Oldham in:

Contributions

K.W.K. and M.C.O. conceived and designed the analytical strategies and wrote the manuscript. K.W.K. performed most data analyses and histological experiments. K.W.K. and H.N.-I. performed PMP2 expression experiments under supervision from A.V.M.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Michael C. Oldham.

Integrated supplementary information

  1. Supplementary Figure 1 Cell-class-specific gene coexpression relationships accurately predict variation in cellular composition among heterogeneous samples (related to Figs. 18).

    Cell-class module eigengenes were calculated for astrocytes, oligodendrocytes, microglia, and neurons from gene coexpression analysis of synthetic mixtures of single-cell RNA-seq data from adult human brain1 (A) or adult mouse brain2 (B). Each cell-class module eigengene was defined as the 1st principal component of the synthetic coexpression module that was maximally enriched with the corresponding cell-class markers (Methods). C) kME values for synthetic cell-class modules from adult mouse brains accurately predicted the results of differential expression analysis for each cell class (n=10 synthetic datasets; ‘up’ / ‘down’ denote up- and down-regulated genes for each cell class). Data are from Tasic et al2. 1. Darmanis, S., et al. A survey of human brain transcriptome diversity at the single cell level. PNAS 112, 7285-7290 (2015). 2. Tasic, B., et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335-346 (2016).

  2. Supplementary Figure 2 Gene expression fidelity is robust to the choice of gene set used for enrichment analysis (related to Figs. 1 and 2).

    Consensus expression fidelity was calculated as described in Fig. 1c-g using four independent gene sets for each cell class. Astrocyte and oligodendrocyte expression fidelity 1 - 4 were calculated using the respective markers from: 11, 22, 33, and 44. Microglia expression fidelity 1 - 3 were calculated using the respective markers from: 15, 26, 32, and 4 was from the immune system phenotype pathway (MP:0005378)7. Neuron expression fidelity 1 - 4 were calculated using the respective markers from: 11, 22, 33, and 48. 1. Cahoy, J.D., et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J. Neurosci. 28, 264-278 (2008). 2. Zhang, Y., et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J. Neurosci. 34, 11929-11947 (2014). 3. Lein, E.S., et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168-176 (2007). 4. Doyle, J.P., et al. Application of a translational profiling approach for the comparative analysis of CNS cell types. Cell 135, 749-762 (2008). 5. Hickman, S.E., et al. The microglial sensome revealed by direct RNA sequencing. Nat. Neurosci. 16, 1896-1905 (2013). 6. Butovsky, O., et al. Identification of a unique TGF-beta-dependent molecular and functional signature in microglia. Nat. Neurosci. 17, 131-143 (2014). 7. Zhang, Y., et al. Systematic analysis, comparison, and integration of disease based human genetic association data and mouse genetic phenotypic information. BMC Med. Genomics 3, 1 (2010). 8. Collins, M.O., et al. Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome. J. Neurochem. 97 Suppl 1, 16-23 (2006).

  3. Supplementary Figure 3 Concordant high-fidelity and single-nucleus RNA-seq genes are more likely to be detected in single-nucleus data than discordant genes (related to Fig. 2).

    A-D) Proportion of cells expressing concordant and discordant genes (Fig. 2f-i) for astrocytes, oligodendrocytes, microglia, and neurons. Data are from Habib et al.1 (n=1909 astrocyte, 2965 oligodendrocyte, 389 microglia, and 7735 neuron independent single-nucleus samples). E-H) Proportion of cells expressing concordant and discordant genes (Fig. 2f-i) for astrocytes, oligodendrocytes, microglia, and neurons. Data are from Lake et al.2 (n=2524 astrocyte, 4369 oligodendrocyte, 756 microglia, and 25854 neuron independent single-nucleus samples). 1. Habib, N., et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods 14, 955-958 (2017). 2. Lake, B.B., et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70-80 (2018).

  4. Supplementary Figure 4 The genes with the highest expression fidelity for human CNS cell classes are consistently coexpressed across CNS regions and technology platforms (related to Figs. 14).

    Expression patterns of the top 10 high-fidelity genes for astrocytes, oligodendrocytes, microglia, and neurons in four human CNS regions. Transcriptomes in each region were analyzed using different technology platforms. Columns 1-4 correspond to the following datasets: Hs.FCX.RNAseq, Hs.HIP.GSE11882, Hs.AMY.ABI, and Hs.DI.GSE46707 (Table S1).

  5. Supplementary Figure 5 Human brain histological validation of high-fidelity genes (related to Fig. 3).

    A) NeuN, DBNDD2, and PON2 are expressed in separate cells in adult human frontal cortex (FCX). B) APBB1IP1 is coexpressed with AIF1 and absent in GFAP+ astrocytes in adult human subcortical white matter (WM). Scale bar: 50μm. Immunostaining was repeated at least twice on independent samples with similar results.

  6. Supplementary Figure 6 Variation among intact tissue samples reveals transcriptional signatures of human choroid plexus cells, mural cells, oligodendrocyte precursor cells (OPCs), and Purkinje neurons (related to Fig. 4).

    A-D) Top: high-fidelity genes for each cell class (top 10 are shown) are consistently coexpressed in independent datasets. Middle: consensus gene expression fidelity distributions for each cell class with canonical markers of major cell classes labeled in green (neurons), red (astrocytes), blue (oligodendrocytes), and black (microglia). Gene expression fidelity distributions for published sets of markers (Al, A2, O1, O2, M1, M2, N1, N2, Cp1, Cp2, Mu1, Mu2, Op1, Op2, P1, P2; Methods) were cross-referenced with high-fidelity genes (top three percentiles). Gray shading: significant enrichment (one-sided Fisher's exact test). Note that Cp1, Mu1, Op1, and P1 were the gene sets used for module enrichment analysis (Table S2). The number of independent samples used to calculate fidelity for each gene is provided in Table S3. Bottom: mouse in situ hybridization data from the Allen Mouse Brain Atlas1 for high-fidelity genes in lateral ventricle (A), ventral midbrain/hypothalamus (B), cortex (C), and cerebellum (D). Scale bar: 200μm; inset scale bar: 500μm. 1. Lein, E.S., et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168-176 (2007).

  7. Supplementary Figure 7 Variation in cellular abundance predicts gene expression in transcriptomes from intact CNS samples (related to Fig. 5).

    Total percent variance explained (mean adj. r2) for ~9600 genes whose expression levels were modeled as a function of inferred astrocyte, oligodendrocyte, microglia, and neuron abundance in each of 47 regional datasets (subset to n=40 samples; values are mean ± 2 s.e.m., 10 iterations). Gene expression modeling results were essentially identical when up to 90% of the data was excluded prior to determining the high-fidelity genes used for estimating relative cellular abundance.

  8. Supplementary Figure 8 Gene expression modeling in AD (related to Fig. 6).

    A,B) Expression patterns of the top 10 high-fidelity genes for each cell class were used to estimate the relative abundance of neurons, astrocytes, microglia, and oligodendrocytes in samples from control (CTRL) and AD subjects as illustrated in Fig. 5a. Data in (A) consist of FCX, TCX, PCX, and HIP samples1 (n=71 CTRL and 71 AD independent samples). Data in (B) consist of FCX, TCX, and HIP samples2 (n=32 CTRL and 32 AD independent samples). P-values indicate the significance of differences in estimated cellular abundance between CTRL and AD (two-sided Wilcoxon rank-sum test). C) T-values of AD risk genes3 from cell-class models calculated for CTRL and AD samples in three independent datasets: GSE383501, GSE447704, GSE369802. Gene order was determined by hierarchical clustering. D) Enrichment analysis (one-sided Fisher’s exact test) of significant (≥ 2 datasets) up-regulated neuron genes in AD. E) Enrichment analysis (one-sided Fisher’s exact test) of significant (≥ 2 datasets) up-regulated microglia genes in AD. Benjamini and Hochberg corrected q-values are displayed. F,G) Expression levels for select genes show cell-intrinsic expression increases in AD neurons (F) or microglia (G) relative to CTRL after controlling for variation in cellular abundance. Black lines in (D,E) denote q-value = 0.05. 1. Cribbs, D.H., et al. Extensive innate immune gene activation accompanies brain aging, increasing vulnerability to cognitive decline and neurodegeneration: a microarray study. J. Neuroinflammation 9, 179 (2012). 2. Hokama, M., et al. Altered expression of diabetes-related genes in Alzheimer's disease brains: the Hisayama study. Cereb. Cortex 24, 2476-2488 (2014). 3. Karch, C.M., Cruchaga, C. & Goate, A.M. Alzheimer's disease genetics: from the bench to the clinic. Neuron 83, 11-26 (2014). 4. Zhang, B., et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell 153, 707-720 (2013).

  9. Supplementary Figure 9 Cell-class module eigengenes (ME) accurately reflect cellular abundance and are robust to gene expression perturbations (related to Fig. 6).

    MEs were constructed from the top 10 astrocyte, oligodendrocyte, microglia, and neuron genes from two synthetic datasets, simulating a condition and control scenario, each consisting of 100 samples of randomly aggregated single-cell RNA-seq data from human brain1 (Methods). In one of the two synthetic datasets, a subset of the top 10 genes was systematically perturbed by the indicated fold-changes. Pearson correlation coefficients between the perturbed module eigengenes and actual cellular abundance for astrocytes, oligodendrocytes, microglia, and neurons are shown. 1. Darmanis, S., et al. A survey of human brain transcriptome diversity at the single cell level. PNAS 112, 7285-7290 (2015).

  10. Supplementary Figure 10 Gene expression modeling identifies neuronal expression differences between CNS regions (related to Fig. 7).

    A) Genes conservatively predicted to be expressed by human neurons in restricted brain regions: FCX, STR, HIP, DI, MID. Depicted genes were differentially modeled (same criteria as Fig. 7f-i) and differentially expressed by greater than 50 %-tile units in region 1 vs. region 2. B) Enrichment analysis of regional neuron genes (one-sided Fisher’s exact test). Benjamini and Hochberg corrected q-values are shown. Black line denotes q-value = 0.05. V.G.: Voltage gated; sig.: signaling. C-H) Examples of differentially modeled regional neuron genes. Gene expression was modeled via linear regression as a function of estimated neuronal abundance in samples from each brain region. Data in (C, D, G) are from the Allen Institute1. Data in (E, F) are from GSE467062. Data in (H) are from GTEx3. Also displayed are Allen Mouse Brain Atlas in situ hybridization data4 for the indicated genes. Scale bar: 500μm. 1. Hawrylycz, M., et al. Canonical genetic signatures of the adult human brain. Nat. Neurosci. 18, 1832-1844 (2015). 2. Ramasamy, A., et al. Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci. 17, 1418-1428 (2014). 3. GTExConsortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648-660 (2015). 4. Lein, E.S., et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168-176 (2007).

  11. Supplementary Figure 11 Integrative gene coexpression analysis of mouse brain transcriptomes reveals consensus transcriptional profiles of astrocytes, oligodendrocytes, microglia, and neurons (related to Fig. 8).

    A-D) Left: consensus gene expression fidelity distributions for mouse astrocytes (A), oligodendrocytes (O), microglia (M), and neurons (N). Canonical markers of each cell class are labeled in red (A), blue (O), black (M), and green (N). Right: gene expression fidelity distributions for published cell-class markers (A1, O1, M1, N1: Zhang et al.1; A2, O2, N2: Cahoy et al.2; M2: Butovsky et al.3; A3, O3, N3: Lein et al.4; M3: Hickman et al.5) were cross-referenced with high-fidelity genes from each consensus signature (z-score > 20). Gray shading: significant enrichment (one-sided Fisher's exact test). Note that A2, O2, M3, and N2 were the gene sets used for module enrichment analysis (Table S2). The number of independent samples used to calculate fidelity for each gene is provided in Table S7. E-H) The top 50 genes ranked by consensus expression fidelity for mouse astrocytes, oligodendrocytes, microglia, or neurons. Expression levels represent averages of mean percentile ranks for all datasets where gene data were present. Mutation intolerance data were obtained from ExAC6. PubMed citations were obtained by queries with gene symbol and cell class (for example gene symbol and 'neuron'). Cellular localization data were extracted from COMPARTMENTS7. Predicted protein-protein interactions (PPI) were obtained from STRING8. A link is shown if the combined score between two proteins was >350. 1. Zhang, Y., et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J. Neurosci. 34, 11929-11947 (2014). 2. Cahoy, J.D., et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J. Neurosci. 28, 264-278 (2008). 3. Butovsky, O., et al. Identification of a unique TGF-beta-dependent molecular and functional signature in microglia. Nat. Neurosci. 17, 131-143 (2014). 4. Lein, E.S., et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168-176 (2007). 5. Hickman, S.E., et al. The microglial sensome revealed by direct RNA sequencing. Nat. Neurosci. 16, 1896-1905 (2013). 6. Lek, M., et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285-291 (2016). 7. Binder, J.X., et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database Feb 25, bau012 (2014). 8. Szklarczyk, D., et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447-452 (2015).

  12. Supplementary Figure 12 Gene expression modeling identifies cell-class-specific transcriptional differences between humans and mice (related to Fig. 8).

    A) Examples of linear regression modeling results using human, chimpanzee, macaque, and mouse brain transcriptomes (Human: PCX1; Chimp: FCX2–5; Macaque: CTX6; Mouse: CTX, HIP, and CB7). SLC1A3 is predicted to be expressed by astrocytes in all species, MRVI1 by astrocytes in primates but not mice, and PLA2G7 by astrocytes in mice but not primates. B) Astrocyte modeling results and mean expression percentiles for genes in (A) from all brain transcriptomes in all species. Bars denote median values and error bars denote s.e.m. The number of independent samples and datasets for each species are provided in Table S1. C-D) Single-molecule FISH for MRVI1 and ALDH1L1 (C) and PLA2G7 and ALDH1L1 (D) in human and mouse cerebral cortex. Scale bar: 20μm. FISH experiments were only performed once. E) PMP2 is expressed by mouse Schwann cells in sciatic nerve from a postnatal day 42 animal. Scale bar: 40μm. Immunostaining was performed twice with similar results. F) Top: lentivirus construct. Left: representative examples of control (CTRL) and PMP2-positive astrocytes from mouse somatosensory cortex. Scale bar: 10μm. Right: Quantification of maximum diameter in CTRL and PMP2-positive astrocytes. n=4 animals per group, n=37 CTRL and 31 PMP2-infected astrocytes, bars denote mean ± s.e.m., with significance determined by a one-sided Welch’s t-test on animal averages. 1. Hawrylycz, M., et al. Canonical genetic signatures of the adult human brain. Nat. Neurosci. 18, 1832-1844 (2015). 2. Khaitovich, P., et al. Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 309, 1850-1854 (2005). 3. Franz, H., et al. Systematic analysis of gene expression in human brains before and after death. Genome Biol. 6, R112 (2005). 4. Khaitovich, P., et al. Positive selection on gene expression in the human brain. Curr. Biol. 16, R356-358 (2006). 5. Somel, M., et al. Transcriptional neoteny in the human brain. PNAS 106, 5743-5748 (2009). 6. Bernard, A., et al. Transcriptional architecture of the primate neocortex. Neuron 73, 1083-1099 (2012). 7. Matarin, M., et al. A genome-wide gene-expression analysis and database in transgenic mice during development of amyloid or tau pathology. Cell Rep 10, 633-644 (2015).

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–12

  2. Reporting Summary

  3. Supplementary Note

  4. Supplementary Table 1 CNS transcriptomes analyzed in this study.

    Dataset information including CNS region(s), sample preparation, technology platform, quality control metrics, sample size, and reference for each dataset analyzed in this study.

  5. Supplementary Table 2 Enrichment p-values for cell-class gene sets in human gene coexpression modules.

    One-sided Fisher’s exact test p-value enrichments for cell-class gene sets across human regional datasets.

  6. Supplementary Table 3 Genome-wide expression fidelity for major CNS cell classes in humans.

    Gene identifiers, data set representations, sample sizes, cell-class fidelity metrics, and mean expression percentiles for 18451 genes.

  7. Supplementary Table 4 Genome-wide comparison of expression fidelity for human CNS cell classes and differential expression results from SN RNA-seq data.

    Cell-class fidelity metrics and SN differential expression statistics for astrocytes, microglia, oligodendrocytes, and neurons for each of the 15494 shared genes between this study and Habib et al.1 (n=1909 astrocyte, 2965 oligodendrocyte, 389 microglia, and 7735 neuron independent single-nucleus samples) and Lake et al.2 (n=2524 astrocyte, 4369 oligodendrocyte, 756 microglia, and 25854 neuron independent single-nucleus samples). 1. Habib, N., et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods 14, 955-958 (2017).2. Lake, B.B., et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70-80 (2018).

  8. Supplementary Table 5 Cell-class-specific transcriptional differences between CTRL and AD.

    Significant (p<.05) cell-class-specific expression differences between CTRL and AD after controlling for differences in cellular abundance in three independent datasets: GSE383501, GSE447702, GSE369803. Statistical significance was determined for each gene by comparing the differences in t-values (obtained by linear regression for each cell class) between CTRL and AD to differences observed after permuting sample labels (n=1000 permutations). 1. Cribbs, D.H., et al. Extensive innate immune gene activation accompanies brain aging, increasing vulnerability to cognitive decline and neurodegeneration: a microarray study. J. Neuroinflammation 9, 179 (2012). 2. Zhang, B., et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell 153, 707-720 (2013). 3. Hokama, M., et al. Altered expression of diabetes-related genes in Alzheimer's disease brains: the Hisayama study. Cereb. Cortex 24, 2476-2488 (2014).

  9. Supplementary Table 6 Cell-class-specific transcriptional differences among human CNS regions.

    Significant (p<2.67x10−8: Bonferroni correction for total # of gene models) cell-class-specific expression differences among human CNS regions (workflow shown in Fig. 7F).

  10. Supplementary Table 7 Genome-wide expression fidelity for major CNS cell classes in mice.

    Gene identifiers, dataset representations, sample sizes, cell-class fidelity metrics, and mean expression percentiles for 18739 genes.

  11. Supplementary Table 8 Cell-class-specific transcriptional differences among species.

    Predicted cell-class-specific transcriptional differences between humans and mice with gene identifier information, cell-class modeling statistics, and expression percentile values.

  12. Supplementary Table 9 RNAscope single-molecule FISH probe information.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41593-018-0216-z