This article has been updated

Abstract

The structure and function of the human brain are highly stereotyped, implying a conserved molecular program responsible for its development, cellular structure and function. We applied a correlation-based metric called differential stability to assess reproducibility of gene expression patterning across 132 structures in six individual brains, revealing mesoscale genetic organization. The genes with the highest differential stability are highly biologically relevant, with enrichment for brain-related annotations, disease associations, drug targets and literature citations. Using genes with high differential stability, we identified 32 anatomically diverse and reproducible gene expression signatures, which represent distinct cell types, intracellular components and/or associations with neurodevelopmental and neurodegenerative disorders. Genes in neuron-associated compared to non-neuronal networks showed higher preservation between human and mouse; however, many diversely patterned genes displayed marked shifts in regulation between species. Finally, highly consistent transcriptional architecture in neocortex is correlated with resting state functional connectivity, suggesting a link between conserved gene expression and functionally relevant circuitry.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 31 August 2017

    In the version of this article initially published, the third and fourth paragraphs of Online Methods section “Differential stability in cortex and resting state network analysis” read as follows:   The next step was to map the Allen Human Brain Atlas (AHBA) tissue samples to the HCP 52 region parcellation so that comparison could be made. Using the MNI centroid coordinate of the AHBA samples, and by manually examining each of the AHBA tissue samples using the online tools, one can assign a set of HCP space voxels to each AHBA tissue sample. As each of the 52 parcels is composed of a set of voxels, we now have potentially one-to-many map from AHBA tissue to HCP parcels. If all ABHA tissue samples belong to a common HCP parcel, we average the gene expression of that tissue in the corresponding parcel. However, some of the 52 parcels represent smaller regions of the brain and therefore there is no unique assignment of AHBA gene expression tissue samples to that region. Therefore, if a collection of AHBA tissue samples intersects more than one region, we average the gene expression values as before but fractionally weight the expression contribution to each of the interesting HCP parcels. This has the effect of allowing some assignment of expression without overweighting non-unique samples. Supplementary Table 12 gives the sample distribution by parcels as well as the uniquely assigned samples.  To obtain the expression correlation matrix for a given gene (Fig. 7c, right panel), we transformed the expression values of that gene into z-scores over all the sampled brain regions (averaging sample data for those samples contained in the same parcel) and calculated the coexpression as the outer product of this z-score vector. Thus, if two regions both show high expression or low expression of the gene of interest, they will have a high positive coexpression value for that gene, whereas if they show opposite expression patterns, they will have a large negative value for that gene. After generating these matrices, we compared each of the 17,348 gene coexpression matrices to the parcellated connectome matrix by calculating the Pearson's correlation between the vectorized elements above the diagonal of the matrices (Fig. 7d). We also obtained a significance value for each gene-connectivity comparison using the randomized gene coexpression matrices. Supplementary Table 12 gives the complete distribution of tissue samples by HCP parcel for the 52 regions and the functional genetic correlations and P-values.   In the current version, these paragraphs have been rewritten to unambiguously explain how each RSN parcel was mapped to the AHBA samples. The original version did not clearly delineate the approach for each of the three possible cases in which RSN parcels could overlap the AHBA samples. The new text also has an additional paragraph describing the rationale behind the two sets of P-values included in Supplementary Table 12. The error has been corrected in the HTML and PDF versions of the article.

References

  1. 1.

    & The HapMap and genome-wide association studies in diagnosis and therapy. Annu. Rev. Med. 60, 443–456 (2009).

  2. 2.

    & Genome-wide association studies: potential next steps on a genetic journey. Hum. Mol. Genet. 17, R156–R165 (2008).

  3. 3.

    et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012).

  4. 4.

    et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).

  5. 5.

    et al. Functional organization of the transcriptome in human brain. Nat. Neurosci. 11, 1271–1282 (2008).

  6. 6.

    et al. Gene expression analyses reveal molecular relationships among 20 regions of the human CNS. Neurogenetics 7, 67–80 (2006).

  7. 7.

    et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011).

  8. 8.

    , , & Preservation of ranking order in the expression of human Housekeeping genes. PLoS ONE 6, e29314 (2011).

  9. 9.

    & WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  10. 10.

    & A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, 17 (2005).

  11. 11.

    et al. Transcriptional landscape of the prenatal human brain. Nature 508, 199–206 (2014).

  12. 12.

    , & Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc. Natl. Acad. Sci. USA 103, 17973–17978 (2006).

  13. 13.

    , , , & Gene coexpression networks in human brain identify epigenetic modifications in alcohol dependence. J. Neurosci. 32, 1884–1897 (2012).

  14. 14.

    et al. Transcriptional architecture of the primate neocortex. Neuron 73, 1083–1099 (2012).

  15. 15.

    et al. Screening the human protocadherin 8 (PCDH8) gene in schizophrenia. Genes Brain Behav. 1, 187–191 (2002).

  16. 16.

    & Neuroscience in the era of functional genomics and systems biology. Nature 461, 908–915 (2009).

  17. 17.

    et al. Human-specific transcriptional networks in the brain. Neuron 75, 601–617 (2012).

  18. 18.

    , , & The role of Foxg1 and dorsal midline signaling in the generation of Cajal-Retzius subtypes. J. Neurosci. 27, 11103–11111 (2007).

  19. 19.

    & Generation of Cajal-Retzius neurons in mouse forebrain is regulated by transforming growth factor beta-Fox signaling pathways. Dev. Biol. 313, 35–46 (2008).

  20. 20.

    et al. Adult mouse brain gene expression patterns bear an embryologic imprint. Proc. Natl. Acad. Sci. USA 102, 10357–10362 (2005).

  21. 21.

    et al. Autworks: a cross-disease network biology application for autism and related disorders. BMC Med. Genomics 5, 56 (2012).

  22. 22.

    , , & Is my network module preserved and reproducible? PLoS Comput. Biol. 7, e1001057 (2011).

  23. 23.

    et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J. Neurosci. 28, 264–278 (2008).

  24. 24.

    , , & ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37, W305–W311 (2009).

  25. 25.

    et al. Identification of new putative susceptibility genes for several psychiatric disorders by association analysis of regulatory and non-synonymous SNPs of 306 genes involved in neurotransmission and neurodevelopment. Am. J. Med. Genet. B Neuropsychiatr. Genet. 150B, 808–816 (2009).

  26. 26.

    et al. Association study of 182 candidate genes in anorexia nervosa. Am. J. Med. Genet. B Neuropsychiatr. Genet. 153B, 1070–1080 (2010).

  27. 27.

    et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168–176 (2007).

  28. 28.

    et al. Correlated gene expression supports synchronous activity in brain networks. Science 348, 1241–1244 (2015).

  29. 29.

    et al. The WU-Minn Human Connectome Project: an overview. Neuroimage 80, 62–79 (2013).

  30. 30.

    et al. Functional connectomics from resting-state fMRI. Trends Cogn. Sci. 17, 666–682 (2013).

  31. 31.

    et al. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. 106, 1125–1165 (2011).

  32. 32.

    et al. The MET oncogene is a functional marker of a glioblastoma stem cell subtype. Cancer Res. 72, 4537–4550 (2012).

  33. 33.

    , , & Expression of complement messenger RNAs and proteins by human oligodendroglial cells. Glia 42, 417–423 (2003).

  34. 34.

    et al. A polymorphism in the complement component C1r is not associated with sporadic Alzheimer's disease. Neurosci. Lett. 336, 101–104 (2003).

  35. 35.

    , & Human postmortem brain-derived cerebrovascular smooth muscle cells express all genes of the classical complement pathway: a potential mechanism for vascular damage in cerebral amyloid angiopathy and Alzheimer's disease. Microvasc. Res. 75, 411–419 (2008).

  36. 36.

    , & Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc. Natl. Acad. Sci. USA 107, 12698–12703 (2010).

  37. 37.

    et al. Divergent and nonuniform gene expression patterns in mouse brain. Proc. Natl. Acad. Sci. USA 107, 19049–19054 (2010).

  38. 38.

    , , & Astrocytic complexity distinguishes the human brain. Trends Neurosci. 29, 547–553 (2006).

  39. 39.

    et al. Large-scale cellular-resolution gene profiling in human neocortex reveals species-specific molecular signatures. Cell 149, 483–496 (2012).

  40. 40.

    et al. Consistent resting-state networks across healthy subjects. Proc. Natl. Acad. Sci. USA 103, 13848–13853 (2006).

  41. 41.

    , , & Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proc. Natl. Acad. Sci. USA 100, 253–258 (2003).

  42. 42.

    et al. Correlated gene expression supports synchronous activity in brain networks. Science 348, 1241–1244 (2015).

  43. 43.

    & Relationships between gene expression and brain wiring in the adult rodent brain. PLoS Comput. Biol. 7, e1001049 (2011).

  44. 44.

    , & Large-scale analysis of gene expression and connectivity in the rodent brain: insights through data integration. Front. Neuroinform. 5, 12 (2011).

  45. 45.

    et al. Resting-state fMRI in the Human Connectome Project. Neuroimage 80, 144–168 (2013).

  46. 46.

    , & Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24, 719–720 (2008).

  47. 47.

    et al. Strategies for aggregating gene expression data: the collapseRows R function. BMC Bioinformatics 12, 322 (2011).

  48. 48.

    et al. Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers. Neuroimage 90, 449–468 (2014).

  49. 49.

    & Probabilistic independent component analysis for functional magnetic resonance imaging. IEEE Trans. Med. Imaging 23, 137–152 (2004).

Download references

Acknowledgements

The authors thank the Allen Institute for Brain Science founders, Paul G. Allen and Jody Allen, for their vision, encouragement and support. Research was supported by the Allen Institute for Brain Science. We also gratefully acknowledge support from the US National Institute of Drug Abuse, grant 4R33DA027644; D. Wall of Stanford University School of Medicine; and 1U54MH091657 (NIH Blueprint for Neuroscience Research).

Author information

Author notes

    • Michael Hawrylycz
    • , Jeremy A Miller
    •  & Vilas Menon

    These authors contributed equally to this work.

Affiliations

  1. The Allen Institute for Brain Science, Seattle, Washington, USA.

    • Michael Hawrylycz
    • , Jeremy A Miller
    • , Vilas Menon
    • , David Feng
    • , Tim Dolbeare
    • , Angela L Guillozet-Bongaarts
    • , Chang-Kyu Lee
    • , Amy Bernard
    • , Aaron Szafer
    • , Forrest Collman
    • , Stefan Mihalas
    • , Zizhen Yao
    • , John Phillips
    • , Lydia Ng
    • , Chinh Dang
    • , Allan Jones
    • , Christof Koch
    •  & Ed Lein
  2. Division of Biomedical Informatics, Cincinnati Children's Hospital and Medical Center, Cincinnati, Ohio, USA.

    • Anil G Jegga
    •  & Bruce J Aronow
  3. Department of Anatomy and Neurobiology, Washington University, St. Louis, Missouri, USA.

    • Matthew F Glasser
    • , Donna L Dierker
    •  & David C Van Essen
  4. Center for Complex Networks Research, Northeastern University, Boston, Massachusetts, USA.

    • Jörg Menche
    •  & Albert-László Barabási
  5. Department of Physics, Northeastern University, Boston, Massachusetts, USA.

    • Jörg Menche
    •  & Albert-László Barabási
  6. Center for Network Science, Central European University, Budapest, Hungary.

    • Jörg Menche
    •  & Albert-László Barabási
  7. Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Jiangsu, China.

    • Pascal Grange
  8. Department of Electrical Engineering and Computing Systems, University of Cincinnati, Cincinnati, Ohio, USA.

    • Kenneth A Berman
  9. Institute for Protein Design, University of Washington, Seattle, Washington, USA.

    • Lance Stewart
  10. Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA.

    • Albert-László Barabási
  11. Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    • Albert-László Barabási
  12. Department of Neuroscience, Georgetown University, Washington, DC, USA.

    • Jay Schulkin
  13. Department of Radiology, The University of Washington, Seattle, Washington, USA.

    • David R Haynor

Authors

  1. Search for Michael Hawrylycz in:

  2. Search for Jeremy A Miller in:

  3. Search for Vilas Menon in:

  4. Search for David Feng in:

  5. Search for Tim Dolbeare in:

  6. Search for Angela L Guillozet-Bongaarts in:

  7. Search for Anil G Jegga in:

  8. Search for Bruce J Aronow in:

  9. Search for Chang-Kyu Lee in:

  10. Search for Amy Bernard in:

  11. Search for Matthew F Glasser in:

  12. Search for Donna L Dierker in:

  13. Search for Jörg Menche in:

  14. Search for Aaron Szafer in:

  15. Search for Forrest Collman in:

  16. Search for Pascal Grange in:

  17. Search for Kenneth A Berman in:

  18. Search for Stefan Mihalas in:

  19. Search for Zizhen Yao in:

  20. Search for Lance Stewart in:

  21. Search for Albert-László Barabási in:

  22. Search for Jay Schulkin in:

  23. Search for John Phillips in:

  24. Search for Lydia Ng in:

  25. Search for Chinh Dang in:

  26. Search for David R Haynor in:

  27. Search for Allan Jones in:

  28. Search for David C Van Essen in:

  29. Search for Christof Koch in:

  30. Search for Ed Lein in:

Contributions

M.H., J.A.M. and V.M. performed the primary analyses, with supporting analyses by A.L.G.-B., F.C., K.A.B., P.G., Z.Y., L.S., A.-L.B. and J.S. Graphics and networks analysis were done by D.F., T.D., L.N., C.D. and J.A.M. Annotation analysis was done by A.G.J. and B.J.A. M.F.G. and D.L.D. performed resting state analysis with V.M., S.M. and D.C.V.E. Data processing and normalization were done by C.-K.L., L.N., C.D., A.B., J.P., A.S. and M.H. J.A.M., D.R.H., D.C.V.E., A.J., C.K. and E.L. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Michael Hawrylycz or Ed Lein.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–9, Supplementary Table 1 and Supplementary Analysis

  2. 2.

    Supplementary Methods Checklist

  3. 3.

    Supplementary Table 1: Neuroanatomical sampling overview of the Allen Human Brain Atlas and of each analysis in the manuscript.

    A hierarchical ontology spanning all major architectural subdivisions was created to support the microarray sampling strategy. Each structure in this tree is designated a specific RGB color used throughout this paper. The number of samples isolated from each brain from brain region is shown in the first worksheet, along with a summary of the specific subdivisions sampled and the sample isolation method used (Macro = scalpel macrodissection, LMD = laser microdissection). The complete hierarchical ontology and fine structure sampling for each brain is provided in the second worksheet, collapsed down to a single column (complete version available at the Allen Brain Atlas data portal, or can be reconstructed using the “structure ID” and “parent structure ID” columsn). The table contains the structure ID, acronym, hemisphere, color hex triplet, and the number of samples for that structures in brains 1-6. Asterisks (*) in the “structure acronym” column indicate the 96 brain regions that were sampled sufficiently to be included in the analysis in Figure 6. Additionally, the “subregion for analysis” column specifies the portions of the ontology which were averaged together to form the 132 broad brain structures in the analyses in Figures 2–5. Asterisks (*) in this column indicate the 65 brain regions sampled across all six brains that were shown in visualizations (** note: CPLV, Pa, and CGS were present in fewer than 6 samples but were also shown in the visualizations to highlight ependymal regions in module M26). The third worksheet shows which mouse structures were matched with each human structures for comparison between species (Figure 6). Structures listed in the “Mouse counterparts” column are structure names from the ontology of the Allen Mouse Brain Atlas. A red “none” indicates that the listed human region did not have comparable structures in mouse. Note that, although cc and RaM had comparable mouse counterparts, ISH quantification was not available for these structures.

CSV files

  1. 1.

    Supplementary Table 2: Differential stability metrics for every gene in the analysis.

    Several different metrics of differential stability are provided for all 17,348 genes included in the analysis. Genes and the corresponding probes are listed and ordered descending by Pearson correlation (this is the metric used in the manuscript). Alternative metrics include MaxDiff (the maximum occurring differential between pairs of structures), Tau (the average Kendall Tau correlation across regions for each pair of brains), Euclid (the average Euclidean distance between expression levels in pairs of brains), and AvgVar (the average across-region variability between the six brains.)

  2. 2.

    Supplementary Table 3: Genes with high expression but low variability are enriched for housekeeping functions.

    2,236 genes have expression in the lowest quartile of variability and also have relatively low differential stability (DS<0.5). These genes are sorted descending by DS and also include their associated probes, their average log2 expression levels, and log2 standard deviations. These stable (but not differentially stable) genes are enriched for housekeeping functions such as RNA binding (p<3.26e-21), KEGG spliceosome pathway (p<6.4e-13), and mitochondrial ribosomal proteins (p<1.32e-10).

  3. 3.

    Supplementary Table 4: Enrichment analysis for the top 10th percentile set of genes (n=1735) ranked by differential stability.

    The complete list of enrichments for the blue bars shown in Figure 2d, including significant enrichments for gene ontology categories, transcription factor binding sites, mrRNA targets, and drug targets. The first three columns indicate the category, ID, and name of the enrichment list. The next four columns show p-values from a hypergeometric test for enrichment and q-values after correction for multiple comparison, as well as the overlapping and total number of genes in the list. The final column shows all overlapping genes in each category.

  4. 4.

    Supplementary Table 5: Disease enrichments for the top 10th percentile set of DS genes are primarily brain related.

    The complete list of significant disease enrichments for the data shown in Figure 2f, based on 2289 gene sets from the Autworks database. Column B shows the disease tested for enrichment. The next two columns show Bonferroni corrected p-values from a hypergeometric test for enrichment, as well as the overlapping number of genes in the list. The final column shows all overlapping genes in each category.

  5. 5.

    Supplementary Table 6: Module assignments and eigengene correlations for each gene assigned to a module in the consensus network.

    The great majority of all genes (90.1%, 15,627) are correlated with 32 modules with ME correlation > 0.4. This table includes these genes and their associated probes, their final module assignment color (column C), label (column D), and recoloring based on neuronal content (column E). The genes correlation to the corresponding ME is shown (column F) along with the DS metric (column G; reproduced from Supplementary Table 2). Finally, the initial module assignment is included (column H) to allow regeneration of the ME from the expression data.

  6. 6.

    Supplementary Table 7: Per module count of marker genes for cell type and subcellular compartment.

    The number of genes showing at least 1.5-fold enrichment for astrocytes (column C), neurons (column D), and oligodendrocytes (column E; Ref. #23) is unevenly distributed across modules. Modules are labeled based on the percent of neuron-related genes (100 * column D / column B) in each module (see Fig. 3C). The number of genes in each module associated with discrete cellular subcompartments was also determined (columns F-O; Foster, LJ, de Hoog, CL, Zhang, Y, Zhang, Y, Xie, X, Mootha, VK, Mann, M. 2006. A Mammalian Organelle Map by Protein Correlation Profiling. Cell 125-1: 187-199); however, these data were not used in the manuscript.

  7. 7.

    Supplementary Table 8: Complete module enrichments based on ToppGene lists.

    The complete list of enrichments for the heatmap shown in Figure 3e. We used the ToppGene portal to identify significant enrichments in gene ontology, pathways, cytoband, disease association, transcription factor binding sites, micro RNAs, drug targets, and protein-protein interactions. The first three columns indicate the category, ID, and name of the enrichment list. The first three columns indicate the category, ID, and name of the enrichment list. The next four columns show p-values from a hypergeometric test for enrichment and q-values after correction for multiple comparison, as well as the overlapping and total number of genes in the list. Column H shows the module being tested. The final column shows all overlapping genes in each category. All enrichments shown have an FDR corrected q-value < 0.05.

  8. 8.

    Supplementary Table 11: Local differential stability metrics for 20 brain regions.

    Local differential stability metrics provided for all 17,348 genes included in the analysis. Genes are listed and ordered descending by Pearson correlation (metric used in the manuscript). The remaining columns list differential stability calculated using only the subregions of a particular brain regions. Note that for cortical and cerebellar regions, these metrics include the full set of subregions and not the data averaged by lobe. Figure 7 uses the DS values for cerebral cortex that are listed in column E.

Excel files

  1. 1.

    Supplementary Table 9: Enrichment analysis for the 302 genes with high DS that are not assigned to any module.

    The complete list of enrichments for the heatmap shown in Figure 3e. The “Singular Genes” tab lists the genes and their differential stability metric, while the “Annotation” tab lists the enrichments. We used the ToppGene portal to identify significant enrichments in gene ontology, pathways, transcription factor binding sites, and other categories. The first four columns indicate the category, ID, and name of the enrichment list, as well as the source of the category. The first three columns indicate the category, ID, and name of the enrichment list. The next four columns show p-values from a hypergeometric test for enrichment and q-values after correction for multiple comparison, as well as the overlapping and total number of genes in the list. The final column shows all overlapping genes in each category. All enrichments shown have a B&H FDR corrected q-value < 0.01.

  2. 2.

    Supplementary Table 10: Conserved or non-conserved expression patterning between mouse and human.

    All 2,651 genes with reliable expression patterns in both mouse and human data sets (Methods), as well as their corresponding human probes are shown. Column C indicates whether the pattern agrees between species (correlated to correct module eigengene in mouse with ρ > 0.4), or disagrees between species (correlated to the correct module eigengene in mouse with ρ < 0.4, but highly correlated to a different module with ρ > 0.8). All remaining genes, which cannot definitively be definitively listed as agreeing or strongly disagreeing between species, are listed as ambiguous (or uncorrelated). The final column lists the original module assignment in human. Note that unassigned genes which are correlated to any module with ρ > 0.8 in mouse are listed as disagreeing between species.

  3. 3.

    Supplementary Table 12: Parcel assignment from AHBA to Human Connectome functional imaging parcellation.

    For each of 52 parcels of the functional connectome from the number of samples from each of the 6 AHBA brains is shown followed by the average number over 6 brains. As each AHBA parcel may belong to multiple regions, the next six columns give the number of unique AHBA parcels assigned to each functional connectome parcel. The final two columns give the size in voxels of corresponding functional connectome parcel and the percentage of total cortex voxels of that parcel. Analyses in the main manuscript are presented in for all AHBA parcels and in the Supplementary methods using unique parcels.

Zip files

  1. 1.

    Supplementary Data Set 1

    Zip file containing code and input required to reproduce figures.

  2. 2.

    Supplementary Data Set 2

    Zip file containing truncated and summarized gene expression data which is used along with Supplementary Data Set 1.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nn.4171

Further reading