Resource | Published:

Coexpression networks identify brain region–specific enhancer RNAs in the human brain

Nature Neuroscience volume 18, pages 11681174 (2015) | Download Citation

Abstract

Despite major progress in identifying enhancer regions on a genome-wide scale, the majority of available data are limited to model organisms and human transformed cell lines. We have identified a robust set of enhancer RNAs (eRNAs) expressed in the human brain and constructed networks assessing eRNA-gene coexpression interactions across human fetal brain and multiple adult brain regions. Our data identify brain region–specific eRNAs and show that enhancer regions expressing eRNAs are enriched for genetic variants associated with autism spectrum disorders.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).

  2. 2.

    et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

  3. 3.

    , , & The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).

  4. 4.

    et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).

  5. 5.

    FANTOM Consortium and the RIKEN PMI and CLST (DGT) et al. A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).

  6. 6.

    1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  7. 7.

    & Interpreting noncoding genetic variation in complex traits and human disease. Nat. Biotechnol. 30, 1095–1106 (2012).

  8. 8.

    , & Genome-scale neurogenetics: methodology and meaning. Nat. Neurosci. 17, 756–763 (2014).

  9. 9.

    , , & Large-scale genomics unveils the genetic architecture of psychiatric disorders. Nat. Neurosci. 17, 782–790 (2014).

  10. 10.

    et al. Individual common variants exert weak effects on the risk for autism spectrum disorders. Hum. Mol. Genet. 21, 4781–4792 (2012).

  11. 11.

    et al. A genome-wide scan for common alleles affecting risk for autism. Hum. Mol. Genet. 19, 4072–4082 (2010).

  12. 12.

    et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature 459, 528–533 (2009).

  13. 13.

    et al. A genome-wide linkage and association scan reveals novel loci for autism. Nature 461, 802–808 (2009).

  14. 14.

    et al. A noncoding RNA antisense to moesin at 5p14.1 in autism. Sci. Transl. Med. 4, 128ra40 (2012).

  15. 15.

    Schizophrenia Working Group of the Psychiatric Genomics. C. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

  16. 16.

    et al. The long non-coding RNA Gomafu is acutely regulated in response to neuronal activation and involved in schizophrenia-associated alternative splicing. Mol. Psychiatry 19, 486–494 (2014).

  17. 17.

    The central role of RNA in human development and cognition. FEBS Lett. 585, 1600–1616 (2011).

  18. 18.

    , , & Characterization of enhancer function from genome-wide analyses. Annu. Rev. Genomics Hum. Genet. 13, 29–57 (2012).

  19. 19.

    et al. The enhancer landscape during early neocortical development reveals patterns of dense regulation and co-option. PLoS Genet. 9, e1003728 (2013).

  20. 20.

    et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).

  21. 21.

    , , & Enhancer RNAs and regulated transcriptional programs. Trends Biochem. Sci. 39, 170–182 (2014).

  22. 22.

    et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

  23. 23.

    Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  24. 24.

    et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

  25. 25.

    et al. Tissue-specific RNA expression marks distant-acting developmental enhancers. PLoS Genet. 10, e1004610 (2014).

  26. 26.

    et al. Large-scale identification of coregulated enhancer networks in the adult human brain. Cell Reports 9, 767–779 (2014).

  27. 27.

    Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013).

  28. 28.

    et al. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nat. Genet. 45, 501–512 (2013).

  29. 29.

    et al. Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors. Genome Biol. 13, R48 (2012).

  30. 30.

    et al. SFARI Gene 2.0: a community-driven knowledgebase for the autism spectrum disorders (ASDs). Mol. Autism 4, 36 (2013).

  31. 31.

    et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246–250 (2012).

  32. 32.

    et al. Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat. Genet. 43, 585–589 (2011).

  33. 33.

    et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).

  34. 34.

    et al. De novo gene disruptions in children on the autistic spectrum. Neuron 74, 285–299 (2012).

  35. 35.

    et al. Neurodevelopmental and neuropsychiatric disorders represent an interconnected molecular system. Mol. Psychiatry 19, 294–301 (2014).

  36. 36.

    & WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  37. 37.

    , & Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24, 719–720 (2008).

  38. 38.

    & A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, 17 (2005).

  39. 39.

    et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012).

  40. 40.

    et al. Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature 478, 519–523 (2011).

  41. 41.

    et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).

  42. 42.

    et al. A highly conserved program of neuronal microexons is misregulated in autistic brains. Cell 159, 1511–1523 (2014).

  43. 43.

    et al. Brain-expressed exons under purifying selection are enriched for de novo mutations in autism spectrum disorder. Nat. Genet. 46, 742–747 (2014).

  44. 44.

    et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell 155, 997–1007 (2013).

  45. 45.

    et al. Long-range downstream enhancers are essential for Pax6 expression. Dev. Biol. 299, 563–581 (2006).

  46. 46.

    et al. SoxC transcription factors are required for neuronal differentiation in adult hippocampal neurogenesis. J. Neurosci. 32, 3067–3080 (2012).

  47. 47.

    , , & Medial prefrontal cortex: genes linked to bipolar disorder and schizophrenia have altered expression in the highly social maternal phenotype. Front. Behav. Neurosci. 8, 110 (2014).

  48. 48.

    et al. Polymorphism screening of brain-expressed FABP7, 5 and 3 genes and association studies in autism and schizophrenia in Japanese subjects. J. Hum. Genet. 55, 127–130 (2010).

  49. 49.

    et al. Functional characterization of FABP3, 5 and 7 gene variants identified in schizophrenia and autism spectrum disorder and mouse behavioral studies. Hum. Mol. Genet. 23, 6495–6511 (2014).

  50. 50.

    et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

  51. 51.

    , , & 5′ end-centered expression profiling using cap-analysis gene expression and next-generation sequencing. Nat. Protoc. 7, 542–561 (2012).

  52. 52.

    , & TagDust—a program to eliminate artifacts from next generation sequencing data. Bioinformatics 25, 2839–2840 (2009).

  53. 53.

    & Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  54. 54.

    , & TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25, 1105–1111 (2009).

  55. 55.

    & BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

  56. 56.

    , & Pathway-based approaches for analysis of genome-wide association studies. Am. J. Hum. Genet. 81, 1278–1283 (2007).

  57. 57.

    et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell 155, 1008–1021 (2013).

  58. 58.

    et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011).

  59. 59.

    & More powerful procedures for multiple significance testing. Stat. Med. 9, 811–818 (1990).

Download references

Acknowledgements

The authors are grateful to P. Carninci and members of the RIKEN Omics Science Center for helpful discussions; the RIKEN GeNAS facility for library preparation and sequencing data preprocessing; S. Miyauchi for technical support in the initial stages of the project; K. Morris, I. Dawes and B. Ballard for critically reading the manuscript; and G. Sutton for editorial assistance. This work was supported by a NARSAD young investigator award, a JSPS Grant-in-Aid, an NHMRC project grant (APP1062510) and an ARC DECRA fellowship (DE140101033) to I.V.

Author information

Author notes

    • Pu Yao
    •  & Peijie Lin

    These authors contributed equally to this work.

Affiliations

  1. School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia.

    • Pu Yao
    • , Peijie Lin
    • , Akira Gokoolparsadh
    • , Amelia Assareh
    •  & Irina Voineagu
  2. QFAB Bioinformatics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia.

    • Mike W C Thang

Authors

  1. Search for Pu Yao in:

  2. Search for Peijie Lin in:

  3. Search for Akira Gokoolparsadh in:

  4. Search for Amelia Assareh in:

  5. Search for Mike W C Thang in:

  6. Search for Irina Voineagu in:

Contributions

P.Y. analyzed the CAGE and RNA-seq data; I.V. carried out the coexpression network analysis; P.L. carried out the GWAS set enrichment analyses; A.G., A.A. and M.W.C.T. contributed to the data analysis. I.V. conceived the project and supervised all aspects of the project. I.V. wrote the paper with input from all authors.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Irina Voineagu.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–5

  2. 2.

    Supplementary Methods Checklist

Excel files

  1. 1.

    Supplementary Table 1: Intergenic and intronic brain-expressed enhancers (BEEs)

    This table lists intergenic and intronic BEEs (genomic coordinates, annotation and mean expression values using the CAGE and RNA-seq data generated in this study).For replicated BEEs, the table also includes data on their overlap with H3K4me1 and H3K27ac peaks from Zhu et al. 2013 and on brain region specific expression using FANTOM5 CAGE data (see Methods). Robust brain expressed enhancers (rBEE) are shown in bold.nSamplesCAGE: number of CAGE samples in which the BEE was expressed at >0.5 TPM. AvgCAGEExp: average BEE expression across the 4 CAGE samples (Frontal, Temporal, Occipital and Cerebellum). nSamplesRNASeq: number of RNA-seq samples in which the BEE was expressed at >0.5 TPM. AvgRNASeqExp: average BEE expression of the enhancer region across the 4 RNA-seq samples (Frontal, Temporal, Occipital and Cerebellum). H3K27ac, H3K4me1: whether the BEE overlaps a chromatin modification peak in inferior temporal gyrus, midfrontal gyrus or anterior caudate nucleus, based on data from Zhu et al 2013. enChr :whether the BEE overlaps either a H3K27ac or a H3K4me1 peak. Fetal brain, amygdala, caudate.nucleus, cerebellum, temporal, globus.pallidus, hippocampus, locus.coeruleus, medulla.oblongata, occipital, parietal, thalamus: these columns list the number of FANTOM5 samples from each category showing expression levels above 0.5 TPM. Nreg: the number of brain regions with expression values >0.5TPM in at least 2 distinct samples from the same region (for simplicity fetal brain was included as a category/region).

  2. 2.

    Supplementary Table 2: RNA samples used for CAGE and RNA-seq

    Source, Catalogue Number, Organism, Gender, Age (years), RNA concentration and RIN information for the RNA samples used for CAGE and RNA-seq data generated in this study (RIN: RNA integrity number measured by an Agilent Bioanalyzer).

  3. 3.

    Supplementary Table 3: Summary of RNA-seq and CAGE data

    Number of sequencing reads: total number of reads obtained for RNA-seq and CAGE respectively. Number of mapped reads (mapQ >10): number of sequencing reads mapped to the human genome, after filtering for a mapping quality score (mapQ) >10. Mapping rate: the ratio between the number of mapped reads with mapQ >10 and the total number of reads.

  4. 4.

    Supplementary Table 4: Expression of robust brain-expressed enhancers (rBEEs) in primary cultured cells

    Chr, start, end: hg19 genomic coordinates. Annot: annotation as intergenic or intronic. Astrocytes (TPM), Neurons (TPM): mean normalized expression values (tags per million) in cultured astrocytes and neurons respectively.

  5. 5.

    Supplementary Table 5: Percentage of protein-encoding genes expressed in cultured cells at varying expression thresholds

    "+","-": presence or absence of expression above the corresponding threshold in each type of cultured cells.

  6. 6.

    Supplementary Table 6: List of published enhancer data sets used for GWAS enrichment analyses

  7. 7.

    Supplementary Table 7: GWAS enrichment P values

    Columns: GWAS datasets (ADD - attention deficit disorder, ASD - autism spectrum disorders, BPI - bipolar illness, MDD - major depressive disorder, SCZ – schizophrenia, BMI – body mass index), rows: enhancer sets (see Results section and Methods section for details on GWAS datasets and enhancer sets). ROADMAP_brain_clusters denotes the union of the 17 ROADMAP brain clusters. p-values were obtained by 1000 random permutations of SNP labels and were Bonferroni corrected for multiple comparisons for each GWAS dataset (see Methods for details). p-values < 0.05 are highlighted in red.

  8. 8.

    Supplementary Table 8: Disease-associated genes

    This table contains the list of genes implicated in neuropsychiatric and neurodevelopmental disorders used for overrepresentation analyses.

  9. 9.

    Supplementary Table 9: FANTOM5 samples used for WGCNA

    List of FANTOM5 samples used for the co-expression network analysis. Library IDs and sample descriptions were obtained from: http://fantom.gsc.riken.jp/5/data/

  10. 10.

    Supplementary Table 10: WGCNA kME values and module assignment

    Module Label, Module Color: numeric and color labels assigned to co-expression modules. kME: module membership value (see Methods for details). pvalBH: Bonferroni and Hochberg corrected p-values for module membership.

  11. 11.

    Supplementary Table 11: WGCNA module eigengene values

  12. 12.

    Supplementary Table 12: Coexpression module annotation

    N rBEE, N Genes: number of rBEE and genes included in the module respectively. Annotation: brain region or developmental stage for which the co-expression module shows significantly higher module eigengene values (modules without significant p-values for any brain region or developmental stage are not included in the list).Module Eigengene Significance: Benjamini and Hochberg corrected Wilcoxon test p-values for module eigengene significance (see Methods). Enrichment p-value: Benjamini and Hochberg corrected hypergeometric test p-values for brain region marker enrichment (see Methods). The description of brain region marker lists is available at:http://www.inside r.org/packages/cran/WGCNA/docs/userListEnrichment. For each co-expression module only the most significantly enriched brain marker list, relevant to the module annotation, is shown. The region marker lists only include adult brain regions and thus there is no value reported for fetal brain modules.

  13. 13.

    Supplementary Table 13: Coexpression network properties of robust brain expressed enhancers

    This table lists the WGCNA module membership for rBEEs (rBEE module, rBEE kME) and their closest genes (closestGene Module, closestGene kME). TO: topological overlap value for the rBEE-closest gene pair. Chr, start, end: rBEE hg19 chromosomal coordinates. Annot: rBEE annotation as intergenic or intronic. closestGeneDist: the distance between rBEE and its closest gene, in number of base-pairs.

  14. 14.

    Supplementary Table 14: Topological overlap values for intergenic rBEE and the top 20 coexpressed genes located in cis (i.e., within 500 MB)

    TO: topological overlap values (see Methods).

Zip files

  1. 1.

    Supplementary Code: Supplementary data analysis code

    These scripts contain annotated data analysis code for identifying rBEE regions, WGCNA network analysis and GWAS set enrichment analysis. See Readme.pdf for details on each script.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nn.4063

Further reading