We carried out an integrative analysis of enhancer landscape and gene expression dynamics during hematopoietic differentiation using DNase-seq, histone mark ChIP-seq and RNA sequencing to model how the early establishment of enhancers and regulatory locus complexity govern gene expression changes at cell state transitions. We found that high-complexity genes—those with a large total number of DNase-mapped enhancers across the lineage—differ architecturally and functionally from low-complexity genes, achieve larger expression changes and are enriched for both cell type–specific and transition enhancers, which are established in hematopoietic stem and progenitor cells and maintained in one differentiated cell fate but lost in others. We then developed a quantitative model to accurately predict gene expression changes from the DNA sequence content and lineage history of active enhancers. Our method suggests a new mechanistic role for PU.1 at transition peaks during B cell specification and can be used to correct assignments of enhancers to genes.
Subscribe to Journal
Get full journal access for 1 year
only $17.42 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Samstein, R.M. et al. Foxp3 exploits a pre-existent enhancer landscape for regulatory T cell lineage specification. Cell 151, 153–166 (2012).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Smale, S.T. Pioneer factors in embryonic stem cells and differentiation. Curr. Opin. Genet. Dev. 20, 519–526 (2010).
Rothenberg, E.V. The chromatin landscape and transcription factors in T cell programming. Trends Immunol. 35, 195–204 (2014).
Lara-Astiaso, D. et al. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).
Xu, C.R. et al. Chromatin “prepattern” and histone modifiers in a fate choice for liver and pancreas. Science 332, 963–966 (2011).
Bernstein, B.E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).
Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).
Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 21931–21936 (2010).
Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).
Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).
Parker, S.C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).
Wang, H. et al. NOTCH1-RBPJ complexes drive target gene expression through dynamic interactions with superenhancers. Proc. Natl. Acad. Sci. USA 111, 705–710 (2014).
Benayoun, B.A. et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell 158, 673–688 (2014).
Stergachis, A.B. et al. Developmental fate and cellular maturity encoded in human regulatory DNA landscapes. Cell 154, 888–903 (2013).
Zhu, J. et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Li, Q.H., Brown, J.B., Huang, H.Y. & Bickel, P.J. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011).
Zheng, Y. et al. Role of conserved non-coding DNA elements in the Foxp3 gene in regulatory T-cell fate. Nature 463, 808–812 (2010).
Yoshida, T. et al. Transcriptional regulation of the Ikzf1 locus. Blood 122, 3149–3159 (2013).
Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).
Kieffer-Kwon, K.R. et al. Interactome maps of mouse gene regulatory domains reveal basic principles of transcriptional regulation. Cell 155, 1507–1520 (2013).
Anderson, E. & Hill, R.E. Long range regulation of the sonic hedgehog gene. Curr. Opin. Genet. Dev. 27, 54–59 (2014).
Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).
Xi, H. et al. Identification and characterization of cell type–specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet. 3, e136 (2007).
Setty, M. & Leslie, C.S. SeqGL identifies context-dependent binding signals in genome-wide regulatory element maps. PLoS Comput. Biol. 11, e1004271 (2015).
Wickrema, A. & Kee, B. Molecular Basis of Hematopoiesis. (Springer, 2009).
Lazarevic, V., Glimcher, L.H. & Lord, G.M. T-bet: a bridge between innate and adaptive immunity. Nat. Rev. Immunol. 13, 777–789 (2013).
Perrotti, D. et al. Overexpression of the zinc finger protein MZF1 inhibits hematopoietic development from embryonic stem cells: correlation with negative regulation of CD34 and c-myb promoter activity. Mol. Cell. Biol. 15, 6075–6087 (1995).
Pan, Z., Hetherington, C.J. & Zhang, D.E. CCAAT/enhancer-binding protein activates the CD14 promoter and mediates transforming growth factor β signaling in monocyte development. J. Biol. Chem. 274, 23242–23248 (1999).
Vahedi, G. et al. STATs shape the active enhancer landscape of T cell populations. Cell 151, 981–993 (2012).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Mak, K.S., Funnell, A.P., Pearson, R.C. & Crossley, M. PU.1 and haematopoietic cell fate: dosage matters. Int. J. Cell Biol. 2011, 808524 (2011).
Pott, S. & Lieb, J.D. What are super-enhancers? Nat. Genet. 47, 8–12 (2015).
Sanyal, A., Lajoie, B.R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).
Dowen, J.M. et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374–387 (2014).
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).
Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).
Malin, J., Aniba, M.R. & Hannenhalli, S. Enhancer networks revealed by correlated DNAse hypersensitivity states of enhancers. Nucleic Acids Res. 41, 6828–6838 (2013).
Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).
Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. & Greenleaf, W.J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Salmon-Divon, M., Dvinge, H., Tammoja, K. & Bertone, P. PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics 11, 415 (2010).
McLean, C.Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Landt, S.G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Gaujoux, R. & Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 11, 367 (2010).
Benaglia, T., Chauveau, D., Hunter, D.R. & Young, D.S. mixtools: an R package for analyzing mixture models. J. Stat. Soft. 32(6), 1–29 (2009).
Mairal, J., Bach, F., Ponce, J. & Sapiro, G. Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2012).
Huang, W., Sherman, B.T. & Lempicki, R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).
We thank A. Kundaje for extensive advice on the processing of Roadmap Epigenomics data sets, and we thank A. Arvey for helpful discussions at early stages in the project. This work was supported by US National Institutes of Health grants R01-HG006798, U01-HG007893 and U01-HG007033.
The authors declare no competing financial interests.
Supplementary Figures 1–43. (PDF 24850 kb)
Data sets used in this study and their accession numbers. (XLSX 39 kb)
Number of DHSs in each cell type in promoters, introns and intergenic regions. (XLSX 38 kb)
GO analysis of high-complexity, highly expressed genes in different cell types. (XLSX 49 kb)
Sharing of peaks between monocytes and B cells and between T cells and NK cells. (XLSX 34 kb)
Transcription factor SeqGL scores learned in different cell types. (XLSX 43 kb)
Performance of regression model for predicting changes in gene expression in cell state transitions. (XLSX 28 kb)
Gene reassignments for all the cell types. (XLSX 62 kb)
About this article
Cite this article
González, A., Setty, M. & Leslie, C. Early enhancer establishment and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation. Nat Genet 47, 1249–1259 (2015). https://doi.org/10.1038/ng.3402
Molecular Cell (2020)
Developmental Cell (2020)
Epigenetics & Chromatin (2020)
Proceedings of the National Academy of Sciences (2020)