Early enhancer establishment and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation


We carried out an integrative analysis of enhancer landscape and gene expression dynamics during hematopoietic differentiation using DNase-seq, histone mark ChIP-seq and RNA sequencing to model how the early establishment of enhancers and regulatory locus complexity govern gene expression changes at cell state transitions. We found that high-complexity genes—those with a large total number of DNase-mapped enhancers across the lineage—differ architecturally and functionally from low-complexity genes, achieve larger expression changes and are enriched for both cell type–specific and transition enhancers, which are established in hematopoietic stem and progenitor cells and maintained in one differentiated cell fate but lost in others. We then developed a quantitative model to accurately predict gene expression changes from the DNA sequence content and lineage history of active enhancers. Our method suggests a new mechanistic role for PU.1 at transition peaks during B cell specification and can be used to correct assignments of enhancers to genes.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: The DHS atlas defines high- and low-complexity genes during hematopoietic differentiation.
Figure 2: High-complexity genes contain enhancers with distinct dynamics.
Figure 3: Gain of active enhancers in cell state transitions correlates with increased expression.
Figure 4: SeqGL identifies multiple transcription factor sequence signals in B cell DNase peaks.
Figure 5: Regression model suggests a role for PU.1 in the early establishment of B cell enhancers.
Figure 6: Regression model proposes reassignment of enhancers to genes.


  1. 1

    Samstein, R.M. et al. Foxp3 exploits a pre-existent enhancer landscape for regulatory T cell lineage specification. Cell 151, 153–166 (2012).

    CAS  Article  Google Scholar 

  2. 2

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    CAS  Article  Google Scholar 

  3. 3

    Smale, S.T. Pioneer factors in embryonic stem cells and differentiation. Curr. Opin. Genet. Dev. 20, 519–526 (2010).

    CAS  Article  Google Scholar 

  4. 4

    Rothenberg, E.V. The chromatin landscape and transcription factors in T cell programming. Trends Immunol. 35, 195–204 (2014).

    CAS  Article  Google Scholar 

  5. 5

    Lara-Astiaso, D. et al. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).

    CAS  Article  Google Scholar 

  6. 6

    Xu, C.R. et al. Chromatin “prepattern” and histone modifiers in a fate choice for liver and pancreas. Science 332, 963–966 (2011).

    CAS  Article  Google Scholar 

  7. 7

    Bernstein, B.E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).

    CAS  Article  Google Scholar 

  8. 8

    Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).

    CAS  Article  Google Scholar 

  9. 9

    Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 21931–21936 (2010).

    CAS  Article  Google Scholar 

  10. 10

    Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

    CAS  Article  Google Scholar 

  11. 11

    Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

    CAS  Article  Google Scholar 

  12. 12

    Parker, S.C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).

    CAS  Article  Google Scholar 

  13. 13

    Wang, H. et al. NOTCH1-RBPJ complexes drive target gene expression through dynamic interactions with superenhancers. Proc. Natl. Acad. Sci. USA 111, 705–710 (2014).

    CAS  Article  Google Scholar 

  14. 14

    Benayoun, B.A. et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell 158, 673–688 (2014).

    CAS  Article  Google Scholar 

  15. 15

    Stergachis, A.B. et al. Developmental fate and cellular maturity encoded in human regulatory DNA landscapes. Cell 154, 888–903 (2013).

    CAS  Article  Google Scholar 

  16. 16

    Zhu, J. et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).

    CAS  Article  Google Scholar 

  17. 17

    Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  18. 18

    Li, Q.H., Brown, J.B., Huang, H.Y. & Bickel, P.J. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011).

    Article  Google Scholar 

  19. 19

    Zheng, Y. et al. Role of conserved non-coding DNA elements in the Foxp3 gene in regulatory T-cell fate. Nature 463, 808–812 (2010).

    CAS  Article  Google Scholar 

  20. 20

    Yoshida, T. et al. Transcriptional regulation of the Ikzf1 locus. Blood 122, 3149–3159 (2013).

    CAS  Article  Google Scholar 

  21. 21

    Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).

    CAS  Article  Google Scholar 

  22. 22

    Kieffer-Kwon, K.R. et al. Interactome maps of mouse gene regulatory domains reveal basic principles of transcriptional regulation. Cell 155, 1507–1520 (2013).

    CAS  Article  Google Scholar 

  23. 23

    Anderson, E. & Hill, R.E. Long range regulation of the sonic hedgehog gene. Curr. Opin. Genet. Dev. 27, 54–59 (2014).

    CAS  Article  Google Scholar 

  24. 24

    Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).

    CAS  Article  Google Scholar 

  25. 25

    Xi, H. et al. Identification and characterization of cell type–specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet. 3, e136 (2007).

    Article  Google Scholar 

  26. 26

    Setty, M. & Leslie, C.S. SeqGL identifies context-dependent binding signals in genome-wide regulatory element maps. PLoS Comput. Biol. 11, e1004271 (2015).

    Article  Google Scholar 

  27. 27

    Wickrema, A. & Kee, B. Molecular Basis of Hematopoiesis. (Springer, 2009).

  28. 28

    Lazarevic, V., Glimcher, L.H. & Lord, G.M. T-bet: a bridge between innate and adaptive immunity. Nat. Rev. Immunol. 13, 777–789 (2013).

    CAS  Article  Google Scholar 

  29. 29

    Perrotti, D. et al. Overexpression of the zinc finger protein MZF1 inhibits hematopoietic development from embryonic stem cells: correlation with negative regulation of CD34 and c-myb promoter activity. Mol. Cell. Biol. 15, 6075–6087 (1995).

    CAS  Article  Google Scholar 

  30. 30

    Pan, Z., Hetherington, C.J. & Zhang, D.E. CCAAT/enhancer-binding protein activates the CD14 promoter and mediates transforming growth factor β signaling in monocyte development. J. Biol. Chem. 274, 23242–23248 (1999).

    CAS  Article  Google Scholar 

  31. 31

    Vahedi, G. et al. STATs shape the active enhancer landscape of T cell populations. Cell 151, 981–993 (2012).

    CAS  Article  Google Scholar 

  32. 32

    Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  Article  Google Scholar 

  33. 33

    Mak, K.S., Funnell, A.P., Pearson, R.C. & Crossley, M. PU.1 and haematopoietic cell fate: dosage matters. Int. J. Cell Biol. 2011, 808524 (2011).

    Article  Google Scholar 

  34. 34

    Pott, S. & Lieb, J.D. What are super-enhancers? Nat. Genet. 47, 8–12 (2015).

    CAS  Article  Google Scholar 

  35. 35

    Sanyal, A., Lajoie, B.R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).

    CAS  Article  Google Scholar 

  36. 36

    Dowen, J.M. et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374–387 (2014).

    CAS  Article  Google Scholar 

  37. 37

    Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    CAS  Article  Google Scholar 

  38. 38

    Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

    CAS  Article  Google Scholar 

  39. 39

    Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).

    CAS  Article  Google Scholar 

  40. 40

    Malin, J., Aniba, M.R. & Hannenhalli, S. Enhancer networks revealed by correlated DNAse hypersensitivity states of enhancers. Nucleic Acids Res. 41, 6828–6838 (2013).

    CAS  Article  Google Scholar 

  41. 41

    Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    CAS  Article  Google Scholar 

  42. 42

    Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. & Greenleaf, W.J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    CAS  Article  Google Scholar 

  43. 43

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  Article  Google Scholar 

  44. 44

    Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).

    CAS  Article  Google Scholar 

  45. 45

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  Google Scholar 

  46. 46

    Salmon-Divon, M., Dvinge, H., Tammoja, K. & Bertone, P. PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics 11, 415 (2010).

    Article  Google Scholar 

  47. 47

    McLean, C.Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    CAS  Article  Google Scholar 

  48. 48

    Landt, S.G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).

    CAS  Article  Google Scholar 

  49. 49

    Gaujoux, R. & Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 11, 367 (2010).

    Article  Google Scholar 

  50. 50

    Benaglia, T., Chauveau, D., Hunter, D.R. & Young, D.S. mixtools: an R package for analyzing mixture models. J. Stat. Soft. 32(6), 1–29 (2009).

    Article  Google Scholar 

  51. 51

    Mairal, J., Bach, F., Ponce, J. & Sapiro, G. Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2012).

    Google Scholar 

  52. 52

    Huang, W., Sherman, B.T. & Lempicki, R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).

    Article  Google Scholar 

Download references


We thank A. Kundaje for extensive advice on the processing of Roadmap Epigenomics data sets, and we thank A. Arvey for helpful discussions at early stages in the project. This work was supported by US National Institutes of Health grants R01-HG006798, U01-HG007893 and U01-HG007033.

Author information




A.J.G. performed computational analyses to construct the DHS atlas, characterize gene complexity classes, describe histone modifications at enhancer classes, and quantify gain and loss of active DHSs with gene expression changes and contributed to writing the manuscript. M.S. developed the DNase peak calling pipeline and the SeqGL tool, performed the regression analysis and iterative reassignment of enhancers, and contributed to writing the manuscript. C.S.L. conceived the project, advised on the analysis and algorithm development, supervised the research and wrote the manuscript.

Corresponding author

Correspondence to Christina S Leslie.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–43. (PDF 24850 kb)

Supplementary Table 1

Data sets used in this study and their accession numbers. (XLSX 39 kb)

Supplementary Table 2

Number of DHSs in each cell type in promoters, introns and intergenic regions. (XLSX 38 kb)

Supplementary Table 3

GO analysis of high-complexity, highly expressed genes in different cell types. (XLSX 49 kb)

Supplementary Table 4

Sharing of peaks between monocytes and B cells and between T cells and NK cells. (XLSX 34 kb)

Supplementary Table 5

Transcription factor SeqGL scores learned in different cell types. (XLSX 43 kb)

Supplementary Table 6

Performance of regression model for predicting changes in gene expression in cell state transitions. (XLSX 28 kb)

Supplementary Table 7

Gene reassignments for all the cell types. (XLSX 62 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

González, A., Setty, M. & Leslie, C. Early enhancer establishment and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation. Nat Genet 47, 1249–1259 (2015). https://doi.org/10.1038/ng.3402

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing