Analysis | Published:

Early enhancer establishment and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation

Nature Genetics volume 47, pages 12491259 (2015) | Download Citation

Abstract

We carried out an integrative analysis of enhancer landscape and gene expression dynamics during hematopoietic differentiation using DNase-seq, histone mark ChIP-seq and RNA sequencing to model how the early establishment of enhancers and regulatory locus complexity govern gene expression changes at cell state transitions. We found that high-complexity genes—those with a large total number of DNase-mapped enhancers across the lineage—differ architecturally and functionally from low-complexity genes, achieve larger expression changes and are enriched for both cell type–specific and transition enhancers, which are established in hematopoietic stem and progenitor cells and maintained in one differentiated cell fate but lost in others. We then developed a quantitative model to accurately predict gene expression changes from the DNA sequence content and lineage history of active enhancers. Our method suggests a new mechanistic role for PU.1 at transition peaks during B cell specification and can be used to correct assignments of enhancers to genes.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    et al. Foxp3 exploits a pre-existent enhancer landscape for regulatory T cell lineage specification. Cell 151, 153–166 (2012).

  2. 2.

    et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  3. 3.

    Pioneer factors in embryonic stem cells and differentiation. Curr. Opin. Genet. Dev. 20, 519–526 (2010).

  4. 4.

    The chromatin landscape and transcription factors in T cell programming. Trends Immunol. 35, 195–204 (2014).

  5. 5.

    et al. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).

  6. 6.

    et al. Chromatin “prepattern” and histone modifiers in a fate choice for liver and pancreas. Science 332, 963–966 (2011).

  7. 7.

    et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).

  8. 8.

    et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).

  9. 9.

    et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 21931–21936 (2010).

  10. 10.

    et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

  11. 11.

    et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

  12. 12.

    et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).

  13. 13.

    et al. NOTCH1-RBPJ complexes drive target gene expression through dynamic interactions with superenhancers. Proc. Natl. Acad. Sci. USA 111, 705–710 (2014).

  14. 14.

    et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell 158, 673–688 (2014).

  15. 15.

    et al. Developmental fate and cellular maturity encoded in human regulatory DNA landscapes. Cell 154, 888–903 (2013).

  16. 16.

    et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).

  17. 17.

    Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  18. 18.

    , , & Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011).

  19. 19.

    et al. Role of conserved non-coding DNA elements in the Foxp3 gene in regulatory T-cell fate. Nature 463, 808–812 (2010).

  20. 20.

    et al. Transcriptional regulation of the Ikzf1 locus. Blood 122, 3149–3159 (2013).

  21. 21.

    et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).

  22. 22.

    et al. Interactome maps of mouse gene regulatory domains reveal basic principles of transcriptional regulation. Cell 155, 1507–1520 (2013).

  23. 23.

    & Long range regulation of the sonic hedgehog gene. Curr. Opin. Genet. Dev. 27, 54–59 (2014).

  24. 24.

    et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).

  25. 25.

    et al. Identification and characterization of cell type–specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet. 3, e136 (2007).

  26. 26.

    & SeqGL identifies context-dependent binding signals in genome-wide regulatory element maps. PLoS Comput. Biol. 11, e1004271 (2015).

  27. 27.

    & Molecular Basis of Hematopoiesis. (Springer, 2009).

  28. 28.

    , & T-bet: a bridge between innate and adaptive immunity. Nat. Rev. Immunol. 13, 777–789 (2013).

  29. 29.

    et al. Overexpression of the zinc finger protein MZF1 inhibits hematopoietic development from embryonic stem cells: correlation with negative regulation of CD34 and c-myb promoter activity. Mol. Cell. Biol. 15, 6075–6087 (1995).

  30. 30.

    , & CCAAT/enhancer-binding protein activates the CD14 promoter and mediates transforming growth factor β signaling in monocyte development. J. Biol. Chem. 274, 23242–23248 (1999).

  31. 31.

    et al. STATs shape the active enhancer landscape of T cell populations. Cell 151, 981–993 (2012).

  32. 32.

    , & edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

  33. 33.

    , , & PU.1 and haematopoietic cell fate: dosage matters. Int. J. Cell Biol. 2011, 808524 (2011).

  34. 34.

    & What are super-enhancers? Nat. Genet. 47, 8–12 (2015).

  35. 35.

    , , & The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).

  36. 36.

    et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374–387 (2014).

  37. 37.

    et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

  38. 38.

    et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

  39. 39.

    et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).

  40. 40.

    , & Enhancer networks revealed by correlated DNAse hypersensitivity states of enhancers. Nucleic Acids Res. 41, 6828–6838 (2013).

  41. 41.

    et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

  42. 42.

    , , , & Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

  43. 43.

    et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  44. 44.

    et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).

  45. 45.

    et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

  46. 46.

    , , & PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics 11, 415 (2010).

  47. 47.

    et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

  48. 48.

    et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).

  49. 49.

    & A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 11, 367 (2010).

  50. 50.

    , , & mixtools: an R package for analyzing mixture models. J. Stat. Soft. 32(6), 1–29 (2009).

  51. 51.

    , , & Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2012).

  52. 52.

    , & Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).

Download references

Acknowledgements

We thank A. Kundaje for extensive advice on the processing of Roadmap Epigenomics data sets, and we thank A. Arvey for helpful discussions at early stages in the project. This work was supported by US National Institutes of Health grants R01-HG006798, U01-HG007893 and U01-HG007033.

Author information

Author notes

    • Alvaro J González
    •  & Manu Setty

    These authors contributed equally to this work.

Affiliations

  1. Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, New York, USA.

    • Alvaro J González
    • , Manu Setty
    •  & Christina S Leslie

Authors

  1. Search for Alvaro J González in:

  2. Search for Manu Setty in:

  3. Search for Christina S Leslie in:

Contributions

A.J.G. performed computational analyses to construct the DHS atlas, characterize gene complexity classes, describe histone modifications at enhancer classes, and quantify gain and loss of active DHSs with gene expression changes and contributed to writing the manuscript. M.S. developed the DNase peak calling pipeline and the SeqGL tool, performed the regression analysis and iterative reassignment of enhancers, and contributed to writing the manuscript. C.S.L. conceived the project, advised on the analysis and algorithm development, supervised the research and wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Christina S Leslie.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–43.

Excel files

  1. 1.

    Supplementary Table 1

    Data sets used in this study and their accession numbers.

  2. 2.

    Supplementary Table 2

    Number of DHSs in each cell type in promoters, introns and intergenic regions.

  3. 3.

    Supplementary Table 3

    GO analysis of high-complexity, highly expressed genes in different cell types.

  4. 4.

    Supplementary Table 4

    Sharing of peaks between monocytes and B cells and between T cells and NK cells.

  5. 5.

    Supplementary Table 5

    Transcription factor SeqGL scores learned in different cell types.

  6. 6.

    Supplementary Table 6

    Performance of regression model for predicting changes in gene expression in cell state transitions.

  7. 7.

    Supplementary Table 7

    Gene reassignments for all the cell types.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3402

Further reading

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing