Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Chromatin-state discovery and genome annotation with ChromHMM

Abstract

Noncoding DNA regions have central roles in human biology, evolution, and disease. ChromHMM helps to annotate the noncoding genome using epigenomic information across one or multiple cell types. It combines multiple genome-wide epigenomic maps, and uses combinatorial and spatial mark patterns to infer a complete annotation for each cell type. ChromHMM learns chromatin-state signatures using a multivariate hidden Markov model (HMM) that explicitly models the combinatorial presence or absence of each mark. ChromHMM uses these signatures to generate a genome-wide annotation for each cell type by calculating the most probable state for each genomic segment. ChromHMM provides an automated enrichment analysis of the resulting annotations to facilitate the functional interpretations of each chromatin state. ChromHMM is distinguished by its modeling emphasis on combinations of marks, its tight integration with downstream functional enrichment analyses, its speed, and its ease of use. Chromatin states are learned, annotations are produced, and enrichments are computed within 1 d.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Overview of ChromHMM.
Figure 2: Overview of different options for handling multiple cell types in ChromHMM.
Figure 3: Example webpage screenshots.

References

  1. 1

    Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    CAS  Article  Google Scholar 

  2. 2

    Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    CAS  Article  Google Scholar 

  3. 3

    Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  4. 4

    Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).

    CAS  Article  Google Scholar 

  5. 5

    Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

    CAS  Article  Google Scholar 

  6. 6

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  Google Scholar 

  7. 7

    Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).

    CAS  Article  Google Scholar 

  8. 8

    Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    CAS  Article  Google Scholar 

  9. 9

    Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903 (2008).

    CAS  Article  Google Scholar 

  10. 10

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  11. 11

    Hoffman, M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).

    CAS  Article  Google Scholar 

  12. 12

    Ernst, J. & Kellis, M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).

    CAS  Article  Google Scholar 

  13. 13

    Mortazavi, A. et al. Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps. Genome Res. 23, 2136–2148 (2013).

    CAS  Article  Google Scholar 

  14. 14

    Chronis, C. et al. Cooperative binding of transcription factors orchestrates reprogramming. Cell 168, 442–459 e20 (2017).

    CAS  Article  Google Scholar 

  15. 15

    Javierre, B.M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 e19 (2016).

    CAS  Article  Google Scholar 

  16. 16

    Lorzadeh, A. et al. Nucleosome density ChIP-Seq identifies distinct chromatin modification signatures associated with MNase accessibility. Cell Rep. 17, 2112–2124 (2016).

    CAS  Article  Google Scholar 

  17. 17

    Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).

    CAS  Article  Google Scholar 

  18. 18

    Roy, S. et al. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797 (2010).

    CAS  Article  Google Scholar 

  19. 19

    Rosenbloom, K.R. et al. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res. 41, D56–D63 (2013).

    CAS  Article  Google Scholar 

  20. 20

    Cunningham, F. et al. Ensembl 2015. Nucleic Acids Res. 43, D662–D669 (2015).

    CAS  Article  Google Scholar 

  21. 21

    Denholtz, M. et al. Long-range chromatin contacts in embryonic stem cells reveal a role for pluripotency factors and polycomb proteins in genome organization. Cell Stem Cell 13, 602–616 (2013).

    CAS  Article  Google Scholar 

  22. 22

    Core, L.J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

    CAS  Article  Google Scholar 

  23. 23

    Wapinski, O.L. et al. Hierarchical mechanisms for direct reprogramming of fibroblasts to neurons. Cell 155, 621–635 (2013).

    CAS  Article  Google Scholar 

  24. 24

    Pope, B.D. et al. Topologically associating domains are stable units of replication-timing regulation. Nature 515, 402–405 (2014).

    CAS  Article  Google Scholar 

  25. 25

    Ernst, J. & Kellis, M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 23, 1142–1154 (2013).

    CAS  Article  Google Scholar 

  26. 26

    Kheradpour, P. et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res. 23, 800–811 (2013).

    CAS  Article  Google Scholar 

  27. 27

    Hibar, D.P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).

    CAS  Article  Google Scholar 

  28. 28

    Gjoneska, E. et al. Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518, 365–369 (2015).

    CAS  Article  Google Scholar 

  29. 29

    De Jager, P.L. et al. Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci. Nat. Neurosci. 17, 1156–1163 (2014).

    CAS  Article  Google Scholar 

  30. 30

    Frost, B., Hemberg, M., Lewis, J. & Feany, M.B. Tau promotes neurodegeneration through global chromatin relaxation. Nat. Neurosci. 17, 357–366 (2014).

    CAS  Article  Google Scholar 

  31. 31

    Parker, S.C.J. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).

    CAS  Article  Google Scholar 

  32. 32

    Taberlay, P.C., Statham, A.L., Kelly, T.K., Clark, S.J. & Jones, P.A. Reconfiguration of nucleosome-depleted regions at distal regulatory elements accompanies DNA methylation of enhancers and insulators in cancer. Genome Res. 24, 1421–1432 (2014).

    CAS  Article  Google Scholar 

  33. 33

    Al-Tassan, N.A. et al. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer. Sci. Rep. 5, 10442 (2015).

    Article  Google Scholar 

  34. 34

    Lay, F.D. et al. Reprogramming of the human intestinal epigenome by surgical tissue transposition. Genome Res. 24, 545–553 (2014).

    CAS  Article  Google Scholar 

  35. 35

    Fiziev, P. et al. Systematic epigenomic analysis reveals chromatin states associated with melanoma progression. Cell Rep. 19, 875–889 (2017).

    CAS  Article  Google Scholar 

  36. 36

    Kasowski, M. et al. Extensive variation in chromatin states across humans. Science 342, 750–752 (2013).

    CAS  Article  Google Scholar 

  37. 37

    Brown, E.J. & Bachtrog, D. The chromatin landscape of Drosophila: comparisons between species, sexes, and chromosomes. Genome Res. 24, 1125–1137 (2014).

    CAS  Article  Google Scholar 

  38. 38

    Day, K. et al. Differential DNA methylation with age displays both common and dynamic features across human tissues that are influenced by CpG landscape. Genome Biol. 14, R102 (2013).

    Article  Google Scholar 

  39. 39

    Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 14, 3156 (2013).

    Article  Google Scholar 

  40. 40

    Ward, L.D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).

    CAS  Article  Google Scholar 

  41. 41

    Boyle, A.P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).

    CAS  Article  Google Scholar 

  42. 42

    Day, N., Hemmaplardh, A., Thurman, R.E., Stamatoyannopoulos, J.A. & Noble, W.S. Unsupervised segmentation of continuous genomic data. Bioinformatics 23, 1424–1426 (2007).

    CAS  Article  Google Scholar 

  43. 43

    Thurman, R.E., Day, N., Noble, W.S. & Stamatoyannopoulos, J.A. Identification of higher-order functional domains in the human ENCODE regions. Genome Res. 17, 917–927 (2007).

    CAS  Article  Google Scholar 

  44. 44

    Hoffman, M.M. et al. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9, 473–476 (2012).

    CAS  Article  Google Scholar 

  45. 45

    Biesinger, J., Wang, Y. & Xie, X. Discovering and mapping chromatin states using a tree hidden Markov model. BMC Bioinformatics 14, S4 (2013).

    PubMed  PubMed Central  Google Scholar 

  46. 46

    Yu, P. et al. Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation. Genome Res. 23, 352–364 (2013).

    CAS  Article  Google Scholar 

  47. 47

    Marco, E. et al. Multi-scale chromatin state annotation using a hierarchical hidden Markov model. Nat. Commun. 8, 15011 (2017).

    CAS  Article  Google Scholar 

  48. 48

    Roy, S. & Sridharan, R. Chromatin module inference on cellular trajectories identifies key transition points and poised epigenetic states in diverse developmental processes. Genome Res. 27, 1250–1262 (2017).

    CAS  Article  Google Scholar 

  49. 49

    Sohn, K.-A. et al. hiHMM: Bayesian non-parametric joint inference of chromatin state maps. Bioinformatics 31, 2066–2074 (2015).

    CAS  Article  Google Scholar 

  50. 50

    Zhang, Y., An, L., Yue, F. & Hardison, R.C. Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Res. 44, 6721–6731 (2016).

    CAS  Article  Google Scholar 

  51. 51

    Libbrecht, M.W. et al. Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell-type-specific expression. Genome Res. 25, 544–557 (2015).

    CAS  Article  Google Scholar 

  52. 52

    Zacher, B., Lidschreiber, M., Cramer, P., Gagneur, J. & Tresch, A. Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle. Mol. Syst. Biol. 10, 768 (2014).

    Article  Google Scholar 

  53. 53

    Zacher, B. et al. Accurate promoter and enhancer identification in 127 ENCODE and roadmap epigenomics cell types and tissues by GenoSTAN. PLoS ONE 12, e0169249 (2017).

    Article  Google Scholar 

  54. 54

    Mammana, A. & Chung, H.-R. Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome. Genome Biol. 16, 151 (2015).

    Article  Google Scholar 

  55. 55

    Song, J. & Chen, K.C. Spectacle: fast chromatin state annotation using spectral learning. Genome Biol. 16, 33 (2015).

    Article  Google Scholar 

  56. 56

    Duttke, S.H.C. et al. Human promoters are intrinsically directional. Mol. Cell 57, 674–684 (2015).

    CAS  Article  Google Scholar 

  57. 57

    Filion, G.J. et al. Systematic protein location mapping reveals five principal chromatin types in Drosophila cells. Cell 143, 212–224 (2010).

    CAS  Article  Google Scholar 

  58. 58

    Hamada, M., Ono, Y., Fujimaki, R. & Asai, K. Learning chromatin states with factorized information criteria. Bioinformatics 31, 2426–2433 (2015).

    CAS  Article  Google Scholar 

  59. 59

    Jaschek, R. & Tanay, A. Spatial clustering of multivariate genomic and epigenomic information in Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology 170–183 (Springer, 2009).

  60. 60

    Kharchenko, P.V. et al. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480–485 (2011).

    CAS  Article  Google Scholar 

  61. 61

    Larson, J.L., Huttenhower, C., Quackenbush, J. & Yuan, G.-C. A tiered hidden Markov model characterizes multi-scale chromatin states. Genomics 102, 1–7 (2013).

    CAS  Article  Google Scholar 

  62. 62

    Roudier, F. et al. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis: organization of the Arabidopsis epigenome. EMBO J. 30, 1928–1938 (2011).

    CAS  Article  Google Scholar 

  63. 63

    Won, K.-J. et al. Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res. 41, 4423–4432 (2013).

    CAS  Article  Google Scholar 

  64. 64

    Zeng, X. et al. jMOSAiCS: joint analysis of multiple ChIP-seq datasets. Genome Biol. 14, R38 (2013).

    Article  Google Scholar 

  65. 65

    Choi, H., Fermin, D., Nesvizhskii, A.I., Ghosh, D. & Qin, Z.S. Sparsely correlated hidden Markov models with application to genome-wide location studies. Bioinformatics 29, 533–541 (2013).

    CAS  Article  Google Scholar 

  66. 66

    Hon, G., Ren, B. & Wang, W. ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput. Biol. 4, e1000201 (2008).

    Article  Google Scholar 

  67. 67

    Bernstein, B.E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).

    CAS  Article  Google Scholar 

  68. 68

    Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

    CAS  Article  Google Scholar 

  69. 69

    Boyle, A.P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 21, 456–464 (2011).

    CAS  Article  Google Scholar 

  70. 70

    Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).

    CAS  Article  Google Scholar 

  71. 71

    Ernst, J. et al. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat. Biotechnol. 34, 1180–1190 (2016).

    CAS  Article  Google Scholar 

  72. 72

    Landt, S.G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).

    CAS  Article  Google Scholar 

  73. 73

    Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We acknowledge the ENCODE and Roadmap Epigenomics consortia for generation and processing of data to which we have previously applied ChromHMM. We acknowledge the users of ChromHMM who have provided useful feedback on the software. We acknowledge funding from U.S. National Institutes of Health grants U54HG004570, RC1HG005334 (M.K.), R01ES024995, U01HG007912 and U01MH105578 (J.E.); a U.S. National Science Foundation Postdoctoral Fellowship (0905968) and CAREER Award 1254200 (J.E.); and an Alfred P. Sloan Fellowship (J.E.).

Author information

Affiliations

Authors

Contributions

J.E. and M.K. wrote this protocol and previously developed ChromHMM.

Corresponding authors

Correspondence to Jason Ernst or Manolis Kellis.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ernst, J., Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12, 2478–2492 (2017). https://doi.org/10.1038/nprot.2017.124

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing