Noncoding DNA regions have central roles in human biology, evolution, and disease. ChromHMM helps to annotate the noncoding genome using epigenomic information across one or multiple cell types. It combines multiple genome-wide epigenomic maps, and uses combinatorial and spatial mark patterns to infer a complete annotation for each cell type. ChromHMM learns chromatin-state signatures using a multivariate hidden Markov model (HMM) that explicitly models the combinatorial presence or absence of each mark. ChromHMM uses these signatures to generate a genome-wide annotation for each cell type by calculating the most probable state for each genomic segment. ChromHMM provides an automated enrichment analysis of the resulting annotations to facilitate the functional interpretations of each chromatin state. ChromHMM is distinguished by its modeling emphasis on combinations of marks, its tight integration with downstream functional enrichment analyses, its speed, and its ease of use. Chromatin states are learned, annotations are produced, and enrichments are computed within 1 d.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Genome Biology Open Access 07 September 2023
Integrative analysis of transcriptome dynamics during human craniofacial development identifies candidate disease genes
Nature Communications Open Access 02 August 2023
Genome Biology Open Access 27 June 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903 (2008).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Hoffman, M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).
Ernst, J. & Kellis, M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).
Mortazavi, A. et al. Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps. Genome Res. 23, 2136–2148 (2013).
Chronis, C. et al. Cooperative binding of transcription factors orchestrates reprogramming. Cell 168, 442–459 e20 (2017).
Javierre, B.M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 e19 (2016).
Lorzadeh, A. et al. Nucleosome density ChIP-Seq identifies distinct chromatin modification signatures associated with MNase accessibility. Cell Rep. 17, 2112–2124 (2016).
Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).
Roy, S. et al. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797 (2010).
Rosenbloom, K.R. et al. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res. 41, D56–D63 (2013).
Cunningham, F. et al. Ensembl 2015. Nucleic Acids Res. 43, D662–D669 (2015).
Denholtz, M. et al. Long-range chromatin contacts in embryonic stem cells reveal a role for pluripotency factors and polycomb proteins in genome organization. Cell Stem Cell 13, 602–616 (2013).
Core, L.J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).
Wapinski, O.L. et al. Hierarchical mechanisms for direct reprogramming of fibroblasts to neurons. Cell 155, 621–635 (2013).
Pope, B.D. et al. Topologically associating domains are stable units of replication-timing regulation. Nature 515, 402–405 (2014).
Ernst, J. & Kellis, M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 23, 1142–1154 (2013).
Kheradpour, P. et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res. 23, 800–811 (2013).
Hibar, D.P. et al. Common genetic variants influence human subcortical brain structures. Nature 520, 224–229 (2015).
Gjoneska, E. et al. Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518, 365–369 (2015).
De Jager, P.L. et al. Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci. Nat. Neurosci. 17, 1156–1163 (2014).
Frost, B., Hemberg, M., Lewis, J. & Feany, M.B. Tau promotes neurodegeneration through global chromatin relaxation. Nat. Neurosci. 17, 357–366 (2014).
Parker, S.C.J. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).
Taberlay, P.C., Statham, A.L., Kelly, T.K., Clark, S.J. & Jones, P.A. Reconfiguration of nucleosome-depleted regions at distal regulatory elements accompanies DNA methylation of enhancers and insulators in cancer. Genome Res. 24, 1421–1432 (2014).
Al-Tassan, N.A. et al. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer. Sci. Rep. 5, 10442 (2015).
Lay, F.D. et al. Reprogramming of the human intestinal epigenome by surgical tissue transposition. Genome Res. 24, 545–553 (2014).
Fiziev, P. et al. Systematic epigenomic analysis reveals chromatin states associated with melanoma progression. Cell Rep. 19, 875–889 (2017).
Kasowski, M. et al. Extensive variation in chromatin states across humans. Science 342, 750–752 (2013).
Brown, E.J. & Bachtrog, D. The chromatin landscape of Drosophila: comparisons between species, sexes, and chromosomes. Genome Res. 24, 1125–1137 (2014).
Day, K. et al. Differential DNA methylation with age displays both common and dynamic features across human tissues that are influenced by CpG landscape. Genome Biol. 14, R102 (2013).
Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 14, 3156 (2013).
Ward, L.D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).
Boyle, A.P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).
Day, N., Hemmaplardh, A., Thurman, R.E., Stamatoyannopoulos, J.A. & Noble, W.S. Unsupervised segmentation of continuous genomic data. Bioinformatics 23, 1424–1426 (2007).
Thurman, R.E., Day, N., Noble, W.S. & Stamatoyannopoulos, J.A. Identification of higher-order functional domains in the human ENCODE regions. Genome Res. 17, 917–927 (2007).
Hoffman, M.M. et al. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9, 473–476 (2012).
Biesinger, J., Wang, Y. & Xie, X. Discovering and mapping chromatin states using a tree hidden Markov model. BMC Bioinformatics 14, S4 (2013).
Yu, P. et al. Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation. Genome Res. 23, 352–364 (2013).
Marco, E. et al. Multi-scale chromatin state annotation using a hierarchical hidden Markov model. Nat. Commun. 8, 15011 (2017).
Roy, S. & Sridharan, R. Chromatin module inference on cellular trajectories identifies key transition points and poised epigenetic states in diverse developmental processes. Genome Res. 27, 1250–1262 (2017).
Sohn, K.-A. et al. hiHMM: Bayesian non-parametric joint inference of chromatin state maps. Bioinformatics 31, 2066–2074 (2015).
Zhang, Y., An, L., Yue, F. & Hardison, R.C. Jointly characterizing epigenetic dynamics across multiple human cell types. Nucleic Acids Res. 44, 6721–6731 (2016).
Libbrecht, M.W. et al. Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell-type-specific expression. Genome Res. 25, 544–557 (2015).
Zacher, B., Lidschreiber, M., Cramer, P., Gagneur, J. & Tresch, A. Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle. Mol. Syst. Biol. 10, 768 (2014).
Zacher, B. et al. Accurate promoter and enhancer identification in 127 ENCODE and roadmap epigenomics cell types and tissues by GenoSTAN. PLoS ONE 12, e0169249 (2017).
Mammana, A. & Chung, H.-R. Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome. Genome Biol. 16, 151 (2015).
Song, J. & Chen, K.C. Spectacle: fast chromatin state annotation using spectral learning. Genome Biol. 16, 33 (2015).
Duttke, S.H.C. et al. Human promoters are intrinsically directional. Mol. Cell 57, 674–684 (2015).
Filion, G.J. et al. Systematic protein location mapping reveals five principal chromatin types in Drosophila cells. Cell 143, 212–224 (2010).
Hamada, M., Ono, Y., Fujimaki, R. & Asai, K. Learning chromatin states with factorized information criteria. Bioinformatics 31, 2426–2433 (2015).
Jaschek, R. & Tanay, A. Spatial clustering of multivariate genomic and epigenomic information in Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology 170–183 (Springer, 2009).
Kharchenko, P.V. et al. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480–485 (2011).
Larson, J.L., Huttenhower, C., Quackenbush, J. & Yuan, G.-C. A tiered hidden Markov model characterizes multi-scale chromatin states. Genomics 102, 1–7 (2013).
Roudier, F. et al. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis: organization of the Arabidopsis epigenome. EMBO J. 30, 1928–1938 (2011).
Won, K.-J. et al. Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res. 41, 4423–4432 (2013).
Zeng, X. et al. jMOSAiCS: joint analysis of multiple ChIP-seq datasets. Genome Biol. 14, R38 (2013).
Choi, H., Fermin, D., Nesvizhskii, A.I., Ghosh, D. & Qin, Z.S. Sparsely correlated hidden Markov models with application to genome-wide location studies. Bioinformatics 29, 533–541 (2013).
Hon, G., Ren, B. & Wang, W. ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput. Biol. 4, e1000201 (2008).
Bernstein, B.E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).
Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).
Boyle, A.P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 21, 456–464 (2011).
Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).
Ernst, J. et al. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat. Biotechnol. 34, 1180–1190 (2016).
Landt, S.G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
We acknowledge the ENCODE and Roadmap Epigenomics consortia for generation and processing of data to which we have previously applied ChromHMM. We acknowledge the users of ChromHMM who have provided useful feedback on the software. We acknowledge funding from U.S. National Institutes of Health grants U54HG004570, RC1HG005334 (M.K.), R01ES024995, U01HG007912 and U01MH105578 (J.E.); a U.S. National Science Foundation Postdoctoral Fellowship (0905968) and CAREER Award 1254200 (J.E.); and an Alfred P. Sloan Fellowship (J.E.).
The authors declare no competing financial interests.
About this article
Cite this article
Ernst, J., Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12, 2478–2492 (2017). https://doi.org/10.1038/nprot.2017.124
This article is cited by
Genome Biology (2023)
Genome Biology (2023)
BMC Bioinformatics (2023)
Genome Biology (2023)
MORC proteins regulate transcription factor binding by mediating chromatin compaction in active chromatin regions
Genome Biology (2023)