Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms1,2,3. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal ‘arms’, and centromeres distributed along their lengths4,5. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.
We used chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) to generate profiles of core histones, histone variants, histone modifications and chromatin-associated proteins (Fig. 1, Supplementary Fig. 1 and Supplementary Tables 1 and 2). Additional data include DNase I hypersensitivity sites in fly and human cells, and nucleosome occupancy maps in all three organisms. Compared to our initial publications1,2,3, this represents a tripling of available fly and worm data sets and a substantial increase in human data sets (Fig. 1b, c). Uniform quality standards for experimental protocols, antibody validation and data processing were used throughout the projects6. Detailed analyses of related transcriptome and transcription factor data are presented in accompanying papers7,8.
We performed systematic cross-species comparisons of chromatin composition and organization, focusing on targets profiled in at least two organisms (Fig. 1). Sample types used most extensively in our analyses are human cell lines H1-hESC, GM12878 and K562; fly late embryos, third instar larvae and cell lines S2, Kc, BG3; and worm early embryos and stage 3 larvae. Our conclusions are summarized in Extended Data Table 1.
Not surprisingly, the three species show many common chromatin features. Most of the genome in each species is marked by at least one histone modification (Supplementary Fig. 2), and modification patterns are similar around promoters, gene bodies, enhancers and other chromosomal elements (Supplementary Figs 3 –12). Nucleosome occupancy patterns around protein-coding genes and enhancers are also largely similar across species, although we observed subtle differences in H3K4me3 enrichment patterns around transcription start sites (TSSs) (Extended Data Fig. 1a and Supplementary Figs 12–14). The configuration and composition of large-scale features such as lamina-associated domains (LADs) are similar (Supplementary Figs 15 –17). LADs in human and fly are associated with late replication and H3K27me3 enrichment, suggesting a repressive chromatin environment (Supplementary Fig. 18). Finally, DNA structural features associated with nucleosome positioning are strongly conserved (Supplementary Figs 19 and 20).
Although patterns of histone modifications across active and silent genes are largely similar in all three species, there are some notable differences (Extended Data Fig. 1b). For example, H3K23ac is enriched at promoters of expressed genes in worm, but is enriched across gene bodies of both expressed and silent genes in fly. H4K20me1 is enriched on both expressed and silent genes in human but only on expressed genes in fly and worm (Extended Data Fig. 1b). Enrichment of H3K36me3 in genes expressed with stage or tissue specificity is lower than in genes expressed broadly, possibly because profiling was carried out on mixed tissues (Supplementary Figs 21–23; see Supplementary Methods). Although the co-occurrence of pairs of histone modifications is largely similar across the three species, there are clearly some species-specific patterns (Extended Data Fig. 1c and Supplementary Figs 24 and 25).
Previous studies showed that in human9,10 and fly1,11 prevalent combinations of marks or ‘chromatin states’ correlate with functional features such as promoters, enhancers, transcribed regions, Polycomb-associated domains, and heterochromatin. ‘Chromatin state maps’ provide a concise and systematic annotation of the genome. To compare chromatin states across the three organisms, we developed and applied a novel hierarchical non-parametric machine-learning method called hiHMM (see Supplementary Methods) to generate chromatin state maps from eight histone marks mapped in common, and compared the results with published methods (Fig. 2 and Supplementary Figs 26–28). We find that combinatorial patterns of histone modifications are largely conserved. Based on correlations with functional elements (Supplementary Figs 29–32), we categorized the 16 states into six groups: promoter (state 1), enhancer (states 2 and 3), gene body (states 4–9), Polycomb-repressed (states 10 and 11), heterochromatin (states 12 and 13), and weak or low signal (states 14–16).
Heterochromatin is a classically defined and distinct chromosomal domain with important roles in genome organization, genome stability, chromosome inheritance and gene regulation. It is typically enriched for H3K9me3 (ref. 12), which we used as a proxy for identifying heterochromatic domains (Fig. 3a and Supplementary Figs 33 and 34). As expected, the majority of the H3K9me3-enriched domains in human and fly are concentrated in the pericentromeric regions (as well as other specific domains, such as the Y chromosome and fly 4th chromosome), whereas in worm they are distributed throughout the distal chromosomal ‘arms’11,13,14 (Fig. 3a). In all three organisms, we find that more of the genome is associated with H3K9me3 in differentiated cells and tissues compared to embryonic cells and tissues (Extended Data Fig. 2a). We also observe large cell-type-specific blocks of H3K9me3 in human and fly11,14,15 (Supplementary Fig. 35). These results suggest a molecular basis for the classical concept of ‘facultative heterochromatin’ formation to silence blocks of genes as cells specialize.
Two distinct types of transcriptionally repressed chromatin have been described. As discussed above, classical ‘heterochromatin’ is generally concentrated in specific chromosomal regions and enriched for H3K9me3 and also H3K9me2 (ref. 12). In contrast, Polycomb-associated silenced domains, involved in cell-type-specific silencing of developmentally regulated genes11,14, are scattered across the genome and enriched for H3K27me3. We found that the organization and composition of these two types of transcriptionally silent domains differ across species. First, human, fly and worm display significant differences in H3K9 methylation patterns. H3K9me2 shows a stronger correlation with H3K9me3 in fly than in worm (r = 0.89 versus r = 0.40, respectively), whereas H3K9me2 is well correlated with H3K9me1 in worm but not in fly (r = 0.44 versus r = −0.32, respectively) (Fig. 3b). These findings suggest potential differences in heterochromatin in the three organisms (see below). Second, the chromatin state maps reveal two distinct types of Polycomb-associated repressed regions: strong H3K27me3 accompanied by marks for active genes or enhancers (Fig. 2, state 10; perhaps due to mixed tissues in whole embryos or larvae for fly and worm), and strong H3K27me3 without active marks (state 11) (see also Supplementary Fig. 31). Third, we observe a worm-specific association of H3K9me3 and H3K27me3. These two marks are enriched together in states 12 and 13 in worm but not in human and fly. This unexpectedly strong association between H3K9me3 and H3K27me3 in worm (observed with several validated antibodies; Extended Data Fig. 2b) suggests a species-specific difference in the organization of silent chromatin.
We also compared the patterns of histone modifications on expressed and silent genes in euchromatin and heterochromatin (Extended Data Fig. 2c and Supplementary Fig. 36). We previously reported prominent depletion of H3K9me3 at TSSs and high levels of H3K9me3 in the gene bodies of expressed genes located in fly heterochromatin14, and now find a similar pattern in human (Extended Data Fig. 2c and Supplementary Fig. 36). In these two species, H3K9me3 is highly enriched in the body of both expressed and silent genes in heterochromatic regions. In contrast, expressed genes in worm heterochromatin have lower H3K9me3 enrichment across gene bodies compared to silent genes (Extended Data Fig. 2c and Supplementary Figs 36, and 37). There are also conspicuous differences in the patterns of H3K27me3 in the three organisms. In human and fly, H3K27me3 is highly associated with silent genes in euchromatic regions, but not with silent genes in heterochromatic regions. In contrast, consistent with the worm-specific association between H3K27me3 and H3K9me3, we observe high levels of H3K27me3 on silent genes in worm heterochromatin, whereas silent euchromatic genes show modest enrichment of H3K27me3 (Extended Data Fig. 2c and Supplementary Fig. 36).
Our results suggest three distinct types of repressed chromatin (Extended Data Fig. 3). The first contains H3K27me3 with little or no H3K9me3 (human and fly states 10 and 11, and worm state 11), corresponding to developmentally regulated Polycomb-silenced domains in human and fly, and probably in worm as well. The second is enriched for H3K9me3 and lacks H3K27me3 (human and fly states 12 and 13), corresponding to constitutive, predominantly pericentric heterochromatin in human and fly, which is essentially absent from the worm genome. The third contains both H3K9me3 and H3K27me3 and occurs predominantly in worm (worm states 10, 12 and 13). Co-occurrence of these marks is consistent with the observation that H3K9me3 and H3K27me3 are both required for silencing of heterochromatic transgenes in worms16. H3K9me3 and H3K27me3 may reside on the same or adjacent nucleosomes in individual cells17,18; alternatively the two marks may occur in different cell types in the embryos and larvae analysed here. Further studies are needed to resolve this and determine the functional consequences of the overlapping distributions of H3K9me3 and H3K27me3 observed in worm.
Genome-wide chromatin conformation capture (Hi-C) assays have revealed prominent topological domains in human19 and fly20,21. Although their boundaries are enriched for insulator elements and active genes19,20 (Supplementary Fig. 38), the interiors generally contain a relatively uniform chromatin state: active, Polycomb-repressed, heterochromatin, or low signal22 (Supplementary Fig. 39). We found that chromatin state similarity between neighbouring regions correlates with chromatin interaction domains determined by Hi-C (Fig. 3c, Supplementary Fig. 40 and Supplementary Methods). This suggests that topological domains can be largely predicted by chromatin marks when Hi-C data are not available (Supplementary Figs 41 and 42).
C. elegans and D. melanogaster have been used extensively for understanding human gene function, development and disease. Our analyses of chromatin architecture and the large public resource we have generated provide a blueprint for interpreting experimental results in these model systems, extending their relevance to human biology. They also provide a foundation for researchers to investigate how diverse genome functions are regulated in the context of chromatin structure.
modENCODE Consortium et al. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797 (2010)
Gerstein, M. B. et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 1775–1787 (2010)
The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)
Gassmann, R. et al. An inverse relationship to germline transcription defines centromeric chromatin in C. elegans. Nature 484, 534–537 (2012)
Blower, M. D., Sullivan, B. A. & Karpen, G. H. Conserved organization of centromeric chromatin in flies and humans. Dev. Cell 2, 319–330 (2002)
Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012)
Gerstein, M. B. et al. Comparative analysis of the transcriptome across distant species. Nature http://dx.doi.org/10.1038/nature13424 (this issue)
Boyle, A. P. et al. Comparative analysis of regulatory information and circuits across distant species. Naturehttp://dx.doi.org/10.1038/nature13668 (this issue)
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011)
Hoffman, M. M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013)
Kharchenko, P. V. et al. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480–485 (2011)
Elgin, S. C. & Reuter, G. Position-effect variegation, heterochromatin formation, and gene silencing in Drosophila. Cold Spring Harb. Perspect. Biol. 5, a017780 (2013)
Liu, T. et al. Broad chromosomal domains of histone modification patterns in C. elegans. Genome Res. 21, 227–236 (2011)
Riddle, N. C. et al. Plasticity in patterns of histone modifications and chromosomal proteins in Drosophila heterochromatin. Genome Res. 21, 147–163 (2011)
Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479–491 (2010)
Towbin, B. D. et al. Step-wise methylation of histone H3K9 positions heterochromatin at the nuclear periphery. Cell 150, 934–947 (2012)
Bilodeau, S., Kagey, M. H., Frampton, G. M., Rahl, P. B. & Young, R. A. SetDB1 contributes to repression of genes encoding developmental regulators and maintenance of ES cell state. Genes Dev. 23, 2484–2489 (2009)
Voigt, P. et al. Asymmetrically modified nucleosomes. Cell 151, 181–193 (2012)
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012)
Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012)
Hou, C., Li, L., Qin, Z. S. & Corces, V. G. Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol. Cell 48, 471–484 (2012)
Zhu, J. et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013)
Chen, R. A. et al. The landscape of RNA polymerase II transcription initiation in C. elegans reveals promoter and enhancer architectures. Genome Res. 23, 1339–1347 (2013)
Egelhofer, T. A. et al. An assessment of histone-modification antibody quality. Nature Struct. Mol. Biol. 18, 91–93 (2011)
Hayashi-Takanaka, Y. et al. Tracking epigenetic histone modifications in single cells using Fab-based live endogenous modification labeling. Nucleic Acids Res. 39, 6475–6488 (2011)
Chandra, T. et al. Independence of repressive histone marks and chromatin compaction during senescent heterochromatic layer formation. Mol. Cell 47, 203–214 (2012)
Bender, L. B., Cao, R., Zhang, Y. & Strome, S. The MES-2/MES-3/MES-6 complex and regulation of histone H3 methylation in C. elegans. Curr. Biol. 14, 1639–1643 (2004)
This project was mainly funded by NHGRI U01HG004258 (G.H.K., S.C.R.E., M.I.K., P.J.P., V.P.), U01HG004270 (J.D.L., J.A., A.F.D., X.S.L., S.S.), U01HG004279 (D.M.M.), U54HG004570 (B.E.B.) and U01HG004695 (W.S.N.). It is also supported by NHBIB 5RL9EB008539 (J.W.K.H.), NHGRI K99HG006259 (M.M.H.), NIGMS fellowships (S.C.J.P., E.N.L.), NIH U54CA121852 (T.D.T.), NSF 1122374 (D.S.D.), National Natural Science Foundation of China 31028011 (X.S.L.), MEST Korea MHW-2013-HI13C2164 (J.H.K.), NRF-2012-0000994 (K.-A.S.), and Wellcome Trust 54523 (J.A.). We thank D. Acevedo and C. Kennedy for technical assistance.
The authors declare no competing financial interests.
All data are available at the project data portals (http://data.modencode.org; http://encodeproject.org), modMine (http://intermine.modencode.org), or our database and web application (http://encode-x.med.harvard.edu/data_sets/chromatin/). More information and annotated files are available at http://encodeproject.org/comparative.
Extended data figures and tables
Extended Data Figure 1 Chromatin features at TSSs and gene bodies, and co-occurrence of histone modifications.
a, Comparative analysis of promoter architecture at transcription start sites (TSSs). From the top, H3K4me3 (human GM12878, fly L3 and worm L3), DNase I hypersensitivity sites (DHSs), GC content, and nascent transcript (GRO-seq in human IMR90 and fly S2 cells). Human promoters, and to a lesser extent worm promoters (as defined using recently published capRNA-seq data23), exhibit a bimodal enrichment for H3K4me3 and other active marks around TSSs. In contrast, fly promoters clearly exhibit a unimodal distribution of active marks, downstream of TSSs. As genes that have a neighbouring gene within 1 kb of a TSS or TES (transcription end site) were removed from this analysis, any bimodal histone modification pattern cannot be attributed to nearby genes. This difference is also not explained by chromatin accessibility determined by DHS, or by fluctuations in GC content around the TSSs, although the GC profiles are highly variable across species. b, Average gene body profiles of histone modifications on protein-coding genes in human GM12878, fly L3 and worm L3. c, Genome-wide correlations between histone modifications show intra- and inter-species similarities and differences. Top left, pairwise correlations between marks in each genome, averaged across all three species. Bottom right, pairwise correlations, averaged over cell types and developmental stages, within each species (pie chart), with inter-species variance (grey-scale background) and intra-species variance (grey-scale small rectangles) of correlation coefficients for human, fly and worm. Modifications enriched within or near actively transcribed genes are consistently correlated with each other in all three organisms. In contrast, we found a major difference in the co-occurrence pattern of two key repressive chromatin marks (black cell in bottom left): H3K27me3 (related to Polycomb (Pc)-mediated silencing) and H3K9me3 (related to heterochromatin). These two marks are strongly correlated at both developmental stages analysed in worm, whereas their correlation is low in human (r = −0.24 to −0.06) and fly (r = −0.03 to −0.1).
a, Genomic coverage of H3K9me3 in multiple cell types and developmental stages. Embryonic cell lines or stages are marked with an asterisk and a black bar. b, Evidence that overlapping H3K9me3 and H3K27me3 ChIP signals in worm are not due to antibody cross-reactivity. ChIP-chip experiments were performed from early embryo (EE) extracts with three different H3K9me3 antibodies (from Abcam, Upstate and H. Kimura) and three different H3K27me3 antibodies (from Active Motif, Upstate and H. Kimura). The H3K9me3 antibodies show similar enrichment profiles (top panel) and high genome-wide correlation coefficients (bottom left). The same is true for H3K27me3 antibodies. There is significant overlap between the H3K9me3 and H3K27me3 ChIP signal, especially on chromosome arms, resulting in relatively high genome-wide correlation coefficients (Extended Data Fig. 1c). The Abcam and Upstate H3K9me3 antibodies showed low level cross-reactivity with H3K27me3 on dot blots24, and the Abcam H3K9me3 ChIP signal overlapped with H3K27me3 on chromosome centres. The Kimura monoclonal antibodies against H3K9me3 and H3K27me3 showed the least overlap and smallest genome-wide correlation. In enzyme-linked immunosorbent assay (ELISA) using histone H3 peptides containing different modifications, each Kimura H3K9me3 or H3K27me3 antibody recognized the modified tail against which it was raised and did not cross-react with the other modified tail25,26, providing support for their specificity. Specificity of the Kimura antibodies was further analysed by immunostaining germlines from wild type, met-2 set-25 mutants (which lack H3K9 histone methyltransferase (HMT) activity16), and mes-2 mutants (which lack H3K27 HMT activity27) in the bottom right panel. Staining with anti-HK9me3 was robust in wild type and in mes-2, but undetectable in met-2 set-25. Staining with anti-HK27me3 was robust in wild type and in met-2 set-25, but undetectable in mes-2. Finally, we note that the laboratories that analysed H3K9me3 and H3K27me3 in other systems used Abcam H3K9me3 (for human and fly) and Upstate H3K27me3 (for human), and in these cases observed non-overlapping distributions. Another paper also reported non-overlapping distributions of H3K9me3 and H3K27me3 in human fibroblast cells using the Kimura antibodies26. The overlapping distributions that we observe in worms using any of those antibodies suggest that H3K9me3 and H3K27me3 occupy overlapping regions in worms. Those overlapping regions may exist in individual cells or in different cell sub-populations in embryo and L3 preparations. c, Average gene body profiles of H3K9me3 and H3K27me3 on expressed and silent genes in euchromatin and heterochromatin in human K562 cells, fly L3 and worm L3.
a, The correlation of H3K27me3 and H3K9me3 enrichment for human K562 (left panel), fly L3 (second panel), and worm EE chromosome arms (third panel) and centres (right panel) with a 10-kb bin (top) and a 1-kb bin (bottom). The density was calculated as a frequency of bins that fall in the area in the scatter plot (darker grey at a higher frequency). r indicates Pearson correlation coefficients between binned H3K27me3 fold enrichment (log2) and H3K9me3 fold enrichment (log2). Worm chromosome arms have a distinctly high correlation between H3K27me3 and H3K9me3. The lower correlation in worm chromosome centres is due to the overall absence of H3K9me3 in these regions. b, Schematic diagrams of the distributions of silent domains along the chromosomes in human (H1-hESC), fly (S2) and worm (EE). In human and fly, the majority of the H3K9me3-enriched domains are located in the pericentric regions (as well as telomeres), while the H3K27me3-enriched domains are distributed along the chromosome arms. H3K27me3-enriched domains are negatively correlated with H3K36me3-enriched domains, although in human, there is some overlap of H3K27me3 and H3K36me3 in bivalent domains. CENP-A resides at the centromere. In contrast, in worm the majority of H3K9me3-enriched domains are located in the arms, whereas H3K27me3-enriched domains are distributed throughout the arms and centres of the chromosomes and are anti-correlated with H3K36me3-enriched domains. In arms and centres, domains that are permissive for CENP-A incorporation generally reside within H3K27me3-enriched domains.
This file contains Supplementary Methods, Supplementary Figures 1-42, Supplementary Tables 1-3, and Supplementary References. See Supplementary Contents for details. (PDF 15080 kb)
About this article
Cite this article
Ho, J., Jung, Y., Liu, T. et al. Comparative analysis of metazoan chromatin organization. Nature 512, 449–452 (2014). https://doi.org/10.1038/nature13415
AIMS Biophysics (2020)
Annals of the New York Academy of Sciences (2020)
Insulator-based loops mediate the spreading of H3K27me3 over distant micro-domains repressing euchromatin genes
Genome Biology (2020)
Briefings in Functional Genomics (2020)
Gene network transitions in embryos depend upon interactions between a pioneer transcription factor and core histones
Nature Genetics (2020)