Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Prediction of histone post-translational modification patterns based on nascent transcription data

Abstract

The role of histone modifications in transcription remains incompletely understood. Here, we examine the relationship between histone modifications and transcription using experimental perturbations combined with sensitive machine-learning tools. Transcription predicted the variation in active histone marks and complex chromatin states, like bivalent promoters, down to single-nucleosome resolution and at an accuracy that rivaled the correspondence between independent ChIP-seq experiments. Blocking transcription rapidly removed two punctate marks, H3K4me3 and H3K27ac, from chromatin indicating that transcription is required for active histone modifications. Transcription was also required for maintenance of H3K27me3, consistent with a role for RNA in recruiting PRC2. A subset of DNase-I-hypersensitive sites were refractory to prediction, precluding models where transcription initiates pervasively at any open chromatin. Our results, in combination with past literature, support a model in which active histone modifications serve a supportive, rather than an essential regulatory, role in transcription.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: dHIT imputes histone modifications using nascent transcription.
Fig. 2: dHIT identifies bivalent H3K4me3-/H3K27me3-marked genes.
Fig. 3: Inference of chromatin states defined by ChromHMM using transcription.
Fig. 4: ChIP-seq measures changes in histone modifications following transcription inhibition by Trp.
Fig. 5: Chromatin accessibility is not sufficient for transcription initiation.
Fig. 6: Transcription is required for chromatin landscaping.

Similar content being viewed by others

Data Availability

Publicly available data used in this study can be found in the Supplementary Tables 1 and 2. Tables in CSV format can be downloaded from https://github.com/alexachivu/dHITpaper_2021. Data generated in this study can be found in the Gene Expression Omnibus under accession code GSE163043.

Code Availability

dHIT software and scripts can be found on GitHub (https://github.com/Danko-Lab/histone-mark-imputation). Custom code for analyzing sequencing data can be found on GitHub (https://github.com/alexachivu/dHITpaper_2021/blob/main/Git.code_dHIT.uploa)

References

  1. Allfrey, V. G., Faulkner, R. & Mirsky, A. E. Acetylation and methylation of histones and their possible role in the regulation of RNA synthesis. Proc. Natl Acad. Sci. USA 51, 786–794 (1964).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Ho, J. W. K. et al. Comparative analysis of metazoan chromatin organization. Nature 512, 449–452 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Weiner, A. et al. High-resolution chromatin dynamics during a yeast stress response. Mol. Cell 58, 371–386 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Sebé-Pedrós, A. et al. The dynamic regulatory genome of capsaspora and the origin of animal multicellularity.Cell 165, 1224–1237 (2016).

    PubMed  PubMed Central  Google Scholar 

  5. Schwartzentruber, J. et al. Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma. Nature 482, 226–231 (2012).

    CAS  PubMed  Google Scholar 

  6. Béguelin, W. et al. EZH2 is required for germinal center formation and somatic EZH2 mutations promote lymphoid transformation. Cancer Cell 23, 677–692 (2013).

  7. Gu, Y. et al. The t(4;11) chromosome translocation of human acute leukemias fuses the ALL-1 gene, related to Drosophila trithorax, to the AF-4 gene. Cell 71, 701–708 (1992).

    CAS  PubMed  Google Scholar 

  8. Milne, T. A. et al. MLL associates specifically with a subset of transcriptionally active target genes. Proc. Natl Acad. Sci. USA 102, 14765–14770 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

    CAS  PubMed  Google Scholar 

  10. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Pickrell, J. K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Portela, A. & Esteller, M. Epigenetic modifications and human disease. Nat. Biotechnol. 28, 1057–1068 (2010).

    CAS  PubMed  Google Scholar 

  14. Henikoff, S. & Shilatifard, A. Histone modification: cause or cog? Trends Genet. 27, 389–396 (2011).

    CAS  PubMed  Google Scholar 

  15. Morgan, M. A. J. & Shilatifard, A. Reevaluating the roles of histone-modifying enzymes and their associated chromatin modifications in transcriptional regulation. Nat. Genet. 52, 1271–1281 (2020).

    CAS  PubMed  Google Scholar 

  16. Bernstein, B. E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).

    CAS  PubMed  Google Scholar 

  17. Heintzman, N. D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    CAS  PubMed  Google Scholar 

  18. Outchkourov, N. S. et al. Balancing of histone H3K4 methylation states by the Kdm5c/SMCX histone demethylase modulates promoter and enhancer function. Cell Rep. 3, 1071–1079 (2013).

    CAS  PubMed  Google Scholar 

  19. Pradeepa, M. M. et al. Histone H3 globular domain acetylation identifies a new class of enhancers.Nat. Genet. 48, 681–686 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Chu, T. et al. Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme.Nat. Genet. 50, 1553–1564 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  24. Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    PubMed Central  Google Scholar 

  25. Wang, Z. et al. Identification of regulatory elements from nascent transcription using dREG.Genome Res. 29, 293–303 (2018).

    PubMed  Google Scholar 

  26. Durham, T. J., Libbrecht, M. W., Howbert, J. J., Bilmes, J. & Noble, W. S. PREDICTD PaRallel epigenomics data imputation with cloud-based tensor decomposition. Nat. Commun. 9, 1402 (2018).

    PubMed  PubMed Central  Google Scholar 

  27. Schreiber, J. et al. Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome.Genome Biol. 21, 81 (2020).

    PubMed  PubMed Central  Google Scholar 

  28. Henikoff, S., Henikoff, J. G., Kaya-Okur, H. S. & Ahmad, K. Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation. Elife 9, e63274 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Tome, J. M. et al. Single-molecule nascent RNA sequencing identifies regulatory domain architecture at promoters and enhancers.Nat. Genet. 50, 1533–1541 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Chen, Y. et al. Principles for RNA metabolism and alternative transcription initiation within closely spaced promoters. Nat. Genet. 48, 984–994 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Scruggs, B. S. et al. Bidirectional transcription arises from two distinct hubs of transcription factor binding and active chromatin. Mol. Cell 58, 1101–1112 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Schreiber, J., Singh, R., Bilmes, J. & Noble, W. S. A pitfall for machine learning methods aiming to predict across cell types. Genome Biol. 21, 282 (2020).

    PubMed  PubMed Central  Google Scholar 

  33. Becker, J. S. et al. Genomic and proteomic resolution of heterochromatin and its restriction of alternate fate genes. Mol. Cell 68, 1023–1037.e15 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Auerbach, R. K. et al. Mapping accessible chromatin regions using Sono-Seq. Proc. Natl Acad. Sci. U. S. A. 106, 14926–14931 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Shah, R. N. et al. Examining the roles of H3K4 methylation states with systematically characterized antibodies. Mol. Cell 72, 162–177 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479–491 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. Elife 3, e02407 (2014).

    PubMed  PubMed Central  Google Scholar 

  38. Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Ernst, J. & Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12, 2478–2492 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Burns, E. N. et al. Generation of an equine biobank to be used for Functional Annotation of Animal Genomes project. Anim. Genet. 49, 564–570 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Giuffra, E. & Tuggle, C. K., FAANG Consortium. Functional Annotation of Animal Genomes (FAANG): current achievements and roadmap. Annu Rev. Anim. Biosci. 7, 65–88 (2019).

    CAS  PubMed  Google Scholar 

  42. Kingsley, N. B. et al. Functionally annotating regulatory elements in the equine genome using histone mark ChIP-seq. Genes 11, 3 (2019).

    PubMed Central  Google Scholar 

  43. Chou, S.-P. et al. Genetic dissection of the RNA polymerase II transcription cycle. Preprint at bioRxiv https://doi.org/10.1101/2021.05.23.445279 (2021)

  44. Vispé, S. et al. Triptolide is an inhibitor of RNA polymerase I and II-dependent transcription leading predominantly to down-regulation of short-lived mRNA. Mol. Cancer Ther. 8, 2780–2790 (2009).

    PubMed  Google Scholar 

  45. Titov, D. V. et al. XPB, a subunit of TFIIH, is a target of the natural product triptolide. Nat. Chem. Biol. 7, 182–188 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Krogan, N. J. et al. Methylation of histone H3 by Set2 in Saccharomyces cerevisiae is linked to transcriptional elongation by RNA polymerase II. Mol. Cell. Biol. 23, 4207–4218 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Kizer, K. O. et al. A novel domain in Set2 mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. Mol. Cell. Biol. 25, 3305–3316 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Steger, D. J. et al. DOT1L/KMT4 recruitment and H3K79 methylation are ubiquitously coupled with gene transcription in mammalian cells. Mol. Cell. Biol. 28, 2825–2839 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Zheng, Y., Tipton, J. D., Thomas, P. M., Kelleher, N. L. & Sweet, S. M. M. Site-specific human histone H3 methylation stability: fast K4me3 turnover. Proteomics 14, 2190–2199 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Long, Y. et al. RNA is essential for PRC2 chromatin occupancy and function in human pluripotent stem cells. Nat. Genet. 52, 1–8 (2020).

    Google Scholar 

  51. Felsenfeld, G. A brief history of epigenetics. Cold Spring Harb. Perspect. Biol. 6, a018200 (2014).

    PubMed  PubMed Central  Google Scholar 

  52. Young, R. S., Kumar, Y., Bickmore, W. A. & Taylor, M. S. Bidirectional transcription initiation marks accessible chromatin and is not specific to enhancers. Genome Biol. 18, 242 (2017).

    PubMed  PubMed Central  Google Scholar 

  53. Struhl, K. Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat. Struct. Mol. Biol. 14, 103–105 (2007).

    CAS  PubMed  Google Scholar 

  54. Danko, C. G. et al. Identification of active transcriptional regulatory elements from GRO-seq data. Nat. Methods 12, 433–438 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Gilchrist, D. A. et al. NELF-mediated stalling of Pol II can enhance gene expression by blocking promoter-proximal nucleosome assembly. Genes Dev. 22, 1921–1933 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Skene, P. J. & Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife Sci. 6, e21856 (2017).

    Google Scholar 

  60. Martin, B. J. E. et al. Transcription shapes genome-wide histone acetylation patterns. Nat. Commun. 12, 1–9 (2021).

    Google Scholar 

  61. Lewis, J. J. et al. The Dryas iulia genome supports multiple gains of a W chromosome from a B chromosome in butterflies. Genome Biol. Evol. 13, evab128 (2021).

    PubMed  PubMed Central  Google Scholar 

  62. Cicconardi, F. et al. Chromosome fusion affects genetic diversity and evolutionary turnover of functional loci but consistently depends on chromosome size. Mol. Biol. Evol. 38, 4449–4462 (2021).

    PubMed  PubMed Central  Google Scholar 

  63. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  66. Li, D., Hsu, S., Purushotham, D., Sears, R. L. & Wang, T. WashU Epigenome Browser update 2019. Nucleic Acids Res. 47, W158–W165 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Zhou, X. et al. The human epigenome browser at Washington University. Nat. Methods 8, 989–990 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Bonhoure, N. et al. Quantifying ChIP-seq data: a spiking method providing an internal reference for sample-to-sample normalization. Genome Res. 24, 1157–1168 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Smith, J. P., Dutta, A. B., Sathyan, K. M., Guertin, M. J. & Sheffield, N. C. PEPPRO: quality control and processing of nascent RNA profiling data. Genome Biol. 22, 155 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Wang, Z., Chu, T., Choate, L. A. & Danko, C. G. Rgtsvm: support vector machines on a GPU in R. Preprint at arXiv https://arxiv.org/abs/1706.05544 (2017).

  72. Andersson, R. et al. Nuclear stability and transcriptional directionality separate functionally distinct RNA species. Nat. Commun. 5, 5336 (2014).

    CAS  PubMed  Google Scholar 

  73. Allen, M. A. et al. Global analysis of p53-regulated transcription identifies its direct targets and unexpected regulatory mechanisms.Elife 27, e02200 (2014).

    Google Scholar 

  74. Danko, C. G. et al. Dynamic evolution of regulatory element ensembles in primate CD4+ T cells. Nat. Ecol. Evolution https://doi.org/10.1038/s41559-017-0447-5 (2018).

    Article  Google Scholar 

  75. Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    PubMed  PubMed Central  Google Scholar 

  76. Nair, S., Kim, D. S., Perricone, J. & Kundaje, A. Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts. Bioinformatics 35, i108–i116 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Favorov, A. et al. Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput. Biol. 8, e1002529 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Ku, M. et al. Genomewide analysis of PRC1 and PRC2 occupancy identifies two classes of bivalent domains. PLoS Genet. 4, e1000242 (2008).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the Extreme Science and Engineering Discovery Environment (allocation number TG-MCB160061) and Nvida (GPU grant) for providing the computational resources required in this study. We thank J. Lewis, H. Yu, A. Vihervaara, M. DeBerardine and all members of the Danko and Lis laboratories for valuable discussions and suggestions. Work in this publication was supported by the National Human Genome Research Institute (grant R01-HG009309 to C.G.D.), the Zweig Memorial Fund for Equine Research (grant to D.F.A. and C.G.D.), the LiaoNing Revitalization Talents Program (grant XLYC2002010 to Z.W.) and the National Institutes of Health (grant RM1GM139738 to J.T.L.). D.F.A. is an investigator of the Dorothy Russell Havemeyer Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Some of the figures in this article were created using BioRender.

Author information

Authors and Affiliations

Authors

Contributions

Z.W., A.G.C. and C.G.D. designed the study. Z.W., T.C. and C.G.D. developed the support-vector regression method. A.G.C., E.J.R. and L.A.C. performed experimental research. A.G.C., Z.W., S.-P.C., J.T.L. and C.G.D. analyzed and interpreted sequencing data. A.G.C. performed and analyzed Trp experiments. D.C.M., N.B.K., J.L.P., C.J.F., R.R.B., D.F.A., E.J.R. and Z.W. prepared and analyzed data from FAANG horse liver tissue. Z.W., A.G.C., J.T.L. and C.G.D. wrote the manuscript. All authors were involved in revisions and approved the final manuscript.

Corresponding author

Correspondence to Charles G. Danko.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Karen Adelman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Imputation of histone marks using nascent transcription.

Scatterplots show predicted (Y-axis) as a function of experimental ChIP-seq signal (X-axis) for ten different histone modifications in K562 and GM12878. Plots show correlations in a holdout chromosome (chr22) at three distinct length scales.

Extended Data Fig. 2 Evaluating dHIT predictions.

a. ROC and PRC plots describe the relationship between imputed and ENCODE ChIP-seq data within ENCODE peaks on chr21, holdout during dHIT training. b. Quantification of area under precision curves for both ROP and PRC plots in A. (c–l) Heatmaps show the experimental and imputed abundance of active, punctate histone marks in K562 (C-G) or GM12878 (H-L). Heatmaps show all peaks calls based on experimental ChIP-seq data ordered by the highest total signal intensity. m. Scatterplots depict imputed H3K9me3 (Y-axis) as a function of CUT&TAG experimental (X-axis) for H3K9me3 in K562. Spearman correlations were computed on the holdout chromosome chr21 (A) and chr22 (B). n. Mean-squared error (MSE) quantification at different subsets of genomic sites in GM12878.

Extended Data Fig. 3 Comparison between experimental and imputed MNase ChIP-seq.

(a,b).Heatmaps show the Pearson (A) and Spearman (B) correlations between predicted and experimental MNase ChIP-seq in 10 kb windows on a holdout chromosome (chr22). c. Genome browser plots show the distribution of PRO-seq, DNase-I hypersensitivity signal, and the signal for H3K4me3, H3K4me2, and H3K4me1 derived from MNase ChIP-seq and imputation near 9 transcribed regions in K562 cells. d. Heatmaps show MNase ChIP-seq and imputed signal intensity for H3K36me3, a gene body mark, deposited in the body of annotated genes. Genes are sorted by gene length. e. Heatmaps show the distribution of transcription (left) and histone modifications (right) predicted using transcription. Rows represent transcription initiation domains in GM12878 cells defined using GRO-cap data by Core, Martins, et. al. (2014) Nat. Gen. Heatmaps were ordered by the distance between the most frequently used TSS in each transcription initiation domain on the plus and minus strand.

Extended Data Fig. 4 Evaluation of cross-cell line imputation by different metrics.

(a-c) Heatmaps show Pearson’s correlation (A), Spearman’s rank correlation (B), Jensen-Shannon divergence. (c) between predicted and ChIP-seq measurements of nine histone modifications. Values are computed in 10 kb windows on the holdout chromosome (chr22) in humans, chr1 in horse, and chr1 in mice. Empty cells indicate that no experimental data is available for comparison in the cell type shown. (d) Heatmap shows Pearson’s correlation between the training dataset in K562 cells and experimental data collected in the indicated human cell line. Values are computed in 1 kb windows on the holdout chromosome (chr22) in humans. (e) Heatmap shows Pearson’s correlation between the ENCODE experimental data and either the imputed data or the average signal of the other human cell lines investigated. Values are computed in 1 kb windows on the holdout chromosome (chr22) in GM12878.

Extended Data Fig. 5 Comparison between imputation and multiple ChIP-seq experiments.

Box and whiskers plot shows the Pearson correlation between different experimental datasets for six histone marks in K562 and GM12878.The correlation between data imputed in K562 and GM12878 and the ENCODE experimental data in the same cell line is shown respectively by red and blue squares. All values are computed on a holdout chromosome (chr22) not used during training and are presented as mean values + /- standard deviation.

Extended Data Fig. 6 Comparing between imputed and experimental Chip-seq.

a. Browser shot shows the ENCODE, imputed, and experimental ChIP-seq signals at the CERK locus. b. Meta plots compare the H3K27ac content of two different sets of H3K27ac annotated peaks: peak high in ENCODE signal and depleted in imputed ChIP (top) or vice-versa (bottom). c. Genome browser compares experimental and predicted H3K27me3 signals at all four Hox gene clusters in relation to PRO-seq signal. d. Principal component analysis of 86 H3K27me3 ChIP-seq datasets from the Epigenome Roadmap project. e. Genome browser shows the distribution of H3K27me3 in the 8 of the Epigenome Roadmap cell lines. f. Quantification of PC1 H3K27me3 signal in 5 classes of cells. An unpaired Wilcox test was used to compare the Primary/Adult to the Pluripotent classes.

Extended Data Fig. 7 Chromatin annotations with dHIT.

Supplementary Fig. 7a. Enrichment of 18 chromatin states near RefSeq annotated transcription start sites for histone abundance predicted by dHIT (thick solid line), ChIP-seq from Broad (thin solid line), or using an alternative source of ChIP-seq data (thin dashed line). (b,c). Confusion matrix shows the Jaccard distance between dHIT and ChIP-seq data in 18 chromatin states (B) or between two separate sources of ChIP-seq data (C). Color scales are shown beside the plot, and are identical between panels (B) and (C). d. Genome browser shows the distribution of transcription, H3K27ac, H3K4me3, H3K4me1 and H3K27me3 in equine liver. E Genome browser shows the distri-bution of eight histone marks in mouse brain (top) and H3K27ac across nine murine tissues (bottom).

Extended Data Fig. 8 Data validation.

a. PCA shows the first two principal components of nine histone modifications in nine murine tissues (81 total datasets) in 100 bp bins on mm10 chr1. b. PCA of active, punctate marks (H3K4me3, H3K4me2, H3K9ac, and H3K27ac) shows that active punctate marks cluster by tissue. c. Genome browser shows the distribution of H3K36me3, H3K4me3, and H3K4me1 observed using ChIP-seq experiments or predicted using either PRO-seq or H3K4me2. Data is shown in two loci covering several transcribed genes (top) and near the transcription start site of ZNF74 (bottom). d. Correlations between PRO-seq 0 h and H3K4me3 and H3K27ac at TSSs. e. Heatmaps centered on transcription initiation domains show loss in transcription measured by PRO-seq after Trp treatment. f. Genome browser shows loss in transcription measured by PRO-seq after Trp treatment. Loss in PRO-seq signal at both enhancers and gene promoters. g. Spearman correlations between ChIP-seq replicates (left), each ChIP-seq replicate and ENCODE data (middle) genome-wide at 10 kb resolution, and at ENCODE peaks between merged Reps and ENCODE. h. H3 CUT&RUN 10 kb resolution Spearman correlation between replicates. i. Genome-wide, 10 kb resolution PCA of all ChIP-seq samples.

Extended Data Fig. 9 Changes in histone marks during Triptolide time course.

a. Heatmaps compare the level of H3K36me3 ChIP-seq after Triptolide inhibition. b. Meta plots show the H3K4me1 levels in a 4 kb window centered on transcription start sites in K562 cells. c. Meta plots show the level in H3K27me3 in a 40 kb window centered in EZH2 binding sites. d. Meta plots show transcription content of EZH2 binding sites during the Triptolide time course. e. Heatmaps shows H3K27me3 signal within gene bodies during the Triptolide treatment. Genes are sorted by gene length. f. Schematics of western blot experimental design. (g,h). Each western blot depicts the abundance of chromatin-bound histone mark or Pol II during the indicated Triptolide incubation time point. Each blot represents a different experiment. A dilution series of the untreated samples was used as standard curve to quantify changes in signal. Experiments were repeated at least twice and a minimum of 2 replicates per histone mark are provided. MM defined the Molecular marker depicted in [kDa]. i. Each western blot depicts the abundance of chromatin-bound H3K27ac or H3K27me3 during the indicated incubation time point of Triptolide, or Triptolide and Trichostatin dual treatment. Each blot represents a different experiment. A dilution series of the untreated samples was used as standard curve to quantify changes in signal. Ponceau staining of membranes imaged are also depicted as total protein loading control. j. Quantification of H3K27ac/H3K27me3 signals of the western blot in I. H3K27me3 was used as loading control. All values are depicted as mean values + /− SD.

Extended Data Fig. 10 Studying transcription activators and repressors.

a. Bar plots display absorbance quantified at 590 nm for AlmarBlue dye incubated with K562 cells during Triptolide, or Triptolide and Trichostatin A treatments. Two technical replicates were averaged for each time point. R1 and R2 define separate biological replication of the experiment. b. Scatterplots display the loss in H3K4me3 (left) and H3K27ac (right) as a function of Pol II transcription (top) or change in transcription (bottom). Changes in histone marks and transcription were calculated as log2 fold changes between 4 h of Triptolide treatment and untreated cells. Plots show Spearman’s Rho between conditions. (c-h) Scatterplots show experimental DNase-I hypersensitivity (x axis) as a function of predicted DNase-I hypersensitivity (y axis) in 100 bp windows intersected with transcriptional repressors (C-E) or transcriptional activators (F-H). (i-j) Meta (I) and Violin (J) plots display TBP (TATA-binding protein) CUT&RUN signal at gene promoters and enhancers in a short 30 min Triptolide time course.

Supplementary information

Supplementary Information

Supplementary Notes 1–7.

Reporting Summary

Peer Review Information

Supplementary Table 1

Excel document describing the source of all PRO-seq and ChIP-seq data used in our study.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Chivu, A.G., Choate, L.A. et al. Prediction of histone post-translational modification patterns based on nascent transcription data. Nat Genet 54, 295–305 (2022). https://doi.org/10.1038/s41588-022-01026-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-022-01026-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing