Evaluating deep learning for predicting epigenomic profiles

Toneyan, Shushan; Tang, Ziqi; Koo, Peter K.

doi:10.1038/s42256-022-00570-9

Analysis
Published: 05 December 2022

Evaluating deep learning for predicting epigenomic profiles

Nature Machine Intelligence volume 4, pages 1088–1100 (2022)Cite this article

4357 Accesses
13 Citations
170 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Deep learning has been successful at predicting epigenomic profiles from DNA sequences. Most approaches frame this task as a binary classification relying on peak callers to define functional activity. Recently, quantitative models have emerged to directly predict the experimental coverage values as a regression. As new models with different architectures and training configurations continue to emerge, a major bottleneck is forming due to the lack of ability to fairly assess the novelty of proposed models and their utility for downstream biological discovery. Here we introduce a unified evaluation framework and use it to compare various binary and quantitative models trained to predict chromatin accessibility data. We highlight various modelling choices that affect generalization performance, including a downstream application of predicting variant effects. In addition, we introduce a robustness metric that can be used to enhance model selection and improve variant effect predictions. Our empirical study largely supports that quantitative modelling of epigenomic profiles leads to better generalizability and interpretability.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Evaluation of Basenji-based quantitative models.**

**Fig. 3: Testing model robustness against translational shifts.**

**Fig. 4: Performance comparison between binary and quantitative models.**

**Fig. 5: Comparison of functional effect predictions.**

**Fig. 6: GIA for ResidualBind-32 on PC-3 cell line.**

Deep learning-based enhancement of epigenomics data with AtacWorks

Article Open access 08 March 2021

A sequence-based global map of regulatory activity for deciphering human genetics

Article Open access 11 July 2022

Predictive analyses of regulatory sequences with EUGENe

Article Open access 16 November 2023

Data availability

The processed ATAC-seq data, JASPAR 2022 core motifs for vertebrates data set and CAGI5 challenge data set used that support the findings of this study are available in Zenodo at https://doi.org/10.5281/zenodo.6464031⁵⁸.

Code availability

The code to reproduce results and figures in this study is available in Zenodo at https://doi.org/10.5281/zenodo.6464031⁵⁸. The open-source project repository is available in GitHub at https://github.com/shtoneyan/gopher. A stable version of the code used for generating the results in the paper is available in Zenodo at https://doi.org/10.5281/zenodo.6977213⁵⁹

References

Quang, D. & Xie, X. FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data. Methods 166, 40–47 (2019).
Article Google Scholar
Li, H., Quang, D. & Guan, Y. Anchor: trans-cell type prediction of transcription factor binding sites. Genome Res. 29, 281–292 (2019).
Article Google Scholar
Zheng, A. et al. Deep neural networks identify sequence context features predictive of transcription factor binding. Nat. Mach. Intell. 3, 172–180 (2021).
Article Google Scholar
Kelley, D. R., Snoek, J. & Rinn, J. L. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–999 (2016).
Article Google Scholar
Minnoye, L. et al. Cross-species analysis of enhancer logic using deep learning. Genome Res. 30, 1815–1834 (2020).
Article Google Scholar
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
Article Google Scholar
Yin, Q., Wu, M., Liu, Q., Lv, H. & Jiang, R. DeepHistone: a deep learning approach to predicting histone modifications. BMC Genomics https://doi.org/10.1186/s12864-019-5489-4 (2019).
Dey, K. K. et al. Evaluating the informativeness of deep learning annotations for human complex diseases. Nat. Commun. 11, 4703 (2020).
Cheng, J., Çelik, M. H., Kundaje, A. & Gagneur, J. Mtsplice predicts effects of genetic variants on tissue-specific splicing. Genome Biol. 22, 1–19 (2021).
Google Scholar
Zhou, J. et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat. Genet. 51, 973–980 (2019).
Article Google Scholar
Park, C. Y. et al. Genome-wide landscape of rna-binding protein target site dysregulation reveals a major impact on psychiatric disorder risk. Nat. Genet. 53, 166–173 (2021).
Article Google Scholar
Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
Article Google Scholar
Kim, D. S. et al. The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation. Nat. Genet. 53, 1564–1576 (2021).
Article Google Scholar
Novakovsky, G., Saraswat, M., Fornes, O., Mostafavi, S. & Wasserman, W. W. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol. 22, 1–25 (2021).
Article Google Scholar
Atak, Z. K. et al. Interpretation of allele-specific chromatin accessibility using cell state-aware deep learning. Genome Res. 31, 1082–1096 (2021).
Article Google Scholar
Li, J., Pu, Y., Tang, J., Zou, Q. & Guo, F. DeepATT: a hybrid category attention neural network for identifying functional effects of DNA sequences. Brief. Bioinform. 22, bbaa159 (2021).
Article Google Scholar
Karbalayghareh, A., Sahin, M. & Leslie, C. S. Chromatin interaction-aware gene regulatory modeling with graph attention networks. Genome Res. 32, 930–944 (2022).
Google Scholar
Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
Article Google Scholar
Janssesn, J. et al. Decoding gene regulation in the fly brain. Nature 601, 630–636 (2022).
Article Google Scholar
Vaishnav, E. D. et al. The evolution, evolvability and engineering of gene regulatory DNA. Nature 603, 455–463 (2022).
Article Google Scholar
Zhou, J. Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat. Genet. 54, 725–734 (2022).
Article Google Scholar
Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
Article Google Scholar
Kelley, D. R. Cross-species regulatory sequence activity prediction. PLoS Comput. Biol. 16, e1008050 (2020).
Article Google Scholar
Maslova, A. et al. Deep learning of immune cell differentiation. Proc. Natl Acad. Sci. USA 117, 25655–25666 (2020).
Article Google Scholar
Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
Article Google Scholar
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Article Google Scholar
Koohy, H., Down, T. A., Spivakov, M. & Hubbard, T. A comparison of peak callers used for DNase-seq data. PLoS ONE 9, e96303 (2014).
Article Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D. & Vladu, A. Towards deep learning models resistant to adversarial attacks. In Proc. International Conference on Learning Representations (2018).
Cohen, J., Rosenfeld, E. & Kolter, Z. Certified adversarial robustness via randomized smoothing. In Proc. International Conference on Machine Learning (ed. Chaudhuri, K. and Salakhutdinov, R.) 97, 1310–1320 (PMLR, 2019).
Goodfellow, I. J., Shlens, J. & Szegedy, C. Explaining and harnessing adversarial examples. In Proc. 3rd International Conference on Learning Representations (ed. Bengio, Y., LeCun, Y.) (2015).
Yu, F. & Koltun, V. Multi-scale context aggregation by dilated convolutions. In Proc. International Conference on Learning Representations (ed. Bengio, Y., LeCun, Y.) (2016).
Koo, P. K., Majdandzic, A., Ploenzke, M., Anand, P. & Paul, S. B. Global importance analysis: an interpretability method to quantify importance of genomic features in deep neural networks. PLoS Comput. Biol. 17, e1008925 (2021).
Article Google Scholar
Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).
Shigaki, D. et al. Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay. Hum. Mutat. 40, 1280–1291 (2019).
Article Google Scholar
Koo, P. K. & Eddy, S. R. Representation learning of genomic sequence motifs with convolutional neural networks. PLoS Comput. Biol. 15, e1007560 (2019).
Article Google Scholar
Koo, P. K. & Ploenzke, M. Improving representations of genomic sequence motifs in convolutional networks with exponential activations. Nat. Mach. Intell. 3, 258–266 (2021).
Article Google Scholar
Ghotra, R., Lee, N. K., Tripathy, R. & Koo, P. K. Designing interpretable convolution-based hybrid networks for genomics. Preprint at bioRxiv https://doi.org/10.1101/2021.07.13.452181 (2021).
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
Article Google Scholar
Mauduit, D. et al. Analysis of long and short enhancers in melanoma cell states. eLife 10, e71735 (2021).
Article Google Scholar
de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).
Article Google Scholar
Stormo, G. D., Schneider, T. D., Gold, L. & Ehrenfeucht, A. Use of the ‘perceptron’ algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 10, 2997–3011 (1982).
Article Google Scholar
Le, D. D. et al. Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding. Proc. Natl Acad. Sci. USA 115, E3702–E3711 (2018).
Article Google Scholar
Levo, M. et al. Unraveling determinants of transcription factor binding outside the core binding site. Genome Res. 25, 1018–1029 (2015).
Article Google Scholar
Kodzius, R. et al. CAGE: cap analysis of gene expression. Nat. Methods 3, 211–222 (2006).
Article Google Scholar
Vaswani, A. et al. Attention is all you need. In International Conference on Neural Information Processing Systems Vol. 30, 5998–6008 (Curran Associates, 2017).
Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21–29 (2015).
Article Google Scholar
The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Yu, F., Koltun, V. & Funkhouser, T. Dilated residual networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 472–480 (Curran Associates, 2017).
Hendrycks, D. & Gimpel, K. Gaussian error linear units (GeLUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).
Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proc. International Conference on Machine Learning 448–456 (ed. Bach F., Blei D.M.) (PMLR, 2015).
Kingma, D. & Ba, J. Adam: A method for stochastic optimization. In Proc. International Conference on Learning Representations (ed. Bengio, Y., LeCun, Y.) (2015).
Biewald, L. Software for experiment tracking with weights and biases. Weights & Biases https://www.wandb.com/ (2020).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (Curran Associates, 2016).
Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. & Noble, W. S. Quantifying similarity between motifs. Genome Biol. 8, 1–9 (2007).
Article Google Scholar
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50, D165–D173 (2021).
Article Google Scholar
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. In Workshop Track Proc. International Conference on Learning Representations (ed. Bengio, Y., LeCun, Y.) (2014).
Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2020).
Article Google Scholar
Toneyan, S., Tang, Z. & Koo, P. Evaluating deep learning for predicting epigenomic profiles. Zenodo https://doi.org/10.5281/zenodo.6464031 (2022).
Toneyan, S., Tang, Z. & Kaczmarzyk, J. shtoneyan/gopher: stable. Zenodo https://doi.org/10.5281/zenodo.6977213 (2022).

Download references

Acknowledgements

This work was supported in part by funding from NIH grant R01HG012131 and the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory. This work was performed with assistance from the US National Institutes of Health grants S10OD028632-01. We thank J. Kaczmarzyk and other members of the Koo laboratory for helpful discussions.

Author information

These authors contributed equally: Shushan Toneyan and Ziqi Tang.

Authors and Affiliations

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Shushan Toneyan, Ziqi Tang & Peter K. Koo

Authors

Shushan Toneyan
View author publications
You can also search for this author in PubMed Google Scholar
Ziqi Tang
View author publications
You can also search for this author in PubMed Google Scholar
Peter K. Koo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.T., Z.T. and P.K.K. conceived the experiments. S.T. and Z.T. wrote the code base, conducted the experiments and analysed and interpreted the results. All authors contributed to the manuscript.

Corresponding author

Correspondence to Peter K. Koo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Anat Kreimer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Evaluation of BPNet-based quantitative models.

(a) Loss function analysis. Scatter plot of the whole-chromosome Pearson’s r versus the MSE for different loss functions (shown in a different color) and different target resolutions (shown in a different marker). The results for the scaled Pearson’s r loss function were removed due to poor training runs. (b) Bin resolution analysis. Plot of the whole-chromosome Pearson’s r for models trained on a given bin size (y-axis) with predictions that were systematically down-sampled to a lower resolution for evaluation (x-axis). (a, b) Pearson’s r represents the average across cell lines.

Extended Data Fig. 2 The effect of smoothing coverage on performance.

Basenji-based models were trained on target resolutions (y-axis) and evaluated using different levels of smoothing with a box-car filter. For each higher resolution model, a box-car filter was applied to both predictions and experimental coverage values with various kernel sizes prior to calculating the average Pearson’s r (x-axis). Pearson’s r represents the average across cell lines.

Extended Data Fig. 3 Performance comparison between quantitative and binary models.

Scatter plot of the classification-based AUPR versus the regression-based Pearson’s r for various binary models (blue) and quantitative models (orange) on peak-centred test data (left) and whole-chromosome test data (right). Metrics represent the average across cell lines.

Extended Data Fig. 4 GIA for optimal flanking nucleotides of motifs in PC-3 cell line for various models.

Ranked plot of the global importance for each tested flank for AP-1 motif (left column), ATAAA motif (middle column) and GATA (right column) for different models (shown in a different row). Dashed line represents the global importance of the core motif with random flanks. The hue in the first column represents the position-weight-matrix score for an AP-1 motif from the JASPAR database (ID: MA0491.1). The first 3 rows are quantitative models, the rest are binary models (with (exp) in the name indicating that the first layer ReLU activation has been replaced with an exponential function). For binary models, the results are based on the logits before the output sigmoid activation. The hue in the first column plots represents the PWM score for an AP-1 motif from the JASPAR database (ID: MA0491.1). The black dot in each plot (in the first column) indicates ‘TGTGATTCATG’, which has a high PWM score (12.800) but yields a global importance close to the core motif with randomized flanks.

Extended Data Fig. 5 GIA for distance dependence between AP-1 and other motifs for PC-3 cell line for various models.

Global importance plot for sequences with an AP-1 motif fixed at the centre of the sequence and another motif that is systematically placed in different locations. Positive and negative values represent the first positions the motifs w/ optimized flanks were embedded to be non-overlapping. First column shows results where the second motif is an identical AP-1 motif, the centre column shows results for ATAAA motif and right column for the GATA motif. All the motifs were embedded with optimized flanks. Red vertical dashed lines indicate the 1024bp position. Each row corresponds to a different trained model, the first 3 are quantitative models, the rest are binary models (with (exp) in the name indicating that the first layer ReLU activation has been replaced with an exponential function). For binary models, the results are based on the logits before the output sigmoid activation.

Extended Data Fig. 6 GIA for cooperative interactions between AP-1 and other motifs for PC-3 cell line for various models.

Each column corresponds to a motif pair between two copies of AP-1, ATAAA and AP-1 and AP-1 and GATA. Each row corresponds to a different trained model, the first 3 are quantitative models, the rest are binary models (with (exp) in the name indicating that the first layer ReLU activation has been replaced with an exponential function). For binary models, the results are based on the logits before the output sigmoid activation. Blue and orange box-plots show the global importance scores for the 1000 sampled sequences when motif 1 or motif 2 is individually embedded. Green box-plot shows the case when both motifs are embedded in the same sequence. Red box-plot shows the sum of the green and blue boxes as an estimate of the global importance if there is no interaction. The pairs were embedded at the optimal distance specified from the distance dependence GIA experiments. Box plots show the first and third quartiles, central line is the median, and the whiskers show the range of data with outliers removed. For each motif pair experiment n=1000 independent samples were drawn from the test set sequences.

Supplementary information

Supplementary Information

Supplementary Tables 1–6, Figs. 1–4 and Notes 1–3.

Reporting Summary

Supplementary Data 1

Full evaluation of model prediction performance.

Supplementary Data 2

Model architecture and hyperparameters.

Supplementary Data 3

CAGI5 data set prediction performance.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Toneyan, S., Tang, Z. & Koo, P.K. Evaluating deep learning for predicting epigenomic profiles. Nat Mach Intell 4, 1088–1100 (2022). https://doi.org/10.1038/s42256-022-00570-9

Download citation

Received: 05 May 2022
Accepted: 19 October 2022
Published: 05 December 2022
Issue Date: December 2022
DOI: https://doi.org/10.1038/s42256-022-00570-9

This article is cited by

Cell-type-directed design of synthetic enhancers
- Ibrahim I. Taskiran
- Katina I. Spanier
- Stein Aerts
Nature (2024)
Genetic regulation of m6A RNA methylation and its contribution in human complex diseases
- Kexuan Chen
- Jiuhong Nan
- Xushen Xiong
Science China Life Sciences (2024)
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
- Nicholas Keone Lee
- Ziqi Tang
- Peter K. Koo
Genome Biology (2023)
Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning
- Marc Horlacher
- Nils Wagner
- Annalisa Marsico
Genome Biology (2023)
Current approaches to genomic deep learning struggle to fully capture human genetic variation
- Ziqi Tang
- Shushan Toneyan
- Peter K. Koo
Nature Genetics (2023)