Improving representations of genomic sequence motifs in convolutional networks with exponential activations

Koo, Peter K.; Ploenzke, Matt

doi:10.1038/s42256-020-00291-x

Article
Published: 08 February 2021

Improving representations of genomic sequence motifs in convolutional networks with exponential activations

Nature Machine Intelligence volume 3, pages 258–266 (2021)Cite this article

2789 Accesses
27 Citations
71 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Deep convolutional neural networks (CNNs) trained on regulatory genomic sequences tend to build representations in a distributed manner, making it a challenge to extract learned features that are biologically meaningful, such as sequence motifs. Here we perform a comprehensive analysis of synthetic sequences to investigate the role that CNN activations have on model interpretability. We show that employing an exponential activation in the first layer filters consistently leads to interpretable and robust representations of motifs compared with other commonly used activations. Strikingly, we demonstrate that CNNs with better test performance do not necessarily imply more interpretable representations with attribution methods. We find that CNNs with exponential activations significantly improve the efficacy of recovering biologically meaningful representations with attribution methods. We demonstrate that these results generalize to real DNA sequences across several in vivo datasets. Together, this work demonstrates how a small modification to existing CNNs (that is, setting exponential activations in the first layer) can substantially improve the robustness and interpretabilty of learned representations directly in convolutional filters and indirectly with attribution methods.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Motif representation performance.**

**Fig. 2: Interpretability performance of saliency maps.**

**Fig. 3: Attribution score comparison for real regulatory DNA sequences.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

scGPT: toward building a foundation model for single-cell multi-omics using generative AI

Article 26 February 2024

Data availability

Data for tasks 1, 3, 5 and 6, and code to generate data for Task 2, are available at https://doi.org/10.5281/zenodo.4301062. Data for Task 4 are available via ref. ¹.

Code availability

Code to reproduce results and figures is available at https://doi.org/10.5281/zenodo.4301062.

References

Kelley, D. R., Snoek, J. & Rinn, J. L. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–998 (2016).
Article Google Scholar
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
Article Google Scholar
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548 (2019).
Article Google Scholar
Bogard, N., Linder, J., Rosenberg, A. B. & Seelig, G. A deep neural network for predicting and engineering alternative polyadenylation. Cell 178, 91–106 (2019).
Article Google Scholar
Koo, P. K. & Ploenzke, M. Deep learning for inferring transcription factor binding sites. Curr. Opin. Syst. Biol. 19, 16–23 (2020).
Article Google Scholar
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at https://arxiv.org/abs/1312.6034 (2013).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In International Conference on Machine Learning Vol. 70, 3319–3328 (ICML, 2017).
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In International Conference on Machine Learning Vol. 70, 3145–3153 (ICML, 2017).
Lundberg, S. & Lee, S. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 4765–4774 (NeurIPS, 2017).
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–8 (2015).
Article Google Scholar
Selvaraju, R. et al. Grad-cam: visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision 618–626 (IEEE, 2017).
Jha, A., Aicher, J. K., Gazzara, M. R., Singh, D. & Barash, Y. Enhanced integrated gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome Biol. 21, 1–22 (2020).
Article Google Scholar
Erhan, D., Bengio, Y., Courville, A. & Vincent, P. Visualizing higher-layer features of a deep network. In ICML Workshop on Learning Feature Hierarchies Vol. 1341 (ICML, 2009).
Yosinski, J., Clune, J., Nguyen, A., Fuchs, T. & Lipson, H. Understanding neural networks through deep visualization. Preprint at https://arxiv.org/abs/1506.06579 (2015).
Lanchantin, J., Singh, R., Lin, Z. & Qi, Y. Deep motif: visualizing genomic sequence classifications. Preprint at https://arxiv.org/abs/1605.01133 (2016).
Shrikumar, A. et al. echnical Note on Transcription Factor Motif Discovery from Importance Scores (TF-MoDISco) version 0.5. 1.1. Preprint at https://arxiv.org/abs/1811.00416 (2018).
Koo, P., Qian, S., Kaplun, G., Volf, V. & Kalimeris, D. Robust neural networks are more interpretable for genomics. Preprint at https://www.biorxiv.org/content/10.1101/657437v1 (2019).
Koo, P. K. & Eddy, S. R. Representation learning of genomic sequence motifs with convolutional neural networks. PLoS Comput. Biol. https://doi.org/10.1371/journal.pcbi.1007560 (2019).
Ploenzke, M. & Irizarry, R. Interpretable convolution methods for learning genomic sequence motifs. Preprint at https://www.biorxiv.org/content/10.1101/411934v1 (2018).
Raghu, M., Poole, B., Kleinberg, J., Ganguli, S. & Sohl-Dickstein, J. On the expressive power of deep neural networks. Preprint at https://arxiv.org/abs/1606.05336 (2016).
Kelley, D. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–50 (2018).
Article Google Scholar
Nair, V. & Hinton, G. E. Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning, 807–814 (2010).
Dugas, C., Bengio, Y., Belisle, F., Nadeau, C. & Garcia, R. Incorporating second-order functional knowledge for better option pricing. In Advances in Neural Information Processing Systems 472–478 (NeurIPS, 2001).
Clevert, D. A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep network learning by exponential linear units (ELUs). Preprint at https://arxiv.org/abs/1511.07289 (2015).
Pennington, J., Schoenholz, S. & Ganguli, S. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice. In Advances in Neural Information Processing Systems 4785–4795 (NeurIPS, 2017).
Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. & Noble, W. S. Quantifying similarity between motifs. Genome Biol. 8, R24 (2007).
Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. 13th International Conference on Artificial Intelligence and Statistics Vol. 9, 249–256 (AISTATS, 2010).
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In IEEE International Conference on Computer Vision 1026–1034 (IEEE, 2015).
LeCun, Y. A., Bottou, L., Orr, G. B. & Müller, K.-R. in Neural networks: Tricks of the Trade 9–48 (Springer, 2012).
Klambauer, G., Unterthiner, T., Mayr, A. & Hochreiter, S. Self-normalizing neural networks. In Advances in Neural Information Processing Systems 971–980 (NeurIPS, 2017).
Siggers, T. & Gordan, R. Protein-DNA binding: complexities and multi-protein codes. Nucleic Acids Res. 42, 2099–2111 (2014).
Article Google Scholar
Stormo, G. D., Schneider, T. D., Gold, L. & Ehrenfeucht, A. Use of the ‘perceptron’ algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 10, 2997–3011 (1982).
Article Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Article Google Scholar
Grant, C. E., Bailey, T. L. & Noble, W. S. Fimo: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
Article Google Scholar
Inukai, S., Kock, K. H. & Bulyk, M. L. Transcription factor-DNA binding: beyond binding site motifs. Curr. Opin. Genet. Dev. 43, 110–119 (2017).
Article Google Scholar
Simcha, D., Price, N. D. & Geman, D. The limits of de novo DNA motif discovery. PLoS One 7, e47836 (2012).
Kulkarni, M. M. & Arnosti, D. N. Information display by transcriptional enhancers. Development 130, 6569–6575 (2003).
Article Google Scholar
Slattery, M. et al. Absence of a simple code: how transcription factors read the genome. Trends Biochem. Sci. 39, 381–99 (2014).
Article Google Scholar
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A. & Madry, A. Robustness may be at odds with accuracy. Preprint at https://arxiv.org/abs/1805.12152 (2018).
Adebayo, J. et al. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems 9505–9515 (NeurIPS, 2018).
Sixt, L., Granz, M. & Landgraf, T. When explanations lie: why modified BP attribution fails. Preprint at https://arxiv.org/abs/1912.09818 (2019).
Adebayo, J., Gilmer, J., Goodfellow, I. & Kim, B. Local explanation methods for deep neural networks lack sensitivity to parameter values. Preprint at https://arxiv.org/abs/1810.03307 (2018).
Piper, M., Gronostajski, R. & Messina, G. Nuclear factor one X in development and disease. Trends Cell Biol. 29, 20–30 (2019).
Article Google Scholar
Forrest, M. P. et al. The emerging roles of TCF4 in disease and development. Trends Mol. Med. 20, 322–331 (2014).
Article Google Scholar
Wei, B. et al. A protein activity assay to measure global transcription factor activity reveals determinants of chromatin accessibility. Nat. Biotechnol. 36, 521–529 (2018).
Article Google Scholar
Koo, P. K., Ploenzke, M., Anand, P., Paul, S. & Majdandzic, A. Global importance analysis: a method to quantify importance of genomic features in deep neural networks. Preprint at https://www.biorxiv.org/content/10.1101/2020.09.08.288068v1 (2020).
Mathelier, A. et al. Jaspar 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44, D110–D115 (2016).
Article Google Scholar
Consortium, E. P. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article Google Scholar
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Article Google Scholar
Vakoc, C R. ZBED2 is an antagonist of interferon regulatory factor 1 and modifies cell identity in pancreatic cancer. Proc. Natl Acad. Sci. USA 117, 11471–11482 (2020).
Article Google Scholar
Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Preprint at https://arxiv.org/abs/1502.03167 (2015).
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
MathSciNet MATH Google Scholar
Kingma, D. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Tareen, A. & Kinney, J. Logomaker: beautiful sequence logos in python. Preprint at https://www.biorxiv.org/content/10.1101/635029v1 (2019).

Download references

Acknowledgements

This work was supported in part by funding from the NCI Cancer Center Support Grant (CA045508) and the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory. M.P. was supported by NIH NCI RFA-CA-19-002. The authors would like to thank D. Krotov, who provided inspiration for the exponential activation. We would also like to thank J. Kinney, A. Tareen and the members of the Koo laboratory for helpful discussions.

Author information

These authors contributed equally: Peter K. Koo and Matt Ploenzke.

Authors and Affiliations

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Peter K. Koo
Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
Matt Ploenzke

Authors

Peter K. Koo
View author publications
You can also search for this author in PubMed Google Scholar
Matt Ploenzke
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.K.K. conceived of the experiments. P.K.K. and M.P. conducted the experiments. P.K.K. and M.P. analysed the results. All authors reviewed the manuscript.

Corresponding author

Correspondence to Peter K. Koo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Smita Krishnaswamy and the other, anonymous reviewer(s), for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Task 1 motif representations for CNNs with modified activations.

a, Boxplot of the fraction of filters that match ground truth motifs for different CNNs with traditional and modified activations. b, Boxplot of the fraction of filters that match ground truth motifs for an ablation study of transformations for modified activations. c, First layer filter scans from CNN-deep with relu activations (top) and exponential activations (middle). Each colour represents a different filter. d, Motif scans (top) and PWM scans (middle) using ground truth motifs and their reverse-complements (each colour represents a different filter scan). Negative PWM scan values were rectified to a value of zero. c,d, The information content of the sequence model used to generate the synthetic sequence (ground truth), which has 3 embedded motifs centred at positions 15, 85, and 150, is shown at the bottom. e, Boxplot of the fraction of filters that match ground truth motifs for CNN-deep with various activations: log activations trained with and without L2-regularization (Log-Relu-L2 and Log-Relu, respectively) and relu activations with and without L2-regularization. a,b,e, Each boxplot represents the performance across 10 models trained with different random intialisations (box represents first and third quartile and the red line represents the median).

Extended Data Fig. 2 Interpretability performance comparison of different attribution methods.

Boxplots of the interpretability AUROC (a) and AUPR b, for CNN-local (top) and CNN-dist (bottom) with relu activations (left) and exponential activations (right) for different attribution methods. Each boxplot represents the performance across 10 models trained with different random intialisations (box represents first and third quartile and the red line represents the median). Sequence logo of a saliency map for a Task 3 test sequence generated with different attribution methods for CNN-deep with relu activations (c) and exponential activations (d). The right y-axis label shows the interpretability AUROC score. c-d, The sequence logo for the ground truth sequence model is shown at the bottom.

Supplementary information

Supplementary Information

Supplementary Tables 1–8 and Figs. 1–16.

Reporting Summary

Supplementary Data

TomTom results for Task 2 and Task 4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koo, P.K., Ploenzke, M. Improving representations of genomic sequence motifs in convolutional networks with exponential activations. Nat Mach Intell 3, 258–266 (2021). https://doi.org/10.1038/s42256-020-00291-x

Download citation

Received: 11 June 2020
Accepted: 18 December 2020
Published: 08 February 2021
Issue Date: March 2021
DOI: https://doi.org/10.1038/s42256-020-00291-x

This article is cited by

Hold out the genome: a roadmap to solving the cis-regulatory code
- Carl G. de Boer
- Jussi Taipale
Nature (2024)
Cell-type-directed design of synthetic enhancers
- Ibrahim I. Taskiran
- Katina I. Spanier
- Stein Aerts
Nature (2024)
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
- Nicholas Keone Lee
- Ziqi Tang
- Peter K. Koo
Genome Biology (2023)
ExplaiNN: interpretable and transparent neural networks for genomics
- Gherman Novakovsky
- Oriol Fornes
- Wyeth W. Wasserman
Genome Biology (2023)
Correcting gradient-based interpretations of deep neural networks for genomics
- Antonio Majdandzic
- Chandana Rajesh
- Peter K. Koo
Genome Biology (2023)