Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms

AlQuraishi, Mohammed; Sorger, Peter K.

doi:10.1038/s41592-021-01283-4

Perspective
Published: 04 October 2021

Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms

Nature Methods volume 18, pages 1169–1180 (2021)Cite this article

18k Accesses
38 Citations
149 Altmetric
Metrics details

Subjects

Abstract

Deep learning using neural networks relies on a class of machine-learnable models constructed using ‘differentiable programs’. These programs can combine mathematical equations specific to a particular domain of natural science with general-purpose, machine-learnable components trained on experimental data. Such programs are having a growing impact on molecular and cellular biology. In this Perspective, we describe an emerging ‘differentiable biology’ in which phenomena ranging from the small and specific (for example, one experimental assay) to the broad and complex (for example, protein folding) can be modeled effectively and efficiently, often by exploiting knowledge about basic natural phenomena to overcome the limitations of sparse, incomplete and noisy data. By distilling differentiable biology into a small set of conceptual primitives and illustrative vignettes, we show how it can help to address long-standing challenges in integrating multimodal data from diverse experiments across biological scales. This promises to benefit fields as diverse as biophysics and functional genomics.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 3: Differentiable programming fuses principles-based and data-driven modeling.**

**Fig. 4: Protein structure prediction vignette.**

A guide to machine learning for biologists

Article 13 September 2021

Current progress and open challenges for applying deep learning across the biosciences

Article Open access 01 April 2022

Machine learning methods to model multicellular complexity and tissue specificity

Article 15 July 2021

References

Martín, A. et al. TensorFlow: large-scale machine learning on heterogeneous systems http://tensorflow.org/ (2015).
Paszke, A. et al. Automatic differentiation in PyTorch. In 31st Conference on Neural Information Processing Systems (NIPS 2017) https://openreview.net/pdf?id=BJJsrmfCZ (2017).
James, B., Roy, F., Peter, H., Matthew, B. & James, J. JAX: Autograd and XLA (Google, 2021).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Preprint at https://arxiv.org/abs/1512.03385 (2015).
Russakovsky, O. et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
Article Google Scholar
Oei, R. W. et al. Convolutional neural network for cell classification using microscope images of intracellular actin networks. PLoS ONE 14, e0213626 (2019).
Article CAS PubMed PubMed Central Google Scholar
Falk, T. et al. U-Net: deep learning for cell counting, detection, and morphometry. Nat. Methods 16, 67–70 (2019).
Article CAS PubMed Google Scholar
Serag, A. et al. Translational AI and deep learning in diagnostic pathology. Front. Med. 6, 185 (2019).
Zhang, Z. et al. Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nat. Mach. Intell. 1, 236–245 (2019).
Article Google Scholar
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Machine Learn. 20, 273–297 (1995).
Google Scholar
Tin Kam, H. Random decision forests. in Proceedings of the 3rd International Conference on Document Analysis and Recognition 278–282 (1995).
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
Zhang, R. Making convolutional networks shift-invariant again. Preprint at https://arxiv.org/abs/1904.11486 (2019).
Rosenbaum, D. et al. Inferring a continuous distribution of atom coordinates from cryo-EM images using VAEs. Preprint at https://arxiv.org/abs/2106.14108 (2021).
TensorFlow Core. Introducing the model garden for TensorFlow 2. TensorFlow Blog https://blog.tensorflow.org/2020/03/introducing-model-garden-for-tensorflow-2.html (2020).
Wolf, T. et al. HuggingFace’s transformers: state-of-the-art natural language processing. Preprint at https://arxiv.org/abs/1910.03771 (2020).
Ramsundar, B. et al. Deep Learning for the Life Sciences (O’Reilly Media, 2019).
AlQuraishi, M. End-to-end differentiable learning of protein structure. Cell Syst. 8, 292–301 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Article PubMed Google Scholar
Sadanandan, S. K., Ranefall, P., Guyader, S. L. & Wählby, C. Automated training of deep convolutional neural networks for cell segmentation. Sci. Rep. 7, 7860 (2017).
Article PubMed PubMed Central CAS Google Scholar
Gut, G., Herrmann, M. D. & Pelkmans, L. Multiplexed protein maps link subcellular organization to cellular states. Science 361, eaar7042 (2018).
Article PubMed CAS Google Scholar
Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019).
Article Google Scholar
Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13, e1005324 (2017).
Article PubMed PubMed Central CAS Google Scholar
Liu, Y., Palmedo, P., Ye, Q., Berger, B. & Peng, J. Enhancing evolutionary couplings with deep convolutional neural networks. Cell Syst. 6, 65–74 (2018).
Article PubMed CAS Google Scholar
Xu, J. Distance-based protein folding powered by deep learning. Proc. Natl Acad. Sci. USA 116, 16856–16865 (2019).
Article CAS PubMed PubMed Central Google Scholar
Senior, A. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).
Torng, W. & Altman, R. B. High precision protein functional site detection using 3D convolutional neural networks. Bioinformatics 35, 1503–1512 (2019).
Article CAS PubMed Google Scholar
Gligorijevic, V. et al. Structure-based function prediction using graph convolutional networks. Preprint at bioRxiv https://doi.org/10.1101/786236 (2019).
Wallach, I., Dzamba, M. & Heifets, A. AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. Preprint at https://arxiv.org/abs/1510.02855 (2015).
Gomes, J., Ramsundar, B., Feinberg, E. N. & Pande, V. S. Atomic convolutional networks for predicting protein–ligand binding affinity. Preprint at https://arxiv.org/abs/1703.10603 (2017).
Benos, P. V., Lapedes, A. S. & Stormo, G. D. Is there a code for protein–DNA recognition? Probab(ilistical)ly…. BioEssays 24, 466–475 (2002).
Article CAS PubMed Google Scholar
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
Article CAS PubMed Google Scholar
Avsec, Z. et al. Deep learning at base-resolution reveals motif syntax of the cis-regulatory code. Preprint at bioRxiv https://doi.org/10.1101/737981 (2019).
Wu, Z. et al. A comprehensive survey on graph neural networks. Preprint at https://arxiv.org/abs/1901.00596 (2019).
Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
Article CAS PubMed Google Scholar
Bouatta, N., Sorger, P. & AlQuraishi, M. Protein structure prediction by AlphaFold2: are attention and symmetries all you need? Acta Crystallogr. D Struct. Biol. 77, 982–991 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature https://doi.org/10.1038/s41586-021-03819-2 (2021).
Muzio, G., O’Bray, L. & Borgwardt, K. Biological network analysis with deep learning. Brief. Bioinform. 22, 1515–1530 (2021).
Article PubMed Google Scholar
Chowdhury, R. et al. Single-sequence protein structure prediction using language models from deep learning. Preprint at bioRxiv https://doi.org/10.1101/2021.08.02.454840 (2021).
Hall, B. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction (Springer, 2004).
Cohen, T. S., Geiger, M. & Weiler, M. A general theory of equivariant CNNs on homogeneous spaces. In Advances in Neural Information Processing Systems vol. 32 (Curran, 2019).
Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D Steerable CNNs: learning rotationally equivariant features in volumetric data. In Advances in Neural Information Processing Systems vol. 31 (Curran, 2018).
Gao, M. & Skolnick, J. Structural space of protein–protein interfaces is degenerate, close to complete, and highly connected. Proc. Natl Acad. Sci. USA 107, 22517–22522 (2010).
Article CAS PubMed PubMed Central Google Scholar
Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).
Article CAS PubMed Google Scholar
Akbar, R. et al. A compact vocabulary of paratope-epitope interactions enables predictability of antibody-antigen binding. Cell Rep. 34, 108856 (2021).
Article CAS PubMed Google Scholar
Cunningham, J., Koytiger, G., Sorger, P. K. & AlQuraishi, M. Biophysical prediction of protein–peptide interactions and signaling networks using machine learning. Nat. Methods 17, 175–183 (2020).
Townshend, R., Bedi, R., Suriana, P. & Dror, R. End-to-end learning on 3D protein structure for interface prediction. In Advances in Neural Information Processing Systems vol. 32 (Curran, 2019).
Paggi, J. M. et al. Leveraging non-structural data to predict structures of protein–ligand complexes. Preprint at bioRxiv https://doi.org/10.1101/2020.06.01.128181 (2020).
Berg, S. et al. ilastik: interactive machine learning for (bio)image analysis. Nat. Methods https://doi.org/10.1038/s41592-019-0582-9 (2019).
Krueger, R. et al. Facetto: combining unsupervised and supervised learning for hierarchical phenotype analysis in multi-channel image data. IEEE Trans. Vis. Comput. Graph. https://doi.org/10.1109/TVCG.2019.2934547 (2019).
Bialek, W. Biophysics: Searching for Principles (Princeton Univ. Press, 2012).
Nguyen, T. H. et al. Bayesian analysis of isothermal titration calorimetry for binding thermodynamics. PLoS ONE 13, e0203224 (2018).
Article PubMed PubMed Central CAS Google Scholar
Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems (eds. Bengio, S. et al.) 6571–6583 (Curran, 2018).
Yuan, B. et al. CellBox: interpretable machine learning for perturbation biology with application to the design of cancer combination therapy. Cell Syst. 12, 128–140 (2021).
Article CAS PubMed Google Scholar
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science https://doi.org/10.1126/science.abj8754 (2021).
Branden, C. & Tooze, J. Introduction to Protein Structure (Garland Science, 1999).
Parsons, J., Holmes, J. B., Rojas, J. M., Tsai, J. & Strauss, C. E. M. Practical conversion from torsion space to Cartesian space for in silico protein synthesis. J. Comput. Chem. 26, 1063–1068 (2005).
Article CAS PubMed Google Scholar
AlQuraishi, M. ProteinNet: a standardized data set for machine learning of protein structure. BMC Bioinformatics 20, 311 (2019).
Article PubMed PubMed Central Google Scholar
Fuchs, F., Worrall, D., Fischer, V. & Welling, M. SE(3)-transformers: 3D roto-translation equivariant attention networks. In Advances in Neural Information Processing Systems vol. 33 1970–1981 (Curran, 2020).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers) (eds. Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems vol. 30 (Curran, 2017).
Lee, H.-J. & Zheng, J. J. PDZ domains and their binding partners: structure, specificity, and modification. Cell Commun. Signal. 8, 8 (2010).
Article PubMed PubMed Central CAS Google Scholar
Song, J., Hao, Y., Du, Z., Wang, Z. & Ewing, R. M. Identifying novel protein complexes in cancer cells using epitope-tagging of endogenous human genes and affinity-purification mass spectrometry. J. Proteome Res. 11, 5630–5641 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chatr-aryamontri, A. et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45, D369–D379 (2017).
Article CAS PubMed Google Scholar
Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Article CAS PubMed Google Scholar
Martins, A. & Astudillo, R. From Softmax to Sparsemax: a sparse model of attention and multi-label classification. In International Conference on Machine Learning 1614–1623 (PMLR, 2016).
Rasmussen, C. E. & Williams, C. K. I. Gaussian Processes for Machine Learning (MIT Press, 2005).
Maclaurin, D., Duvenaud, D. & Adams, R. Gradient-based hyperparameter optimization through reversible learning. In International Conference on Machine Learning 2113–2122 (PMLR, 2015).
Lorraine, J. & Duvenaud, D. Stochastic hyperparameter optimization through hypernetworks. Preprint at https://arxiv.org/abs/1802.09419 (2018).
Burgess, D. J. Spatial transcriptomics coming of age. Nat. Rev. Genet. 20, 317 (2019).
Article CAS PubMed Google Scholar
Reddy, R. J. et al. Early signaling dynamics of the epidermal growth factor receptor. Proc. Natl Acad. Sci. USA 113, 3114–3119 (2016).
Article CAS PubMed PubMed Central Google Scholar
Maier, T., Güell, M. & Serrano, L. Correlation of mRNA and protein in complex biological samples. FEBS Lett. 583, 3966–3973 (2009).
Article CAS PubMed Google Scholar
Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).
Article CAS PubMed PubMed Central Google Scholar
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Article CAS PubMed PubMed Central Google Scholar
Costello, J. C. et al. A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol. 32, 1202–1212 (2014).
Aldridge, B. B., Burke, J. M., Lauffenburger, D. A. & Sorger, P. K. Physicochemical modelling of cell signalling pathways. Nat. Cell Biol. 8, 1195–1203 (2006).
Article CAS PubMed Google Scholar
Rackauckas, C. et al. Universal differential equations for scientific machine learning. Preprint at https://arxiv.org/abs/2001.04385 (2020).
Yang, J., Li, A., Li, Y., Guo, X. & Wang, M. A novel approach for drug response prediction in cancer cell lines via network representation learning. Bioinformatics 35, 1527–1535 (2019).
Article CAS PubMed Google Scholar
Neil, D., Pfeiffer, M. & Liu, S.-C. Phased LSTM: accelerating recurrent network training for long or event-based sequences. In Advances in Neural Information Processing Systems vol. 29 (Curran, 2016).
Eydgahi, H. et al. Properties of cell death models calibrated and compared using Bayesian approaches. Mol. Syst. Biol. 9, 644 (2013).
Article CAS PubMed PubMed Central Google Scholar
Dillon, J. V. et al. TensorFlow distributions. Preprint at https://arxiv.org/abs/1711.10604 (2017).
Bingham, E. et al. Pyro: deep universal probabilistic programming. J. Mach. Learn. Res. 20, 1–6 (2019).
Google Scholar
Hafner, M., Niepel, M. & Sorger, P. K. Alternative drug sensitivity metrics improve preclinical cancer pharmacogenomics. Nat. Biotechnol. 35, 500–502 (2017).
Article CAS PubMed PubMed Central Google Scholar
Saar-Tsechansky, M. & Provost, F. Handling missing values when applying classification models. J. Mach. Learn. Res. 8, 1623–1657 (2007).
Google Scholar
Bepler, T. & Berger, B. Learning the protein language: evolution, structure, and function. Cell Syst. 12, 654–669 (2021).
Article CAS PubMed Google Scholar
Bepler, T. & Berger, B. Learning protein sequence embeddings using information from structure. In International Conference on Learning Representations (2019).
Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M. & Church, G. M. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods 16, 1315–1322 (2019).
Article CAS PubMed PubMed Central Google Scholar
Elnaggar, A. et al. ProtTrans: towards cracking the language of lifes code through self-supervised deep learning and high performance computing. IEEE Trans. Pattern Anal. Mach. Intel. https://doi.org/10.1109/TPAMI.2021.3095381 (2021).
Madani, A. et al. ProGen: language modeling for protein generation. Preprint at https://arxiv.org/abs/2004.03497 (2020).
Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Biswas, S., Khimulya, G., Alley, E. C., Esvelt, K. M. & Church, G. M. Low-N protein engineering with data-efficient deep learning. Nat. Methods 18, 389–396 (2021).
Article CAS PubMed Google Scholar
Weißenow, K., Heinzinger, M. & Rost, B. Protein language model embeddings for fast, accurate, alignment-free protein structure prediction. Preprint at bioRxiv https://doi.org/10.1101/2021.07.31.454572 (2021).
Bileschi, M. L. et al. Using deep learning to annotate the protein universe. Preprint at bioRxiv https://doi.org/10.1101/626507 (2019).
Lai, B. & Xu, J. Accurate protein function prediction via graph attention networks with predicted structure information. Preprint at bioRxiv https://doi.org/10.1101/2021.06.16.448727 (2021).
Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 3168 (2021).
Article PubMed PubMed Central CAS Google Scholar
Rao, R. et al. MSA Transformer. Preprint at bioRxiv https://doi.org/10.1101/2021.02.12.430858 (2021).
Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hu, W. et al. Strategies for pre-training graph neural networks. In International Conference on Learning Representations (2019).
Liu, S., Demirel, M. F. & Liang, Y. N-gram graph: simple unsupervised representation for graphs, with applications to molecules. In Advances in Neural Information Processing Systems vol. 32 (Curran, 2019).
Chithrananda, S., Grand, G. & Ramsundar, B. ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. Preprint at https://arxiv.org/abs/2010.09885 (2020).
Wang, Y., Wang, J., Cao, Z. & Farimani, A. B. MolCLR: molecular contrastive learning of representations via graph neural networks. Preprint at https://arxiv.org/abs/2102.10056 (2021).
Zhu, J. et al. Dual-view molecule pre-training. Preprint at https://arxiv.org/abs/2106.10234 (2021).
Goodfellow, I. et al. Generative adversarial nets. In Advances in Neural Information Processing Systems (eds. Ghahramani, Z. et al.) 2672–2680 (Curran, 2014).
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).
Kobyzev, I., Prince, S. J. D. & Brubaker, M. A. Normalizing flows: an introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. https://doi.org/10.1109/TPAMI.2020.2992934 (2020).
Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. Preprint at https://arxiv.org/abs/1503.03585 (2015).
Karras, T., Aila, T., Laine, S. & Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In International Conference on Learning Representations (2018).
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. Preprint at https://arxiv.org/abs/1805.11973 (2018).
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Article PubMed PubMed Central CAS Google Scholar
Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).
Article CAS PubMed Google Scholar
Killoran, N., Lee, L. J., Delong, A., Duvenaud, D. & Frey, B. J. Generating and designing DNA with deep generative models. Preprint at https://arxiv.org/abs/1712.06148 (2017).
Anand, N., Eguchi, R. & Huang, P.-S. Fully differentiable full-atom protein backbone generation. In International Conference on Learning Representations Workshop (2019).
Ingraham, J., Garg, V., Barzilay, R. & Jaakkola, T. Generative models for graph-based protein design. In Advances in Neural Information Processing Systems vol. 32 (Curran, 2019).
Marouf, M. et al. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11, 166 (2020).
Article CAS PubMed PubMed Central Google Scholar
Johnson-Roberson, M. et al. Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? Preprint at https://arxiv.org/abs/1610.01983 (2017).
Martin, R. M. Electronic Structure: Basic Theory and Practical Methods (Cambridge University Press, 2008).
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
Article CAS PubMed PubMed Central Google Scholar
Brockherde, F. et al. Bypassing the Kohn–Sham equations with machine learning. Nat. Commun. 8, 872 (2017).
Article PubMed PubMed Central CAS Google Scholar
Zhang, L., Han, J., Wang, H., Car, R., & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Article CAS PubMed Google Scholar
OpenAI et al. Solving Rubik’s cube with a robot hand. Preprint at https://arxiv.org/abs/1910.07113 (2019).
Kulkarni, T. D., Whitney, W. F., Kohli, P. & Tenenbaum, J. B. Deep convolutional inverse graphics network. in Proc. 28th International Conference on Neural Information Processing Systems Vol. 2, 2539–2547 (MIT Press, 2015).
Carreira-Perpinan, M. A. & Hinton, G. E. On contrastive divergence learning. Aistats 10, 33–40 (2005).
Google Scholar
Jumper, J. M., Faruk, N. F., Freed, K. F. & Sosnick, T. R. Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in CPU-hours. PLoS Comput. Biol. 14, e1006578 (2018).
Ingraham, J., Riesselman, A., Sander, C. & Marks, D. Learning protein structure with a differentiable simulator. In International Conference on Learning Representations (2019).
Wu, J. et al. EBM-Fold: fully-differentiable protein folding powered by energy-based models. Preprint at https://arxiv.org/abs/2105.04771 (2021).
Walker, S. G. in Bayesian Nonparametrics (eds. Holmes, C., Hjort, N. L., Müller, P. & Walker, S. G.) 22–34 (Cambridge Univ. Press, 2010).
Rezende, D. J., Mohamed, S. & Wierstra, D. Stochastic backpropagation and approximate inference in deep generative models. in Proceedings of the 31st International Conference on Machine Learning Vol. 32, II-1278–II-1286 (JMLR.org, 2014).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Suarez, J., Du, Y., Mordach, I. & Isola, P. Neural MMO v1.3: a massively multiagent game environment for training and evaluating neural networks. In Proc. 19th International Conference on Autonomous Agents and MultiAgent Systems 2020–2022 (International Foundation for Autonomous Agents and Multiagent Systems, 2020).
Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).
Article CAS PubMed Google Scholar
Mikulak-Klucznik, B. et al. Computational planning of the synthesis of complex natural products. Nature https://doi.org/10.1038/s41586-020-2855-y (2020).
Eastman, P., Shi, J., Ramsundar, B. & Pande, V. S. Solving the RNA design problem with reinforcement learning. PLoS Comput. Biol. 14, e1006176 (2018).
Article PubMed PubMed Central CAS Google Scholar
Webb, S. Deep learning for biology. Nature 554, 555–557 (2018).
Article CAS PubMed Google Scholar
Cho, J., Lee, K., Shin, E., Choy, G. & Do, S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? Preprint at https://arxiv.org/abs/1511.06348 (2016).
Zhou, J. et al. Graph neural networks: a review of methods and applications. Preprint at https://arxiv.org/abs/1812.08434 (2021).
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2021).
Article PubMed Google Scholar
Bowman, S. R. et al. Generating sentences from a continuous space. In Proc. 20th SIGNLL Conference on Computational Natural Language Learning 10–21 (Association for Computational Linguistics, 2016).
Lample, G. & Charton, F. Deep learning for symbolic mathematics. In International Conference for Learning Representations (2020).
Grefenstette, E., Hermann, K. M., Suleyman, M. & Blunsom, P. Learning to transduce with unbounded memory. In Advances in Neural Information Processing Systems vol. 28 (Curran, 2015).
Grover, A., Wang, E., Zweig, A. & Ermon, S. Stochastic optimization of sorting networks via continuous relaxations. In International Conference on Learning Representations (2018).
Graves, A. Adaptive computation time for recurrent neural networks. Preprint at https://arxiv.org/abs/1603.08983 (2016).
Trask, A. et al. Neural arithmetic logic units. In Advances in Neural Information Processing Systems vol. 31 (Curran, 2018).
Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. In International Conference on Machine Learning 2323–2332 (PMLR, 2018).
Amodei, D. & Hernandez, D. AI and compute. Heruntergeladen Von Httpsblog Openai Comaiand-Compute (2018).
Weld, D. S. & Bansal, G. The challenge of crafting intelligible intelligence. Commun. ACM 62, 70–79 (2019).
Article Google Scholar
Chakraborty, S. et al. Interpretability of deep learning models: a survey of results. in 2017 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computed, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI) 1–6 (IEEE, 2017).
Godfrey, J. J., Holliman, E. C. & McDaniel, J. SWITCHBOARD: telephone speech corpus for research and development. In Proc. 1992 IEEE international Conference on Acoustics, Speech and Signal Processing Vol. 1, 517–520 (IEEE Computer Society, 1992).
Han, K. J., Chandrashekaran, A., Kim, J. & Lane, I. The CAPIO 2017 conversational speech recognition system. Preprint at https://arxiv.org/abs/1801.00059 (2018).
Schütt, K. T. et al. (eds.) Machine Learning Meets Quantum Physics (Springer, 2020).
Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins 87, 1011–1020 (2019).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank N. Bouatta for comments on early versions of this manuscript. This work is supported by DARPA PANACEA program grant HR00111920022 and NCI/NIH grant U54-CA225088 to P.K.S.

Author information

Authors and Affiliations

Department of Systems Biology, Columbia University, New York, NY, USA
Mohammed AlQuraishi
Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Mohammed AlQuraishi & Peter K. Sorger

Authors

Mohammed AlQuraishi
View author publications
You can also search for this author in PubMed Google Scholar
Peter K. Sorger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mohammed AlQuraishi or Peter K. Sorger.

Ethics declarations

Competing interests

P.K.S. is a member of the scientific advisory board or board of directors of Glencoe Software, Applied Biomath, RareCyte and NanoString and a consultant to Montai Health and Merck; he has equity in several of these companies. P.K.S. declares that none of these relationships are directly or indirectly related to the content of this manuscript. M.A.Q. is a member of the scientific advisory board of FL2021-002, a Foresite Labs company, and consults for Interline Therapeutics.

Additional information

Peer review information Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

AlQuraishi, M., Sorger, P.K. Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat Methods 18, 1169–1180 (2021). https://doi.org/10.1038/s41592-021-01283-4

Download citation

Received: 13 January 2021
Accepted: 27 August 2021
Published: 04 October 2021
Issue Date: October 2021
DOI: https://doi.org/10.1038/s41592-021-01283-4

This article is cited by

Predicting the effect of chemicals on fruit using graph neural networks
- Junming Han
- Tong Li
- Ziyi Yang
Scientific Reports (2024)
Learning spiking neuronal networks with artificial neural networks: neural oscillations
- Ruilin Zhang
- Zhongyi Wang
- Yao Li
Journal of Mathematical Biology (2024)
Biochemist Investigates Protein Folding
- Richard F. Dods
Biomedical Materials & Devices (2024)
Homologous Pairs of Low and High Temperature Originating Proteins Spanning the Known Prokaryotic Universe
- Evan Komp
- Humood N. Alanzi
- David A. C. Beck
Scientific Data (2023)
MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms
- Cemal Erdem
- Sean M. Gross
- Marc R. Birtwistle
Nature Communications (2023)

Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms

Subjects

Abstract

Access options

Similar content being viewed by others

A guide to machine learning for biologists

Current progress and open challenges for applying deep learning across the biosciences

Machine learning methods to model multicellular complexity and tissue specificity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

This article is cited by

Predicting the effect of chemicals on fruit using graph neural networks

Learning spiking neuronal networks with artificial neural networks: neural oscillations

Biochemist Investigates Protein Folding

Homologous Pairs of Low and High Temperature Originating Proteins Spanning the Known Prokaryotic Universe

MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links