Article | Published:

Using deep learning to model the hierarchical structure and function of a cell

Nature Methods volume 15, pages 290298 (2018) | Download Citation

Abstract

Although artificial neural networks are powerful classifiers, their internal structures are hard to interpret. In the life sciences, extensive knowledge of cell biology provides an opportunity to design visible neural networks (VNNs) that couple the model's inner workings to those of real systems. Here we develop DCell, a VNN embedded in the hierarchical structure of 2,526 subsystems comprising a eukaryotic cell (http://d-cell.ucsd.edu/). Trained on several million genotypes, DCell simulates cellular growth nearly as accurately as laboratory observations. During simulation, genotypes induce patterns of subsystem activities, enabling in silico investigations of the molecular mechanisms underlying genotype–phenotype associations. These mechanisms can be validated, and many are unexpected; some are governed by Boolean logic. Cumulatively, 80% of the importance for growth prediction is captured by 484 subsystems (21%), reflecting the emergence of a complex phenotype. DCell provides a foundation for decoding the genetics of disease, drug resistance and synthetic life.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , & Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1915–1929 (2013).

  2. 2.

    , , , & Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition Understanding 196–201 (IEEE, 2011).

  3. 3.

    et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012).

  4. 4.

    , , & Deep convolutional neural networks for LVCSR. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 8614–8618 (IEEE, 2013).

  5. 5.

    et al. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011).

  6. 6.

    , & Deep learning. Nature 521, 436–444 (2015).

  7. 7.

    et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

  8. 8.

    An introduction to cybernetics. Br. J. Psychiatry 104, 590–592 (1958).

  9. 9.

    The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45, D331–D338 (2016).

  10. 10.

    et al. A gene ontology inferred from molecular networks. Nat. Biotechnol. 31, 38–45 (2013).

  11. 11.

    , , , & Inferring gene ontologies from pairwise similarity data. Bioinformatics 30, i34–i42 (2014).

  12. 12.

    & Siri of the cell: what biology could learn from the iPhone. Cell 157, 534–538 (2014).

  13. 13.

    et al. Translation of genotype to phenotype by a hierarchy of cell subsystems. Cell Syst. 2, 77–88 (2016).

  14. 14.

    Moonlighting is mainstream: paradigm adjustment required. BioEssays 34, 578–588 (2012).

  15. 15.

    et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).

  16. 16.

    et al. The genetic landscape of a cell. Science 327, 425–431 (2010).

  17. 17.

    et al. An integrated approach to characterize genetic interaction networks in yeast metabolism. Nat. Genet. 43, 656–662 (2011).

  18. 18.

    et al. Predicting genetic modifier loci using functional gene networks. Genome Res. 20, 1143–1153 (2010).

  19. 19.

    et al. An integrative multi-network and multi-classifier approach to predict genetic interactions. PLoS Comput. Biol. 6, e1000928 (2010).

  20. 20.

    , , & Futile protein folding cycles in the ER are terminated by the unfolded protein O-mannosylation pathway. Science 340, 978–981 (2013).

  21. 21.

    Fungal cell wall organization and biosynthesis. Adv. Genet. 31, 33–82 (2013).

  22. 22.

    & The unfolded protein response: from stress pathway to homeostatic regulation. Science 334, 1081–1086 (2011).

  23. 23.

    , , & The unfolded protein response is induced by the cell wall integrity mitogen-activated protein kinase signaling cascade and is required for cell wall integrity in Saccharomyces cerevisiae. Mol. Biol. Cell 20, 164–175 (2009).

  24. 24.

    et al. Comprehensive characterization of genes required for protein folding in the endoplasmic reticulum. Science 323, 1693–1697 (2009).

  25. 25.

    et al. A UV-induced genetic network links the RSC complex to nucleotide excision repair and shows dose-dependent rewiring. Cell Rep. 5, 1714–1724 (2013).

  26. 26.

    , & Ultraviolet radiation-mediated damage to cellular DNA. Mutat. Res. 571, 3–17 (2005).

  27. 27.

    Cours d'Économie Politique (Librairie Droz, 1964).

  28. 28.

    & Oxidative stress and programmed cell death in yeast. Front. Oncol. 2, 64 (2012).

  29. 29.

    & de la Torre-Ruiz, M.A. Glutaredoxins Grx4 and Grx3 of Saccharomyces cerevisiae play a role in actin dynamics through their Trx domains, which contributes to oxidative stress resistance. Appl. Environ. Microbiol. 76, 7826–7835 (2010).

  30. 30.

    Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).

  31. 31.

    et al. YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Res. 42, D731–D736 (2014).

  32. 32.

    et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

  33. 33.

    , , , & Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).

  34. 34.

    , & Classic and contemporary approaches to modeling biochemical reactions. Genes Dev. 24, 1861–1875 (2010).

  35. 35.

    et al. An integrated approach to characterize genetic interaction networks in yeast metabolism. Nat. Genet. 43, 656–662 (2011).

  36. 36.

    et al. A whole-cell computational model predicts phenotype from genotype. Cell 150, 389–401 (2012).

  37. 37.

    The mythos of model interpretability. Preprint at (2017).

  38. 38.

    & Understanding deep image representations by inverting them. In Proceedings of the IEEE conference on computer vision and pattern recognition 5188–5196 (IEEE, 2015).

  39. 39.

    , , & Hoggles: Visualizing object detection features. In Proceedings of the IEEE International Conference on Computer Vision 1–8 (IEEE, 2013).

  40. 40.

    , & Reconstructing an image from its local descriptors. In CVPR 2011 337–344 (IEEE, 2011).

  41. 41.

    et al. Interpretability of deep learning models: a survey of results. Paper presented at IEEE Smart World Congress 2017 Workshop: DAIS 2017, Workshop on Distributed Analytics InfraStructure and Algorithms for Multi-Organization Federations, San Francisco, CA, USA, 7–8 August 2017.

  42. 42.

    , & Neural machine translation by jointly learning to align and translate. Preprint at (2016).

  43. 43.

    , & Rationalizing neural predictions. Preprint at (2016).

  44. 44.

    , , & Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).

  45. 45.

    et al. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition 1–9 (IEEE, 2015).

  46. 46.

    , , , & Deeply-Supervised Nets. in AISTATS 2, 5 (2015).

  47. 47.

    & Batch normalization: accelerating deep network training by reducing internal covariate shift. Preprint at (2015).

  48. 48.

    & Adam: a method for stochastic optimization. Preprint at (2017).

  49. 49.

    , & Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

  50. 50.

    & Understanding intermediate layers using linear classifier probes. Preprint at (2016).

  51. 51.

    et al. Cytoscape.js: a graph theory library for visualisation and analysis. Bioinformatics 32, 309–311 (2016).

  52. 52.

    , & D3: data-driven documents. IEEE Trans. Vis. Comput. Graph. 17, 2301–2309 (2011).

  53. 53.

    React: Up & Running: Building Web Applications. (O'Reilly Media, 2016).

  54. 54.

    , , , & Document Object Model (DOM) level 3 core specification. W3C https://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/DOM3-Core.html. (2004).

  55. 55.

    & Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine (O'Reilly Media, 2015).

Download references

Acknowledgements

We gratefully acknowledge support for this work provided by grants from the National Institutes of Health to T.I. (TR002026, GM103504, CA209891, ES014811). We also wish to thank T. Sejnowski and M. Kramer for very helpful comments during development of this work.

Author information

Author notes

    • Jianzhu Ma
    • , Michael Ku Yu
    •  & Samson Fong

    These authors contributed equally to this work.

Affiliations

  1. Department of Medicine, University of California San Diego, La Jolla, California, USA.

    • Jianzhu Ma
    • , Michael Ku Yu
    • , Samson Fong
    • , Keiichiro Ono
    • , Eric Sage
    • , Barry Demchak
    •  & Trey Ideker
  2. Program in Bioinformatics, University of California San Diego, La Jolla, California, USA.

    • Michael Ku Yu
    •  & Trey Ideker
  3. Department of Bioengineering, University of California San Diego, La Jolla, California, USA.

    • Samson Fong
    •  & Trey Ideker
  4. Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.

    • Roded Sharan

Authors

  1. Search for Jianzhu Ma in:

  2. Search for Michael Ku Yu in:

  3. Search for Samson Fong in:

  4. Search for Keiichiro Ono in:

  5. Search for Eric Sage in:

  6. Search for Barry Demchak in:

  7. Search for Roded Sharan in:

  8. Search for Trey Ideker in:

Contributions

J.M., M.K.Y., S.F., R.S. and T.I. designed the study and developed the conceptual ideas. J.M. implemented the main algorithm. M.K.Y. collected all the input sources. J.M. and S.F. implemented all other computational methods and conducted analysis. J.M., M.K.Y., S.F. and T.I. wrote the manuscript with suggestions from the other authors. J.M., M.K.Y., S.F., K.O., E.S. and B.D. designed and developed the server.

Competing interests

T.I. is co-founder of Data4Cure, Inc. and has an equity interest. T.I. has an equity interest in Ideaya BioSciences, Inc. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies.

Corresponding author

Correspondence to Trey Ideker.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–3

  2. 2.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Table 1

    RLIPP scores for subsystems in the Gene Ontology andCliXO

  2. 2.

    Supplementary Table 2

    Boolean logic approximating the states of subsystems in the Gene Ontology and CliXO

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.4627