Perspective | Published:

Learning biological networks: from modules to dynamics

Nature Chemical Biology volume 4, pages 658664 (2008) | Download Citation

Subjects

Abstract

Learning regulatory networks from genomics data is an important problem with applications spanning all of biology and biomedicine. Functional genomics projects offer a cost-effective means of greatly expanding the completeness of our regulatory models, and for some prokaryotic organisms they offer a means of learning accurate models that incorporate the majority of the genome. There are, however, several reasons to believe that regulatory network inference is beyond our current reach, such as (i) the combinatorics of the problem, (ii) factors we can't (or don't often) collect genome-wide measurements for and (iii) dynamics that elude cost-effective experimental designs. Recent works have demonstrated the ability to reconstruct large fractions of prokaryotic regulatory networks from compendiums of genomics data; they have also demonstrated that these global regulatory models can be used to predict the dynamics of the transcriptome. We review an overall strategy for the reconstruction of global networks based on these results in microbial systems.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & A new approach to decoding life: systems biology. Annu. Rev. Genomics Hum. Genet. 2, 343–372 (2001).

  2. 2.

    et al. Genomic and genetic dissection of an archaeal regulon. Proc. Natl. Acad. Sci. USA 98, 2521–2525 (2001).

  3. 3.

    et al. Systems biology as a foundation for genome-scale synthetic biology. Curr. Opin. Biotechnol. 17, 488–492 (2006).

  4. 4.

    Systems biology: a brief overview. Science 295, 1662–1664 (2002).

  5. 5.

    et al. Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002).

  6. 6.

    , & Comparative biology: beyond sequence analysis. Curr. Opin. Biotechnol. 18, 371–377 (2007).

  7. 7.

    , & Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics 7, 280 (2006).

  8. 8.

    et al. Revealing modular organization in the yeast transcriptional network. Nat. Genet. 31, 370–377 (2002).

  9. 9.

    et al. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436, 861–865 (2005).

  10. 10.

    et al. The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLoS Biol. 2, e328 (2004).

  11. 11.

    et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002).

  12. 12.

    et al. General transcription factor specified global gene regulation in archaea. Proc. Natl. Acad. Sci. USA 104, 4630–4635 (2007).

  13. 13.

    & ChIP-Seq data reveal nucleosome architecture of human promoters. Cell 131, 831–832 (2007).

  14. 14.

    ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613–614 (2007).

  15. 15.

    et al. A gateway-compatible yeast one-hybrid system. Genome Res. 14, 2093–2101 (2004).

  16. 16.

    Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol. 9, 67–103 (2002).

  17. 17.

    & Biological networks. Curr. Opin. Struct. Biol. 13, 193–202 (2003).

  18. 18.

    , & Reconstruction of microbial transcriptional regulatory networks. Curr. Opin. Biotechnol. 15, 70–77 (2004).

  19. 19.

    et al. How to infer gene networks from expression profiles. Mol. Syst. Biol. 3, 78 (2007).

  20. 20.

    , & Size matters: network inference tackles the genome scale. Mol. Syst. Biol. 3, 77 (2007).

  21. 21.

    & Activities and sensitivities in boolean network models. Phys. Rev. Lett. 93, 048701 (2004).

  22. 22.

    et al. Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620 (2000).

  23. 23.

    et al. Computational discovery of gene modules and regulatory networks. Nat. Biotechnol. 21, 1337–1342 (2003).

  24. 24.

    et al. Rich probabilistic models for gene expression. Bioinformatics 17 (suppl. 1), S243–S252 (2001).

  25. 25.

    , & Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics 19 (suppl. 1), i273–i282 (2003).

  26. 26.

    et al. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).

  27. 27.

    et al. Inferring subnetworks from perturbed expression profiles. Bioinformatics 17 (suppl. 1), S215–S224 (2001).

  28. 28.

    & Bayesian Inference in Statistical Analysis (Wiley-Interscience, New York, 1992).

  29. 29.

    Causality: Models, Reasoning, and Inference 8th ed. (Cambridge University Press, Cambridge, UK, 2001).

  30. 30.

    et al. Linear modeling of mRNA expression levels during CNS development and injury. Pac. Symp. Biocomput. 1999, 41–52 (1999).

  31. 31.

    , & Modeling regulatory networks with weight matrices. Pac. Symp. Biocomput. 1999, 112–123 (1999).

  32. 32.

    et al. Genetic network modeling. Pharmacogenomics 3, 507–525 (2002).

  33. 33.

    , & Linear modeling of genetic networks from experimental data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 355–366 (2000).

  34. 34.

    , & The Elements of Statistical Learning (Springer-Verlag, New York, 2001).

  35. 35.

    , & Robust design of biological experiments. Proc. Neural Inf. Process. Symp. 18, 363–370 (2005).

  36. 36.

    Statistical Methods, Experimental Design and Scientific Inference (Oxford University Press, Oxford, 1935).

  37. 37.

    & Optimum Experimental Designs (Oxford University Press, Oxford, 1992).

  38. 38.

    , & Statistics for Experimenters (John Wiley & Sons, New York, 1978).

  39. 39.

    et al. Systems level insights into the stress response to UV radiation in the halophilic archaeon Halobacterium NRC-1. Genome Res. 14, 1025–1035 (2004).

  40. 40.

    , & Discovering statistically significant biclusters in gene expression data. Bioinformatics 18 (suppl. 1), S136–S144 (2002).

  41. 41.

    , & Biclustering microarray data by Gibbs sampling. Bioinformatics 19 (suppl. 2), II196–II205 (2003).

  42. 42.

    et al. EXPANDER–an integrative program suite for microarray data analysis. BMC Bioinformatics 6, 232 (2005).

  43. 43.

    et al. A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22, 1122–1129 (2006).

  44. 44.

    , & Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes. Bioinformatics 24, 889–896 (2008).

  45. 45.

    et al. Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13, 703–716 (2003).

  46. 46.

    , & Automatic layout and visualization of biclusters. Algorithms Mol. Biol. 1, 15 (2006).

  47. 47.

    & Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 93–103 (2000).

  48. 48.

    et al. Predictome: a database of putative functional links between proteins. Nucleic Acids Res. 30, 306–309 (2002).

  49. 49.

    et al. Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 5, R35 (2004).

  50. 50.

    et al. A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res. 33, 880–892 (2005).

  51. 51.

    et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007).

  52. 52.

    et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (suppl. 1), S7 (2006).

  53. 53.

    et al. RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 34, D394–D397 (2006).

  54. 54.

    , & Determination of causal connectivities of species in reaction networks. Proc. Natl. Acad. Sci. USA 99, 5816–5821 (2002).

  55. 55.

    & Statistical construction of chemical reaction mechanism from measured time series. J. Phys. Chem. 99, 970–979 (1995).

  56. 56.

    & Dynamic models of gene expression and classification. Funct. Integr. Genomics 1, 269–278 (2001).

  57. 57.

    et al. Uncovering a macrophage transcriptional program by integrating evidence from motif scanning and expression dynamics. PLoS Comput. Biol. 4, e1000021 (2008).

  58. 58.

    , & Inferring pairwise regulatory relationships from multiple time series datasets. Bioinformatics 23, 755–763 (2007).

  59. 59.

    et al. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301, 102–105 (2003).

  60. 60.

    et al. Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc. Natl. Acad. Sci. USA 100, 5944–5949 (2003).

  61. 61.

    , & Reverse engineering gene networks using singular value decomposition and robust regression. Proc. Natl. Acad. Sci. USA 99, 6163–6168 (2002).

  62. 62.

    et al. The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol. 7, R36 (2006).

  63. 63.

    et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 131, 1354–1365 (2007).

  64. 64.

    et al. Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4. Nature 441, 173–178 (2006).

  65. 65.

    , & Constructing and analyzing a large-scale gene-to-gene regulatory network–lasso-constrained inference and biological validation. IEEE/ACM Trans. Comput. Biol. Bioinform. 2, 254–261 (2005).

  66. 66.

    et al. A systems view of haloarchaeal strategies to withstand stress from transition metals. Genome Res. 16, 841–854 (2006).

  67. 67.

    et al. An integrated systems approach for understanding cellular responses to gamma radiation. Mol. Syst. Biol. 2, 47 (2006).

  68. 68.

    , , & Least angle regression. Ann. Stat. 32, 407–499 (2004).

  69. 69.

    et al. Proteomic analysis of an extreme halophilic Archaeon, Halobacterium sp. NRC-1. Mol. Cell. Proteomics 2, 506–524 (2003).

  70. 70.

    et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999).

  71. 71.

    et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001).

  72. 72.

    et al. UniPep, a database for human N-linked glycosites: a resource for biomarker discovery. Genome Biol. 7, R73 (2006).

  73. 73.

    & Metabolomics integrated with transcriptomics: assessing systems response to sulfur-deficiency stress. Physiol. Plant. 132, 190–198 (2008).

  74. 74.

    Integration of metabolomics and proteomics in molecular plant physiology–coping with the complexity by data-dimensionality reduction. Physiol. Plant. 132, 176–189 (2008).

  75. 75.

    et al. Metabolomics. Curr. Drug Metab. 9, 89–98 (2008).

  76. 76.

    , & Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol. 2, 886–897 (2004).

  77. 77.

    et al. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 4, R54 (2003).

  78. 78.

    & Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem. 277, 28058–28064 (2002).

  79. 79.

    et al. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science 320, 1081–1085 (2008).

  80. 80.

    et al. Metagenomics reveals our incomplete knowledge of global diversity. Bioinformatics 24, 2124–2125 (2008).

  81. 81.

    Metagenomics: exploring unseen communities. Nature 453, 687–690 (2008).

  82. 82.

    Genomics of host-pathogen interactions. Prog. Drug Res. 64, 313–343 (2007).

Download references

Acknowledgements

We thank E. Vanden-Eijnden, D. Reiss, A. Madar, N. Baliga, B. Church and P. Waltman. We thank D. Shasha and the anonymous reviewers for detailed and insightful comments. R.B. is supported by the US National Science Foundation (DBI-0820757), the US Department of Energy GTL program and the US Department of Defense Computing and Society program.

Author information

Affiliations

  1. Richard Bonneau is in the Biology and Courant Computer Science Department, New York University, 100 Washington Square East, 1009 Silver Center, New York, New York 10003-6688, USA. bonneau@nyu.edu

    • Richard Bonneau

Authors

  1. Search for Richard Bonneau in:

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nchembio.122

Further reading