Key Points
-
Software is crucial for biological research, has an impact on research productivity and enables researchers to explore massive databases and knowledge-bases.
-
The workflow in systems biology generally consists of iterative cycles of: experiments; data acquisition and analysis; modelling; and computational analysis. Each of these processes is supported by software tools.
-
Standardization and interoperability are crucial for the efficient use of software tools and data resources.
-
A software platform enables various software tools, data resources and knowledge sources to be accessible in a consistent manner. This dramatically improves the productivity of research and reduces potential errors in the workflow.
-
In the future, software platforms and data or knowledge resources need to be supported through community-wide efforts. However, this requires a broader understanding of social dynamics, psychology and the economics of research activities; additionally, platforms need to be supported by user-friendly software tools.
Abstract
Understanding complex biological systems requires extensive support from software tools. Such tools are needed at each step of a systems biology computational workflow, which typically consists of data handling, network inference, deep curation, dynamical simulation and model analysis. In addition, there are now efforts to develop integrated software platforms, so that tools that are used at different stages of the workflow and by different researchers can easily be used together. This Review describes the types of software tools that are required at different stages of systems biology research and the current options that are available for systems biology researchers. We also discuss the challenges and prospects for modelling the effects of genetic changes on physiology and the concept of an integrated platform.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Kitano, H. Systems biology: a brief overview. Science 295, 1662–1664 (2002).
Kitano, H. Computational systems biology. Nature 420, 206–210 (2002).
Kitano, H. Perspectives on systems biology. New Generation Computing 18, 199–216 (2000).
Ideker, T., Galitski, T. & Hood, L. A new approach to decoding life: systems biology. Annu. Rev. Genomics Hum. Genet. 2, 343–372 (2001).
Tyson, J. J., Chen, K. & Novak, B. Network dynamics and cell physiology. Nature Rev. Mol. Cell Biol. 2, 908–916 (2001).
Novak, B. & Tyson, J. J. Numerical analysis of a comprehensive model of M-phase control in Xenopus oocyte extracts and intact embryos. J. Cell Sci. 106, 1153–1168 (1993).
Chen, K. C. et al. Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell 15, 3841–3862 (2004). A pioneering study using computational modelling and analysis of the budding yeast cell cycle. The model computationally reproduced the phenotypes of various gene deletion mutants.
Aoki, K., Yamada, M., Kunida, K., Yasuda, S. & Matsuda, M. Processive phosphorylation of ERK MAP kinase in mammalian cells. Proc. Natl Acad. Sci. USA 108, 12675–12680 (2011).
Schoeberl, B. et al. An ErbB3 antibody, MM-121, is active in cancers with ligand-dependent activation. Cancer Res. 70, 2485–2494 (2010).
Schoeberl, B. et al. Therapeutically targeting ErbB3: a key node in ligand-induced activation of the ErbB receptor-PI3K axis. Sci. Signal. 2, ra31 (2009).
Evans, D., Hagiu, A. & Schmalensee, R. Invisible Engines: How Software Platforms Drive Innovation and Transform Industries. (MIT Press, 2006). An easy-to-read introduction to the concept of software platforms in industries.
Lee, T. L. Big data: open-source format needed to aid wiki collaboration. Nature 455, 461 (2008).
Brown, F. Saving big pharma from drowning in the data pool. Drug Discov. Today 11, 1043–1045 (2006).
Kröger, P. & Bry, F. A computational biology database digest: data, data analysis, and data management. Distributed and Parallel Databases 13, 7–42 (2003).
Field, D., Tiwari, B. & Snape, J. Bioinformatics and data management support for environmental genomics. PLoS Biol. 3, e297 (2005).
Keator, D. B. Management of information in distributed biomedical collaboratories. Methods Mol. Biol. 569, 1–23 (2009).
Van Deun, K., Smilde, A. K., van der Werf, M. J., Kiers, H. A. & Van Mechelen, I. A structured overview of simultaneous component based data integration. BMC Bioinformatics 10, 246 (2009).
Brazma, A., Krestyaninova, M. & Sarkans, U. Standards for systems biology. Nature Rev. Genet. 7, 593–605 (2006).
Brazma, A. et al. Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genet. 29, 365–371, (2001).
Taylor, C. F. et al. The minimum information about a proteomics experiment (MIAPE). Nature Biotech. 25, 887–893 (2007).
Martens, L., Palazzi, L. M. & Hermjakob, H. Data standards and controlled vocabularies for proteomics. Methods Mol. Biol. 484, 279–286 (2008).
Taylor, C. F. et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nature Biotech. 26, 889–896 (2008).
Saltz, J. et al. caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid. Bioinformatics 22, 1910–1916 (2006).
Oinn, T. et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 3045–3054 (2004).
Lee, S., Wang, T. D., Hashmic, N. & Cummings, M. P. Bio-STEER: A semantic Web workflow tool for Grid computing in the life sciences. Future Generation Computer Systems 23, 497–509 (2007).
Giardine, B. et al. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15, 1451–1455 (2005).
Schadt, E. E., Friend, S. H. & Shaywitz, D. A. A network view of disease and compound screening. Nature Rev. Drug Discov. 8, 286–295 (2009).
van't Veer, L. J. et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002).
Altshuler, D., Daly, M. J. & Lander, E. S. Genetic mapping in human disease. Science 322, 881–888 (2008).
Dewan, A. et al. HTRA1 promoter polymorphism in wet age-related macular degeneration. Science 314, 989–992 (2006).
Yang, Z. et al. A variant of the HTRA1 gene increases susceptibility to age-related macular degeneration. Science 314, 992–993 (2006).
Chesler, E. J. et al. Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nature Genet. 37, 233–242 (2005).
Monks, S. A. et al. Genetic inheritance of gene expression in human cell lines. Am. J. Hum. Genet. 75, 1094–1105 (2004).
Morley, M. et al. Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004).
Zhu, J. et al. An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet. Genome Res. 105, 363–374 (2004).
Zhu, J. et al. Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations. PLoS Comput. Biol. 3, e69 (2007).
Zhu, J. et al. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genet. 40, 854–861 (2008).
Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (Suppl. 1), S7 (2006).
Faith, J. J. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5, e8 (2007).
Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64–68 (2002).
Alon, U. Network motifs: theory and experimental approaches. Nature Rev. Genet. 8, 450–461 (2007).
Fadda, A. et al. Inferring the transcriptional network of Bacillus subtilis. Mol. Biosyst. 5, 1840–1852 (2009).
Cho, B. K. et al. The transcription unit architecture of the Escherichia coli genome. Nature Biotech. 27, 1043–1049 (2009).
Mendoza-Vargas, A. et al. Genome-wide identification of transcription start sites, promoters and transcription factor binding sites in E. coli. PloS ONE 4, e7526 (2009).
Lemmens, K. et al. DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol. 10, R27 (2009).
De Smet, R. & Marchal, K. Advantages and limitations of current network inference methods. Nature Rev. Microbiol. 8, 717–729 (2010).
Ferrucci, D. et al. Building Watson: an overview of the DeepQA project. AI Magazine 31, 3 (2010).
Oda, K. & Kitano, H. A comprehensive map of the toll-like receptor signaling network. Mol. Syst. Biol. 2, 2006.0015 (2006).
Oda, K., Matsuoka, Y., Funahashi, A. & Kitano, H. A comprehensive pathway map of epidermal growth factor receptor signaling. Mol. Syst. Biol. 1, 2005.0010 (2005).
Caron, E. et al. A comprehensive map of the mTOR signaling network. Mol. Syst. Biol. 6, 453 (2010).
Kaizu, K. et al. A comprehensive molecular interaction map of the budding yeast cell cycle. Mol. Syst. Biol. 6, 415 (2010).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Joshi-Tope, G. et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33, D428–D432 (2005).
Mi, H. et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33, D284–D288 (2005).
Cerami, E. G. et al. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 39, D685–D690 (2011).
Karp, P. D. et al. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 33, 6083–6089 (2005).
Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003). An original paper on SBML that triggered various standardization efforts in systems biology.
Demir, E. et al. The BioPAX community standard for pathway data sharing. Nature Biotech. 28, 935–942 (2010).
Le Novere, N. et al. The Systems Biology Graphical Notation. Nature Biotech. 27, 735–741 (2009).
Le Novere, N. et al. Minimum information requested in the annotation of biochemical models (MIRIAM). Nature Biotech. 23, 1509–1515 (2005).
Kitano, H., Funahashi, A., Matsuoka, Y. & Oda, K. Using process diagrams for the graphical representation of biological networks. Nature Biotech. 23, 961–966 (2005).
Klipp, E., Liebermeister, W., Helbig, A., Kowald, A. & Schaber, J. Systems biology standards — the community speaks. Nature Biotech. 25, 390–391 (2007).
Sauro, H. M. et al. Next generation simulation tools: the Systems Biology Workbench and BioSPICE integration. OMICS 7, 355–372 (2003).
van Iersel, M. P. et al. Presenting and exploring biological pathways with PathVisio. BMC Bioinformatics 9, 399 (2008).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Bauer-Mehren, A., Furlong, L. I. & Sanz, F. Pathway databases and tools for their exploitation: benefits, current limitations and challenges. Mol. Syst. Biol. 5, 290 (2009).
Calzone, L., Gelay, A., Zinovyev, A., Radvanyi, F. & Barillot, E. A comprehensive modular map of molecular interactions in RB/E2F pathway. Mol. Syst. Biol. 4, 173 (2008).
Thiele, I. & Palsson, B. O. Reconstruction annotation jamborees: a community approach to systems biology. Mol. Syst. Biol. 6, 361 (2010). This paper discusses issues regarding community efforts to reconstruct comprehensive metabolic networks.
Thiele, I. & Palsson, B. O. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc. 5, 93–121 (2010).
Feist, A. M., Herrgard, M. J., Thiele, I., Reed, J. L. & Palsson, B. O. Reconstruction of biochemical networks in microorganisms. Nature Rev. Microbiol. 7, 129–143 (2009). A review on the current state-of-the-art in data-driven genome-wide network reconstruction.
Herrgard, M. J. et al. A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nature Biotech. 26, 1155–1160 (2008).
Wu, G., Zhu, L., Dent, J. E. & Nardini, C. A comprehensive molecular interaction map for rheumatoid arthritis. PLoS ONE 5, e10137 (2010).
Matsuoka, Y., Ghosh, S., Kikuchi, N. & Kitano, H. Payao: a community platform for SBML pathway model curation. Bioinformatics 26, 1381–1383 (2010).
Pico, A. R. et al. WikiPathways: pathway editing for the people. PLoS Biol. 6, e184 (2008).
Wierling, C., Herwig, R. & Lehrach, H. Resources, standards and tools for systems biology. Brief. Funct. Genomic. Proteomic. 6, 240–251 (2007).
Klipp, E. et al. Systems Biology: A Textbook (Wiley-VCH, 2009). A text book with examples of modelling and computational analysis.
Lopez-Aviles, S., Kapuy, O., Novak, B. & Uhlmann, F. Irreversibility of mitotic exit is the consequence of systems-level feedback. Nature 459, 592–595 (2009).
McAdams, H. H. & Arkin, A. Stochastic mechanisms in gene expression. Proc. Natl Acad. Sci. USA. 94, 814–819 (1997).
Ozbudak, E. M., Thattai, M., Kurtser, I., Grossman, A. D. & van Oudenaarden, A. Regulation of noise in the expression of a single gene. Nature Genet. 31, 69–73 (2002).
Arkin, A., Ross, J. & McAdams, H. H. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics 149, 1633–1648 (1998).
Emonet, T., Macal, C. M., North, M. J., Wickersham, C. E. & Cluzel, P. AgentCell: a digital single-cell assay for bacterial chemotaxis. Bioinformatics 21, 2714–2721 (2005).
Hofestadt, R. & Thelen, S. Quantitative modeling of biochemical networks. Stud. Health Technol. Inform. 162, 3–16 (2011).
Blinov, M. L., Faeder, J. R., Goldstein, B. & Hlavacek, W. S. BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 20, 3289–3291 (2004).
Swainston, N. et al. Enzyme kinetics informatics: from instrument to browser. FEBS J. 277, 3769–3779 (2010).
Waltemath, D. et al. Minimum Information About a Simulation Experiment (MIASE). PLoS Comput. Biol. 7, e1001122 (2011).
Dada, J. O., Spasic, I., Paton, N. W. & Mendes, P. SBRML: a markup language for associating systems biology data with models. Bioinformatics 26, 932–938 (2010).
Hoops, S. et al. COPASI — a COmplex PAthway SImulator. Bioinformatics 22, 3067–3074 (2006).
Klipp, E., Herwig, R., Kowald, A., Wierling, C. & Lehrach, H. Systems Biology in Practice: Concepts, Implementation and Application (John Wiley & Sons, 2005).
Haefner, J. W. Modeling Biological Systems: Principles and Applications (Kluwer Academic Pub, 1996).
Kauffman, S. A. Metabolic stability and epigenesis in randomly constructed genetic nets. J.Theor. Biol. 22, 437–467 (1969).
Zheng, J. et al. SimBoolNet — a Cytoscape plugin for dynamic simulation of signaling networks. Bioinformatics 26, 141–142 (2010).
Iglesias, P. & Ingaalls, B. Control Theory and Systems Biology (MIT Press, 2009). An excellent collection of introductory articles on how control theory can be applied to systems biology analysis.
Chen, Q. et al. Genetic basis and molecular mechanism for idiopathic ventricular fibrillation. Nature 392, 293–296 (1998).
Noble, D. Modeling the heart — from genes to cells to the whole organ. Science 295, 1678–1682 (2002).
Nomura, T. Towards integration of biological and physiological functions at multiple levels. Front. Physiol. 1, 164 (2010).
Gleeson, P. et al. NeuroML: a language for describing data driven models of neurons and networks with a high degree of biological detail. PLoS Comput. Biol. 6, e1000815 (2010).
Asai, Y. et al. Specifications of insilicoML 1.0: a multilevel biophysical model description language. J. Physiol. Sci. 58, 447–458 (2008).
Plewczynski, D., La niewski, M., Augustyniak, R. & Ginalski, K. Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database. J. Comput. Chem. 32, 742–755 (2011).
Englebienne, P. & Moitessier, N. Docking ligands into flexible and solvated macromolecules. 4: are popular scoring functions accurate for this class of proteins? J. Chem. Inf. Model. 49, 1568–1580 (2009).
Swertz, M. A. & Jansen, R. C. Beyond standardization: dynamic software infrastructures for systems biology. Nature Rev. Genet. 8, 235–243 (2007).
Mi, H. & Thomas, P. PANTHER pathway: an ontology-based pathway database coupled with data analysis tools. Methods Mol. Biol. 563, 123–140 (2009).
Kemper, B. et al. PathText: a text mining integrator for biological pathway visualizations. Bioinformatics 26, i374–i381 (2010).
Maier, H. et al. LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts. Nucleic Acids Res. 33, W779–W782 (2005).
Huss, J. W. et al. The Gene Wiki: community intelligence applied to human gene annotation. Nucleic Acids Res. 38, D633–D639 (2010).
Callaway, E. No rest for the bio-wikis. Nature 468, 359–360 (2010).
Kitano, H., Ghosh, S. & Matsuoka, Y. Social engineering for virtual 'big science' in systems biology. Nat. Chem. Biol. 7, 323–326 (2011). This paper discusses social issues in community-driven efforts in systems biology.
Surowiecki, J. The Wisdom of Crowds. (Anchor, 2005).
Edwards, J. S. & Palsson, B. O. How will bioinformatics influence metabolic engineering? Biotechnol. Bioeng. 58, 162–169 (1998).
Edwards, J. S., Ibarra, R. U. & Palsson, B. O. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nature Biotech. 19, 125–130 (2001).
Smith, D. A. in Metabolism, Pharmacokinetics and Toxicity of Functional Groups 61–94 (Royal Society of Chemistry Publishing, 2010).
Acknowledgements
This work is, in part, supported by funding from the HD-Physiology Project of the Japan Society for the Promotion of Science (JSPS) to the Okinawa Institute of Science and Technology (OIST). Additional support is from a Canon Foundation Grant, the International Strategic Collaborative Research Program (BBSRC-JST) of the Japan Science and Technology Agency (JST), the Exploratory Research for Advanced Technology (ERATO) programme of JST to the Systems Biology Institute (SBI) and from a strategic cooperation partnership between the Luxembourg Centre for Systems Biomedicine and SBI.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary information S1 (table)
A comprehensive table of software tools and resources, including further information and Weblinks (XLS 52 kb)
Related links
Glossary
- Mutual information
-
A dimensionless quantity that measures the extent to which one random variable is informative about another variable. Zero mutual information between two random variables means that they are independent.
- Meta-database
-
A database for storing metadata, which was originally defined as 'data about data', such as tags and keywords. The database is used for integrating independent distributed databases.
- Ordinary differential equations
-
(ODEs). A type of differential equation involving functions of one independent variable, such as time, and derivatives of the functions with respect to the variable.
- Partial differential equations
-
A type of differential equation involving functions of several independent variables, such as time and spatial axes (that is, x, y and z), and partial derivatives of the functions with respect to those variables.
- Agent-based modelling
-
A class of computational models that simulate the interaction of agents to study their effects on a system. Agents are autonomous, decision-making entities that have heterogeneous characteristics; examples of agents are molecules or cells.
- Process algebra
-
A mathematical modelling language for describing the behaviour of distributed systems.
- Rule-based modelling
-
When used in biochemical science, this term refers to a way to model molecules and proteins as objects that interact with each other. The interactions are described as rules that define how the objects transform their attributes and the relationships between the objects.
- Phase-space analysis
-
A way to analyse the dynamics of a system in a space (the phase-space), in which each of the possible states of the system is represented as a unique point.
- Bifurcation analysis
-
A way to analyse the qualitative changes in the dynamics of a system that are caused by varying one or several parameter values continuously.
- Homeodynamics
-
A concept that views an organism as a dynamical system; this concept emerged after the concept of homeostasis. Biological systems can be considered as homeodynamic: they can lose stability and show diverse behaviours, such as bi-stability, periodicity and chaotic dynamics.
- Constraint-based reconstruction and analysis
-
(COBRA). A suite of methods to simulate, analyse and predict various phenotypes using genome-scale models. These methods are used particularly for metabolic networks.
Rights and permissions
About this article
Cite this article
Ghosh, S., Matsuoka, Y., Asai, Y. et al. Software for systems biology: from tools to integrated platforms. Nat Rev Genet 12, 821–832 (2011). https://doi.org/10.1038/nrg3096
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrg3096
This article is cited by
-
Optimization of Propofol Dose Estimated During Anesthesia Through Artificial Intelligence by Genetic Algorithm: Design and Clinical Assessment
Neural Processing Letters (2022)
-
Towards a comprehensive assessment of QSP models: what would it take?
Journal of Pharmacokinetics and Pharmacodynamics (2022)
-
A refinement strategy for identification of scientific software from bioinformatics publications
Scientometrics (2022)
-
MFCIS: an automatic leaf-based identification pipeline for plant cultivars using deep learning and persistent homology
Horticulture Research (2021)
-
LimeMap: a comprehensive map of lipid mediator metabolic pathways
npj Systems Biology and Applications (2021)