Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Opinion
  • Published:

Beyond standardization: dynamic software infrastructures for systems biology

Abstract

Progress in systems biology is seriously hindered by slow production of suitable software infrastructures. Biologists need infrastructure that easily connects to work that is done in other laboratories, for which standardization is helpful. However, the infrastructure must also accommodate the specifics of their biological system, but appropriate mechanisms to support variation are currently lacking. We argue that a minimal computer language, and a software tool called a generator, can be used to quickly produce customized software infrastructures that 'systems biologists really want to have'.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Software infrastructure for systems biology.
Figure 2: Cost effectiveness of development strategies.

Similar content being viewed by others

References

  1. Brazma, A., Krestyaninova, M. & Sarkans, U. Standards for systems biology. Nature Rev. Genet. 7, 593–605 (2006).

    Article  CAS  PubMed  Google Scholar 

  2. Stein, L. Creating a bioinformatics nation. Nature 417, 119–120 (2002).

    Article  CAS  PubMed  Google Scholar 

  3. Abiola, O. et al. The nature and identification of quantitative trait loci: a community's view. Nature Rev. Genet. 4, 911–916 (2003).

    PubMed  Google Scholar 

  4. Jansen, R. C. & Nap, J. P. Genetical genomics: the added value from segregation. Trends Genet. 17, 388–391 (2001).

    Article  CAS  PubMed  Google Scholar 

  5. Bystrykh, L. et al. Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'. Nature Genet. 37, 225–232 (2005).

    Article  CAS  PubMed  Google Scholar 

  6. Chesler, E. J. et al. Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nature Genet. 37, 233–242 (2005).

    Article  CAS  PubMed  Google Scholar 

  7. Alberts, R. et al. Combining microarrays and genetic analysis. Brief. Bioinformatics 6, 135–145 (2005).

    Article  CAS  PubMed  Google Scholar 

  8. Keurentjes, J. J. et al. The genetics of plant metabolism. Nature Genet. 38, 842–849 (2006).

    Article  CAS  PubMed  Google Scholar 

  9. Li, Y. et al. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet. (2006).

  10. Ravichandran, V. & Sriram, R. D. Toward data standards for proteomics. Nature Biotech. 23, 373–376 (2005).

    Article  CAS  Google Scholar 

  11. Brazma, A. et al. Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genet. 29, 365–371 (2001).

    Article  CAS  PubMed  Google Scholar 

  12. Spellman, P. T. et al. Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol. 3, RESEARCH0046 (2002).

  13. Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Etzold, T., Ulyanov, A. & Argos, P. SRS: Information retrieval system for molecular biology data banks. Meth. Enzymol. 266, 114–128 (1996).

    Article  CAS  Google Scholar 

  15. Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European molecular biology open software suite. Trends Genet. 16, 276–277 (2000).

    Article  CAS  PubMed  Google Scholar 

  16. Ihaka, R. & Gentleman, R. C. R: A language for data analysis and graphics. J. Comput. Graph. Stat. 5, 399–414 (1996).

    Google Scholar 

  17. Wang, X. S., Gorlitsky, R. & Almeida, J. S. From XML to RDF: how semantic web technologies will change the design of 'omic' standards. Nature Biotech. 23, 1099–1103 (2005).

    Article  CAS  Google Scholar 

  18. Foster, I. Service-oriented science. Science 308, 814–817 (2005).

    Article  CAS  PubMed  Google Scholar 

  19. Saal, L. H. et al. BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data. Genome Biol. 3, SOFTWARE0003 (2002).

  20. Stein, L. D. et al. The Generic Genome Browser: a building block for a model organism system database. Genome Res. 12, 1599–1610 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Ameur, A., Yankovski, V., Enroth, S., Spjuth, O. & Komorowski, J. The LCB Data Warehouse. Bioinformatics 22, 1024–1026 (2006).

    Article  CAS  PubMed  Google Scholar 

  24. Matthews, K. A., Kaufman, T. C. & Gelbart, W. M. Research resources for Drosophila: the expanding universe. Nature Rev. Genet. 6, 179–193 (2005).

    Article  CAS  PubMed  Google Scholar 

  25. Alberts, R., Terpstra, P., Bystrykh, L. V., de Haan, G. & Jansen, R. C. A statistical multiprobe model for analyzing cis and trans genes in genetical genomics experiments with short-oligonucleotide arrays. Genetics 171, 1437–1439 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Washietl, S., Hofacker, I. L. & Stadler, P. F. Fast and reliable prediction of noncoding RNAs. Proc. Natl Acad. Sci. USA 102, 2454–2459 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Cassman, M. Barriers to progress in systems biology. Nature 438, 1079 (2005).

    Article  CAS  PubMed  Google Scholar 

  28. Mattes, W. B., Pettit, S. D., Sansone, S. A., Bushel, P. R. & Waters, M. D. Database development in toxicogenomics: Issues and efforts. Environ. Health Perspect. 112, 495–505 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Zimmermann, P. et al. MIAME/Plant — adding value to plant microarrray experiments. Plant Methods 2, 1 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Xirasagar, S. et al. CEBS object model for systems biology data, SysBio-OM. Bioinformatics 20, 2004–2015 (2004).

    Article  CAS  PubMed  Google Scholar 

  31. Jones, A., Hunt, E., Wastling, J. M., Pizarro, A. & Stoeckert, C. J. An object model and database for functional genomics. Bioinformatics 20, 1583–1590 (2004).

    Article  CAS  PubMed  Google Scholar 

  32. Fogh, R. H. et al. A framework for scientific data modeling and automated software development. Bioinformatics 21, 1678–1684 (2005).

    Article  CAS  PubMed  Google Scholar 

  33. Quackenbush, J. Top-down standards will not serve systems biology. Nature 440, 24 (2006).

    Article  CAS  PubMed  Google Scholar 

  34. Baxter, S. M., Day, S. W., Fetrow, J. S. & Reisinger, S. J. Scientific software development is not an oxymoron. PLoS Comput. Biol. 2, e87 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Hunt, A. & Thomas, D. The Pragmatic Programmer: From Journeyman To Master (Addison–Wesley, Boston, 1999).

    Google Scholar 

  36. Tseng, M. M. & Jiao, J. in Handbook of Industrial Engineering, Technology and Operation Management (John Wiley & Sons, New York, 2001).

    Google Scholar 

  37. Czarnecki, K. & Eisenecker, U. W. Generative Programming: Methods, Techniques, and Applications (Addison–Wesley, Boston, 2000).

    Google Scholar 

  38. Clements, P. & Northrop, L. Salion, Inc.: A Software Product Line Case Study. Technical Report Carnegie Mellon CMU/SEI-2002-TR-038 [online], (2002).

    Google Scholar 

  39. Clements, P. & Northrop, L. Software Product Lines: Practices and Patterns (Adisson–Wesley, Boston, 2001).

    Google Scholar 

  40. Weiss, D. M. & Lai, C. T. R. Software Product-Line Engineering: A Family Based Software Development Process (Addison–Wesley, Boston, 1999).

    Google Scholar 

  41. Brownsword, L. & Clements, P. A Case Study In Successful Product Line Development. Technical Report Carnegie Mellon CMU/SEI-96-TR-016 [online], (1996).

    Google Scholar 

  42. Nadkarni, P. M. et al. Managing attribute–value clinical trials data using the ACT/DB client–server database system. J. Am. Med. Inform. Assoc. 5, 139–151 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Eker, J. et al. Taming heterogeneity – The ptolemy approach. Proc. IEEE Comput. Syst. Bioinform. Conf. 91, 127–143 (2003).

    Google Scholar 

  44. Fall, A. & Fall, J. A domain-specific language for models of landscape dynamics. Ecol. Modell. 141, 1–18 (2001).

    Article  Google Scholar 

  45. Jaring, M., Krikhaar, R. L. & Bosch, J. Representing variability in a family of MRI scanners. Softw. Pract. Exper. 34, 69–100 (2004).

    Article  Google Scholar 

  46. Covitz, P. A. et al. caCORE: A common infrastructure for cancer informatics. Bioinformatics 19, 2404–2412 (2003).

    Article  CAS  PubMed  Google Scholar 

  47. Swertz, M. A. et al. Molecular Genetics Information System (MOLGENIS): alternatives in developing local experimental genomics databases. Bioinformatics 20, 2075–2083 (2004).

    Article  CAS  PubMed  Google Scholar 

  48. Wilkinson, M. D. & Links, M. BioMOBY: an open source biological web services proposal. Brief. Bioinform. 3, 331–341 (2002).

    Article  PubMed  Google Scholar 

  49. Oinn, T. et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 3045–3054 (2004).

    Article  CAS  PubMed  Google Scholar 

  50. Letondal, C. A web interface generator for molecular biology programs in Unix. Bioinformatics 17, 73–82 (2001).

    Article  CAS  PubMed  Google Scholar 

  51. Shah, S. P. et al. Pegasys: software for executing and integrating analyses of biological sequences. BMC Bioinformatics 5, 40 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Garcia, C. A., Thoraval, S., Garcia, L. J. & Ragan, M. A. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator. BMC Bioinformatics 6, 87 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Tang, F. et al. Wildfire: distributed, Grid-enabled workflow construction and execution. BMC Bioinformatics 6, 69 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Garwood, K. L. et al. Pedro: a configurable data entry tool for XML. Bioinformatics 20, 2463–2465 (2004).

    Article  CAS  PubMed  Google Scholar 

  55. Sarkans, U. et al. The ArrayExpress gene expression database: a software engineering and implementation perspective. Bioinformatics 21, 1495–1501 (2005).

    Article  CAS  PubMed  Google Scholar 

  56. Goesmann, A. et al. Building a BRIDGE for the integration of heterogeneous data from functional genomics into a platform for systems biology. J. Biotechnol. 106, 157–167 (2003).

    Article  CAS  PubMed  Google Scholar 

  57. Tobias, J. et al. The CAP cancer protocols — a case study of caCORE based data standards implementation to integrate with the Cancer Biomedical Informatics Grid. BMC Med. Inform. Decis. Mak. 6, 25 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Kuipers, O. P. et al. Transcriptome analysis and related databases of Lactococcus lactis. Antonie Van Leeuwenhoek 82, 113–122 (2002).

    Article  CAS  PubMed  Google Scholar 

  59. Wilkinson, M., Schoof, H., Ernst, R. & Haase, D. BioMOBY successfully integrates distributed heterogeneous bioinformatics web services. The PlaNet exemplar case. Plant Physiol. 138, 5–17 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Stevens, R. D. et al. Exploring Williams–Beuren syndrome using myGrid. Bioinformatics 2, I303–I310 (2004).

    Article  Google Scholar 

  61. Lampel, J. & Mintzberg, H. Customizing customization. Sloan Manage. Rev. 38, 21 (1996).

    Google Scholar 

  62. Ulrich, K. The role of product architecture in the manufacturing firm. Res. Policy 24, 419–440 (1995).

    Article  Google Scholar 

  63. Bass, L., Clements, P. & Kazman, R. Software Architecture in Practice (Addison–Wesley, Boston, (2003).

    Google Scholar 

  64. Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P. & Stal, M. Pattern-Oriented Software Architecture: A System of Patterns (John Wiley & Sons, New York, 1996).

    Google Scholar 

  65. Fowler, M. Patterns of Enterprise Application Architecture (Addison–Wesley, Boston, 2002).

    Google Scholar 

  66. Batory, D., Cardone, R. & Smaragdakis, Y. Proceedings of the 1st Software Product-line Conference (Kluwer Academic, 2006).

    Google Scholar 

  67. van Deursen, A. & Klint, P. Little languages: little maintenance? J. Softw. Maint. Evol. 10, 75–92 (1998).

    Article  Google Scholar 

  68. Van Ommering, R. Building product populations with software components. Proc. 24th Conf. on Software Engineering 255–265 (ACM, New York, 2002).

    Google Scholar 

  69. Krueger, C. Eliminating the adoption barrier. IEEE Softw. 19, 29–31 (2002).

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the reviewers for their valuable suggestions, and R. W. Williams, F. C. P. Holstege, J. P. Nap, E. O. de Brock, and R. Breitling for their comments on an earlier version of this article. Furthermore, we would like to thank R. A. Scheltema, B. M. Tesson and D. I. Matthijssen for assistance with the development of the Showcase. This work was supported by grants from The Netherlands Organization for Scientific Research, Program Earth and Life Sciences and The Netherlands Ministry of Economic Affairs, Programs Biopartner and Biorange.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ritsert C. Jansen.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

FURTHER INFORMATION

Apache Tomcat

BioArray Software Environment

Bioconductor

BioMOBY

Bio* toolkits

CaCORE

CCPN

Code Generation Network

Complex Trait Consortium

Cytoscape

Eclipse

EMBOSS

GeneNetwork

Generic Model Organism Database

Groningen Bioinformatics Centre

MIAME

Microarray Gene Expression Data Society

MOLGENIS

MySQL

Online Showcase

PHP

PISE

PostgreSQL

Software Engineering Institute

Taverna

The Apache Velocity Project

The R Project

UCSC Genome Browser

W3C Semantic Web

W3C Web Services Activity

Glossary

Analysis workflow

The transformation of raw data into biological evidence by applying algorithms, tools and services in a certain order.

Code generator

A code generator translates a domain-specific language into a general language (such as Java), which is then translated (by Java) into a separate program for execution later.

Design pattern

A general, repeatable solution to a commonly occurring problem. It is a description or template for how to solve the problem.

Domain-specific language

A minimal language to describe features for a certain domain in a compact and easy way.

Genetical genomics

A strategy to map genetic determinants that underlie variations in transcript, protein or metabolite abundance that are observed in genetically different individuals.

Interpreter

An interpreter translates domain-specific language directly into machine code for execution.

Module

A unit of functionality that has a clear interface so it can be easily assembled and (re)used interchangeably. Good modules hide implementation details so that a change inside one module does not require changes in other modules.

Software architecture

Software components, the externally visible properties of those components and the relationships among them.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Swertz, M., Jansen, R. Beyond standardization: dynamic software infrastructures for systems biology. Nat Rev Genet 8, 235–243 (2007). https://doi.org/10.1038/nrg2048

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg2048

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing