Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

The model organism as a system: integrating 'omics' data sets

Key Points

  • Many 'omics' data sets are becoming available for various model organisms that can be used to describe many aspects of the cell for a given time and/or condition. They can be broadly classified as components data, which describe the specific molecular contents of the cell; interactions data, which detail the connectivity between cellular components; or functional-states data, which reveal the overall behaviour, or phenotype, of the cell or system in response to genetic and/or environmental perturbations.

  • Even though each of these genome-scale data types can be powerful on their own, researchers are gaining valuable additional insights into cellular phenomena through the integration of 'omics' data sets.

  • The computational tools that have been developed for integrating 'omics' data generally tackle three specific tasks: first, identifying the network scaffold by delineating the connections that exist between cellular components; second, decomposing the network scaffold into its constituent parts in an attempt to understand the overall network structure; and third, developing cellular or system models to simulate and predict the network behaviour that gives rise to particular cellular phenotypes.

  • In addition to the development of methods, many researchers are using 'omics' integration to drive studies that are aimed at delineating systems-wide behaviour. For example, many efforts have been devoted to using genome-scale data integration to completely map the cellular pathways that are responsible for the observed cellular responses to environmental perturbations or developmental events. In some cases, these studies have also led to the development of biomarkers, or patterns of cellular-component expression that are associated with medical disorders, such as various cancers.

  • Researchers are also using omics integration to address fundamental evolutionary questions that were previously beyond the scope and scale of standard techniques. Specifically, omics data-integration techniques have been used to examine cellular differences that are associated with speciation, and other studies have used them to study selective pressures that are likely to have arisen due to cellular-network structure.

  • The integration of omics data has primarily affected basic research efforts so far. Increasingly, however, this strategy is taking on significant roles in clinically relevant areas, as shown by its stimulation of the fields of toxicogenomics and nutrigenomics, which are applying genome-scale technologies and integrative analyses to problems in toxicology and nutrition, respectively. Even though many challenges related to data quality and accessibility remain, researchers continue to work towards meeting the ultimate future goals of employing these strategies to drug-development applications and in personalized medicine.

Abstract

Various technologies can be used to produce genome-scale, or 'omics', data sets that provide systems-level measurements for virtually all types of cellular components in a model organism. These data yield unprecedented views of the cellular inner workings. However, this abundance of information also presents many hurdles, the main one being the extraction of discernable biological meaning from multiple omics data sets. Nevertheless, researchers are rising to the challenge by using omics data integration to address fundamental biological questions that would increase our understanding of systems as a whole.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: 'Omics' data are providing comprehensive descriptions of nearly all components and interactions within the cell.
Figure 2: 'Omics' data-integration approaches for identifying, decomposing and modelling cellular networks.
Figure 3: Network-motif enrichment: an example of network decomposition.
Figure 4: 'Omics'-data integration helps to address interesting biological questions on the systems level.

Similar content being viewed by others

References

  1. Fleischmann, R. D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995).

    CAS  PubMed  Google Scholar 

  2. Ehrenman, G. Mining what others miss: highlighting the subtleties in 1012 bytes of data, technology tries to clear up its own complex mess. Mechanical Engineering-CIME 127, 26 (2005).

    Article  Google Scholar 

  3. Hays, C. L. What Wal-Mart Knows About Customers' Habits. New York Times (14 Nov 2004).

    Google Scholar 

  4. Hand, D. J., Blunt, G., Kelly, M. G. & Adams, N. M. Data mining for fun and profit. Stat. Sci. 15, 111–131 (2000).

    Article  Google Scholar 

  5. Kluger, Y., Yu, H., Qian, J. & Gerstein, M. Relationship between gene co-expression and probe localization on microarray slides. BMC Genomics 4, 49 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Quackenbush, J. Data standards for 'omic' science. Nature Biotechnol. 22, 613–614 (2004). A short, incisive report that introduces some of the problems that the omics sciences face with regards to data quality and representation standards.

    Article  CAS  Google Scholar 

  7. Bader, G. D. & Hogue, C. W. Analyzing yeast protein–protein interaction data obtained from different sources. Nature Biotechnol. 20, 991–997 (2002).

    Article  CAS  Google Scholar 

  8. Ge, H., Walhout, A. J. & Vidal, M. Integrating 'omic' information: a bridge between genomics and systems biology. Trends Genet. 19, 551–560 (2003).

    Article  CAS  PubMed  Google Scholar 

  9. Liolios, K., Tavernarakis, N., Hugenholtz, P. & Kyrpides, N. C. The genomes on line database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Res. 34, D332–D334 (2006).

    Article  CAS  PubMed  Google Scholar 

  10. Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003).

    Article  CAS  PubMed  Google Scholar 

  11. Chimpanzee Sequencing And Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005).

  12. Delsuc, F., Brinkmann, H. & Philippe, H. Phylogenomics and the reconstruction of the tree of life. Nature Rev. Genet. 6, 361–375 (2005).

    Article  CAS  PubMed  Google Scholar 

  13. Tompa, M. et al. Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnol. 23, 137–144 (2005).

    Article  CAS  Google Scholar 

  14. Brasch, M. A., Hartley, J. L. & Vidal, M. ORFeome cloning and systems biology: standardized mass production of the parts from the parts-list. Genome Res. 14, 2001–2009 (2004).

    Article  CAS  PubMed  Google Scholar 

  15. Hardiman, G. Microarray platforms — comparisons and contrasts. Pharmacogenomics 5, 487–502 (2004).

    Article  CAS  PubMed  Google Scholar 

  16. Harbers, M. & Carninci, P. Tag-based approaches for transcriptome research and genome annotation. Nature Methods 2, 495–502 (2005).

    Article  CAS  PubMed  Google Scholar 

  17. Li, L. & Akashi, K. Unraveling the molecular components and genetic blueprints of stem cells. Biotechniques 35, 1233–1239 (2003).

    Article  CAS  PubMed  Google Scholar 

  18. Rhodes, D. R. & Chinnaiyan, A. M. Integrative analysis of the cancer transcriptome. Nature Genet. 37, S31–S37 (2005).

    Article  CAS  PubMed  Google Scholar 

  19. Jenner, R. G. & Young, R. A. Insights into host responses against pathogens from transcriptional profiling. Nature Rev. Microbiol. 3, 281–294 (2005).

    Article  CAS  Google Scholar 

  20. Mata, J., Marguerat, S. & Bahler, J. Post-transcriptional control of gene expression: a genome-wide perspective. Trends Biochem. Sci. 30, 506–514 (2005).

    Article  CAS  PubMed  Google Scholar 

  21. Patterson, S. D. & Aebersold, R. H. Proteomics: the first decade and beyond. Nature Genet. 33 (Suppl.), 311–323 (2003).

    Article  CAS  PubMed  Google Scholar 

  22. Ghaemmaghami, S. et al. Global analysis of protein expression in yeast. Nature 425, 737–741 (2003).

    Article  CAS  PubMed  Google Scholar 

  23. Yates, J. R. 3rd, Gilchrist, A., Howell, K. E. & Bergeron, J. J. Proteomics of organelles and large cellular structures. Nature Rev. Mol. Cell Biol. 6, 702–714 (2005).

    Article  CAS  Google Scholar 

  24. Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nature Rev. Mol. Cell Biol. 6, 577–583 (2005).

    Article  CAS  Google Scholar 

  25. Griffin, J. L. & Bollard, M. E. Metabonomics: its potential as a tool in toxicology for safety assessment and data integration. Curr. Drug Metab. 5, 389–398 (2004).

    Article  CAS  PubMed  Google Scholar 

  26. Nielsen, J. & Oliver, S. The next wave in metabolome analysis. Trends Biotechnol. 23, 544–546 (2005).

    Article  CAS  PubMed  Google Scholar 

  27. Dunn, W. B., Bailey, N. J. & Johnson, H. E. Measuring the metabolome: current analytical technologies. Analyst 130, 606–625 (2005).

    Article  CAS  PubMed  Google Scholar 

  28. Fridman, E. & Pichersky, E. Metabolomics, genomics, proteomics, and the identification of enzymes and their substrates and products. Curr. Opin. Plant Biol. 8, 242–248 (2005).

    Article  CAS  PubMed  Google Scholar 

  29. Markuszewski, M. J., Szczykowska, M., Siluk, D. & Kaliszan, R. Human red blood cells targeted metabolome analysis of glycolysis cycle metabolites by capillary electrophoresis using an indirect photometric detection method. J. Pharm. Biomed. Anal. 39, 636–642 (2005).

    Article  CAS  PubMed  Google Scholar 

  30. Wu, L. et al. Quantitative analysis of the microbial metabolome by isotope dilution mass spectrometry using uniformly 13C-labeled cell extracts as internal standards. Anal. Biochem. 336, 164–171 (2005).

    Article  CAS  PubMed  Google Scholar 

  31. Memelink, J. Tailoring the plant metabolome without a loose stitch. Trends Plant Sci. 10, 305–307 (2005).

    Article  CAS  PubMed  Google Scholar 

  32. Robertson, D. G. Metabonomics in toxicology: a review. Toxicol. Sci. 85, 809–822 (2005).

    Article  CAS  PubMed  Google Scholar 

  33. Gibney, M. J. et al. Metabolomics in human nutrition: opportunities and challenges. Am. J. Clin. Nutr. 82, 497–503 (2005).

    Article  CAS  PubMed  Google Scholar 

  34. Arita, M., Robert, M. & Tomita, M. All systems go: launching cell simulation fueled by integrated experimental biology data. Curr. Opin. Biotechnol. 16, 344–349 (2005).

    Article  CAS  PubMed  Google Scholar 

  35. Huh, W. K. et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003).

    Article  CAS  PubMed  Google Scholar 

  36. Dupuy, D. et al. A first version of the Caenorhabditis elegans promoterome. Genome Res. 14, 2169–2175 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Guda, C. & Subramaniam, S. pTARGET: a new method for predicting protein subcellular localization in eukaryotes. Bioinformatics 21, 3963–3969 (2005).

    Article  CAS  PubMed  Google Scholar 

  38. Coulton, G. Are histochemistry and cytochemistry 'Omics'? J. Mol. Histol. 35, 603–613 (2004).

    PubMed  Google Scholar 

  39. Wenk, M. R. The emerging field of lipidomics. Nature Rev. Drug Discov. 4, 594–610 (2005).

    Article  CAS  Google Scholar 

  40. Shriver, Z., Raguram, S. & Sasisekharan, R. Glycomics: a pathway to a class of new and improved therapeutics. Nature Rev. Drug Discov. 3, 863–873 (2004).

    Article  CAS  Google Scholar 

  41. Levine, M. & Davidson, E. H. Gene regulatory networks for development. Proc. Natl Acad. Sci. USA 102, 4936–4942 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Mockler, T. C. et al. Applications of DNA tiling arrays for whole-genome analysis. Genomics 85, 1–15 (2005).

    Article  CAS  PubMed  Google Scholar 

  43. Buck, M. J. & Lieb, J. D. ChIP–chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83, 349–360 (2004).

    Article  CAS  PubMed  Google Scholar 

  44. Herring, C. D. et al. Immobilization of Escherichia coli RNA polymerase and location of binding sites by use of chromatin immunoprecipitation and microarrays. J. Bacteriol. 187, 6166–6174 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Pokholok, D. K., Hannett, N. M. & Young, R. A. Exchange of RNA polymerase II initiation and elongation factors during gene expression in vivo. Mol. Cell 9, 799–809 (2002).

    Article  CAS  PubMed  Google Scholar 

  46. Kim, T. H. et al. A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Li, Z. et al. A global transcriptional regulatory role for c-Myc in Burkitt's lymphoma cells. Proc. Natl Acad. Sci. USA 100, 8164–8169 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Martone, R. et al. Distribution of NF-κB-binding sites across human chromosome 22. Proc. Natl Acad. Sci. USA 100, 12247–12252 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Cawley, S. et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).

    Article  CAS  PubMed  Google Scholar 

  51. Zhang, X. et al. Genome-wide analysis of cAMP-response element binding protein occupancy, phosphorylation, and target gene activation in human tissues. Proc. Natl Acad. Sci. USA 102, 4459–4464 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Pokholok, D. K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).

    Article  CAS  PubMed  Google Scholar 

  53. Cusick, M., Klitgord, N., Vidal, M. & Hill, D. E. Interactome: gateway into systems biology. Hum. Mol. Genet. 14, R171–R181 (2005).

    Article  CAS  PubMed  Google Scholar 

  54. Fields, S. High-throughput two-hybrid analysis. The promise and the peril. FEBS J. 272, 5391–5399 (2005).

    Article  CAS  PubMed  Google Scholar 

  55. Ben-Hur, A. & Noble, W. S. Kernel methods for predicting protein–protein interactions. Bioinformatics 21 (Suppl. 1), i38–i46 (2005).

    Article  CAS  PubMed  Google Scholar 

  56. Pazos, F., Ranea, J. A., Juan, D. & Sternberg, M. J. Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J. Mol. Biol. 352, 1002–1015 (2005).

    Article  CAS  PubMed  Google Scholar 

  57. Droit, A., Poirier, G. G. & Hunter, J. M. Experimental and bioinformatic approaches for interrogating protein–protein interactions to determine protein function. J. Mol. Endocrinol. 34, 263–280 (2005).

    Article  CAS  PubMed  Google Scholar 

  58. Butland, G. et al. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 433, 531–537 (2005).

    Article  CAS  PubMed  Google Scholar 

  59. Rain, J. C. et al. The protein–protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001).

    Article  CAS  PubMed  Google Scholar 

  60. Lacount, D. J. et al. A protein interaction network of the malaria parasite Plasmodium falciparum. Nature 438, 103–107 (2005).

    Article  CAS  PubMed  Google Scholar 

  61. Ito, T. et al. Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell Proteomics 1, 561–566 (2002).

    Article  CAS  PubMed  Google Scholar 

  62. Formstecher, E. et al. Protein interaction mapping: a Drosophila case study. Genome Res. 15, 376–384 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Li, S. et al. A map of the interactome network of the metazoan C. elegans. Science 303, 540–543 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Stelzl, U. et al. A human protein–protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005).

    Article  CAS  PubMed  Google Scholar 

  65. Scholtens, D., Vidal, M. & Gentleman, R. Local modeling of global interactome networks. Bioinformatics 21, 3548–3557 (2005).

    Article  CAS  PubMed  Google Scholar 

  66. Hahn, M. W., Conant, G. C. & Wagner, A. Molecular evolution in large genetic networks: does connectivity equal constraint? J. Mol. Evol. 58, 203–211 (2004).

    Article  CAS  PubMed  Google Scholar 

  67. Sprinzak, E., Sattath, S. & Margalit, H. How reliable are experimental protein–protein interaction data? J. Mol. Biol. 327, 919–923 (2003).

    Article  CAS  PubMed  Google Scholar 

  68. Roehrl, M. H., Wang, J. Y. & Wagner, G. A general framework for development and data analysis of competitive high-throughput screens for small-molecule inhibitors of protein–protein interactions by fluorescence polarization. Biochemistry 43, 16056–16066 (2004).

    Article  CAS  PubMed  Google Scholar 

  69. Bochner, B. R. New technologies to assess genotype–phenotype relationships. Nature Rev. Genet. 4, 309–314 (2003).

    Article  CAS  PubMed  Google Scholar 

  70. Bredel, M. & Jacoby, E. Chemogenomics: an emerging strategy for rapid target and drug discovery. Nature Rev. Genet. 5, 262–275 (2004).

    Article  CAS  PubMed  Google Scholar 

  71. Dykxhoorn, D. M. & Lieberman, J. The silent revolution: RNA interference as basic biology, research tool, and therapeutic. Annu. Rev. Med. 56, 401–423 (2005).

    Article  CAS  PubMed  Google Scholar 

  72. Tong, A. H. et al. Global mapping of the yeast genetic interaction network. Science 303, 808–813 (2004).

    Article  CAS  PubMed  Google Scholar 

  73. Sauer, U. High-throughput phenomics: experimental methods for mapping fluxomes. Curr. Opin. Biotechnol. 15, 58–63 (2004).

    Article  CAS  PubMed  Google Scholar 

  74. Li, H. & Wang, W. Dissecting the transcription networks of a cell using computational genomics. Curr. Opin. Genet. Dev. 13, 611–616 (2003).

    Article  CAS  PubMed  Google Scholar 

  75. Wang, W. et al. Inference of combinatorial regulation in yeast transcriptional networks: a case study of sporulation. Proc. Natl Acad. Sci. USA 102, 1998–2003 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nature Biotechnol. 21, 1337–1342 (2003). Introduces the GRAM algorithm that can be used to identify gene modules or groups of co-expressed genes that share a common transcriptional regulator. This approach is useful for inferring transcriptional-regulatory networks from omics data sets.

    Article  CAS  Google Scholar 

  77. Gat-Viks, I., Tanay, A. & Shamir, R. Modeling and analysis of heterogeneous regulation in biological networks. J. Comput. Biol. 11, 1034–1049 (2004).

    Article  CAS  PubMed  Google Scholar 

  78. Yeang, C. H. et al. Validation and refinement of gene-regulatory pathways on a network of physical interactions. Genome Biol. 6, R62 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Jansen, R. et al. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 302, 449–453 (2003).

    Article  CAS  PubMed  Google Scholar 

  80. Rhodes, D. R. et al. Probabilistic model of the human protein–protein interaction network. Nature Biotechnol. 23, 951–959 (2005). This study illustrates the use of a Bayesian classification strategy to predict the structure of molecular networks — orthologous protein–protein interactions, transcriptomics and genomics data were integrated to develop a Bayesian model that predicts 40,000 human protein–protein interactions.

    Article  CAS  Google Scholar 

  81. Yeger-Lotem, E. et al. Network motifs in integrated cellular networks of transcription-regulation and protein–protein interaction. Proc. Natl Acad. Sci. USA 101, 5934–5939 (2004). This work presents a methodology to decompose cellular networks into their constituent basic building blocks, or network motifs. Although the technique can be applied to networks of any type, this study focuses on the analysis of a S. cerevisiae network derived from genome-scale protein–protein- and protein–DNA-interaction data sets.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Yeger-Lotem, E. & Margalit, H. Detection of regulatory circuits by integrating the cellular networks of protein–protein interactions and transcription regulation. Nucleic Acids Res. 31, 6053–6061 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Zhang, L. V. et al. Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. J. Biol. 4, 6 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Luscombe, N. M. et al. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431, 308–312 (2004).

    Article  CAS  PubMed  Google Scholar 

  85. Han, J. D. et al. Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 430, 88–93 (2004).

    Article  CAS  PubMed  Google Scholar 

  86. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl Acad. Sci. USA 101, 2981–2986 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18 (Suppl. 1), S233–S240 (2002).

    Article  PubMed  Google Scholar 

  88. Wong, S. L. et al. Combining biological networks to predict genetic interactions. Proc. Natl Acad. Sci. USA 101, 15682–15687 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nature Biotechnol. 23, 561–566 (2005).

    Article  CAS  Google Scholar 

  90. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. & Hattori, M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Price, N. D., Reed, J. L. & Palsson, B. O. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nature Rev. Microbiol. 2, 886–897 (2004). This review discusses the COBRA approach to modelling genome-scale molecular networks by integrating genome-scale data sets with a specific emphasis on the many recent analytical methods that are associated with these models for studying characteristics and capabilities of microorganisms.

    Article  CAS  Google Scholar 

  92. Reed, J. L., Famili, I., Thiele, I. & Palsson, B. O. Towards multidimensional genome annotation. Nature Rev. Genet. 7, 130–141 (2006).

    Article  CAS  PubMed  Google Scholar 

  93. Palsson, B. Two-dimensional annotation of genomes. Nature Biotechnol. 22, 1218–1219 (2004).

    Article  CAS  Google Scholar 

  94. Patil, K. R., Akesson, M. & Nielsen, J. Use of genome-scale microbial models for metabolic engineering. Curr. Opin. Biotechnol. 15, 64–69 (2004).

    Article  CAS  PubMed  Google Scholar 

  95. Patil, K. R. & Nielsen, J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl Acad. Sci. USA 102, 2685–2689 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Covert, M. W., Knight, E. M., Reed, J. L., Herrgard, M. J. & Palsson, B. O. Integrating high-throughput and computational data elucidates bacterial networks. Nature 429, 92–96 (2004).

    Article  CAS  PubMed  Google Scholar 

  97. Papin, J. A. & Palsson, B. O. The JAK–STAT signaling network in the human B-cell: an extreme signaling pathway analysis. Biophys. J. 87, 37–46 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Longabaugh, W. J., Davidson, E. H. & Bolouri, H. Computational representation of developmental genetic regulatory networks. Dev. Biol. 283, 1–16 (2005). The reconstruction and modelling of developmental gene-regulatory networks is detailed by integrating various data types using the BioTapestry modelling software.

    Article  CAS  PubMed  Google Scholar 

  99. Saghatelian, A. & Cravatt, B. F. Global strategies to integrate the proteome and metabolome. Curr. Opin. Chem. Biol. 9, 62–68 (2005).

    Article  CAS  PubMed  Google Scholar 

  100. Begley, T. J., Rosenbach, A. S., Ideker, T. & Samson, L. D. Hot spots for modulating toxicity identified by genomic phenotyping and localization mapping. Mol. Cell 16, 117–125 (2004).

    Article  CAS  PubMed  Google Scholar 

  101. Lee, W. et al. Genome-wide requirements for resistance to functionally distinct DNA-damaging agents. PLoS Genet. 1, e24 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Haugen, A. C. et al. Integrating phenotypic and expression profiles to map arsenic-response networks. Genome Biol. 5, R95 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  103. Kim, J. K. et al. Functional genomic analysis of RNA interference in C. elegans. Science 308, 1164–1167 (2005).

    Article  CAS  PubMed  Google Scholar 

  104. Tewari, M. et al. Systematic interactome mapping and genetic perturbation analysis of a C. elegans TGF-β signaling network. Mol. Cell 13, 469–482 (2004).

    Article  CAS  PubMed  Google Scholar 

  105. Boulton, S. J. et al. Combined functional genomic maps of the C. elegans DNA damage response. Science 295, 127–131 (2002).

    Article  CAS  PubMed  Google Scholar 

  106. Gunsalus, K. C. et al. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436, 861–865 (2005). This study integrated transcriptomics, protein–protein interactions and RNAi-based phenomics to map the molecular network topology of genes associated with early embryogenesis in C. elegans . The resulting structure is used to infer potential network organizational and functional properties such as interacting molecular complexes and cellular-process crosstalk.

    Article  CAS  PubMed  Google Scholar 

  107. Oksman-Caldentey, K. M. & Saito, K. Integrating genomics and metabolomics for engineering plant metabolic pathways. Curr. Opin. Biotechnol. 16, 174–179 (2005).

    Article  CAS  PubMed  Google Scholar 

  108. Kristensen, C. et al. Metabolic engineering of dhurrin in transgenic Arabidopsis plants with marginal inadvertent effects on the metabolome and transcriptome. Proc. Natl Acad. Sci. USA 102, 1779–1784 (2005). This study used omics data integration to diagnose unexpected impacts of genomic manipulations on the phenotype of the organism. Metabolomic and transcriptomic data were integrated to assess the systems-wide impact of introducing exogenous high-flux pathways to A. thaliana.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Hirai, M. Y. et al. Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proc. Natl Acad. Sci. USA 101, 10205–10210 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Ippolito, J. E. et al. An integrated functional genomics and metabolomics approach for defining poor prognosis in human neuroendocrine cancers. Proc. Natl Acad. Sci. USA 102, 9901–9906 (2005). The utility of integrating omics data to identify biomarkers is shown in this work, which integrated transcriptomics and metabolomics data to determine a molecular signature that is associated with poor-prognosis human neuroendocrine cancers.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Yan, W. et al. System-based proteomic analysis of the interferon response in human liver cells. Genome Biol. 5, R54 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  112. Enard, W. et al. Intra- and interspecific variation in primate gene expression patterns. Science 296, 340–343 (2002).

    Article  CAS  PubMed  Google Scholar 

  113. Khaitovich, P. et al. Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 309, 1850–1854 (2005).

    Article  CAS  PubMed  Google Scholar 

  114. Khaitovich, P. et al. Regional patterns of gene expression in human and chimpanzee brains. Genome Res. 14, 1462–1473 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Ihmels, J. et al. Rewiring of the yeast transcriptional network through the evolution of motif usage. Science 309, 938–940 (2005). Genomics and transcriptomics data are integrated to identify a cis -regulatory element associated with the evolutionary emergence of rapid anaerobic growth capacity in certain yeast species. This study highlights the potential of integrating omics data sets to address fundamental evolutionary questions.

    Article  CAS  PubMed  Google Scholar 

  116. Tanay, A., Regev, A. & Shamir, R. Conservation and evolvability in regulatory networks: the evolution of ribosomal regulation in yeast. Proc. Natl Acad. Sci. USA 102, 7203–7208 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Shields, R. MIAME, we have a problem. Trends Genet. 22, 65–66 (2006).

    Article  CAS  PubMed  Google Scholar 

  118. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003). One of the most widely used and broadly accessible software packages designed to facilitate omics data integration and analysis, known as Cytoscape, is detailed in this report.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003).

    Article  CAS  PubMed  Google Scholar 

  120. Novere, N. L. et al. Minimum information requested in the annotation of biochemical models (MIRIAM). Nature Biotechnol. 23, 1509–1515 (2005).

    Article  CAS  Google Scholar 

  121. Stierum, R., Heijne, W., Kienhuis, A., van Ommen, B. & Groten, J. Toxicogenomics concepts and applications to study hepatic effects of food additives and chemicals. Toxicol. Appl. Pharmacol. 207, 179–188 (2005).

    Article  CAS  PubMed  Google Scholar 

  122. Corthesy-Theulaz, I. et al. Nutrigenomics: the impact of biomics technology on nutrition research. Ann. Nutr. Metab. 49, 355–365 (2005).

    Article  CAS  PubMed  Google Scholar 

  123. Desiere, F. Towards a systems biology understanding of human health: interplay between genotype, environment and nutrition. Biotechnol. Annu. Rev. 10, 51–84 (2004).

    Article  CAS  PubMed  Google Scholar 

  124. Roche, H. M., Phillips, C. & Gibney, M. J. The metabolic syndrome: the crossroads of diet and genetics. Proc. Nutr. Soc. 64, 371–377 (2005).

    Article  CAS  PubMed  Google Scholar 

  125. Ibrahim, S. M. & Gold, R. Genomics, proteomics, metabolomics: what is in a word for multiple sclerosis? Curr. Opin. Neurol. 18, 231–235 (2005).

    Article  CAS  PubMed  Google Scholar 

  126. Khalil, I. G. & Hill, C. Systems biology for cancer. Curr. Opin. Oncol. 17, 44–48 (2005).

    Article  CAS  PubMed  Google Scholar 

  127. Nikolsky, Y., Nikolskaya, T. & Bugrim, A. Biological networks and analysis of experimental data in drug discovery. Drug Discov. Today 10, 653–662 (2005).

    Article  CAS  PubMed  Google Scholar 

  128. Billings, P. R. et al. Ready for genomic medicine? Perspectives of health care decision makers. Arch. Intern. Med. 165, 1917–1919 (2005).

    Article  PubMed  Google Scholar 

  129. Deeds, E. J., Ashenberg, O. & Shakhnovich, E. I. A simple physical model for scaling in protein–protein interaction networks. Proc. Natl Acad. Sci. USA 103, 311–316 (2006).

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Ethics declarations

Competing interests

Bernhard Ø. Palsson serves on the scientific advisory board of Genomatica Inc.

Related links

Related links

FURTHER INFORMATION

Bernhard Palsson's laboratory

Glossary

Terabyte

A unit of computer-information-storage capacity that is equal to one trillion bytes or one thousand gigabytes.

Data mining

An analytical discipline that is focused on finding unsuspected relationships and summarizing often large observational data sets in new ways that are both understandable and useful to the data owner.

Omics data set

A generic term that describes the genome-scale data sets that are emerging from high-throughput technologies. Examples include whole-genome sequencing data (genomics) and microarray-based genome-wide expression profiles (transcriptomics).

Serial analysis of gene expression

(SAGE). An experimental technique for transcriptome analysis through the massive sequential analysis of short cDNA sequence tags. The cDNA tags are derived from cellular or tissue mRNA for which the corresponding genes can be identified, and the total count of cDNA tags for each gene represents an accurate measurement of its expression level.

Mass spectrometry

An analytical technique that identifies biochemical molecules (such as proteins, metabolites or fatty acids) on the basis of their mass and charge.

Vibrational spectroscopy

An analytical technique that can be used to investigate the composition of biological samples by the characteristic frequencies at which chemical bonds vibrate.

Metabolic engineering

An applied discipline that is devoted to the targeted improvement in cellular properties or metabolite production by experimental manipulation of specific metabolic or signal-transduction pathways.

In silico prediction

A general term that refers to a computational prediction that usually results from the analysis of a mathematical or computational model.

Histocytomics

A developing field that is scaling up the traditional techniques of histochemistry and cytochemistry, such that many cellular species can be identified and localized in a cell or tissue sample in a high-throughput manner.

Tiling array

A high-density microarray that contains evenly spaced, or 'tiled', sets of probes that span the genome or chromosome, and can be used in many experimental applications such as transcriptome characterization, gene discovery, alternative-splicing analysis, ChIP–chip, DNA-methylation analysis, DNA-polymorphism analysis, comparative genome analysis and genome resequencing.

ChIP–chip

A high-throughput experimental technique that combines chromatin immunoprecipitation (ChIP) and microarray technology (chip) that directly identifies protein–DNA interactions.

Power-law distribution

Networks that exhibit a power-law distribution, also known as scale-free networks, are non-uniform, with most nodes having very few links, whereas a few so-called hub nodes have a very large number of links. Notably, many biological networks follow a power-law distribution as does the internet, for example.

Network scaffold

Refers to the structure of a network that specifies the components of the network and the interactions between them, and represents the end product of the network-reconstruction process.

Network module

A portion of a biological network that is composed of multiple molecular entities (such as genes, proteins or metabolites) that work together as a distinct unit within the cell, for example, in response to certain stimuli or as part of a developmental or differentiation programme.

Bayesian model

A probabilistic model that generally specifies the likelihood of an observation occurring, on the basis of the presence of various characteristics that are known or assumed to be associated with the observation according to prior information.

Synthetic lethal

This term refers to the lethal or significantly impaired phenotype that results from mutations in two non-essential genes that, individually, result in viability. Such an interaction possibly indicates their activity within the same essential pathway or parallel non-essential pathways.

Bipartite graph

A set of graph vertices that is partitioned into two distinct sets such that no two graph vertices within the same set are adjacent. For example, one set can represent genes, and the other set can represent characteristics that describe the function(s) of those genes.

Simulated annealing-based search algorithm

A global optimization technique that traverses a search space by testing random mutations on an individual solution, keeping all better solutions, and accepting worse solutions probabilistically on the basis of the difference in solutions and a decreasing temperature parameter.

Training set

A collection of data that has known characteristics and is used to develop a predictive model in data-mining and machine-learning applications (for example, in Bayesian-model approaches). The characteristics learned from the training set are used to make subsequent predictions about new data.

Log-odds scoring scheme

A statistical procedure that is designed to assess the significance of an observation by calculating a quantity that considers the observed frequency relative to the expected frequency, if the observation was random.

Constraint-based reconstruction and analysis

(COBRA). A genome-scale modelling approach that involves: first, the reconstruction of biochemical reaction networks; then, applying constraints to the network; and finally, analysing the characteristics and capabilities of the network using various computational techniques.

Network reconstruction

The process of integrating different data sources to create a representation of the chemical events that underlie a biochemical reaction network.

Governing constraints

Biochemical networks and cellular systems are constrained by natural law. These governing constraints include physico-chemical constraints (such as enzyme turnover), topobiological constraints (such as cellular crowding), environmental constraints (such as nutrient availability) and regulatory constraints (such as gene repression in response to external signals).

Omics data integration

The simultaneous analysis of high-throughput genome-scale data that is aimed at developing models of biological systems to assess their properties and behaviour.

Biomarker

A distinctive biochemical indicator that is associated with a biological process or event (for example, the presence of a protein, or set of proteins, that are characteristic of cancerous cells).

Metabolic syndrome

An increasingly common, complex and multi-factorial disorder that is characterized by glucose intolerance, abdominal obesity, hypertension and abnormal cholesterol levels that increases an individual's risk of developing coronary heart disease and type 2 diabetes.

Personalized genomic medicine

The idea that genome-scale technologies will allow clinicians to apply treatment regimens that are tailored specifically to an individual patient on the basis of their genetic makeup and associated predispositions.

Gödel's incompleteness theorem

A prominent result from mathematical logic that basically states that for any formal theory in which basic arithmetical facts (or axioms) are provable, it is possible to construct an arithmetical statement that is true but neither provable nor refutable within the theory. Therefore, despite having all axioms available, certain truths may not be provable or readily apparent.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joyce, A., Palsson, B. The model organism as a system: integrating 'omics' data sets. Nat Rev Mol Cell Biol 7, 198–210 (2006). https://doi.org/10.1038/nrm1857

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrm1857

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing