Access

Letters to Nature

Nature 429, 92-96 (6 May 2004) | doi:10.1038/nature02456; Received 22 November 2003; Accepted 1 March 2004

Open Innovation Challenges

Integrating high-throughput and computational data elucidates bacterial networks

Markus W. Covert1,2, Eric M. Knight1, Jennifer L. Reed1, Markus J. Herrgard1 & Bernhard O. Palsson1

  1. Bioengineering Department, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0412, USA
  2. Present address: Biology Division, California Institute of Technology, 1200 E. California Boulevard, Mail Code 147-75, Pasadena, California 91125, USA

Correspondence to: Bernhard O. Palsson1 Email: palsson@ucsd.edu
The gene expression data are available online in GEO (http://www.ncbi.nlm.nih.gov/geo/), accession number GSE1121.

Top

The flood of high-throughput biological data has led to the expectation that computational (or in silico) models can be used to direct biological discovery, enabling biologists to reconcile heterogeneous data types, find inconsistencies and systematically generate hypotheses1, 2, 3. Such a process is fundamentally iterative, where each iteration involves making model predictions, obtaining experimental data, reconciling the predicted outcomes with experimental ones, and using discrepancies to update the in silico model. Here we have reconstructed, on the basis of information derived from literature and databases, the first integrated genome-scale computational model of a transcriptional regulatory and metabolic network. The model accounts for 1,010 genes in Escherichia coli, including 104 regulatory genes whose products together with other stimuli regulate the expression of 479 of the 906 genes in the reconstructed metabolic network. This model is able not only to predict the outcomes of high-throughput growth phenotyping and gene expression experiments, but also to indicate knowledge gaps and identify previously unknown components and interactions in the regulatory and metabolic networks. We find that a systems biology approach that combines genome-scale experimentation and computation can systematically generate hypotheses on the basis of disparate data sources.

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.