Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Integrating high-throughput and computational data elucidates bacterial networks


The flood of high-throughput biological data has led to the expectation that computational (or in silico) models can be used to direct biological discovery, enabling biologists to reconcile heterogeneous data types, find inconsistencies and systematically generate hypotheses1,2,3. Such a process is fundamentally iterative, where each iteration involves making model predictions, obtaining experimental data, reconciling the predicted outcomes with experimental ones, and using discrepancies to update the in silico model. Here we have reconstructed, on the basis of information derived from literature and databases, the first integrated genome-scale computational model of a transcriptional regulatory and metabolic network. The model accounts for 1,010 genes in Escherichia coli, including 104 regulatory genes whose products together with other stimuli regulate the expression of 479 of the 906 genes in the reconstructed metabolic network. This model is able not only to predict the outcomes of high-throughput growth phenotyping and gene expression experiments, but also to indicate knowledge gaps and identify previously unknown components and interactions in the regulatory and metabolic networks. We find that a systems biology approach that combines genome-scale experimentation and computation can systematically generate hypotheses on the basis of disparate data sources.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1: Growth phenotype study.
Figure 2: Characterization of the regulatory network related to the aerobic–anaerobic shift.
Figure 3: Biological network elucidation by a model-centric approach.


  1. Ideker, T., Galitski, T. & Hood, L. A new approach to decoding life: systems biology. Annu. Rev. Genomics Hum. Genet. 2, 343–372 (2001)

    Article  CAS  Google Scholar 

  2. Palsson, B. O. The challenges of in silico biology. Nature Biotechnol. 18, 1147–1150 (2000)

    Article  CAS  Google Scholar 

  3. Kitano, H. Systems biology: a brief overview. Science 295, 1662–1664 (2002)

    Article  ADS  CAS  Google Scholar 

  4. Reed, J. L., Vo, T. D., Schilling, C. H. & Palsson, B. O. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 4, R54.1–R54.12 (2003)

    Article  Google Scholar 

  5. Bochner, B. R. New technologies to assess genotype–phenotype relationships. Nature Rev. Genet. 4, 309–314 (2003)

    Article  CAS  Google Scholar 

  6. Glasner, J. D. et al. ASAP, a systematic annotation package for community analysis of genomes. Nucleic Acids Res. 31, 147–151 (2003)

    Article  CAS  Google Scholar 

  7. Forster, J., Famili, I., Palsson, B. O. & Nielsen, J. Large-scale evaluation of in silico gene knockouts in Saccharomyces cerevisiae. Omics 7, 193–202 (2003)

    Article  Google Scholar 

  8. Edwards, J. S. & Palsson, B. O. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl Acad. Sci. USA 97, 5528–5533 (2000)

    Article  ADS  CAS  Google Scholar 

  9. Covert, M. W. & Palsson, B. O. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem. 277, 28058–28064 (2002)

    Article  CAS  Google Scholar 

  10. Herrgard, M. J., Covert, M. W. & Palsson, B. O. Reconciling gene expression data with known genome-scale regulatory network structures. Genome Res. 13, 2423–2434 (2003)

    Article  CAS  Google Scholar 

  11. Salmon, K. et al. Global gene expression profiling in Escherichia coli K12. The effects of oxygen availability and FNR. J. Biol. Chem. 278, 29837–29855 (2003)

    Article  CAS  Google Scholar 

  12. Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001)

    Article  ADS  CAS  Google Scholar 

  13. Compan, I. & Touati, D. Anaerobic activation of arcA transcription in Escherichia coli: roles of Fnr and ArcA. Mol. Microbiol. 11, 955–964 (1994)

    Article  CAS  Google Scholar 

  14. Cotter, P. A., Melville, S. B., Albrecht, J. A. & Gunsalus, R. P. Aerobic regulation of cytochrome d oxidase (cydAB) operon expression in Escherichia coli: roles of Fnr and ArcA in repression and activation. Mol. Microbiol. 25, 605–615 (1997)

    Article  CAS  Google Scholar 

  15. Griffin, T. J. et al. Complementary profiling of gene expression at the transcriptome and proteome levels in Saccharomyces cerevisiae. Mol. Cell Proteomics 1, 323–333 (2002)

    Article  CAS  Google Scholar 

  16. Reed, J. L. & Palsson, B. O. Thirteen years of building constraint-based in silico models of Escherichia coli. J. Bacteriol. 185, 2692–2699 (2003)

    Article  CAS  Google Scholar 

  17. Edwards, J. S., Ibarra, R. U. & Palsson, B. O. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nature Biotechnol. 19, 125–130 (2001)

    Article  CAS  Google Scholar 

  18. Price, N. D., Papin, J. A., Schilling, C. H. & Palsson, B. O. Genome-scale microbial in silico models: the constraints-based approach. Trends. Biotechnol. 21, 162–169 (2003)

    Article  CAS  Google Scholar 

  19. Segre, D., Vitkup, D. & Church, G. M. Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl Acad. Sci. USA 99, 15112–15117 (2002)

    Article  ADS  CAS  Google Scholar 

  20. Burgard, A. P. & Maranas, C. D. Probing the performance limits of the Escherichia coli metabolic network subject to gene additions or deletions. Biotechnol. Bioeng. 74, 364–375 (2001)

    Article  CAS  Google Scholar 

  21. Burgard, A. P., Vaidyaraman, S. & Maranas, C. D. Minimal reaction sets for Escherichia coli metabolism under different growth requirements and uptake environments. Biotechnol. Prog. 17, 791–797 (2001)

    Article  CAS  Google Scholar 

  22. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. The large-scale organization of metabolic networks. Nature 407, 651–654 (2000)

    Article  ADS  CAS  Google Scholar 

  23. Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64–68 (2002)

    Article  CAS  Google Scholar 

  24. Gutierrez-Rios, R. M. et al. Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 13, 2435–2443 (2003)

    Article  CAS  Google Scholar 

  25. Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nature Biotechnol. 21, 1337–1342 (2003)

    Article  CAS  Google Scholar 

  26. Covert, M. W., Schilling, C. H. & Palsson, B. Regulation of gene expression in flux balance models of metabolism. J. Theor. Biol. 213, 73–88 (2001)

    Article  CAS  Google Scholar 

  27. Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1474 (1997)

    Article  CAS  Google Scholar 

  28. Datsenko, K. A. & Wanner, B. L. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl Acad. Sci. USA 97, 6640–6645 (2000)

    Article  ADS  CAS  Google Scholar 

  29. Li, C. & Wong, W. H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001)

    Article  ADS  CAS  Google Scholar 

  30. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995)

    MathSciNet  MATH  Google Scholar 

Download references


We thank K. Stadsklev and A. Fleming for assistance with computation; Z. Zhang and A. Raghunathan for experimental assistance; the Perna and Blattner laboratories for access to the high-throughput phenotyping data in the ASAP database; and the NIH for funding and support. M.W.C. and B.O.P. designed the project and were involved in all phases of the study; E.M.K. carried out experiments; J.L.R. reconstructed the model, ran simulations and did the phenotyping analysis; M.J.H. did the statistical analysis of the gene expression data.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Bernhard O. Palsson.

Ethics declarations

Competing interests

UCSD has licensed patent applications to a spin-off company, Genomatica, that may relate to the present paper. UCSD and some of the authors hold shares in Genomatica.

Supplementary information

Suppplementary Notes

Text describing the contents of all the supplementary Excel files in more detail, together with a case-by-case study of inconsistent environments and strains from Figure 1 in the main text, and a completed MIAME checklist. (DOC 118 kb)

Supplementary Data 1

Regulatory Model Rules (iMC1010v1). A list of the genes accounted for by the model, together with the regulatory rules, if any. (XLS 126 kb)

Supplementary Data 2

Simulation Parameters. A detailed list of all parameters used to run the simulations described in the manuscript. (XLS 47 kb)

Supplementary Data 3

Regulatory Model Abbreviations. Abbreviations used in the model to represent metabolites or metabolic reactions. (XLS 81 kb)

Supplementary Data 4

Phenotype-Model Comparison. A more detailed version of the phenotype model comparison shown in Figure 1 of the main text. (XLS 220 kb)

Supplementary Data 5

Phenotype sensitivity analysis. Sensitivity analysis of the phenotype cutoff parameter. (XLS 19 kb)

Supplementary Data 6

Anaerobic-aerobic culture data. Growth, substrate uptake and by-product secretion of wild-type and 6 knockout E. coli strains under aerobic and anaerobic conditions. (XLS 18 kb)

Supplementary Data 7

Normalized Array Data. A table with all the dChip-normalized array data from our experiments. (XLS 9871 kb)

Supplementary Data 8

Detailed hypothesis list. A detailed list of the regulatory interaction hypotheses generated by this study. Includes new regulatory rules implemented in iMC1010v2. (XLS 54 kb)

Supplementary Data 9

qPCR cross-validation. Results of qPCR validation of various changes in gene expression from the microarray data set. (XLS 17 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Covert, M., Knight, E., Reed, J. et al. Integrating high-throughput and computational data elucidates bacterial networks. Nature 429, 92–96 (2004).

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing