Synopsis

Subject Categories: Metabolic and regulatory networks | Cellular Metabolism

Molecular Systems Biology 3 Article number: 121  doi:10.1038/msb4100155
Published online: 26 June 2007
Citation: Molecular Systems Biology 3:121

A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information

Adam M Feist1, Christopher S Henry2, Jennifer L Reed1, Markus Krummenacker3, Andrew R Joyce1, Peter D Karp3, Linda J Broadbelt2, Vassily Hatzimanikatis4 & Bernhard Ø Palsson1

  1. Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
  2. Department of Chemical and Biological Engineering, McCormick School of Engineering and Applied Sciences, Northwestern University, Evanston, IL, USA
  3. Bioinformatics Research Group, SRI International, Ravenswood, CA, USA
  4. Laboratory of Computational Systems Biotechnology, Ecole polytechnique fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland

Correspondence to: Bernhard Ø Palsson1 Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, Mail Code 0412, La Jolla, CA 92093, USA. Tel.: +1 858 534 5668; Fax: +1 858 822 3120; Email: bpalsson@bioeng.ucsd.edu

Received 20 December 2006; Accepted 12 April 2007; Published online 26 June 2007

Top

Article highlights

  • We generated a genome-scale reconstruction of the model organism E. coli K-12 MG1655 which includes 1260 ORFs from the latest genome annotation and thermodynamic information for reactions and compounds included in the reconstruction. In the reconstruction process, we synchronized the contents of this reconstruction with the EcoCyc/MetaCyc databases to include the most up to date information available for E. coli metabolism.
  • We characterized the reconstruction content with respect to previous reconstructions of E. coli metabolism and detail five major enhancements over the previous version: increased scope, compartmentalization, increased pathway detail, incorporation of reaction thermodynamics and alignment with the EcoCyc database.
  • We delineated the steps necessary to utilize a reconstruction as a predictive model for constraint-based analyses and verified that the included content was able to predict observable outcomes when compared to growth experiments.
  • To determine and quantify the computational model's ability to predict cellular phenotypes, we examined the influence of specific parameters on flux balancing solutions and compared computational predictions with available high-throughput experimental data. Modeling predictions agreed well with both high-throughput growth and gene-essentiality data and disagreements provide specific areas for biological discovery and model refinement/expansion.

Top

Synopsis

Genome sequencing and annotation, along with biochemical characterization of cellular machinery, has enabled reconstruction of cellular processes on the genomic scale since the turn of the millennium. Initial targets for reconstruction were primarily microorganisms, but with advancements in genomic sequencing technology and annotation techniques, reconstructions of higher-order species are appearing (Reed et al, 2006a). Of the available genome-scale reconstructions, those for metabolism are most prevalent due to the large body of work characterizing and cataloging metabolic processes. Similarly, the metabolic reconstruction of the bacterium E. coli is arguably the most advanced because it possesses the most complete body of data available for its metabolism and growth behavior. Building on the rich history of E. coli metabolic reconstruction (Figure 1), we generated an updated genome-scale metabolic reconstruction for E. coli containing 1260 ORFs from the latest genome annotation (Riley et al, 2006). We characterize the iAF1260 reconstruction, detail its conversion to a computational model and demonstrate its application as a model to predict selected cellular phenotypes.

Figure 1
Figure 1 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Classification of the ORFs, reactions and metabolites included in iAF1260. (A) Coverage of characterized ORFs from each of the COGs functional classes included in iAF1260 and five previous reconstructions. The percentage given is the total coverage accounted for in iAF1260 for each class. Some ORFs included in the reconstructions did not have a COG functional class assignment (see Supplemental information). (B) The number of reactions (both gene-associated and non-gene associated) that are associated to ORFs from each COG functional class. Since ORFs can belong to multiple classes, the percent unique in each class is listed. Non-gene-associated reactions were assigned to a class manually. (C) The number of metabolites that participate in reactions from each functional class and the percent unique in each class. Other (OT) includes classes J, K, L, O, T, U, V (see Supplementary information). NC, no COG assignment.

Full figure and legend (343K)Figures & Tables index

Figure 1 characterizes the content of iAF1260 by categorizing the ORFs (i.e., genes), reactions and metabolites contained in the reconstruction in terms of their Clusters of Orthologous Groups (COGs) functional class. The figure also outlines the genetic content of five previous reconstructions of E. coli metabolism. The major areas of expansion for iAF1260 relative to previous work (i.e., the enhancements) fall into five categories: (i) an increased scope, (ii) compartmentalization of the reconstruction into three distinct cellular compartments (the cytosol, periplasm and extra-cellular space), (iii) increased pathway detail, (iv) incorporation of reaction thermodynamic information, and (v) alignment with the E. coli specific database, EcoCyc. Each of these five major areas of expansion is discussed further in the text.

In an effort to clearly demonstrate how a genome-scale reconstruction can be utilized as a computational model for phenotypic predictions (e.g., flux balance analysis (FBA) calculations), we delineated the steps necessary for converting a reconstruction to a computational model (Figure 3). In summary, this process involves: (i) explicit assignment of the metabolites participating in a reaction (often times incorporated into the reconstruction process), (ii) definition of a system boundary, (iii) conversion of the defined system into a mathematical format that forms the basis for a computational model, (iv) curation of the network, which often requires filling gaps in the reconstruction, and (v) determining the strain specific parameters for a particular organism or system (e.g., maintenance parameters).

Figure 2
Figure 2 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Thermodynamic properties of the reactions in iAF1260. (A) The distribution of estimated DeltarG'm values for the reactions in iAF1260. DeltarG'm could be estimated for 1996 reactions (96%) in the reconstruction. 64% of the represented reactions have a negative DeltarG'm, and 20% of the reactions have a DeltarG'm of approximately zero. This distribution of DeltarG'm values indicates that most reactions in the model are thermodynamically favorable at millimolar concentration conditions. (B) The range of possible DeltarG' values for the reactions in iAF1260. DeltarG' differs from DeltarG'o (orange diamonds) and DeltarG'm (black diamonds) due to variations in metabolite concentrations from the 1 M and 1 mM reference states, respectively. Metabolite concentrations typically range from 0.02 to 0.00001 M, resulting in a wide range of values for DeltarG' (blue error bars). Taking uncertainty into account, the range of possible values for DeltarG' can be extended (purple error bars). The DeltarG' ranges were used to assess the feasibility and reversibility of the reactions in iAF1260; reactions for which a positive DeltarG' is not possible are thermodynamically irreversible.

Full figure and legend (172K)Figures & Tables index

Figure 3
Figure 3 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

Utilizing iAF1260 as a predictive model. (A) A drawing of central metabolism and the ETS included in iAF1260. Originally, the entire network is unconstrained. (B) Application of transcriptional regulatory effects restricts the total number of pathways, or routes, flux can pass through in the network. (C) Further application of known reaction capacities can result in more accurate predictions. For example, the flux through the NADH dehydrogenase enzymes is split in a 1:1 ratio during a simulation to produce an optimal P/O ratio of approximately 1.4 (Gennis and Stewart, 1996; Noguchi et al, 2004). (D) The non-metabolic activity of the cell can be accounted for through maintenance parameters and these were approximated using experimental data under known media conditions. Chemostat data (see Materials and methods) was used (triangles) and the dotted line shows the modeling predictions with the appropriate maintenance parameters. (E) After the parameters are approximated, the model can then be used to predict the GR (circles), product formation (acetate, squares) and additional uptake rates (oxygen, triangles) under different environmental conditions (for succinate growth in this case).

Full figure and legend (362K)Figures & Tables index

After formulating a computational model based on the iAF1260 reconstruction, we utilized this model to predict and quantify the active pathways and probable system outputs under glucose aerobic conditions, a common E. coli laboratory growth condition. In this process, we addressed modeling issues specific to E. coli and the chosen growth conditions that are required for accurate phenotypic predictions using iAF1260, which include (i) transcriptional regulatory events, (ii) cellular maintenance costs, and (iii) reaction kinetic effects. We then used the information from this process and applied it to study growth under a different condition, aerobic growth on succinate. The modeling predictions made using iAF1260 agreed well with reported experimental values under these conditions (Figure 3). Looking further at modeling predictions on a pathway by pathway and reaction by reaction basis, we compared results from growth on glucose to 13C-labeled experiments (Fischer et al, 2004) and found a good agreement between predicted and observed values. A sensitivity analysis was also performed to determine how changes in key model parameters affect the computational results generated using FBA in conjunction with the iAF1260 model.

In forming iAF1260, we incorporated thermodynamic information to provide another means of assessing reaction reversibility beyond what is stated in the primary literature and assignments made using general heuristic rules. This process, termed thermodynamic consistency analysis, utilized flux variability analysis (Mahadevan and Schilling, 2003) under 174 different carbon source conditions to identify reactions that operated in a thermodynamically infeasible direction during near optimal growth. Specific examples are presented that describe how thermodynamically inconsistent reactions were altered for inclusion in iAF1260. Additionally, this analysis facilitated the further classification of reactions as essential, substitutable or blocked. Interestingly, a large number of the reactions in the reconstruction behaved uniformly regardless of the carbon source being utilized. Once reactions that operated in thermodynamically infeasible directions according to the flux variability analysis were identified and adjusted to remove all thermodynamic inconsistencies, we examined calculated DeltarG' values and further adjusted reactions to be consistent with those we predicted to be irreversible with high likelihood.

To demonstrate how iAF1260 can be used as context for biological content, we analyzed two high-throughput data types: growth phenotype (http://www.biolog.com) and gene essentiality (Figure 5) (Baba et al, 2006; Joyce et al, 2006) data for E. coli. To do this, we utilized FBA to screen carbon, nitrogen, phosphorus and sulfur sources that could support simulated growth, and also computationally determined the essential ORFs for growth under glucose aerobic conditions. We compared our predictions to the high-throughput data sets and outlined the level or agreement between computational and experimental results. Whereas the agreements validate the content of the reconstruction and modeling methods (an overall agreement of 76% for growth phenotype and 92% for gene essentiality predictions), the disagreements provide area for further investigation (included in the Supplementary Data). For the growth phenotypes, disagreements indicate possible areas where there are either errors in the reconstruction, where regulation limits the utilization of pathways needed for growth, and/or point to areas where further biochemical characterization is needed (targeted areas for biological discovery). Disagreements in gene essentiality data point to specific areas where additional intracellular and transport reactions can be examined to rectify the disagreements or identify a current limitation of the reconstruction and modeling methods (e.g., transcription and translation processes are not currently incorporated in the modeling scheme).

Figure 5
Figure 5 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

ORF essentiality predictions using iAF1260. This heat map characterizes the agreement between ORFs predicted to be essential using iAF1260 and those experimentally determined from Baba et al (2006) and Joyce et al (2006). The enlarged region details how each a row corresponds to a computationally predicted essential ORF (188 total). The overall agreement between iAF1260 predictions and those found to be experimentally essential (overall, column 1) is shown along with a breakdown for ORFs found to be essential under rich media conditions (rich, column 2), under both glucose and glycerol minimal media conditions (shared, column 3) and under just glucose minimal medium conditions (glucose, column 4). ORFs are further grouped by their COG functional class (see Figure 1 for abbreviations; MU-ORF belongs to multiple COG classes). Dark blue indicates the condition under which each ORF was found to be essential. For example, folP was predicted to be an essential ORF for the biosynthesis of folate in iAF1260 under these conditions, but was not identified as essential by Baba et al (2006). This suggested the possibility of an alternative pathway for this step in E. coli that has yet to be characterized.

Full figure and legend (219K)Figures & Tables index

Finally, future directions for improvement of the metabolic reconstruction of E. coli are discussed, and as the field of systems biology expands, it is expected that iAF1260 will serve as a key component for the study of E. coli and related organisms by providing a comprehensive picture of cellular metabolism.

Top

Acknowledgements

We thank Kenyon Applebee, Edward Chuong, Ingrid Keseler, Sean Nihalani, Alan Ruttenberg, Milton Saier, Jan Schellenberger and Jeremy Zucker for their help in the generation and analysis of the reconstruction. Studies performed at UCSD were supported by National Institutes of Health Grant GM057089. Bernhard Palsson and UCSD have a financial interest in Genomatica Inc. Although the NIH R01 GM057089 grant has been identified for conflict of interest management based on the overall scope of the project and its potential to benefit Genomatica Inc., the research findings included in this publication do not necessarily directly relate to the interests of Genomatica Inc.

Top

References

  1. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008 16738554  | Article |
  2. Fischer E, Zamboni N, Sauer U (2004) High-throughput metabolic flux analysis based on gas chromatography-mass spectrometry derived 13C constraints. Anal Biochem 325: 308–316 | Article | PubMed | ISI | ChemPort |
  3. Gennis RB, Stewart V (1996) Respiration. In Escherichia coli and Salmonella, Neidhardt FC (ed), pp 217–261. ASM Press: Washington, DC
  4. Joyce AR, Reed JL, White A, Edwards R, Osterman A, Baba T, Mori H, Lesely SA, Palsson BO, Agarwalla S (2006) Experimental and computational assessment of conditionally essential genes in Escherichia coli. J Bacteriol 188: 8259–8271 | Article | PubMed | ISI | ChemPort |
  5. Mahadevan R, Schilling CH (2003) The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng 5: 264–276 | Article | PubMed | ISI | ChemPort |
  6. Noguchi Y, Nakai Y, Shimba N, Toyosaki H, Kawahara Y, Sugimoto S, Suzuki E (2004) The energetic conversion competence of Escherichia coli during aerobic respiration studied by 31P NMR using a circulating fermentation system. J Biochem (Tokyo) 136: 509–515 | PubMed | ChemPort |
  7. Reed JL, Famili I, Thiele I, Palsson BO (2006a) Towards multidimensional genome annotation. Nat Rev Genet 7: 130–141 | Article | PubMed | ISI | ChemPort |
  8. Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett III G, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D, Wanner BL (2006) Escherichia coli K-12: a cooperatively developed annotation snapshot-2005. Nucleic Acids Res 34: 1–9 | Article | PubMed | ISI | ChemPort |

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

NEWS AND VIEWS

Bringing metabolomics data into the forefront of systems biology

Molecular Systems Biology News and Views (20 Jun 2006)

Principles of optimal metabolic network operation

Molecular Systems Biology News and Views (10 Jul 2007)

Extra navigation

.
ADVERTISEMENT