Protocol | Published:

A protocol for generating a high-quality genome-scale metabolic reconstruction

Nature Protocols volume 5, pages 93121 (2010) | Download Citation

Abstract

Network reconstructions are a common denominator in systems biology. Bottom–up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , , & Global organization of metabolic fluxes in the bacterium Escherichia coli . Nature 427, 839–843 (2004).

  2. 2.

    , , & Candidate metabolic network states in human mitochondria: impact of diabetes, ischemia and diet. J. Biol. Chem. 280, 11683–11695 (2005).

  3. 3.

    et al. Chance and necessity in the evolution of minimal metabolic networks. Nature 440, 667–670 (2006).

  4. 4.

    , , & The global transcriptional regulatory network for metabolism in Escherichia coli attains few dominant functional states. Proc. Natl. Acad. Sci. USA 102, 19103–19108 (2005).

  5. 5.

    , , , & Integrating high-throughput and computational data elucidates bacterial networks. Nature 429, 92–96 (2004).

  6. 6.

    , & Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl. Acad. Sci. USA 99, 15112–15117 (2002).

  7. 7.

    & The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli . Nat. Biotechnol. 26, 659–667 (2008).

  8. 8.

    , , , & Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7, 129–143 (2009).

  9. 9.

    , , & Towards multidimensional genome annotation. Nat. Rev.ws Genet. 7, 130–141 (2006).

  10. 10.

    , , , & Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics 7, 296 (2006).

  11. 11.

    , & Genome-scale models of bacterial metabolism: reconstruction and applications. FEMS Microbiol. Rev. 33, 164–190 (2009).

  12. 12.

    , , & Genome-scale microbial in silico models: the constraints-based approach. Trends Biotechnol. 21, 162–169 (2003).

  13. 13.

    , , & Combining pathway analysis with flux balance analysis for the comprehensive study of metabolic systems. Biotechnol. Bioeng. 71, 286–306 (2000).

  14. 14.

    & Metabolic flux balancing: basic concepts, scientific and practical use. Nat. Biotechnol. 12, 994–998 (1994).

  15. 15.

    et al. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc. Natl. Acad. Sci. USA 104, 1777–1782 (2007).

  16. 16.

    et al. Quantitative prediction of cellular metabolism with constraint-based models: The COBRA Toolbox. Nat. Protoc. 2, 727–738 (2007).

  17. 17.

    & Network analysis of intermediary metabolism using linear optimization. I. Development of mathematical formalism. J. Theor. Biol. 154, 421–454 (1992).

  18. 18.

    & Optimization-based framework for inferring and testing hypothesized metabolic objective functions. Biotechnol. Bioeng. 82, 670–677 (2003).

  19. 19.

    , & Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli . Mol. Syst. Biol. 3, 1–15 (2007).

  20. 20.

    , , , & Predicting biological system objectives de novo from internal state measurements. BMC Bioinformatics 9, 43 (2008).

  21. 21.

    & The JAK-STAT signaling network in the human B-cell: an extreme signaling pathway analysis. Biophys. J. 87, 37–46 (2004).

  22. 22.

    , , & Identification of potential pathway mediation targets in Toll-like receptor signaling. PLoS Comput. Biol. 5, e1000292 (2009).

  23. 23.

    , , & Genome-scale reconstruction of Escherichia coli's transcriptional and translational machinery: a knowledge base, its mathematical formulation, and its functional characterization. PLoS Comput. Biol. 5, e1000312 (2009).

  24. 24.

    , , , & Matrix formalism to describe functional states of transcriptional regulatory systems. PLoS Comput. Biol. 2, e101 (2006).

  25. 25.

    , , & Functional States of the genome-scale Escherichia coli transcriptional regulatory system. PLoS Comput. Biol. 5, e1000403 (2009).

  26. 26.

    , & Helicobacter pylori (ASM Press, Washington, D.C., 2001).

  27. 27.

    (ed.) Escherichia coli and Salmonella: Cellular and Molecular Biology 2nd edn. (ASM Press, Washington, D.C., 1996).

  28. 28.

    & The Metabolism and Molecular Physiology of Saccharomyces cerevisiae 2nd edn. (Taylor & Francis Ltd, London, Philadelphia, 2004).

  29. 29.

    Pseudomonas (Academic/Plenum Publishers, New York Kluwer, 2004).

  30. 30.

    , & The pathway tools software. Bioinformatics (Oxford, England) 18 (Suppl 1): S225–S232 (2002).

  31. 31.

    , , & metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella. Nucleic Acids Res. 33, 1399–1409 (2005).

  32. 32.

    et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 33, 5691–5702 (2005).

  33. 33.

    Genome annotation: from sequence to biology. Nat. Rev. Genet. 2, 493–503 (2001).

  34. 34.

    et al. The RAST server: rapid annotations using subsystems technology. BMC Genomics 9, 75 (2008).

  35. 35.

    , , & Annotation of bacterial and archaeal genomes: improving accuracy and consistency. Chem. Rev. 107, 3431–3447 (2007).

  36. 36.

    et al. Metabolic network analysis integrated with transcript verification for sequenced genomes. Nat. Methods 6, 589–592 (2009).

  37. 37.

    et al. A revised annotation and comparative analysis of Helicobacter pylori genomes. Nucleic Acids Res. 31, 1704–1714 (2003).

  38. 38.

    et al. Multidimensional annotation of the Escherichia coli K-12 genome. Nucleic Acids Res. 35, 7577–7590 (2007).

  39. 39.

    et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

  40. 40.

    (NC-IUBMB), N.C.o.t.I.U.o.B.a.M.B. Enzyme Nomenclature 6th edn. (Academic Press, San Diego, California, 1992).

  41. 41.

    et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354–D357 (2006).

  42. 42.

    , , , & BRENDA, AMENDA and FRENDA: the enzyme information system in 2007. Nucleic Acids Res. 35, D511–D514 (2007).

  43. 43.

    et al. The EcoCyc database. Nucleic Acids Res. 30, 56–58 (2002).

  44. 44.

    , , & Group contribution method for thermodynamic analysis of complex metabolic networks. Biophys. J. 95, 1487–1499 (2008).

  45. 45.

    , & Quantitative assignment of reaction directionality in constraint-based models of metabolism: application to Escherichia coli . Biophys. Chem. 145, 47–56 (2009).

  46. 46.

    , & Systematic assignment of thermodynamic constraints in metabolic network models. BMC Bioinformatics 7, 1–12 (2006).

  47. 47.

    et al. PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics (Oxford, England) 21, 617–623 (2005).

  48. 48.

    et al. Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics (Oxford, England) 20, 547–556 (2004).

  49. 49.

    , , & Locating proteins in the cell using TargetP, SignalP and related tools. Nat. Protoc. 2, 953–971 (2007).

  50. 50.

    et al. Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402, 413–418 (1999).

  51. 51.

    et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003).

  52. 52.

    , & The European Bioinformatics Institute's data resources: towards systems biology. Nucleic Acids Res. 33, D46–D53 (2005).

  53. 53.

    et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 35, D5–D12 (2007).

  54. 54.

    SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).

  55. 55.

    , , , & Enhancement of the chemical semantic web through the use of InChI identifiers. Org. Biomol. Chem. 3, 1832–1834 (2005).

  56. 56.

    Internet-based tools for communication and collaboration in chemistry. Drug Discov. Today 13, 502–506 (2008).

  57. 57.

    , & A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory. BMC Syst Biol 2, 79 (2008).

  58. 58.

    & Rapid screening method for quantitation of bacterial cell lipids from whole cells. J Microbiol Methods 55, 411–418 (2003).

  59. 59.

    , & A simple and reliable method for the determination of cellular RNA content. Biotechnol. Tech. 5, 39–42 (1991).

  60. 60.

    , & Chemical analysis of microbial cells. Methods Microbiol. 5, 209–344 (1971).

  61. 61.

    & Ribosomal genes in Escherichia coli. Annu. Rev. Genet. 20, 297–326 (1986).

  62. 62.

    , , , & The number of ribosomal RNA genes in Mycoplasma capricolum. Mol. Gen. Genet. 182, 502–504 (1981).

  63. 63.

    & Characterization of the ribosomal RNA gene clusters in Halobacterium cutirubrum. J. Biol. Chem. 260, 899–906 (1985).

  64. 64.

    , & Physiology of the Bacterial Cell: A Molecular Approach (Sinauer Associates, Sunderland, MA, USA, 1990).

  65. 65.

    et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol. 3, 121 (2007).

  66. 66.

    , & Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J. Theor. Biol. 203, 229–248 (2000).

  67. 67.

    , & Candidate states of Helicobacter pylori's genome-scale metabolic network upon application of loop law thermodynamic constraints. Biophys. J. 90, 3919–3928 (2006).

  68. 68.

    Systems Biology: Properties of Reconstructed Networks (Cambridge University Press, New York, 2006).

  69. 69.

    , , & Compounds which serve as the sole source of carbon or nitrogen for Salmonella typhimurium LT-2. J. Bacteriol. 100, 215–219 (1969).

  70. 70.

    , & Glucose fermentation to acetate, CO 2 and H 2 in the anaerobic hyperthermophilic eubacterium Thermotoga maritima: involvement of the Embden–Meyerhof pathway. Arch. Microbiol. 161, 460–470 (1994).

  71. 71.

    , & Optimization based automated curation of metabolic reconstructions. BMC Bioinformatics 8, 212 (2007).

  72. 72.

    & Genome-scale in silico models of E. coli have multiple equivalent phenotypic states: assessment of correlated reaction subsets that comprise network states. Genome Res. 14, 1797–1805 (2004).

  73. 73.

    et al. Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. J. Biol. Chem. 281, 40041–40048 (2006).

  74. 74.

    et al. Systems approach to refining genome annotation. Proc. Natl. Acad. Sci. USA 103, 17480–17484 (2006).

  75. 75.

    , , , & Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J. Biol. Chem. 282, 28791–28799 (2007).

  76. 76.

    , , & An expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterization of single and double deletion mutants. J. Bacteriol. 187, 5818–5830 (2005).

  77. 77.

    , , , & Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Syst. Biol. 2, 1–14 (2006).

  78. 78.

    , , & Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc. Natl. Acad. Sci. USA 100, 13134–13139 (2003).

  79. 79.

    , & Bayesian-based selection of metabolic objective functions. Bioinformatics (Oxford, England) 23, 351–357 (2007).

  80. 80.

    The principle of flux minimization and its application to estimate stationary fluxes in metabolic networks. Eur. J. Biochem. 271, 2905–2922 (2004).

  81. 81.

    , & Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc. Natl. Acad. Sci. USA 102, 7695–7700 (2005).

  82. 82.

    , & Is maximization of molar yield in metabolic networks favoured by evolution? J. Theor. Biol 252, 497–504 (2008).

  83. 83.

    & Correcting ligands, metabolites, and pathways. BMC Bioinformatics 7, 517 (2006).

  84. 84.

    et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480–D484 (2008).

  85. 85.

    et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).

  86. 86.

    et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 36, D13–D21 (2008).

  87. 87.

    & Mycobacterial cell wall: structure and role in natural resistance to antibiotics. FEMS Microbiol. Lett. 123, 11–18 (1994).

  88. 88.

    et al. The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 32, D293–D295 (2004).

  89. 89.

    , & TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels. Nucleic Acids Res. 35, D274–D279 (2007).

  90. 90.

    , & Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst. Biol. 1, 2 (2007).

  91. 91.

    , , & FluxAnalyzer: exploring structure, pathways, and flux distributions in metabolic networks on interactive flux maps. Bioinformatics (Oxford, England) 19, 261–269 (2003).

  92. 92.

    , , , & FluxExplorer: a general platform for modeling and analyses of metabolic networks based on stoichiometry. Chin. Sci. Bull. 51, 689–696 (2006).

  93. 93.

    , , & MetaFluxNet: the management of metabolic reaction information and quantitative metabolic flux analysis. Bioinformatics (Oxford, England) 19, 2144–2146 (2003).

  94. 94.

    et al. Systems-level analysis of genome-scale in silico metabolic models using MetaFluxNet. Biotechnol. Bioproc. Eng. 10, 425–431 (2005).

  95. 95.

    et al. Carbohydrate-induced differential gene expression patterns in the hyperthermophilic bacterium Thermotoga maritima. J. Biol. Chem. 278, 7540–7552 (2003).

  96. 96.

    et al. Genome-scale reconstruction and analysis of the Pseudomonas putida KT2440 metabolic network facilitates applications in biotechnology. PLoS Comput. Biol. 4, e1000210 (2008).

  97. 97.

    , , , & Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1. J. Bacteriol. 190, 2790–2803 (2008).

  98. 98.

    et al. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189. PLoS Comput. Biol. 5, e1000285 (2009).

  99. 99.

    , & Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res. 15, 820–829 (2005).

  100. 100.

    , , , & Systems analysis of metabolism in the pathogenic trypanosomatid Leishmania major. Mol. Syst. Biol. 4, 177 (2008).

  101. 101.

    , & Connecting extracellular metabolomic measurements to intracellular flux states in yeast. BMC Syst. Biol. 3, 37 (2009).

Download references

Acknowledgements

We would like to acknowledge R.M.T. Fleming, A. Feist and N. Jamshidi for their valuable discussions. We thank M. Abrahams, S.A. Becker and F.-C. Cheng for reading the paper. We also thank S. Burning for preparing the biomass reaction manual, as well as A. Bordbar and R.M.T. Fleming for providing Matlab code. I.T. was supported by National Institutes of Health (NIH) grant R01 GM057089.

Author information

Author notes

    • Ines Thiele

    Current address: Center for Systems Biology, Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland.

Affiliations

  1. Department of Bioengineering, University of California, San Diego, La Jolla, California, USA.

    • Ines Thiele
    •  & Bernhard Ø Palsson

Authors

  1. Search for Ines Thiele in:

  2. Search for Bernhard Ø Palsson in:

Corresponding author

Correspondence to Bernhard Ø Palsson.

Supplementary information

PDF files

  1. 1.

    Supplementary Method 1 | Supplemental tables, figures, and sample use of Cobra Toolbox functions presented in the protocol.

    The supplemental tables and figures illustrate additional information which can be helpful during the reconstruction process. We also included detailed examples on the use of various Cobra Toolbox commands needed during the network evaluation and debugging phase. Furthermore, the file includes a list of standards that have been commonly used in metabolic reconstructions (e.g., naming conventions).

Excel files

  1. 1.

    Supplementary Method 2 | Extract of a curated reconstruction.

    This spreadsheet can be used as a starting point for the manual reconstruction. It contains all necessary columns for reaction and metabolite curation. The order of columns of the metabolite and reaction spreadsheets is important for importing the reconstruction into Matlab using the ‘xls2model’ function (Step 39). The example also highlights which information is obligatory.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nprot.2009.203

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.