Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Towards multidimensional genome annotation

Key Points

  • Sequencing efforts have provided us with detailed information about the genetic content of various organisms across all three domains of life. As genomic sciences continue to evolve we can anticipate that multiple dimensions in genome annotation will emerge as we characterize genome-scale functions. The expansion in dimensionality of genome annotation allows for the formalization of our knowledge about genomes, their attributes and functions.

  • A one-dimensional annotation provides information on the location of genes and any information on the known or putative function of gene products. A two-dimensional annotation uses information about the functional networks in a cell to specify the cellular components and their interactions.

  • Two-dimensional annotations of many cellular processes can be represented as biochemical transformations. These two-dimensional annotations serve as biochemically and genetically structured databases through which data can be analysed and from which computational models can be generated.

  • Manual, automated and iterative methods for generating two-dimensional reconstructions of cellular metabolism from one-dimensional annotations have been developed and can be applied to studying other cellular processes, such as signalling, transcription and translation.

  • Higher dimensions in genome annotation are beginning to appear, where the three-dimensional structural arrangement of genomes within the confines of a cell are accounted for and where changes in genome sequence over evolutionary time are tracked.

  • We currently have the methods and information needed to generate one-dimensional and two-dimensional annotations; as we learn more about the structural arrangement of genomes within the cell and how these genomes adaptively evolve we can begin to generate higher levels of annotation.

Abstract

Our information about the gene content of organisms continues to grow as more genomes are sequenced and gene products are characterized. Sequence-based annotation efforts have led to a list of cellular components, which can be thought of as a one-dimensional annotation. With growing information about component interactions, facilitated by the advancement of various high-throughput technologies, systemic, or two-dimensional, annotations can be generated. Knowledge about the physical arrangement of chromosomes will lead to a three-dimensional spatial annotation of the genome and a fourth dimension of annotation will arise from the study of changes in genome sequences that occur during adaptive evolution. Here we discuss all four levels of genome annotation, with specific emphasis on two-dimensional annotation methods.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Four levels of annotation.
Figure 2: Model-guided network expansion.

Similar content being viewed by others

References

  1. Thiele, I., Price, N. D., Vo, T. D. & Palsson, B. O. Candidate metabolic network states in human mitochondria. Impact of diabetes, ischemia, and diet. J. Biol. Chem. 280, 11683–11695 (2005).

    Article  CAS  PubMed  Google Scholar 

  2. Jamshidi, N., Wiback, S. J. & Palsson, B. O. In silico model-driven assessment of the effects of single nucleotide polymorphisms (SNPs) on human red blood cell metabolism. Genome Res. 12, 1687–1692 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Yeh, I., Hanekamp, T., Tsoka, S., Karp, P. D. & Altman, R. B. Computational analysis of Plasmodium falciparum metabolism: organizing genomic information to facilitate drug discovery. Genome Res. 14, 917–924 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Becker, S. A. & Palsson, B. O. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC Microbiol. 5, 8 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Burgard, A. P., Pharkya, P. & Maranas, C. D. Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. 84, 647–657 (2003).

    Article  CAS  PubMed  Google Scholar 

  6. Alper, H., Jin, Y. S., Moxley, J. F. & Stephanopoulos, G. Identifying gene targets for the metabolic engineering of lycopene biosynthesis in Escherichia coli. Metab. Eng. 7, 155–164 (2005).

    Article  CAS  PubMed  Google Scholar 

  7. Alper, H., Miyaoku, K. & Stephanopoulos, G. Construction of lycopene-overproducing E. coli strains by combining systematic and combinatorial gene knockout targets. Nature Biotechnol. 23, 612–616 (2005).

    Article  CAS  Google Scholar 

  8. Fong, S. S. et al. In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol. Bioeng. 91, 743–748 (2005).

    Article  CAS  Google Scholar 

  9. Carlson, R., Fell, D. & Srienc, F. Metabolic pathway analysis of a recombinant yeast for rational strain development. Biotechnol. Bioeng. 79, 121–134 (2002).

    Article  CAS  PubMed  Google Scholar 

  10. Pharkya, P., Burgard, A. P. & Maranas, C. D. OptStrain: a computational framework for redesign of microbial production systems. Genome Res. 14, 2367–2376 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Liao, J. C., Hou, S. Y. & Chao, Y. P. Pathway analysis, engineering and physiological considerations for redirecting central metabolism. Biotechnol. Bioeng. 52, 129–140 (1996).

    Article  CAS  PubMed  Google Scholar 

  12. Janssen, P., Goldovsky, L., Kunin, V., Darzentas, N. & Ouzounis, C. A. Genome coverage, literally speaking. The challenge of annotating 200 genomes with 4 million publications. EMBO Rep. 6, 397–399 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Stein, L. Genome annotation: from sequence to biology. Nature Rev. Genet. 2, 493–503 (2001). This article provides a thorough review of one-dimensional annotation methods that involve gene finding and gene-functional assignment, as well as placing genes in the context of biological processes.

    Article  CAS  PubMed  Google Scholar 

  14. Salzberg, S. L., Delcher, A. L., Kasif, S. & White, O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26, 544–548 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Salzberg, S. L., Pertea, M., Delcher, A. L., Gardner, M. J. & Tettelin, H. Interpolated Markov models for eukaryotic gene finding. Genomics 59, 24–31 (1999).

    Article  CAS  PubMed  Google Scholar 

  16. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

    Article  CAS  PubMed  Google Scholar 

  17. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  PubMed  Google Scholar 

  19. Pearson, W. R. & Lipman, D. J. Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA 85, 2444–2448 (1988).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Eddy, S. HMMER: profile HMMs for protein sequence analysis. HMMER: sequence analysis using pofile hidden Markov Models web site [online], <http://hmmer.wustl.edu> (2003).

    Google Scholar 

  21. Bowers, P. M. et al. Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 5, R35 (2004). This article describes several context-based methods for identifying genes that are functionally related. The article also announces the creation of the Prolinks database that includes results for several genomes.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G. D. & Maltsev, N. Use of contiguity on the chromosome to predict functional coupling. In Silico Biol. 1, 93–108 (1999).

    CAS  PubMed  Google Scholar 

  23. Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G. D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA 96, 2896–2901 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Enright, A. J., Iliopoulos, I., Kyrpides, N. C. & Ouzounis, C. A. Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999).

    Article  CAS  PubMed  Google Scholar 

  25. Marcotte, E. M. et al. Detecting protein function and protein–protein interactions from genome sequences. Science 285, 751–753 (1999).

    Article  CAS  PubMed  Google Scholar 

  26. Marcotte, C. J. & Marcotte, E. M. Predicting functional linkages from gene fusions with confidence. Appl. Bioinformatics 1, 93–100 (2002).

    PubMed  Google Scholar 

  27. Wu, J., Kasif, S. & DeLisi, C. Identification of functional links between genes using phylogenetic profiles. Bioinformatics 19, 1524–1530 (2003).

    Article  CAS  PubMed  Google Scholar 

  28. Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D. & Yeates, T. O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA 96, 4285–4288 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Kharchenko, P., Vitkup, D. & Church, G. M. Filling gaps in a metabolic network using expression information. Bioinformatics 20 (Suppl. 1), I178–I185 (2004).

    Article  CAS  PubMed  Google Scholar 

  30. Stuart, J. M., Segal, E., Koller, D. & Kim, S. K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).

    Article  CAS  PubMed  Google Scholar 

  31. Walker, M. G., Volkmuth, W., Sprinzak, E., Hodgson, D. & Klingler, T. Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes. Genome Res. 9, 1198–1203 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Hughes, T. R. et al. Functional discovery via a compendium of expression profiles. Cell 102, 109–126 (2000).

    Article  CAS  PubMed  Google Scholar 

  33. Zhang, W. et al. The functional landscape of mouse gene expression. J. Biol. 3, 21 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nature Biotechnol. 23, 561–566 (2005).

    Article  CAS  Google Scholar 

  35. Covert, M. W., Knight, E. M., Reed, J. L., Herrgard, M. J. & Palsson, B. O. Integrating high-throughput and computational data elucidates bacterial networks. Nature 429, 92–96 (2004). This article describes an iterative model-building approach for identifying new regulatory interactions that is based on gene-expression data. The work also resulted in the identification of knowledge gaps in metabolism and regulation from analysis of mutant phenotyping data.

    Article  CAS  PubMed  Google Scholar 

  36. Borodina, I., Krabben, P. & Nielsen, J. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res. 15, 820–829 (2005). This article describes a metabolic reconstruction that is generated by automated methods followed by manual curation for Streptomyces coelicolor . It discusses problems that are associated with automated reconstructions and provides examples where two-dimensional annotation enhanced one-dimensional annotation by finding genes for missing metabolic enzymes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Green, M. L. & Karp, P. D. A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics 5, 76 (2004). This article presents a method for identifying the genes responsible for encoding enzymes that are missing from pathways in current metabolic reconstructions. This method was applied to reconstructions from three different organisms and led to new putative assignments for about half the missing enzymes.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Karp, P. D., Krummenacker, M., Paley, S. & Wagg, J. Integrated pathway-genome databases and their role in drug discovery. Trends Biotechnol. 17, 275–281 (1999).

    Article  CAS  PubMed  Google Scholar 

  39. Reed, J. L., Vo, T. D., Schilling, C. H. & Palsson, B. O. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 4, R54 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Price, N. D., Reed, J. L. & Palsson, B. O. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nature Rev. Microbiol. 2, 886–897 (2004). This review provides a comprehensive overview of developed methods for interrogating reconstructions using a constraint-based modelling approach.

    Article  CAS  Google Scholar 

  41. Papin, J. A., Hunter, T., Palsson, B. O. & Subramaniam, S. Reconstruction of cellular signalling networks and analysis of their properties. Nature Rev. Mol. Cell Biol. 6, 99–111 (2005).

    Article  CAS  Google Scholar 

  42. Papin, J. A. & Palsson, B. O. The JAK–STAT signaling network in the human B-cell: an extreme signaling pathway analysis. Biophys. J. 87, 37–46 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Ouzounis, C. A. & Karp, P. D. The past, present and future of genome-wide re-annotation. Genome Biol 3, COMMENT 2001 (2002).

    Article  Google Scholar 

  44. Schomburg, I. et al. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 32, D431–D433 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Duarte, N. C., Herrgard, M. J. & Palsson, B. O. Reconstruction and validation of Saccharomyces cerevisiae i ND750, a fully compartmentalized genome-scale metabolic model. Genome Res. 14, 1298–1309 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Gardy, J. L. et al. PSORTb v. 2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21, 617–623 (2005).

    Article  CAS  PubMed  Google Scholar 

  47. Hua, S. & Sun, Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17, 721–728 (2001).

    Article  CAS  PubMed  Google Scholar 

  48. Schneider, G. & Fechner, U. Advances in the prediction of protein targeting signals. Proteomics 4, 1571–1580 (2004).

    Article  CAS  PubMed  Google Scholar 

  49. Ross-Macdonald, P. et al. Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402, 413–418 (1999).

    Article  CAS  PubMed  Google Scholar 

  50. Huh, W. K. et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003).

    Article  CAS  PubMed  Google Scholar 

  51. Gasteiger, E. et al. ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31, 3784–3788 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Keseler, I. M. et al. EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 33, D334–D337 (2005).

    Article  CAS  PubMed  Google Scholar 

  54. Overbeek, R. et al. WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction. Nucleic Acids Res. 28, 123–125 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Christie, K. R. et al. Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res. 32, D311–D314 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Krieger, C. J. et al. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res. 32, D438–D442 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Vo, T. D., Greenberg, H. J. & Palsson, B. O. Reconstruction and functional characterization of the human mitochondrial metabolic network based on proteomic and biochemical data. J. Biol. Chem. 279, 39532–39540 (2004).

    Article  CAS  PubMed  Google Scholar 

  58. Neidhardt, F. C., Ingraham, J. L. & Schaechter, M. Physiology of the bacterial cell (Sinauer Associates, Sunderland, Massachusetts, 1990).

    Google Scholar 

  59. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. The large-scale organization of metabolic networks. Nature 407, 651–654 (2000).

    Article  CAS  PubMed  Google Scholar 

  60. Famili, I. & Palsson, B. O. Systemic metabolic reactions are obtained by singular value decomposition of genome-scale stoichiometric matrices. J. Theor. Biol. 224, 87–96 (2003).

    Article  CAS  PubMed  Google Scholar 

  61. Thiele, I., Vo, T. D., Price, N. D. & Palsson, B. O. An expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): An in silico genome-scale characterization of single and double deletion mutants. J. Bacteriol. 187, 5818–5830 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Karp, P. D., Paley, S. & Romero, P. The Pathway Tools software. Bioinformatics 18 (Suppl. 1), S225–S232 (2002).

    Article  PubMed  Google Scholar 

  63. Paley, S. M. & Karp, P. D. Evaluation of computational metabolic-pathway predictions for Helicobacter pylori. Bioinformatics 18, 715–724 (2002).

    Article  CAS  PubMed  Google Scholar 

  64. Tsoka, S., Simon, D. & Ouzounis, C. A. Automated metabolic reconstruction for Methanococcus jannaschii. Archaea 1, 223–229 (2004).

    Article  CAS  PubMed  Google Scholar 

  65. Romero, P. et al. Computational prediction of human metabolic pathways from the complete human genome. Genome Biol. 6, R2 (2005).

    Article  PubMed  Google Scholar 

  66. Zhang, P. et al. MetaCyc and AraCyc. Metabolic pathway databases for plant research. Plant Physiol. 138, 27–37 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Romero, P. & Karp, P. PseudoCyc, a pathway-genome database for Pseudomonas aeruginosa. J. Mol. Microbiol. Biotechnol. 5, 230–239 (2003).

    Article  CAS  PubMed  Google Scholar 

  68. Larsson, P. et al. The complete genome sequence of Francisella tularensis, the causative agent of tularemia. Nature Genet. 37, 153–159 (2005).

    Article  CAS  PubMed  Google Scholar 

  69. Karp, P. D. et al. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 33, 6083–6089 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Serres, M. H. et al. A functional update of the Escherichia coli K-12 genome. Genome Biol. 2, RESEARCH 0035 (2001).

    Article  Google Scholar 

  71. Burgard, A. P., Nikolaev, E. V., Schilling, C. H. & Maranas, C. D. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 14, 301–312 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Palsson, B. The challenges of in silico biology. Nature Biotechnol. 18, 1147–1150 (2000).

    Article  CAS  Google Scholar 

  73. Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929–934 (2001). This article illustrates how the combination of experimental measurements and model predictions can be used to identify new network interactions. The experiments were carried out to better understand and generate new hypotheses concerning galactose utilization in yeast.

    Article  CAS  PubMed  Google Scholar 

  74. Thanbichler, M., Viollier, P. H. & Shapiro, L. The structure and function of the bacterial chromosome. Curr. Opin. Genet. Dev. 15, 153–162 (2005). This review discusses studies that relate to the topological (three-dimensional) structure of bacterial chromosomes. It describes recent evidence that the organization of bacterial chromosomes is non-random and that during replication the position of the genome within the cell is spatially arranged.

    Article  CAS  PubMed  Google Scholar 

  75. Chakalova, L. et al. Replication and transcription: shaping the landscape of the genome. Nature Rev. Genet. 6, 669–677 (2005).

    Article  CAS  PubMed  Google Scholar 

  76. Viollier, P. H. et al. Rapid and sequential movement of individual chromosomal loci to specific subcellular locations during bacterial DNA replication. Proc. Natl Acad. Sci. USA 101, 9257–9262 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Allen, T. E. et al. Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets. J. Bacteriol. 185, 6392–6399 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Jeong, K. S., Ahn, J. & Khodursky, A. B. Spatial patterns of transcriptional activity in the chromosome of Escherichia coli. Genome Biol. 5, R86 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Gerdes, S. Y. et al. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J. Bacteriol. 185, 5673–5684 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Rocha, E. P. & Danchin, A. Gene essentiality determines chromosome organisation in bacteria. Nucleic Acids Res. 31, 6570–6577 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Rocha, E. P. & Danchin, A. Essentiality, not expressiveness, drives gene-strand bias in bacteria. Nature Genet. 34, 377–378 (2003).

    Article  CAS  PubMed  Google Scholar 

  82. Hatfield, G. W. & Benham, C. J. DNA topology-mediated control of global gene expression in Escherichia coli. Annu. Rev. Genet. 36, 175–203 (2002).

    Article  CAS  PubMed  Google Scholar 

  83. Travers, A. & Muskhelishvili, G. DNA supercoiling — a global transcriptional regulator for enterobacterial growth? Nature Rev. Microbiol. 3, 157–169 (2005).

    Article  CAS  Google Scholar 

  84. Flores, N. et al. Adaptation for fast growth on glucose by differential expression of central carbon metabolism and gal regulon genes in an Escherichia coli strain lacking the phosphoenolpyruvate:carbohydrate phosphotransferase system. Metab. Eng. 7, 70–87 (2005).

    Article  CAS  PubMed  Google Scholar 

  85. Raghunathan, A. & Palsson, B. O. Scalable method to determine mutations that occur during adaptive evolution of Escherichia coli. Biotechnol. Lett. 25, 435–441 (2003).

    Article  CAS  PubMed  Google Scholar 

  86. Notley-McRobb, L. & Ferenci, T. Adaptive mgl-regulatory mutations and genetic diversity evolving in glucose-limited Escherichia coli populations. Environ. Microbiol. 1, 33–43 (1999).

    Article  CAS  PubMed  Google Scholar 

  87. Anderson, J. B. et al. Mode of selection and experimental evolution of antifungal drug resistance in Saccharomyces cerevisiae. Genetics 163, 1287–1298 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Honisch, C., Raghunathan, A., Cantor, C. R., Palsson, B. O. & van den Boom, D. High-throughput mutation detection underlying adaptive evolution of Escherichia coli-K12. Genome Res. 14, 2495–2502 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Shendure, J. et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005). This article describes a new non-electrophoretic DNA-sequencing method for rapid whole-genome sequencing and provides results for the DNA sequence of an adaptively evolved strain of E. coli.

    Article  CAS  PubMed  Google Scholar 

  90. Palsson, B. O. Systems Biology: Properties of Reconstructed Networks (Cambridge Univ. Press, 2006).

    Book  Google Scholar 

  91. Reed, J. L. & Palsson, B. O. Thirteen years of building constraint-based in silico models of Escherichia coli. J. Bacteriol. 185, 2692–2699 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Edwards, J. S. & Palsson, B. O. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl Acad. Sci. USA 97, 5528–5533 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Forster, J., Famili, I., Fu, P., Palsson, B. O. & Nielsen, J. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res. 13, 244–253 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Sheikh, K., Forster, J. & Nielsen, L. K. Modeling hybridoma cell metabolism using a generic genome-scale metabolic model of Mus musculus. Biotechnol. Prog. 21, 112–121 (2005).

    Article  CAS  PubMed  Google Scholar 

  95. Park, S. M., Schilling, C. H. & Palsson, B. O. Compositions and methods for modeling Bacillus subtilis metabolism (US Patent and Trademark Office, 2003).

    Google Scholar 

  96. Schilling, C. H. & Palsson, B. O. Assessment of the metabolic capabilities of Haemophilus influenzae Rd through a genome-scale pathway analysis. J. Theor. Biol. 203, 249–283 (2000).

    Article  CAS  PubMed  Google Scholar 

  97. Edwards, J. S. & Palsson, B. O. Systems properties of the Haemophilus influenzae Rd metabolic genotype. J. Biol. Chem. 274, 17410–17416 (1999).

    Article  CAS  PubMed  Google Scholar 

  98. Schilling, C. H. et al. Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol. 184, 4582–4593 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Oliveira, A. P., Nielsen, J. & Forster, J. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol. 5, 39 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  100. Hong, S. H. et al. The genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens. Nature Biotechnol. 22, 1275–1281 (2004).

    Article  CAS  Google Scholar 

  101. Eppig, J. T. et al. The Mouse Genome Database (MGD): from genes to mice — a community resource for mouse biology. Nucleic Acids Res. 33, D471–D475 (2005).

    Article  CAS  PubMed  Google Scholar 

  102. Palsson, B. O. Two-dimensional annotation of genomes. Nature Biotechnol. 22, 1218–1219 2004).

    Article  CAS  Google Scholar 

  103. Woldringh, C. L. The role of co-transcriptional translation and protein translocation (transertion) in bacterial chromosome segregation. Mol. Microbiol. 45, 17–29 (2002).

    Article  CAS  PubMed  Google Scholar 

  104. Ibarra, R. U., Edwards, J. S. & Palsson, B. O. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature 420, 186–189 (2002).

    Article  CAS  PubMed  Google Scholar 

  105. Mahadevan, R. et al. Characterization of metabolism in the Fe(III)-reducing organism Geobacter sulfurreducens by constraint-based modeling. Appl. Environ. Microbiol. (in the press).

  106. Feist, A. M. et al. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Systems Biol. (in the press).

Download references

Acknowledgements

The authors would like to thank T. Allen and S. Fong for useful comments on the manuscript. This work was funded in part by the US National Institutes of Health. B.O.P. serves on the scientific advisory board of Genomatica, Inc.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernhard O. Palsson.

Ethics declarations

Competing interests

Jennifer L. Reed, Iman Famili, Ines Thiele & Bernhard O. Palsson Towards multidimensional genome annotation. Nature Reviews Genetics 7, 130–141 (2006); doi:10.1038/nrg1769 Bernhard O. Palsson serves on the Scientific Advisory Board of Genomatica, Inc.

Supplementary information

Related links

Related links

FURTHER INFORMATION

BRENDA — the comprehensive enzyme information system

Entrez Gene

ExPASy Proteomics Server

GENSCAN

GlimmerM

Human Metabolic Network Reconstruction web site

KEGG — Kyoto Encyclopedia of Genes and Genomes

MetaCyc

Pathway Tools

PSORTdb

PubChem

The GLIMMER homepage

TIGR — The Institute for Genomic Research

TransportDB

UniProtKB

Glossary

One-dimensional annotation

Details the position of genes within the genome and describes the cellullar function of gene products.

Two-dimensional annotation

Accounts for the cellular components that are identified in a one-dimensional annotation as well as their chemical and physical interactions.

Network reconstruction

A description of the network components and their interactions.

Three-dimensional annotation

Details the spatial location of genes (rather than the gene products) within the cell as a result of genome packaging.

Four-dimensional annotation

Details changes in genome sequence that result from adaptive evolution.

Metabolite connectivity

The number of reactions a given metabolite participates in.

Systemic reactions

Mathematically derived reactions which represent overall or dominant types of chemical transformation in a given network.

Isozymes

Proteins encoded by different genes that catalyse the same reaction.

Computational model

A set of equations that mathematically represents network reconstruction and is used to predict the behaviour of a system.

Precursor metabolites

Metabolites that are generated by catabolic pathways and used by anabolic pathways to generate biomass components.

Biomass components

The macromolecules (proteins, carbohydrates, lipids and nucleotides), vitamins, cofactors, metals and minerals that make up a cell.

Boolean rules

Logic statements that use Boolean operators (and, or, not) to evaluate the on/off state of a variable.

P/O ratio

The number of ATP molecules (P) that are formed per oxygen atom (O) consumed during respiration.

Network gap

One or more reaction that is missing from the network reconstruction owing to the lack of direct genetic or biochemical evidence.

Blocked reactions

Reactions that, at steady state, can have no net flux (reactions that involve dead-end metabolites are blocked reactions).

Pathway holes

Missing reactions from defined metabolic pathways such as glycolysis and amino-acid biosynthesis.

Dead-end metabolites

A metabolite that is either only produced or only consumed by the metabolic network (pathway holes, network gaps and blocked reactions involve dead-end metabolites).

Flux-coupling analysis

A computational method that determines how fluxes through a pair of reactions are related.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reed, J., Famili, I., Thiele, I. et al. Towards multidimensional genome annotation. Nat Rev Genet 7, 130–141 (2006). https://doi.org/10.1038/nrg1769

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg1769

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing