Discovery of new enzymes and metabolic pathways by using structure and genome context


Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns1. We and others2 are developing computation-guided strategies for functional discovery with ‘metabolite docking’ to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by ‘genome neighbourhoods’ (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by ‘predicting’ the intermediates in the glycolytic pathway in Escherichia coli5. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-l-proline betaine (tHyp-B) and cis-4-hydroxy-d-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Homology modelling and docking results for HpbD, HpbJ and HpbB1.
Figure 2: Genome contexts of HpbD in P. bermudensis and the orthologous genes in P. denitrificans.
Figure 3: Chemotype analysis of HpbD docking results.
Figure 4: Catabolic pathway for tHyp-B and kinetic constants for HpbD.

Accession codes


Protein Data Bank

Data deposits

The atomic coordinates and structure factors for APO Hyp-B 2-epimerase (HpbD) and tHyp-B-liganded HpbD are deposited in the Protein Data Bank under accession numbers 2PMQ and 4H2H, respectively.


  1. 1

    Schnoes, A. M., Brown, S. D., Dodevski, I. & Babbitt, P. C. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLOS Comput. Biol. 5, e1000605 (2009)

    ADS  PubMed  PubMed Central  Google Scholar 

  2. 2

    Gerlt, J. A. et al. The Enzyme Function Initiative. Biochemistry 50, 9950–9962 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3

    Hermann, J. C. et al. Structure-based activity prediction for an enzyme of unknown function. Nature 448, 775–779 (2007)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. 4

    Song, L. et al. Prediction and assignment of function for a divergent N-succinyl amino acid racemase. Nature Chem. Biol. 3, 486–491 (2007)

    CAS  Google Scholar 

  5. 5

    Kalyanaraman, C. & Jacobson, M. P. Studying enzyme-substrate specificity in silico: a case study of the Escherichia coli glycolysis pathway. Biochemistry 49, 4003–4005 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Babbitt, P. C. et al. The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the α-protons of carboxylic acids. Biochemistry 35, 16489–16501 (1996)

    CAS  PubMed  Google Scholar 

  7. 7

    Gerlt, J. A., Babbitt, P. C. & Rayment, I. Divergent evolution in the enolase superfamily: the interplay of mechanism and specificity. Arch. Biochem. Biophys. 433, 59–70 (2005)

    CAS  PubMed  Google Scholar 

  8. 8

    Gerlt, J. A., Babbitt, P. C., Jacobson, M. P. & Almo, S. C. Divergent evolution in enolase superfamily: strategies for assigning functions. J. Biol. Chem. 287, 29–34 (2012)

    CAS  PubMed  Google Scholar 

  9. 9

    Tanabe, M. & Kanehisa, M. Using the KEGG database resource. Curr. Protocols Bioinformatics 38, 1.12.1–1.12.43 (2012)

    Google Scholar 

  10. 10

    Hermann, J. C. et al. Predicting substrates by docking high-energy intermediates to enzyme structures. J. Am. Chem. Soc. 128, 15882–15891 (2006)

    CAS  PubMed  Google Scholar 

  11. 11

    Schiefner, A. et al. Cation-π interactions as determinants for binding of the compatible solutes glycine betaine and proline betaine by the periplasmic ligand-binding protein ProX from Escherichia coli . J. Biol. Chem. 279, 5588–5596 (2004)

    CAS  PubMed  Google Scholar 

  12. 12

    Goldmann, A. et al. Symbiotic plasmid genes essential to the catabolism of proline betaine, or stachydrine, are also required for efficient nodulation by Rhizobium meliloti . FEMS Microbiol. Lett. 115, 305–311 (1994)

    CAS  Google Scholar 

  13. 13

    Burnet, M. W. et al. The stachydrine catabolism region in Sinorhizobium meliloti encodes a multi-enzyme complex similar to the xenobiotic degrading systems in other bacteria. Gene 244, 151–161 (2000)

    CAS  PubMed  Google Scholar 

  14. 14

    Wargo, M. J., Szwergold, B. S. & Hogan, D. A. Identification of two gene clusters and a transcriptional regulator required for Pseudomonas aeruginosa glycine betaine catabolism. J. Bacteriol. 190, 2690–2699 (2008)

    CAS  PubMed  Google Scholar 

  15. 15

    Daughtry, K. D. et al. Quaternary ammonium oxidative demethylation: X-ray crystallographic, resonance Raman, and UV-visible spectroscopic analysis of a Rieske-type demethylase. J. Am. Chem. Soc. 134, 2823–2834 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Larsen, P. I., Sydnes, L. K., Landfald, B. & Strom, A. R. Osmoregulation in Escherichia coli by accumulation of organic osmolytes: betaines, glutamic acid, and trehalose. Arch. Microbiol. 147, 1–7 (1987)

    CAS  PubMed  Google Scholar 

  17. 17

    Hanson, A. D. et al. Osmoprotective compounds in the Plumbaginaceae: a natural experiment in metabolic engineering of stress tolerance. Proc. Natl Acad. Sci. USA 91, 306–310 (1994)

    ADS  CAS  PubMed  Google Scholar 

  18. 18

    Amin, U. S., Lash, T. D. & Wilkinson, B. J. Proline betaine is a highly effective osmoprotectant for Staphylococcus aureus . Arch. Microbiol. 163, 138–142 (1995)

    CAS  PubMed  Google Scholar 

  19. 19

    Bernard, T., Pocard, J.-A., Berrould, B. & Le Rudulier, D. Variations in the response of salt-stressed Rhizobium strains to betaines. Arch. Microbiol. 143, 359–364 (1986)

    CAS  Google Scholar 

  20. 20

    Alloing, G., Travers, I., Sagot, B., Le Rudulier, D. & Dupont, L. Proline betaine accumulation and metabolism in alfalfa plants under sodium chloride stress. Exploring its compartmentalization in nodules. J. Bacteriol. 188, 6308–6317 (2006)

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21

    Burg, M. B., Kwon, E. D. & Kultz, D. Regulation of gene expression by hypertonicity. Annu. Rev. Physiol. 59, 437–455 (1997)

    CAS  PubMed  Google Scholar 

  22. 22

    Kempf, B. & Bremer, E. Uptake and synthesis of compatible solutes as microbial stress responses to high-osmolality environments. Arch. Microbiol. 170, 319–330 (1998)

    CAS  PubMed  Google Scholar 

  23. 23

    Bar-Even, A. et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50, 4402–4410 (2011)

    CAS  PubMed  Google Scholar 

  24. 24

    Babbitt, P. C. & Gerlt, J. A. Understanding enzyme superfamilies. Chemistry as the fundamental determinant in the evolution of new catalytic activities. J. Biol. Chem. 272, 30591–30594 (1997)

    CAS  PubMed  Google Scholar 

  25. 25

    Watanabe, S. et al. Identification and characterization of d-hydroxyproline dehydrogenase and Δ1-pyrroline-4-hydroxy-2-carboxylate deaminase involved in novel l-hydroxyproline metabolism of bacteria: metabolic convergent evolution. J. Biol. Chem. 287, 32674–32688 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26

    White, C. E., Gavina, J. M., Morton, R., Britz-McKibbin, P. & Finan, T. M. Control of hydroxyproline catabolism in Sinorhizobium meliloti . Mol. Microbiol. 85, 1133–1147 (2012)

    CAS  PubMed  Google Scholar 

  27. 27

    Gloux, K. & Le Rudulier, D. Transport and catabolism of proline betaine in salt-stressed Rhizobium meliloti . Arch. Microbiol. 151, 143–148 (1989)

    CAS  Google Scholar 

  28. 28

    Lenky, C. C., McEntyre, C. J. & Lever, M. Measurement of marine osmolytes in mammalian serum by liquid chromatography–tandem mass spectrometry. Anal. Biochem. 420, 7–12 (2012)

    CAS  PubMed  Google Scholar 

  29. 29

    Van Spanning, R. J. et al. A method for introduction of unmarked mutations in the genome of Paracoccus denitrificans: construction of strains with multiple mutations in the genes encoding periplasmic cytochromes c 550, c 551i, and c 553i . J. Bacteriol. 173, 6962–6970 (1991)

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30

    Matsson, M., Ackrell, B. A., Cochran, B. & Hederstedt, L. Carboxin resistance in Paracoccus denitrificans conferred by a mutation in the membrane-anchor domain of succinate:quinone reductase. Arch. Microbiol. 170, 27–37 (1998)

    CAS  PubMed  Google Scholar 

  31. 31

    Barber, A. E. & Babbitt, P. C. Pythoscape: a framework for generation of large protein similarity networks. Bioinformatics 28, 2845–2846 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32

    Katoh, K., Kuma, K., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Suite, S. 2009 Protein Preparation Wizard; Epik version 2.0; Impact version 5.5; Prime version 2.1 (Schrödinger LLC, 2009)

    Google Scholar 

  34. 34

    Suite, S. 2009 LigPrep, version 2.3 (Schrödinger LLC, 2009)

    Google Scholar 

  35. 35

    Kalyanaraman, C., Bernacki, K. & Jacobson, M. P. Virtual screening against highly charged active sites: identifying substrates of α–β barrel enzymes. Biochemistry 44, 2059–2071 (2005)

    CAS  PubMed  Google Scholar 

  36. 36

    Friesner, R. A. et al. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes. J. Med. Chem. 49 6177–6196 10.1021/jm051256o (2006)

    CAS  Google Scholar 

  37. 37

    Sauder, M. J. et al. High throughput protein production and crystallization at NYSGXRC. Methods Mol. Biol. 426, 561–575 (2008)

    CAS  PubMed  Google Scholar 

  38. 38

    Fox, B. G. & Blommel, P. G. Autoinduction of protein expression. Curr. Protocols Protein Sci. 5.23.1–5.23.18 (2009)

  39. 39

    Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005)

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40

    Leslie, A. G. The integration of macromolecular diffraction data. Acta Crystallogr. D Biol. Crystallogr. 62, 48–57 (2006)

    PubMed  Google Scholar 

  41. 41

    Evans, P. Scaling and assessment of data quality. Acta Crystallogr. D Biol. Crystallogr. 62, 72–82 (2006)

    PubMed  Google Scholar 

  42. 42

    McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Cryst. 40, 658–674 (2007)

    CAS  Google Scholar 

  43. 43

    Zwart, P. H. et al. Automated structure solution with the PHENIX suite. Methods Mol. Biol. 426, 419–435 (2008)

    CAS  Google Scholar 

  44. 44

    Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. D60, 2126–2132 (2004)

    CAS  Google Scholar 

  45. 45

    Schuttelkopf, A. W. & van Aalten, D. M. PRODRG: a tool for high-throughput crystallography of protein–ligand complexes. Acta Crystallogr. D Biol. Crystallogr. 60, 1355–1363 (2004)

    PubMed  Google Scholar 

  46. 46

    Simon, R., Priefer, U. & Puhler, A. A broad host range mobilization system for in vivo genetic engineering: transposon mutagenesis in Gram negative bacteria. Nature Biotechnol. 1, 784–791 (1983)

    CAS  Google Scholar 

  47. 47

    Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the method. Methods 25, 402–408 (2001)

    CAS  Google Scholar 

  48. 48

    Erb, T. J. et al. A RubisCO-like protein links SAM metabolism with isoprenoid biosynthesis. Nature Chem. Biol. 8, 926–932 (2012)

    CAS  Google Scholar 

  49. 49

    Chambers, S. T. & Kunin, C. M. Isolation of glycine betaine and proline betaine from human urine. Assessment of their role as osmoprotective agents for bacteria and the kidney. J. Clin. Invest. 79, 731–737 (1987)

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50

    Oufir, M. et al. Simultaneous measurement of proline and related compounds in oak leaves by high-performance ligand-exchange chromatography and electrospray ionization mass spectrometry for environmental stress studies. J. Chromatogr. A 1216, 1094–1099 (2009)

    CAS  PubMed  Google Scholar 

  51. 51

    Li, C., Hill, R. W. & Jones, A. D. Determination of betaine metabolites and dimethylsulfoniopropionate in coral tissues using liquid chromatography–time-of-flight mass spectrometry and stable isotope-labeled internal standards. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 878, 1809–1816 (2010)

    CAS  PubMed  Google Scholar 

Download references


This research was supported by cooperative agreements from the US National Institutes of Health (U54GM093342, U54GM074945 and U54GM094662). Molecular graphics and analyses were performed with the University of California, San Francisco (UCSF) Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at UCSF (supported by National Institutes of Health P41-GM103311). Use of the Advanced Photon Source, an Office of Science User Facility operated for the US Department of Energy (DOE) Office of Science by Argonne National Laboratory, was supported by the US DOE under contract no. DE-AC02-06CH11357. Use of the Lilly Research Laboratories Collaborative Access Team (LRL-CAT) beamline at Sector 31 of the Advanced Photon Source was provided by Eli Lilly Company, which operates the facility.

Author information




S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. designed the research. S.Z., R.K., A.S., M.W.V., B.M.W., J.B.B., B.S.H. and R.D.S. performed the research. S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. analysed data. S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. wrote the paper.

Corresponding authors

Correspondence to Patricia C. Babbitt or Steven C. Almo or Jonathan V. Sweedler or John A. Gerlt or John E. Cronan or Matthew P. Jacobson.

Ethics declarations

Competing interests

M.P.J. is a consultant to Schrödinger LLC, which developed or licensed some of the software used in this study.

Supplementary information

Supplementary Information

This file contains Supplementary Tables 1-10, Supplementary Figures 1-13, a Supplementary Discussion and Supplementary References. (PDF 3010 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zhao, S., Kumar, R., Sakai, A. et al. Discovery of new enzymes and metabolic pathways by using structure and genome context. Nature 502, 698–702 (2013).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing