Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns1. We and others2 are developing computation-guided strategies for functional discovery with ‘metabolite docking’ to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by ‘genome neighbourhoods’ (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by ‘predicting’ the intermediates in the glycolytic pathway in Escherichia coli5. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-l-proline betaine (tHyp-B) and cis-4-hydroxy-d-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


Data deposits

The atomic coordinates and structure factors for APO Hyp-B 2-epimerase (HpbD) and tHyp-B-liganded HpbD are deposited in the Protein Data Bank under accession numbers 2PMQ and 4H2H, respectively.


  1. 1.

    , , & Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLOS Comput. Biol. 5, e1000605 (2009)

  2. 2.

    et al. The Enzyme Function Initiative. Biochemistry 50, 9950–9962 (2011)

  3. 3.

    et al. Structure-based activity prediction for an enzyme of unknown function. Nature 448, 775–779 (2007)

  4. 4.

    et al. Prediction and assignment of function for a divergent N-succinyl amino acid racemase. Nature Chem. Biol. 3, 486–491 (2007)

  5. 5.

    & Studying enzyme-substrate specificity in silico: a case study of the Escherichia coli glycolysis pathway. Biochemistry 49, 4003–4005 (2010)

  6. 6.

    et al. The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the α-protons of carboxylic acids. Biochemistry 35, 16489–16501 (1996)

  7. 7.

    , & Divergent evolution in the enolase superfamily: the interplay of mechanism and specificity. Arch. Biochem. Biophys. 433, 59–70 (2005)

  8. 8.

    , , & Divergent evolution in enolase superfamily: strategies for assigning functions. J. Biol. Chem. 287, 29–34 (2012)

  9. 9.

    & Using the KEGG database resource. Curr. Protocols Bioinformatics 38, 1.12.1–1.12.43 (2012)

  10. 10.

    et al. Predicting substrates by docking high-energy intermediates to enzyme structures. J. Am. Chem. Soc. 128, 15882–15891 (2006)

  11. 11.

    et al. Cation-π interactions as determinants for binding of the compatible solutes glycine betaine and proline betaine by the periplasmic ligand-binding protein ProX from Escherichia coli. J. Biol. Chem. 279, 5588–5596 (2004)

  12. 12.

    et al. Symbiotic plasmid genes essential to the catabolism of proline betaine, or stachydrine, are also required for efficient nodulation by Rhizobium meliloti. FEMS Microbiol. Lett. 115, 305–311 (1994)

  13. 13.

    et al. The stachydrine catabolism region in Sinorhizobium meliloti encodes a multi-enzyme complex similar to the xenobiotic degrading systems in other bacteria. Gene 244, 151–161 (2000)

  14. 14.

    , & Identification of two gene clusters and a transcriptional regulator required for Pseudomonas aeruginosa glycine betaine catabolism. J. Bacteriol. 190, 2690–2699 (2008)

  15. 15.

    et al. Quaternary ammonium oxidative demethylation: X-ray crystallographic, resonance Raman, and UV-visible spectroscopic analysis of a Rieske-type demethylase. J. Am. Chem. Soc. 134, 2823–2834 (2012)

  16. 16.

    , , & Osmoregulation in Escherichia coli by accumulation of organic osmolytes: betaines, glutamic acid, and trehalose. Arch. Microbiol. 147, 1–7 (1987)

  17. 17.

    et al. Osmoprotective compounds in the Plumbaginaceae: a natural experiment in metabolic engineering of stress tolerance. Proc. Natl Acad. Sci. USA 91, 306–310 (1994)

  18. 18.

    , & Proline betaine is a highly effective osmoprotectant for Staphylococcus aureus. Arch. Microbiol. 163, 138–142 (1995)

  19. 19.

    , , & Variations in the response of salt-stressed Rhizobium strains to betaines. Arch. Microbiol. 143, 359–364 (1986)

  20. 20.

    , , , & Proline betaine accumulation and metabolism in alfalfa plants under sodium chloride stress. Exploring its compartmentalization in nodules. J. Bacteriol. 188, 6308–6317 (2006)

  21. 21.

    , & Regulation of gene expression by hypertonicity. Annu. Rev. Physiol. 59, 437–455 (1997)

  22. 22.

    & Uptake and synthesis of compatible solutes as microbial stress responses to high-osmolality environments. Arch. Microbiol. 170, 319–330 (1998)

  23. 23.

    et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50, 4402–4410 (2011)

  24. 24.

    & Understanding enzyme superfamilies. Chemistry as the fundamental determinant in the evolution of new catalytic activities. J. Biol. Chem. 272, 30591–30594 (1997)

  25. 25.

    et al. Identification and characterization of d-hydroxyproline dehydrogenase and Δ1-pyrroline-4-hydroxy-2-carboxylate deaminase involved in novel l-hydroxyproline metabolism of bacteria: metabolic convergent evolution. J. Biol. Chem. 287, 32674–32688 (2012)

  26. 26.

    , , , & Control of hydroxyproline catabolism in Sinorhizobium meliloti. Mol. Microbiol. 85, 1133–1147 (2012)

  27. 27.

    & Transport and catabolism of proline betaine in salt-stressed Rhizobium meliloti. Arch. Microbiol. 151, 143–148 (1989)

  28. 28.

    , & Measurement of marine osmolytes in mammalian serum by liquid chromatography–tandem mass spectrometry. Anal. Biochem. 420, 7–12 (2012)

  29. 29.

    et al. A method for introduction of unmarked mutations in the genome of Paracoccus denitrificans: construction of strains with multiple mutations in the genes encoding periplasmic cytochromes c550, c551i, and c553i. J. Bacteriol. 173, 6962–6970 (1991)

  30. 30.

    , , & Carboxin resistance in Paracoccus denitrificans conferred by a mutation in the membrane-anchor domain of succinate:quinone reductase. Arch. Microbiol. 170, 27–37 (1998)

  31. 31.

    & Pythoscape: a framework for generation of large protein similarity networks. Bioinformatics 28, 2845–2846 (2012)

  32. 32.

    , , & MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)

  33. 33.

    2009 Protein Preparation Wizard; Epik version 2.0; Impact version 5.5; Prime version 2.1 (Schrödinger LLC, 2009)

  34. 34.

    2009 LigPrep, version 2.3 (Schrödinger LLC, 2009)

  35. 35.

    , & Virtual screening against highly charged active sites: identifying substrates of α–β barrel enzymes. Biochemistry 44, 2059–2071 (2005)

  36. 36.

    et al. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes. J. Med. Chem. 49 6177–6196 10.1021/jm051256o (2006)

  37. 37.

    et al. High throughput protein production and crystallization at NYSGXRC. Methods Mol. Biol. 426, 561–575 (2008)

  38. 38.

    & Autoinduction of protein expression. Curr. Protocols Protein Sci. 5.23.1–5.23.18 (2009)

  39. 39.

    Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005)

  40. 40.

    The integration of macromolecular diffraction data. Acta Crystallogr. D Biol. Crystallogr. 62, 48–57 (2006)

  41. 41.

    Scaling and assessment of data quality. Acta Crystallogr. D Biol. Crystallogr. 62, 72–82 (2006)

  42. 42.

    et al. Phaser crystallographic software. J. Appl. Cryst. 40, 658–674 (2007)

  43. 43.

    et al. Automated structure solution with the PHENIX suite. Methods Mol. Biol. 426, 419–435 (2008)

  44. 44.

    & Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. D60, 2126–2132 (2004)

  45. 45.

    & PRODRG: a tool for high-throughput crystallography of protein–ligand complexes. Acta Crystallogr. D Biol. Crystallogr. 60, 1355–1363 (2004)

  46. 46.

    , & A broad host range mobilization system for in vivo genetic engineering: transposon mutagenesis in Gram negative bacteria. Nature Biotechnol. 1, 784–791 (1983)

  47. 47.

    & Analysis of relative gene expression data using real-time quantitative PCR and the method. Methods 25, 402–408 (2001)

  48. 48.

    et al. A RubisCO-like protein links SAM metabolism with isoprenoid biosynthesis. Nature Chem. Biol. 8, 926–932 (2012)

  49. 49.

    & Isolation of glycine betaine and proline betaine from human urine. Assessment of their role as osmoprotective agents for bacteria and the kidney. J. Clin. Invest. 79, 731–737 (1987)

  50. 50.

    et al. Simultaneous measurement of proline and related compounds in oak leaves by high-performance ligand-exchange chromatography and electrospray ionization mass spectrometry for environmental stress studies. J. Chromatogr. A 1216, 1094–1099 (2009)

  51. 51.

    , & Determination of betaine metabolites and dimethylsulfoniopropionate in coral tissues using liquid chromatography–time-of-flight mass spectrometry and stable isotope-labeled internal standards. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 878, 1809–1816 (2010)

Download references


This research was supported by cooperative agreements from the US National Institutes of Health (U54GM093342, U54GM074945 and U54GM094662). Molecular graphics and analyses were performed with the University of California, San Francisco (UCSF) Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at UCSF (supported by National Institutes of Health P41-GM103311). Use of the Advanced Photon Source, an Office of Science User Facility operated for the US Department of Energy (DOE) Office of Science by Argonne National Laboratory, was supported by the US DOE under contract no. DE-AC02-06CH11357. Use of the Lilly Research Laboratories Collaborative Access Team (LRL-CAT) beamline at Sector 31 of the Advanced Photon Source was provided by Eli Lilly Company, which operates the facility.

Author information

Author notes

    • Suwen Zhao
    • , Ritesh Kumar
    • , Ayano Sakai
    • , Matthew W. Vetting
    •  & B. McKay Wood

    These authors contributed equally to this work.


  1. Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143, USA

    • Suwen Zhao
    •  & Matthew P. Jacobson
  2. Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

    • Ritesh Kumar
    • , Ayano Sakai
    • , B. McKay Wood
    • , Jonathan V. Sweedler
    • , John A. Gerlt
    •  & John E. Cronan
  3. Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York 10461, USA

    • Matthew W. Vetting
    • , Jeffery B. Bonanno
    • , Brandan S. Hillerich
    • , Ronald D. Seidel
    •  & Steven C. Almo
  4. Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California 94143, USA

    • Shoshana Brown
    •  & Patricia C. Babbitt
  5. Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

    • Jonathan V. Sweedler
    •  & John A. Gerlt
  6. Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

    • John A. Gerlt
    •  & John E. Cronan
  7. Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

    • John E. Cronan


  1. Search for Suwen Zhao in:

  2. Search for Ritesh Kumar in:

  3. Search for Ayano Sakai in:

  4. Search for Matthew W. Vetting in:

  5. Search for B. McKay Wood in:

  6. Search for Shoshana Brown in:

  7. Search for Jeffery B. Bonanno in:

  8. Search for Brandan S. Hillerich in:

  9. Search for Ronald D. Seidel in:

  10. Search for Patricia C. Babbitt in:

  11. Search for Steven C. Almo in:

  12. Search for Jonathan V. Sweedler in:

  13. Search for John A. Gerlt in:

  14. Search for John E. Cronan in:

  15. Search for Matthew P. Jacobson in:


S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. designed the research. S.Z., R.K., A.S., M.W.V., B.M.W., J.B.B., B.S.H. and R.D.S. performed the research. S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. analysed data. S.Z., R.K., A.S., M.W.V., B.M.W., S.B., J.B.B., B.S.H., R.D.S., P.C.B., S.C.A., J.V.S., J.A.G., J.E.C. and M.P.J. wrote the paper.

Competing interests

M.P.J. is a consultant to Schrödinger LLC, which developed or licensed some of the software used in this study.

Corresponding authors

Correspondence to Patricia C. Babbitt or Steven C. Almo or Jonathan V. Sweedler or John A. Gerlt or John E. Cronan or Matthew P. Jacobson.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This file contains Supplementary Tables 1-10, Supplementary Figures 1-13, a Supplementary Discussion and Supplementary References.

About this article

Publication history






Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.