Abstract
The design of biocatalytic reaction systems is highly complex owing to the dependency of the estimated kinetic parameters on the enzyme, the reaction conditions, and the modeling method. Consequently, reproducibility of enzymatic experiments and reusability of enzymatic data are challenging. We developed the XML-based markup language EnzymeML to enable storage and exchange of enzymatic data such as reaction conditions, the time course of the substrate and the product, kinetic parameters and the kinetic model, thus making enzymatic data findable, accessible, interoperable and reusable (FAIR). The feasibility and usefulness of the EnzymeML toolbox is demonstrated in six scenarios, for which data and metadata of different enzymatic reactions are collected and analyzed. EnzymeML serves as a seamless communication channel between experimental platforms, electronic lab notebooks, tools for modeling of enzyme kinetics, publication platforms and enzymatic reaction databases. EnzymeML is open and transparent, and invites the community to contribute. All documents and codes are freely available at https://enzymeml.org.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout

Data availability
All data availability is listed in Supplementary Table 11.
Code availability
All code availability is listed in Supplementary Table 11.
References
Iqbal, S. A., Wallach, J. D., Khoury, M. J., Schully, S. D. & Ioannidis, J. P. A. Reproducible research practices and transparency across the biomedical literature. PLoS Biol. 14, e1002333 (2016).
Wulf, C. et al. A unified research data infrastructure for catalysis research—challenges and concepts. ChemCatChem 13, 3223–3236 (2021).
Halling, P. et al. An empirical analysis of enzyme function reporting for experimental reproducibility: missing/incomplete information in published papers. Biophys. Chem. 242, 22–27 (2018).
Stroberg, W. & Schnell, S. On the estimation errors of KM and V from time-course experiments using the Michaelis–Menten equation. Biophys. Chem. 219, 17–27 (2016).
Cvijovic, M. et al. Bridging the gaps in systems biology. Mol. Genet. Genomics 289, 727–734 (2014).
Pleiss, J. Standardized data, scalable documentation, sustainable storage—EnzymeML as a basis for fair data management in biocatalysis. ChemCatChem 13, 3909–3913 (2021).
Range, J. et al. EnzymeML—a data exchange format for biocatalysis and enzymology. FEBS J. 289, 5864–5874 (2022).
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Tipton, K. F. et al. Standards for reporting enzyme data: the STRENDA Consortium: what it aims to do and why it should be helpful. Perspect. Sci. 1, 131–137 (2014).
Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003).
Malzacher, S., Range, J., Halupczok, C., Pleiss, J. & Rother, D. BioCatHub, a graphical user interface for standardized data acquisition in biocatalysis. Chem. Ing. Tech. 92, 1251–1251 (2020).
Hoops, S. et al. COPASI—a complex pathway simulator. Bioinformatics 22, 3067–3074 (2006).
Christensen, C. D., Hofmeyr, J. H. S. & Rohwer, J. M. PySCeSToolbox: a collection of metabolic pathway analysis tools. Bioinformatics 34, 124–125 (2018).
Swainston, N. et al. STRENDA DB: enabling the validation and sharing of enzyme kinetics data. FEBS J. 285, 2193–2204 (2018).
Wittig, U., Rey, M., Weidemann, A., Kania, R. & Müller, W. SABIO-RK: an updated resource for manually curated biochemical reaction kinetics. Nucleic Acids Res. 46, D656–D660 (2018).
Bezerra, R. M. F. & Dias, A. A. Discrimination among eight modified Michaelis–Menten kinetics models of cellulose hydrolysis with a large range of substrate/enzyme ratios: inhibition by cellobiose. Appl. Biochem. Biotechnol. 112, 173–184 (2004).
Buchholz, P. C. F., Ohs, R., Spiess, A. C. & Pleiss, J. Progress curve analysis within BioCatNet: comparing kinetic models for enzyme-catalyzed self-ligation. Biotechnol. J. 14, e1800183 (2019).
Dias Gomes, M., Moiseyenko, R. P., Baum, A., Jørgensen, T. M. & Woodley, J. M. Use of image analysis to understand enzyme stability in an aerated stirred reactor. Biotechnol. Prog. 35, e2878 (2019).
Woodley, J. M. Advances in biological conversion technologies: new opportunities for reaction engineering. React. Chem. Eng. 5, 632–640 (2020).
Courtot, M. et al. Controlled vocabularies and semantics in systems biology. Mol. Syst. Biol. 7, 543 (2011).
Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, Proc. 20th Int. Conf. on Electronic Publishing (eds. Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Newville, M. et al. lmfit/lmfit-py: 1.1.0; https://doi.org/10.5281/zenodo.7370358 (2022).
Pinto, M. F. et al. interferENZY: a web-based tool for enzymatic assay validation and standardized kinetic analysis. J. Mol. Biol. 433, 166613 (2021).
Crosas, M. The Dataverse Network®: an open-source application for sharing, discovering and preserving data. D-Lib Magazine 17, 2 (2011).
Olivier, B. G., Rohwer, J. M. & Hofmeyr, J.-H. S. Modelling cellular systems with PySCeS. Bioinformatics 21, 560–561 (2005).
Dräger, A. et al. JSBML: A flexible java library for working with SBML. Bioinformatics 27, 2167–2168 (2011).
Acknowledgements
S.L., H.D., J.P.R. and J.P. were supported by the German Research Foundation under Germany’s Excellence Strategy (EXC 2075, grant 390740016) and by the German Federal Ministry of Education and Research (grant 01DG17027). J.D.S. was supported by the German Research Foundation under Germany’s Excellence Strategy (EXC 2186, grant 390919832). A.W. and U.W. were supported by the Klaus Tschira Foundation and the German Federal Ministry of Education and Research within de.NBI (031A540). T.K. and S.N. were supported by the National Research Foundation of South Africa (grants 105889 and 112099). J.M.R. was supported by the National Research Foundation of South Africa (grant 120859). A.H. and J.W. were partially funded by the Sino-Danish Center for Education and Research and the Technical University of Denmark. C.E.L. and A.S.B. were supported by the US Food and Drug Administration, Center for Drug Evaluation and Research, and Office of Pharmaceutical Quality, through grant U01FD006484. C.E.L. also acknowledges funding by the US National Science Foundation through the Graduate Research Fellowship Program (grant DGE-1650044). F.T.B. was supported by the German Federal Ministry of Education and Research within de.NBI (031L0104A). STRENDA and STRENDA DB are funded by the Beilstein-Institut. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
S.M., J.D.S., M.F.P., C.E.L., A.V.H. and S.N. contributed data to the scenarios. D.R., P.M., A.S.B., J.M.W. and T.K. supervised the scenarios. F.T.B. and J.M.R. contributed their kinetic modeling platforms. D.I., A.W. and U.W. contributed database platforms. C.K., N.S. and S.S. contributed to the conceptualisation of EnzymeML and to the development of the protocols. S.L., H.D. and J.R. implemented the EnzymeML workflows and analyzed data. J.P. supervised the development and application of EnzymeML workflows and prepared the draft of the manuscript with input from all authors. All authors approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Shelley Copley, Kenneth Johnson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Note
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lauterbach, S., Dienhart, H., Range, J. et al. EnzymeML: seamless data flow and modeling of enzymatic data. Nat Methods 20, 400–402 (2023). https://doi.org/10.1038/s41592-022-01763-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-022-01763-1