Defining informative priors for ensemble modeling in systems biology

Tsigkinopoulou, Areti; Hawari, Aliah; Uttley, Megan; Breitling, Rainer

doi:10.1038/s41596-018-0056-z

Protocol
Published: 23 October 2018

Defining informative priors for ensemble modeling in systems biology

Nature Protocols volume 13, pages 2643–2663 (2018)Cite this article

1384 Accesses
11 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Ensemble modeling in molecular systems biology requires the reproducible translation of kinetic parameter data into informative probability distributions (priors), as well as approaches that sample parameters from these distributions without violating the thermodynamic consistency of the overall model. Although a number of pioneering frameworks for ensemble modeling have been published, the issue of generating informative priors has not yet been addressed. Here, we present a protocol that aims to fill this gap. This protocol discusses the collection of parameter values from a diverse range of sources (literature, databases and experiments), assessment of their plausibility, and creation of log-normal probability distributions that can be used as informative priors in ensemble modeling. Furthermore, the protocol enables sampling from the generated distributions while maintaining thermodynamic consistency. Once all parameter values have been retrieved from literature and databases, the protocol can be implemented within ~5–10 min per parameter. The aim of this protocol is to facilitate the design and use of informative distributions for ensemble modeling, especially in fields such as synthetic biology and systems medicine.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Workflow of the protocol from parameter collection to generation of probability distributions and confirmation of thermodynamic consistency for interconnected parameters.**

**Fig. 2: Calculation of the weighted median of two unitless parameter values (modes), 10³ and 10⁶ (log-transformed to 6.9078 and 13.82, respectively), with varying weights and uncertainty.**

**Fig. 5: Estimated probability distributions for parameters K_D1 and \(k_1^ -\), plotted on a log scale.**

**Fig. 6: The bivariate distribution for parameters k₁ and k_on1 and the two marginal distributions.**

Causal machine learning for predicting treatment outcomes

Article 19 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Genome-wide association studies

Article 26 August 2021

References

Samee, M. A. H. et al. A systematic ensemble approach to thermodynamic modeling of gene expression from sequence data. Cell Syst. 1, 396–407 (2015).
Article CAS PubMed Google Scholar
Lee, Y., Lafontaine Rivera, J. G. & Liao, J. C. Ensemble modeling for robustness analysis in engineering non-native metabolic pathways. Metab. Eng. 25, 63–71 (2014).
Article CAS PubMed Google Scholar
Khazaei, T., McGuigan, A. & Mahadevan, R. Ensemble modeling of cancer metabolism. Front. Physiol. 3, 135 (2012).
Article PubMed PubMed Central Google Scholar
Kuepfer, L., Peter, M., Sauer, U. & Stelling, J. Ensemble modeling for analysis of cell signaling dynamics. Nat. Biotech. 25, 1001–1006 (2007).
Article CAS Google Scholar
Andreozzi, S., Miskovic, L. & Hatzimanikatis, V. iSCHRUNK—in silico approach to characterization and reduction of uncertainty in the kinetic models of genome-scale metabolic networks. Metab. Eng. 33, 158–168 (2016).
Jacobsen, J. P., Levin, L. M. & Tausanovitch, Z. Comparing standard regression modeling to ensemble modeling: how data mining software can improve economists’ predictions. East. Econ. J. 42, 387–398 (2016).
Article Google Scholar
Roy, C. J. & Oberkampf, W. L. A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing. Comput. Methods Appl. Mech. Eng. 200, 2131–2144 (2011).
Article Google Scholar
Biddle, J. & Winsberg, E. Value judgements and the estimation of uncertainty in climate modeling. in New Waves in Philosophy of Science (eds. Magnus, P. & Busch, J.) (Palgrave Macmillan, Basingstoke, UK, 2010).
Chapter Google Scholar
Johnstone, R. H., Bardenet, R., Gavaghan, D. J. & Mirams, G. R. Hierarchical Bayesian inference for ion channel screening dose-response data. Wellcome Open Res. 1, 6 (2016).
Article PubMed Google Scholar
Walters, K. Parameter estimation for an immortal model of colonic stem cell division using approximate Bayesian computation. J. Theor. Biol. 306, 104–114 (2012).
Article PubMed Google Scholar
Tan, Y., Lafontaine Rivera, J. G., Contador, C. A., Asenjo, J. A. & Liao, J. C. Reducing the allowable kinetic space by constructing ensemble of dynamic models with the same steady-state flux. Metab. Eng. 13, 60–75 (2011).
Article CAS PubMed Google Scholar
Miskovic, L. et al. A design–build–test cycle using modeling and experiments reveals interdependencies between upper glycolysis and xylose uptake in recombinant S. cerevisiae and improves predictive capabilities of large-scale kinetic models. Biotechnol. Biofuels 10, 166 (2017).
Article PubMed PubMed Central Google Scholar
Thijssen, B., Dijkstra, T. M. H., Heskes, T. & Wessels, L. F. A. BCM: toolkit for Bayesian analysis of computational models using samplers. BMC Syst. Biol. 10, 100 (2016).
Article PubMed PubMed Central Google Scholar
Chakrabarti, A., Miskovic, L., Soh, K. C. & Hatzimanikatis, V. Towards kinetic modeling of genome-scale metabolic networks without sacrificing stoichiometric, thermodynamic and physiological constraints. Biotechnol. J. 8, 1043–1057 (2013).
Article CAS PubMed Google Scholar
Babtie, A. C. & Stumpf, M. P. H. How to deal with parameters for whole-cell modelling. J. R. Soc. Interface 14, https://doi.org/10.1098/rsif.2017.0237 (2017).
Article PubMed Central Google Scholar
Toni, T., Welch, D., Strelkowa, N., Ipsen, A. & Stumpf, M. P. H. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202 (2009).
Article PubMed Google Scholar
Lang, M. & Stelling, J. Modular parameter identification of biomolecular networks. SIAM J. Sci. Comput. 38, B988–B1008 (2016).
Article Google Scholar
Liepe, J. et al. A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation. Nat. Protoc. 9, 439–456 (2014).
Article CAS PubMed PubMed Central Google Scholar
Karr, J. R. et al. Summary of the DREAM8 parameter estimation challenge: toward parameter identification for whole-cell models. PLoS Comput. Biol. 11, e1004096 (2015).
Article PubMed PubMed Central Google Scholar
Abu Bakar, S. A., Nadarajah, S., Absl Kamarul Adzhar, Z. A. & Mohamed, I. Gendist: an R package for generated probability distribution models. PLoS ONE 11, e0156537 (2016).
Article PubMed PubMed Central Google Scholar
Liebermeister, W., Uhlendorf, J. & Klipp, E. Modular rate laws for enzymatic reactions: thermodynamics, elasticities and implementation. Bioinformatics 26, 1528–1534 (2010).
Article CAS PubMed Google Scholar
Vlad, M. O. & Ross, J. Thermodynamically based constraints for rate coefficients of large biochemical networks. Wiley Interdiscip. Rev. Syst. Biol. Med. 1, 348–358 (2009).
Article CAS PubMed Google Scholar
Jenkinson, G. & Goutsias, J. Thermodynamically consistent model calibration in chemical kinetics. BMC Syst. Biol. 5, 64–64 (2011).
Article PubMed PubMed Central Google Scholar
Saa, P. & Nielsen, L. K. A general framework for thermodynamically consistent parameterization and efficient sampling of enzymatic reactions. PLoS Comput. Biol. 11, e1004195 (2015).
Article PubMed PubMed Central Google Scholar
Gelman, A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 1, 515–534 (2006).
Article Google Scholar
Eydgahi, H. et al. Properties of cell death models calibrated and compared using Bayesian approaches. Mol. Syst. Biol. 9, 644 (2013).
Article CAS PubMed PubMed Central Google Scholar
Geris, L. & Gomez-Cabrero, D. Uncertainty in Biology: A Computational Modeling Approach (Springer International Publishing, New York, 2015).
Limpert, E., Stahel, W. A. & Abbt, M. Log-normal distributions across the sciences: keys and clues. Bioscience 51, 341–352 (2001).
Article Google Scholar
Tsigkinopoulou, A., Baker, S. M. & Breitling, R. Respectful modeling: addressing uncertainty in dynamic system models for molecular biology. Trends Biotechnol. 35, 518–529 (2017).
Article CAS PubMed Google Scholar
Cohen, A. A. et al. Protein dynamics in individual human cells: experiment and theory. PLoS ONE 4, e4901 (2009).
Article PubMed PubMed Central Google Scholar
Gaudet, S., Spencer, S. L., Chen, W. W. & Sorger, P. K. Exploring the contextual sensitivity of factors that determine cell-to-cell variability in receptor-mediated apoptosis. PLoS Comput. Biol. 8, e1002482 (2012).
Article CAS PubMed PubMed Central Google Scholar
Klipp, E, Liebermeister, W, Wierling, C. & Kowald, A. Systems Biology: A Textbook. (John Wiley & Sons, Hoboken, NJ, 2016).
Google Scholar
Liebermeister, W. & Klipp, E. Biochemical networks with uncertain parameters. Syst. Biol. (Stevenage) 152, 97–107 (2005).
Article CAS Google Scholar
Achcar, F. et al. Dynamic modelling under uncertainty: the case of Trypanosoma brucei energy metabolism. PLoS Comput. Biol. 8, e1002352 (2012).
Article CAS PubMed PubMed Central Google Scholar
Achcar, F., Barrett, M. P. & Breitling, R. Explicit consideration of topological and parameter uncertainty gives new insights into a well-established model of glycolysis. FEBS J. 280, 4640–4651 (2013).
Article CAS PubMed PubMed Central Google Scholar
Placzek, S. et al. BRENDA in 2017: new perspectives and new tools in BRENDA. Nucleic Acids Res. 45, D380–D388 (2017).
Article CAS PubMed Google Scholar
Milo, R., Jorgensen, P., Moran, U., Weber, G. & Springer, M. BioNumbers—the database of key numbers in molecular and cell biology. Nucleic Acids Res. 38, D750–D753 (2010).
Article CAS PubMed Google Scholar
Ron Milo, R. P. Cell Biology by the Numbers (Garland Science, Taylor & Francis Group, New York, 2015).
Borger, S., Liebermeister, W. & Klipp, E. Prediction of enzyme kinetic parameters based on statistical learning. Genome Inform. 17, 80–87 (2006).
CAS PubMed Google Scholar
Sridharan, G. V., Ullah, E., Hassoun, S. & Lee, K. Discovery of substrate cycles in large scale metabolic networks using hierarchical modularity. BMC Syst. Biol. 9, 5 (2015).
Article PubMed PubMed Central Google Scholar
Gebauer, J., Schuster, S., de Figueiredo, L. F. & Kaleta, C. Detecting and investigating substrate cycles in a genome-scale human metabolic network. FEBS J. 279, 3192–3202 (2012).
Article CAS PubMed Google Scholar
Beard, D. A. & Qian, H. Metabolic futile cycles and their functions: a systems analysis of energy and control. Syst. Biol. (Stevenage) 153, 192–200 (2006).
Article Google Scholar
Sauro, H. M. Enzyme Kinetics for Systems Biology (Ambrosius Publishing, Lexington, KY, 2011).
Ahn, S. K., Tahlan, K., Yu, Z. & Nodwell, J. Investigation of transcription repression and small-molecule responsiveness by tetR-like transcription factors using a heterologous Escherichia coli–based assay. J. Bacteriol. 189, 6655–6664 (2007).
Article CAS PubMed PubMed Central Google Scholar
Kleinschmidt, C., Tovar, K., Hillen, W. & Porschke, D. Dynamics of repressor-operator recognition: Tn10-encoded tetracycline resistance control. Biochemistry 27, 1094–1104 (1988).
Article CAS PubMed Google Scholar
Kamionka, A., Bogdanska-Urbaniak, J., Scholz, O. & Hillen, W. Two mutations in the tetracycline repressor change the inducer anhydrotetracycline to a corepressor. Nucleic Acids Res. 32, 842–847 (2004).
Article CAS PubMed PubMed Central Google Scholar
Bolla, J. R. et al. Structural and functional analysis of the transcriptional regulator Rv3066 of Mycobacterium tuberculosis. Nucleic Acids Res. 40, 9340–9355 (2012).
Article CAS PubMed PubMed Central Google Scholar
Li, T. et al. The TetR-type transcriptional repressor RolR from Corynebacterium glutamicum regulates resorcinol catabolism by binding to a unique operator, rolO. Appl. Environ. Microbiol. 78, 6009–6016 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kokoska, S. & Zwillinger, D. CRC Standard Probability and Statistics Tables and Formulae, Student Edition (Taylor & Francis, Abingdon, UK, 2000).
Thomas, B. L. K. Geometric means and measures of dispersion. Biometrics 35, 908–909 (1979).
Google Scholar
Anderson, T. W. An Introduction to Multivariate Statistical Analysis (Wiley, Hoboken, NJ, 2003).
Hogg, R. V., McKean, J. W. & Craig, A. T. Introduction to Mathematical Statistics (Pearson Prentice Hall, Upper Saddle River, NJ, 2005).
Gut, A. An Intermediate Course in Probability (Springer, New York, 2009).
Book Google Scholar
King, E. L. & Altman, C. A schematic method of deriving the rate laws for enzyme-catalyzed reactions. J. Phys. Chem. 60, 1375–1378 (1956).
Article CAS Google Scholar
Qi, F., Dash, R. K., Han, Y. & Beard, D. A. Generating rate equations for complex enzyme systems by a computer-assisted systematic method. BMC Bioinform. 10, 238–238 (2009).
Article Google Scholar
Kuzmič, P. Program DYNAFIT for the analysis of enzyme kinetic data: application to HIV proteinase. Anal. Biochem. 237, 260–273 (1996).
Article PubMed Google Scholar
Leskovac, V. Comprehensive Enzyme Kinetics (Springer US, New York, 2003).
Google Scholar
Purich, D. L. & Allison, R. D. Handbook of Biochemical Kinetics: A Guide to Dynamic Processes in the Molecular Life Sciences. (Elsevier Science, New York, 1999).
Google Scholar
Fenton, L. The sum of log-normal probability distributions in scatter transmission systems. IEEE Trans. Commun. Syst. 8, 57–67 (1960).
Article Google Scholar
Marlow, N. A. A normal limit theorem for power sums of independent random variables. Bell Syst. Tech. J. 46, 2081–2089 (1967).
Article Google Scholar

Download references

Acknowledgements

We thank F. Del Carratore and the Synthetic Biology Research Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) for providing technical support. This work received funding from the UK Biotechnology and Biological Sciences Research Council (BB/M000354/1, BB/M017702/1 (R.B.)) and the European Union’s Horizon 2020 Research and Innovation Programme (grant agreement no. 720793, the H2020 TOPCAPI project (R.B.)).

Author information

Authors and Affiliations

Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester, United Kingdom
Areti Tsigkinopoulou, Aliah Hawari & Rainer Breitling
Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, United Kingdom
Megan Uttley

Authors

Areti Tsigkinopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Aliah Hawari
View author publications
You can also search for this author in PubMed Google Scholar
Megan Uttley
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Breitling
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.T. designed and developed the mathematical strategy and the MATLAB functions of the protocol and wrote the manuscript. A.H. designed the first step of the protocol, concerning the criteria of the assignment of weights; performed tests with diverse case studies; and provided feedback for the improvement of the protocol and the manuscript. M.U. tested the protocol and provided insightful advice on the improvement of the manuscript and the computational functions. R.B. supervised the project and wrote the manuscript.

Corresponding author

Correspondence to Rainer Breitling.

Ethics declarations

Competing interests

The authors declare that they have no competing interests as defined by Nature Research, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Properties of the standard normal and log-normal distributions (μ = 0 and σ = 1).

For the normal distribution, the standard deviation (σ) is additive, and 68.27% of the probability density are contained within the confidence interval [μ−σ, μ+σ]. For the log-normal distribution, the geometric standard deviation is multiplicative and describes a confidence interval around the geometric mean of the distribution, which contains 68.27% of the probability density. The Spread (or multiplicative standard deviation) describes the confidence interval around the mode of the distribution, which contains this fraction of the density. The geometric standard deviation and the Spread are equally valid ways to describe our uncertainty about a parameter, and each has its advantages for some applications. For the protocol, the main advantage in using the Mode and the Spread is the fact that the Spread is symmetric around the most likely value (mode), in the same way as the standard deviation of a normal distribution (i.e., the probability density at each endpoint of the interval is identical). This is not the case for the geometric standard deviation, as shown in the figure. As a result, is more intuitive to specify and communicate our uncertainty about a parameter by using the confidence interval around the mode, rather than that around the median. As can be seen in the figure, the most likely parameter values might not even be included in the confidence interval around the median, which is clearly undesirable when specifying the range of plausible values.

Supplementary Figure 2 Schematic representation of the model.

Blue arrows correspond to the maintenance of the ATP/ADP ratio by direct assignment (in combination with reaction 13), rather than by differential equations, in the published model. Likewise, the cytosolic glycerol levels are kept at zero by direct assignment, corresponding to rapid export of glycerol.

Supplementary Figure 3 Effect of reducing TPI on the steady-state flux of glucose, pyruvate and glycerol.

Replicated results matching Fig. 3b of the published model (Helfert et al,. Biochem. J. (2001)).

Supplementary Figure 4 Plots of the initial priors (red lines) and the samples from the final distributions (green histograms), along with the P values of the K-S test.

For the parameter K_m⁺ the adjusted distribution is also included (blue line).

Supplementary Figure 5

Pairwise correlations between the sampled parameter values.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–5, Supplementary Software 1–7, Supplementary Results 1 and 2, and Supplementary Tables 1–7

Supplementary Software 8

Supplementary Software scripts

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsigkinopoulou, A., Hawari, A., Uttley, M. et al. Defining informative priors for ensemble modeling in systems biology. Nat Protoc 13, 2643–2663 (2018). https://doi.org/10.1038/s41596-018-0056-z

Download citation

Published: 23 October 2018
Issue Date: November 2018
DOI: https://doi.org/10.1038/s41596-018-0056-z

This article is cited by

dynamAedes: a unified modelling framework for invasive Aedes mosquitoes
- Daniele Da Re
- Wim Van Bortel
- Matteo Marcantonio
Parasites & Vectors (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Defining informative priors for ensemble modeling in systems biology

Subjects

Abstract

Access options

Similar content being viewed by others

Causal machine learning for predicting treatment outcomes

Highly accurate protein structure prediction with AlphaFold

Genome-wide association studies

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Related links

Integrated supplementary information

Supplementary Figure 1 Properties of the standard normal and log-normal distributions (μ = 0 and σ = 1).

Supplementary Figure 2 Schematic representation of the model.

Supplementary Figure 3 Effect of reducing TPI on the steady-state flux of glucose, pyruvate and glycerol.

Supplementary Figure 4 Plots of the initial priors (red lines) and the samples from the final distributions (green histograms), along with the P values of the K-S test.

Supplementary Figure 5

Supplementary information

Supplementary Text and Figures

Supplementary Software 8

Rights and permissions

About this article

Cite this article

This article is cited by

dynamAedes: a unified modelling framework for invasive Aedes mosquitoes

Comments

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Related links

Integrated supplementary information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links