Scalable reaction network modeling with automatic validation of consistency in Event-B

Sanwal, Usman; Hoang, Thai Son; Petre, Luigia; Petre, Ion

doi:10.1038/s41598-022-05308-6

Download PDF

Article
Open access
Published: 25 January 2022

Scalable reaction network modeling with automatic validation of consistency in Event-B

Usman Sanwal¹,
Thai Son Hoang²,
Luigia Petre³ &
…
Ion Petre^4,5

Scientific Reports volume 12, Article number: 1287 (2022) Cite this article

968 Accesses
1 Citations
Metrics details

Subjects

Abstract

Constructing a large biological model is a difficult, error-prone process. Small errors in writing a part of the model cascade to the system level and their sources are difficult to trace back. In this paper we extend a recent approach based on Event-B, a state-based formal method with refinement as its central ingredient, allowing us to validate for model consistency step-by-step in an automated way. We demonstrate this approach on a model of the heat shock response in eukaryotes and its scalability on a model of the $\mathsf {ErbB}$ signaling pathway. All consistency properties of the model were proved automatically with computer support.

Reconciling qualitative, abstract, and scalable modeling of biological networks

Article Open access 26 August 2020

ModelBricks—modules for reproducible modeling improving model annotation and provenance

Article Open access 08 October 2019

A scalable method for parameter-free simulation and validation of mechanistic cellular signal transduction network models

Article Open access 10 January 2020

Introduction

Biological processes are large, complex, concurrent systems of biochemical reactions. It is remarkably difficult to capture all the necessary details of such a system in a single all-encompassing modeling step¹: details that are critical in some parts of the model are neutral in other parts of it; modules build upon each other in a structured way; there can be several levels of detail. An effective solution that has been proposed for this problem^2,3,4,5,6,7 is to use model refinement: gradually adding details to a model while preserving its consistency. This splits the modeling processes into two stages: build first a simplified, abstract version of the model (and verify/ensure its consistency) and then add details to it step-by-step in a way that ensures that the model remains consistent. At each refinement step, one can focus only on the new elements that are introduced and on their consistency with the previous model (Fig. 1). This approach allows to also separate the reasoning about the system under development into smaller steps. Model refinement has been introduced in biomodeling in several different frameworks, such as ODE-based modeling⁸, Boolean networks⁷, process algebras^3,9, rule-based modeling^5,10, and Petri nets^11,12,13. The key challenge in deploying this method in practice is verifying the consistency of the initial/basic model and ensuring that the model remains consistent in each step of the refinement. This is very difficult as the model size increases to hundreds/thousands of variables influencing each other through concurrent processes. A small error in writing a part of the model (say, in a variable name or in the multiplicity of a reactant or of a product) cascades into system-level inaccuracies between the basic and the refined versions of the model, whose sources are very difficult to identify. There are several other approaches to axiomatic network generation, based on graph theory, including the reaction mechanisms generator^14,15 and the automated reaction generation^16,17.

To address the problem of ensuring the model consistency throughout the construction of the model we recently proposed¹⁸ using Event-B for biomodeling. Event-B¹⁹ is a formal method for system modeling, with its original motivations rooted in the specifications of complex software systems. It is based on set theory, with refinement at its core, and with a focus on mathematical proofs that the different refinement levels of a model are consistent. Crucially, proving the consistency is done in Event-B at the level of its basic events, rather than on the system as a whole. This allows the modeler to identify easily the source of model inconsistencies in the events whose proof obligations fail to get discharged. Event-B is accompanied by the software toolset Rodin²⁰.

In this paper we extend the Event-B-based modeling technique¹⁸ to give the first scalable demonstration of Event-B-based biomodeling. We show that simple mathematical functions can be used to hide some of the details of the refinement, leading to a reduction in the number of events of the model. We show that a model of the heat shock response can be described through 17 events, instead of the 57 events in an earlier model¹⁸. More drastically, we show that a model of the $\mathsf {ErbB}$ signaling pathway consisting of 1320 reactions can be written through only 242 events. The models were implemented using Rodin²⁰ and all proof obligations related to demonstrating the model consistency were automatically discharged in Rodin. All Event-B models discussed in the paper are available at²¹.

Results

From a reaction network to an Event-B model

We model reaction networks as sets of biochemical reactions, where each reaction specifies its reactants, products, and possibly inhibitors and catalyzers. For simplicity, we consider each reversible reaction in our methodology as two irreversible reactions. With these assumptions, a reaction r can be written as a rewriting rule of the form:

$$\begin{aligned} r: \ \ \ m_1X_1 + m_2X_2 + \cdots + m_nX_n \rightarrow m'_1X_1 + m'_2X_2 + \cdots + m'_nX_n, \end{aligned}$$

(1)

where $\mathcal{S} = \{X_1,\ldots ,X_n\}$ is the set of reactants and $m_1, \ldots , m_n, m'_1, \ldots , m'_n \in \mathbb {N}$ are non-negative integers.

Reaction networks can be modeled straightforwardly in Event-B¹⁸: every reactant is modeled by a variable and every reaction is modeled by an event. Invariants ensure the consistency of the model as well as other biological properties of interest, for instance the mass conservation rule that requires that the number of certain reactants remains constant in the system. The general form of an Event-B model corresponding to reaction (1) is presented in Table 1; more reactions in the network model lead to more events in the Event-B model¹⁸. $X_1, X_2$ $, \ldots ,X_n$ are the variables of the model, whose initial values are set in the initialization event. For each reaction r of the reaction network, we specify in its guard that it must have enough of each reactant in order for the reaction to be enabled, while the action of the event specifies the changes to happen to each variable.

Table 1 The general form of an Event-B model for a reaction network.

Full size table

The data refinement of a model is about adding details to it by replacing generic variables with more specific versions of them. The refined model can include the reactions of the basic model where the refined variables replace the generic ones in all (or some) possible combinations. New reactions between the refined variables can also be added. Obviously, this easily leads to a combinatorial explosion in the size of the refined model. To see this, consider a binary binding reaction $A+B\rightarrow C$. When variable A is refined into two specific versions of it $\{A_0,A_1\}$, B into $\{B_0,B_1,B_2\}$, this leads to a refinement of C into $\{C_{i,j}\mid i=0,1, j=0,1,2\}$. The reaction is replaced by six refined versions of it: $A_i+B_j\rightarrow C_{i,j}$, for all $0\le i\le 1$, $0\le j\le 2$.

The case of a homogeneous binary reaction (also called a dimerization reaction) is similar: reaction $2A\rightarrow C$ and a refinement of A into $\{A_0,A_1\}$ leads to a refinement of C into $\{C_0,C_1,C_2\}$ and three refined versions of the reaction: $A_i+A_j\rightarrow C_{i+j}$, with $i,j\in \{0,1\}$.

Size-preserving reaction network refinement in Event-B

We propose in this paper a new approach to reaction network refinement in Event-B that keeps the number of variables and events unchanged. To refine a variable X into l versions of it $\{X_1,X_2,\ldots ,X_l\}$ we replace the integer variable X with a function rX defined on a domain set partitioned into l singleton sets $\{X_1\}$, $\{X_2\}$, $\ldots $, $\{X_l\}$ and taking integer values. Distinguishing between the l versions of variable X can be done by checking the argument of function rX for the partition it belongs to. All of these are easily implementable in Event-B. The complexity of the refinement is transferred in this way to using functions defined on partitioned sets, rather than using integer variables, and in specifying more complex event guards to check the argument of the function. Crucially, this makes writing large refined models possible and it leaves checking the model consistency feasible, at least on the $\mathsf {ErbB}$ model we tested on, with more than 1000 reactions in its refined version. We discuss this technique in concrete details in the case of the binary reactions: binding $A+B\rightarrow C$ and dimerization $2A\rightarrow C$.

Consider the binding reaction $A + B \rightarrow C$. Assume that the variable A is refined into two versions, $A_0$ and $A_1$ and the variable C is refined into $C_0,C_1$. The binding reaction is then refined into two versions of it $A_i+B\rightarrow C_i$, with $i=0,1$. Rather than writing two events in the refined Event-B model, one for each of them, we can write a single encompassing event in the following way.

We replace variables A and C with the functions $rA:\{\{A_0\}, \{A_1\}\}\rightarrow \mathbb {N}$ and $rC:\{\{C_0\}, \{C_1\}\}\rightarrow \mathbb {N}$, resp., where $A_0,A_1,C_0,C_1$ are new constants. The refinement of the binding reaction can be described through a single event, shown in Table 2. The key observation is that distinguishing between the two refinements of the reaction is done in a uniform way through the guards of the event. The main guard in this case is @grd3, which models that the specific form $A_0$ of variable A corresponds to the specific form $C_0$ of variable C and that $A_1$ corresponds to $C_1$. These two cases are covered in @grd3 through parameters a and c, a technique that allows a uniform specification of the two refined versions of the reaction through @act1 and @act3. The gluing invariants specifying the conditions for the consistency of this refinement are $A = rA(A_0) + rA(A_1)$ and $C = rC(C_0) + rC(C_1)$.

Table 2 Size-preserving refinement of a binding reaction in Event-B.

Full size table

A dimerization reaction $2A \rightarrow C$ can be refined in the same way.

Assume that the variable A is refined into two versions $A_0$ and $A_1$ and the variable C is refined into three versions $C_0$, $C_1$, and $C_2$. The dimerization reaction is then refined into three versions of it $A_i+A_j\rightarrow C_{i+j}$, with $0\le i\le j\le 1$. All three of them can be described through a single event, shown in Table 3.

Table 3 Size-preserving refinement of a dimerization reaction in Event-B.

Full size table

The event is written similarly as for the binding reaction: variable A is replaced with a function $rA:\{\{A_0\}, \{A_1\}\}\rightarrow \mathbb {N}$ and variable C is replaced by function $rC:\{\{C_0\}, \{C_1\}, \{C_2\}\}\rightarrow \mathbb {N}$. The consistency of the refinement is specified through the gluing invariants shown in Table 3.

A compact Event-B model for the heat shock response

The heat shock response is a well conserved cellular mechanism to react to sudden increases in temperature, in an effort to protect the cell against damage to cellular structures and essential functions²². It also has a key role in oncogenesis and cell death^23,24. The reaction network model²⁵ describing the heat shock response is shown in Table 4 and it consists of 10 variables and 17 irreversible reactions. It was described through an Event-B model¹⁸ with each (uni-directional) reaction leading to its own event in the model. We are interested in a refinement of this model where the heat shock factor ($\mathsf {hsf}$) variable is replaced with two variants $\mathsf {\mathsf {rhsf}}^{(0)}$ and $\mathsf {\mathsf {rhsf}}^{(1)}$, corresponding to the phosphorylation status of site S230 of the heat shock factor. Consequently, this leads us to also introduce simultaneously the refinement of all the other forms of $\mathsf {hsf}$:

$$\begin{aligned}&\mathsf {hsp:hsf}\rightarrow \{\mathsf {\mathsf {\mathsf {hsp}:\mathsf {rhsf}}}^{(0)}, \mathsf {\mathsf {\mathsf {hsp}:\mathsf {rhsf}}}^{(1)}\},\\&\mathsf {hsf}_{2}\rightarrow \{\mathsf {\mathsf {rhsf}}_{2}^{(0)}, \mathsf {\mathsf {rhsf}}_{2}^{(1)}, \mathsf {\mathsf {rhsf}}_{2}^{(2)}\},\\&\mathsf {hsf}_{3}\rightarrow \{\mathsf {\mathsf {rhsf}}_{3}^{(0)}, \mathsf {\mathsf {rhsf}}_{3}^{(1)}, \mathsf {\mathsf {rhsf}}_{3}^{(2)}, \mathsf {\mathsf {rhsf}}_{3}^{(3)}\},\\&\mathsf {hsf}_{3}:hse\rightarrow \{\mathsf {\mathsf {rhsf}}_{3}^{(0)}:\mathsf {hse}, \mathsf {\mathsf {rhsf}}_{3}^{(1)}:\mathsf {hse}, \mathsf {\mathsf {rhsf}}_{3}^{(2)}:\mathsf {hse}, \mathsf {\mathsf {rhsf}}_{3}^{(3)}:\mathsf {hse}\}. \end{aligned}$$

This refinement leads to 22 variables and 57 reactions in the reaction network model. We avoid this increase in the model size using our approach: the refined model can be written in Event-B through 10 variables and 17 events, the same as in the basic model. All proof obligations were discharged automatically in Rodin (some of them required a change of the default theorem prover).

Table 4 The molecular model for the eukaryotic heat shock response proposed in ²⁵.

Full size table

The comparison between the size of the basic and the refined variants of the $\mathsf {HSR}$ model is in Table 5. The full model can be downloaded at²¹.

Table 5 The heat shock response and the $\mathsf {ErbB}$ models statistics.

Full size table

A compact Event-B model for the ErbB signaling pathway

The $\mathsf {ErbB}$ signaling pathway is a very well studied evolutionary pathway, essential in the growth and expansion of organs and of the central nervous system^26,27. Its main role is to induce, through the cellular membrane, a signal instigating the cell’s growth and differentiation. This pathway is often overly active in various types of cancer and has been used for a long time as a therapeutic target²⁸.

The reaction network model^29,30,31 of the $\mathsf {ErbB}$ signaling pathway consists of a basic model consisting of 242 reactions, extended then with the phosphorylation details of the epidermal growth factor receptor ($\mathsf {EGFR}$) and the epidermal growth factor ($\mathsf {EGF}$) to 1320 reactions. The basic model was described in Event-B³² and we extend it here to the full model. The epidermal growth factor receptor is refined into the four receptor members of the $\mathsf {ErbB}$ family and the epidermal growth factor is refined into two variants:

$$\begin{aligned}&\mathsf {EGFR}\rightarrow \{\mathsf {ErbB1}, \mathsf {ErbB2}, \mathsf {ErbB3}, \mathsf {ErbB4}\}; \\&\mathsf {EGF}\rightarrow \{\mathsf {EGF}, \mathsf {HRG}\}. \end{aligned}$$

This leads to a refinement of $\mathsf {EGF-EGFR}$ into the following eight variants:

$$\begin{aligned}&\mathsf {EGF-EGFR}\rightarrow \{\mathsf {EGF-ErbB1}, \mathsf {EGF-ErbB2}, \mathsf {EGF-ErbB3}, \mathsf {EGF-ErbB4},\\&\mathsf {HRG-ErbB1}, \mathsf {HRG-ErbB2},\mathsf {HRG-ErbB3}, \mathsf {HRG-ErbB4}\}. \end{aligned}$$

A similar refinement is also applied for their phosphorylated versions and for their homodimers.

They were described in our Event-B model through 242 events, as many as the basic model. All proof obligations were discharged automatically by Rodin (some of them required a change of the default theorem prover), with the comparison between the basic and the refined variants shown in Table 5. The full model can be downloaded at²¹.

Discussion

Modeling complex biological processes is a challenging systems engineering task. A solution to addressing its complexity is to use refinement and start modeling from a conceptually simpler (more abstract) model that is consistent. We can then gradually add more details in a correctness-by-construction approach, so that the most detailed (concrete) model is mathematically proved to be consistent. In this approach, the challenge is to prove both the consistency of the basic model and that of the refinement process. The Event-B-based approach we propose in this paper addresses both of these challenges. Through the Rodin toolset, the models get a much needed automatic computer-aided verification of their consistency. The method we proposed, to implement the data refinement through the use of integer-valued functions, allowed us to preserve the size of the models in terms of the number of variables and events, with the increase in complexity reflected in the more complex guards, in the higher number of invariants, and in the switch from integer variables to integer-valued functions. This led to more proof obligations. Nevertheless, in the proof-of-concept examples we modeled, all the proof obligations were automatically discharged in Rodin, albeit some of them needed a switch from the default theorem prover to another one in the Rodin installation. Even in this case, no input from the user was needed in proving the model consistency.

Additional refinements and replacing refinement relations with more detailed ones can be done in the same way. They only involve replacing some integer variables with integer-valued functions as we discussed in this paper, or replacing the domain set of an integer-valued function with a larger domain set (reflecting the more detailed refinement). This means that the number of variables and events will remain the same (the number of proof obligations will increase). The refinement of the $\mathsf {ErbB}$ reaction network, going from 242 reactions to 1320 reactions, left unchanged the number of variables and events, but increased the number of proof obligations by about 50%. The number of variables and events of the model will certainly increase if other operations than refinement are used in the model construction, such as model composition or model union.

Specifying the refinement relation in a consistent manner is essential for our approach. This means that refining a variable A standing for a molecular species should lead to the corresponding refinements of all complexes that it is part of: complexes A : B, dimers $A_2$, trimers $A_3$, etc. Deducing the full consistent refinement relation based on specifying the refinement of a single variable can be done based on⁶ by specifying the relationship between the compound variables and the basic variables.

Scalable modeling remains a challenging problem. The approach we introduced here is a partial solution to the problem in the case of modeling based on data refinement. Event-B has most characteristics required for a mathematical modeling language to facilitate model reuse in systems biology¹: it is modular, human-readable, it is hybrid (it allows for continuous and for discrete mathematics semantics, aspects beyond the focus of this paper), open, and declarative (but not graphical). Significant is also the bridge between the reaction network modeling and Event-B-based modeling, which brings to biomodeling a new framework for computer-aided verification of model consistency. This is relevant to a wide variety of models, beyond the two proof-of-concept examples in this paper.

Methods and models

Event-B

Event-B³³ is a state-based formal method, building on the earlier formalisms the B-method³⁴ and the action systems³⁵. The system state in Event-B is described by the values of variables and the state changes are modeled using events. The variable types and the model properties that must hold during system execution are defined as invariants. The initial system state is described with a specific event named Initialisation. An event can contain parameters, a guard and an action. The guard is a logical predicate on the variables and parameters, describing the conditions under which the action is enabled to take place. The action describes the updates it yields to the variables when it is executed. If two or more events are enabled at the same time, then one is non-deterministically chosen and executed. If two events do not update each other’s variables, then they can be executed in any order and we can consider that they are executed in parallel. The variables and events in an Event-B model are contained in a machine, also referred to as the “dynamic part” of the model. An Event-B machine can see one or more contexts, also known as the “static part” of the model. A context contains definitions of constants, sets, as well as axioms about them. A general structure of an Event-B model, made out of machine M and context C is presented in Table 6.

Table 6 General structure of machine M and context C in Event-B.

Full size table

A key concept in formal modeling with Event-B is that of refinement¹⁹, which allows the modeler to start from a simple model of the system and then gradually introduce more details, in the form of new events, variables or context data. Event-B has two types of refinements: superposition refinement and data refinement. Superposition refinement^3,4 is the term used when new variables and events are added to the existing model. The validity of model is preserved by making sure that newly added variables and events do not contradict the previous events in any of the preceding models. In data refinement², some variables in the abstract machine are replaced by other variables in the refined machine; in this case, a gluing invariant is added in the refined machine to define the relation between the abstract variables and the newly introduced, concrete ones. Refinement in Event-B has been used to model numerous protocols and systems, including smart cash card systems³⁶, vehicle platoons³⁷, topology discovery in graphs³⁸, self-recovery in sensor-actor networks³⁹, spacecraft systems⁴⁰, coordination in peer-to-peer networks⁴¹, smart grid recoverability⁴², proactive routing in wireless networks⁴³, reaction networks¹⁸, etc.

Event-B benefits from the tool support of the Eclipse-based Rodin platform²⁰. Rodin allows to edit the model, to prove properties of the model, to animate the model and even allows model checking. Proving in Event-B employs several proof engines to automatically prove the different properties of the model. Rodin automatically generates all proof obligations in the form of sequents; these need to be discharged in order for the different properties (e.g., invariance, termination, or refinement) to hold. The internal provers attempt to discharge these proof obligations automatically. The remaining ones can be tackled using the interactive prover, with input from the modeler, for instance by adding extra assumptions or choosing different proof strategies. The model properties are proved at the level of the events, rather than on the systems level. Having a property that cannot be proved for some event can point out to errors in that event. The modeler then has a chance to edit the model to address the issue. Such interleaving between modeling and proving is an important aspect of working with Event-B and the Rodin platform and is quite similar to the compilation of programs²⁰.

The heat shock response

The heat-shock response is a central, well-conserved, cellular-level regulatory mechanism^44,45. Proteins are folded in three-dimensional shapes and the fold determines whether it can achieve its functionality (e.g., bind to a certain site on a DNA molecule or on another protein). Protein folding is a dynamical process, continuously influenced by many factors, such as chemical modifications of the amino-acids forming the protein (e.g., phosphorylation, acetylation, sumoylation) and properties of the environment (e.g., temperature, radiation, heavy metals). Misfolded proteins quickly form large protein bundles that are detrimental to the normal physiology of a cell and eventually lead to cell death. The heat shock response is one of the stress response mechanisms of a cell, aiming to limit the accumulation of misfolded proteins and assisting misfolded proteins to regain their natural fold. The heat shock response synthesizes a group of proteins called heat shock proteins ($\mathsf {hsp}$s) that act as molecular chaperones for the misfolded proteins and support their recovery from stress. This is achieved either by repairing the damaged proteins or by degrading them, thus restoring protein homeostasis and promoting cell survival. Without such a mechanism, misfolded proteins will form plaque, which is the hallmark of many neurological diseases.

The basic model we discuss for the eukaryotic heat shock response²⁵ is summarized in Table 4. When the temperature increases, proteins $\mathsf {prot}$ begin to misfold, namely transform into $\mathsf {mfp}$ [reaction (10)]. The heat shock proteins have a high affinity to bind to the misfolded proteins, acting as chaperones and forming $\mathsf {hsp:mfp}$ complexes [reaction (11)]. Then, the complex $\mathsf {hsp:mfp}$ can transform back into the original protein $\mathsf {prot}$, freeing the heat shock factor protein $\mathsf {hsp}$ too [reaction (12)]. The $\mathsf {hsp}$ is synthesized as follows. The heat shock factor ($\mathsf {hsf}$) binds in a trimmer form to the $\mathsf {hsp}$’s gene promoter (called the heat shock element $\mathsf {hse}$) [reactions (1)–(3) in Table 4]. The formed $\mathsf {hsf}_{3}:hse$ then produces the $\mathsf {hsp}$ proteins [reaction (4)]. These tend to combine with $\mathsf {hsf}$ and stay in inactive state as $\mathsf {hsp:hsf}$ complexes [right arrow in reaction (5), as well as reactions (6)–(8)]. Once the temperature increases and more $\mathsf {hsp}$ are becoming chaperons for $\mathsf {mfp}$, less are available for forming $\mathsf {hsp:hsf}$ complexes and the balance changes: the left arrow in the reaction (5) is activated. Finally, $\mathsf {hsp}$s can also degrade [reaction (9)].

The phosphorylation level of the heat shock factors has a key contribution to its activation and consequently to the effectiveness of the heat shock response⁴⁶. One site in particular, S230, becomes phosphorylated only during heat shock response and drives its activity⁴⁷. Following the model of⁴⁸, we introduce two variants of $\mathsf {hsf}$: one where S230 is unphosphorylated and the other where S230 is phosphorylated. This cascades into several variants of all variables that $\mathsf {hsf}$ is a part of: $\mathsf {hsf}_{2}$ (phosphorylation level 0, 1, or 2), $\mathsf {hsf}_{3}$ and $\mathsf {hsf}_{3}:hse$ (phosphorylation level 0, 1, 2, or 3), and $\mathsf {hsp:hsf}$ (phosphorylation level 0 or 1). These phosphorylation-based variants replace the generic variables in all reactions, in all possible combinations, leading to an increase in the size of the model⁴⁸.

The ErbB signaling pathway

We discuss briefly here the key functionality of the $\mathsf {ErbB}$ signaling pathway^29,30,31 using a highly simplified presentation. The epidermal growth factors ($\mathsf {EGF}$) are a family of proteins that signal to cells to grow and differentiate. They do that by binding to ligand proteins embedded in the cellular membrane—the epidermal growth factor receptors ($\mathsf {EGFR}$). Once bound, the complex dimerizes and then gets phosphorylated. This then activates other ($\mathsf {MAPK}$ and $\mathsf {ERK}$) signaling pathways. All of these activations are done step by step through a cascade of reactions, whose effect is the activation of some proteins, that then participate in other reactions activating other proteins, etc.

The model of the $\mathsf {ErbB}$ signaling pathway²⁹ that we follow in this paper is a revised version of the two earlier models^30,31. It is first presented on a more generic level, along the lines briefly described above. This initial model consists however of 242 irreversible reactions. The full model is then introduced essentially by differentiating between the four members of the $\mathsf {EGFR}$ family ($\mathsf {ErbB1}$ (also known as $\mathsf {EGFR}$), $\mathsf {ErbB2}$, $\mathsf {ErbB3}$, $\mathsf {ErbB4}$) and the two members of the $\mathsf {EGF}$ family ($\mathsf {EGF}$ and $\mathsf {HRG}$). Adding these details leads to many more variables in the model. For example, the complex $\mathsf {EGF}:\mathsf {EGFR}$ is replaced by 8 different variants of it. The full reaction network model²⁹ has 1320 reactions.

Data availability

All models are available at²¹.

References

Schölzel, C., Blesius, V., Ernst, G. & Dominik, A. Characteristics of mathematical modeling languages that facilitate model reuse in systems biology: A software engineering perspective. NPJ Syst. Biol. Appl. 7, 27. https://doi.org/10.1038/s41540-021-00182-w (2021).
Article PubMed PubMed Central Google Scholar
Hoare, C., Jifeng, H. & Sanders, J. Prespecification in data refinement. Inf. Process. Lett. 25, 71–76. https://doi.org/10.1016/0020-0190(87)90224-9 (1987).
Article MathSciNet MATH Google Scholar
Back, R.-J. & Sere, K. Superposition refinement of reactive systems. Formal Asp. Comput. 8, 324–346 (1996).
Article Google Scholar
Katz, S. A superimposition control construct for distributed systems. ACM Trans. Program. Lang. Syst. 15, 337–356. https://doi.org/10.1145/169701.169682 (1993).
Article Google Scholar
Murphy, E., Danos, V., Féret, J., Krivine, J. & Harmer, R. Rule-Based Modeling and Model Refinement, chap. 4, 83–114 (John Wiley & Sons, Ltd, 2010). https://onlinelibrary.wiley.com/doi/pdf/10.1002/9780470556757.ch4.
Gratie, C. & Petre, I. Complete characterization for the fit-preserving data refinement of mass-action reaction networks. Theor. Comput. Sci. 641, 11–24. https://doi.org/10.1016/j.tcs.2016.03.027 (2016).
Article MathSciNet MATH Google Scholar
Paulevé, L., Kolčák, J., Chatain, T. & Haar, S. Reconciling qualitative, abstract, and scalable modeling of biological networks. Nat. Commun. 11, 4256. https://doi.org/10.1038/s41467-020-18112-5 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Iancu, B., Czeizler, E., Czeizler, E. & Petre, I. Quantitative refinement of reaction models. Int. J. Unconv. Comput. 8, 529–550 (2012).
MATH Google Scholar
Aceto, L. & Hennessy, M. Towards action-refinement in process algebras. In $[$1989$]$Proceedings. Fourth Annual Symposium on Logic in Computer Science, 138–145. https://doi.org/10.1109/LICS.1989.39168 (1989).
Danos, V., Feret, J., Fontana, W., Harmer, R. & Krivine, J. Rule-based modelling and model perturbation. In Transactions on Computational Systems Biology XI, 116–137 (Springer, 2009).
Suzuki, I. & Murata, T. A method for stepwise refinement and abstraction of petri nets. J. Comput. Syst. Sci. 27, 51–76. https://doi.org/10.1016/0022-0000(83)90029-6 (1983).
Article MathSciNet MATH Google Scholar
Padberg, J. & Urbášek, M. Rule-based refinement of petri nets: A survey. In Ehrig, H., Reisig, W., Rozenberg, G. & Weber, H. (eds.) Petri Net Technology for Communication-Based Systems: Advances in Petri Nets, 161–196. https://doi.org/10.1007/978-3-540-40022-6_9 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2003).
Gratie, D.-E. & Petre, I. Hiding the combinatorial state space explosion of biomodels through colored petri nets. Ann. Univ. Buchar. LXI, 23–41 (2014).
MathSciNet Google Scholar
Van Geem, K. M. et al. Automatic reaction network generation using RMG for steam cracking of n-hexane. AIChE J. 52, 718–730. https://doi.org/10.1002/aic.10655 (2006).
Article CAS Google Scholar
Liu, M. et al. Reaction mechanism generator v3.0: Advances in automatic mechanism generation. J. Chem. Inf. Model. 61, 2686–2696. https://doi.org/10.1021/acs.jcim.0c01480 (2021).
Article CAS PubMed Google Scholar
Orlova, Y., Kryven, I. & Iedema, P. D. Automated reaction generation for polymer networks. Comput. Chem. Eng. 112, 37–47. https://doi.org/10.1016/j.compchemeng.2018.01.022 (2018).
Article CAS Google Scholar
Orlova, Y., Gambardella, A. A., Harmon, R. E., Kryven, I. & Iedema, P. D. Finite representation of reaction kinetics in unbounded biopolymer structures. Chem. Eng. J. 405, 126485. https://doi.org/10.1016/j.cej.2020.126485 (2021).
Article CAS Google Scholar
Sanwal, U., Petre, L. & Petre, I. Stepwise construction of a metabolic network in Event-B. Comput. Biol. Med. 91, 1–12 (2017).
Article Google Scholar
Abrial, J.-R. Modeling in Event-B: System and Software Engineering 1st edn. (Cambridge University Press, New York, 2010).
Book Google Scholar
Abrial, J.-R. et al. Rodin: An open toolset for modelling and reasoning in Event-B. STTT 12, 447–466 (2010).
Article Google Scholar
Sanwal, U., Hoang, T. S., Petre, L. & Petre, I. Event-B models for the heat shock response and the ErbB signaling pathway. https://combio.org/wp-content/uploads/2021/08/Event-B_Model_Erbb_HSR.zip (2021).
Richter, K., Haslbeck, M. & Buchner, J. The heat shock response: Life on the verge of death. Mol. Cell 40, 253–266. https://doi.org/10.1016/j.molcel.2010.10.006 (2010).
Article CAS PubMed Google Scholar
Jolly, C. & Morimoto, R. I. Role of the heat shock response and molecular chaperones in oncogenesis and cell death. JNCI J. Natl. Cancer Inst. 92, 1564–1572. https://doi.org/10.1093/jnci/92.19.1564 (2000).
Article CAS PubMed Google Scholar
Gomez-Pastor, R., Burchfiel, E. T. & Thiele, D. J. Regulation of heat shock transcription factors and their roles in physiology and disease. Nat. Rev. Mol. Cell Biol. 19, 4–19. https://doi.org/10.1038/nrm.2017.73 (2018).
Article CAS PubMed Google Scholar
Petre, I. et al. A simple mass-action model for the eukaryotic heat shock response and its mathematical validation. Nat. Comput. 10, 595–612 (2011).
Article MathSciNet Google Scholar
Citri, A. & Yarden, Y. EGF-ERBB signalling: Towards the systems level. Nat. Rev. Mol. Cell Biol. 7, 505–516. https://doi.org/10.1038/nrm1962 (2006).
Article CAS PubMed Google Scholar
Hynes, N. E. & MacDonald, G. ERBB receptors and signaling pathways in cancer. Curr. Opin. Cell Biol. 21, 177–184. https://doi.org/10.1016/j.ceb.2008.12.010 (2009).
Article CAS PubMed Google Scholar
Santarpia, L., Lippman, S. M. & El-Naggar, A. K. Targeting the MAPK-RAS-RAF signaling pathway in cancer therapy. Expert Opin. Ther. Targets 16, 103–119. https://doi.org/10.1517/14728222.2011.645805 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hornberg, J. J. et al. Control of MAPK signalling: From complexity to what really matters. Oncogene 24, 5533–5542 (2005).
Article CAS Google Scholar
Kholodenko, B. N., Demin, O. V., Moehren, G. & Hoek, J. B. Quantification of short term signaling by the epidermal growth factor receptor. J. Biol. Chem. 274, 30169–30181 (1999).
Article CAS Google Scholar
Schoeberl, B., Eichler-Jonsson, C., Gilles, E. D. & Müller, G. Computational modeling of the dynamics of the map kinase cascade activated by surface and internalized EGF receptors. Nat. Biotechnol. 20, 370–375 (2002).
Article Google Scholar
Iancu, B., Sanwal, U., Gratie, C. & Petre, I. Refinement-based modeling of the ErbB signaling pathway. Comput. Biol. Med. 106, 91–96. https://doi.org/10.1016/j.compbiomed.2019.01.016 (2019).
Article CAS PubMed Google Scholar
Abrial, J.-R. & Hallerstede, S. Refinement, decomposition, and instantiation of discrete models: Application to Event-B. Fundamenta Informaticae 77(1—-2), 1–28 (2007).
MathSciNet MATH Google Scholar
Abrial, J.-R. The B-book: Assigning Programs to Meanings (Cambridge University Press, New York, 1996).
Book Google Scholar
Back, R.-J. & Kurki-Suonio, R. Decentralization of process nets with centralized control. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing, PODC ’83, 131–142, https://doi.org/10.1145/800221.806716 (ACM, New York, NY, USA, 1983).
Butler, M. & Yadav, D. An incremental development of the mondex system in Event-B. Form. Asp. Comput. 20, 61–77. https://doi.org/10.1007/s00165-007-0061-4 (2007).
Article Google Scholar
Lanoix, A. Event-B specification of a situated multi-agent system: Study of a platoon of vehicles. In Theoretical Aspects of Software Engineering, 297–304. https://doi.org/10.1109/TASE.2008.39 (IEEE Computer Society, Los Alamitos, CA, USA, 2008).
Hoang, T. S., Kuruma, H., Basin, D. & Abrial, J.-R. Developing topology discovery in Event-B. In Leuschel, M. & Wehrheim, H. (eds.) Integrated Formal Methods, vol. LNCS 5423, 1–19. https://doi.org/10.1007/978-3-642-00255-7_1 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2009).
Kamali, M., Laibinis, L., Petre, L. & Sere, K. Self-recovering sensor-actor networks. In Proceedings Ninth International Workshop on the Foundations of Coordination Languages and Software Architectures, FOCLASA 2010, Paris, France, 4th September 2010., 47–61. https://doi.org/10.4204/EPTCS.30.4 (2010).
Salehi Fathabadi, A., Rezazadeh, A. & Butler, M. Applying atomicity and model decomposition to a space craft system in Event-B. In Bobaru, M., Havelund, K., Holzmann, G. J. & Joshi, R. (eds.) NASA Formal Methods, 328–342. https://doi.org/10.1007/978-3-642-20398-5_24 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2011).
Petre, L., Sandvik, P. & Sere, K. Node coordination in peer-to-peer networks. In Coordination Models and Languages - 14th International Conference, COORDINATION 2012, Stockholm, Sweden, June 14-15, 2012. Proceedings, 196–211. https://doi.org/10.1007/978-3-642-30829-1_14 (2012).
Horsmanheimo, S. et al. On proving recoverability of smart electrical grids. In NASA Formal Methods - 6th International Symposium, NFM 2014, Houston, TX, USA, April 29 - May 1, 2014. Proceedings, 77–91. https://doi.org/10.1007/978-3-319-06200-6_6 (2014).
Kamali, M., Höfner, P., Kamali, M. & Petre, L. Formal analysis of proactive, distributed routing. In Software Engineering and Formal Methods - 13th International Conference, SEFM 2015, York, UK, September 7-11, 2015. Proceedings, 175–189. https://doi.org/10.1007/978-3-319-22969-0_13 (2015).
Powers, M. V. & Workman, P. Inhibitors of the heat shock response: Biology and pharmacology. FEBS Letters 581, 3758 – 3769. https://doi.org/10.1016/j.febslet.2007.05.040 (2007). Cellular Stress.
Voellmy, R. Transduction of the stress signal and mechanisms of transcriptional regulation of heat shock/stress protein gene expression in higher eukaryotes. Crit. Rev. Eukaryot. Gene Expr. 4, 357–401 (1994).
CAS PubMed Google Scholar
Kline, M. P. & Morimoto, R. I. Repression of the heat shock factor 1 transcriptional activation domain is modulated by constitutive phosphorylation. Mol. Cell Biol. 17, 2107–2115. https://doi.org/10.1128/MCB.17.4.2107 (1997).
Article CAS PubMed PubMed Central Google Scholar
Holmberg, C. I. et al. Phosphorylation of serine 230 promotes inducible transcriptional activity of heat shock factor 1. EMBO J. 20, 3800–3810. https://doi.org/10.1093/emboj/20.14.3800 (2001).
Article CAS PubMed PubMed Central Google Scholar
Czeizler, E., Rogojin, V. & Petre, I. The phosphorylation of the heat shock factor as a modulator for the heat shock response. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1326–1337. https://doi.org/10.1109/TCBB.2012.66 (2012).
Article PubMed Google Scholar

Download references

Acknowledgements

Ion Petre was partially supported by the Romanian Ministry of Education and Research, CCCDI—UEFISCDI, project number PNIII-P2-2.1-PED-2019-2391, within PNCDI III.

Author information

Authors and Affiliations

Mälardalen University, Västerås, Sweden
Usman Sanwal
School of Electronics and Computer Science, University of Southampton, Southampton, UK
Thai Son Hoang
Computer Science, Abo Akademi University, 20500, Turku, Finland
Luigia Petre
Department of Mathematics and Statistics, University of Turku, 20014, Turku, Finland
Ion Petre
National Institute for Research and Development in Biological Sciences, 060031, Bucharest, Romania
Ion Petre

Authors

Usman Sanwal
View author publications
You can also search for this author in PubMed Google Scholar
Thai Son Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Luigia Petre
View author publications
You can also search for this author in PubMed Google Scholar
Ion Petre
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

U.S., L.P. and I.P designed the study. U.S., T.S.H. and L.P. wrote and analyzed the models. U.S. and I.P. collected and curated the data. All authors analyzed the results. They were all involved in drafting the article, and have reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Ion Petre.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sanwal, U., Hoang, T.S., Petre, L. et al. Scalable reaction network modeling with automatic validation of consistency in Event-B. Sci Rep 12, 1287 (2022). https://doi.org/10.1038/s41598-022-05308-6

Download citation

Received: 13 August 2021
Accepted: 08 December 2021
Published: 25 January 2022
DOI: https://doi.org/10.1038/s41598-022-05308-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.