Computing all persistent subspaces of a reaction-diffusion system

An algorithm is presented for computing a reaction-diffusion partial differential equation (PDE) system for all possible subspaces that can hold a persistent solution of the equation. For this, all possible sub-networks of the underlying reaction network that are distributed organizations (DOs) are identified. Recently it has been shown that a persistent subspace must be a DO. The algorithm computes the hierarchy of DOs starting from the largest by a linear programming approach using integer cuts. The underlying constraints use elementary reaction closures as minimal building blocks to guarantee local closedness and global self-maintenance, required for a DO. Additionally, the algorithm delivers for each subspace an affiliated set of organizational reactions and minimal compartmentalization that is necessary for this subspace to persist. It is proved that all sets of organizational reactions of a reaction network, as already DOs, form a lattice. This lattice contains all potentially persistent sets of reactions of all constrained solutions of reaction-diffusion PDEs. This provides a hierarchical structure of all persistent subspaces with regard to the species and also to the reactions of the reaction-diffusion PDE system. Here, the algorithm is described and the corresponding Python source code is provided. Furthermore, an analysis of its run time is performed and all models from the BioModels database as well as further examples are examined. Apart from the practical implications of the algorithm the results also give insights into the complexity of solving reaction-diffusion PDEs.


Introductory example
As an example, we consider the following reaction network describing the role of microRNAs in osteoarthritis 23 .It is a micro-RNA transcription-factor interaction model.The set of species is and the set of reactions thus n = 9 and m = 11 for this example.
Generally, the equation of reaction number j can be described by with natural numbers a ij , b ij , j = 1, . . ., m which can be zero.For the reaction r j the set of species s i with a ij > 0 is called support of r j , shortly supp(r j ) , and the set of species s i with b ij > 0 the products of r j , shortly prod(r j ) .For a subset S S of species, we say that it supports a reaction r ∈ R , if supp(r) S.
A reversible reaction is a reaction in which the conversion of reactants to products and the conversion of products to reactants occur simultaneously.In this work, reversible reactions are represented by a pair of two separate reactions, which are both active or inactive at the same time.From the set of reactions the so-called stoichiometric matrix N ∈ R n×m with its elements n ij = b ij − a ij , i = 1, . . ., n, j = 1, . . ., m is derived.TF1 : Transcription factor 1 TF2 : Transcription factor for miR synthesis miR : micro RNA miR_gene : gene of micro RNA Sink : EmptySet Signal : signal of TF1 transcription miR_gene_TF2 : mir_gene_TF2 complex miR_gene_TF1 : mir_gene_TF1 complex TF1_mRNA : TF1_mRNA complex

Distributed organizations (DOs)
The previously defined organizations were generalized towards so-called distributed organizations (DOs), which are introduced and broadly discussed in 20,21 .
(5) Nv ≥ 0, www.nature.com/scientificreports/ 2. there is a vector v ∈ R m + , v ≥ 0, such that 3. and there is a feasible flux vi ∈ R m + , vi ≥ 0, with respect to each subset S i , i = 1, . . ., k , with Collectively, we call the Eqs.( 8) and ( 9) the self-maintenance property of a DO.We say "D is distributed to the compartments S i ", "the compartments S i form a compartmentalization (or distribution) of D" or " S i is a compart- ment of D".When listing the elements of the subsets S i , i = 1, . . ., k, of species, we use a vertical notation, for example, if D is distributed to S 1 = {s 1 , s 2 } and S 2 = {s 1 , s 3 } , we write If a DO exhibits a distribution to only one subset of species, then this DO is an organization in the sense of COT.
Otherwise, we call it a "genuine DO".
Mathematically, the significance of DOs is proven by the fact that the set of persistent species of every solution of a reaction-diffusion system is always a DO 20 .
From a given reaction network the set of DOs can be computed without the need for any knowledge about the kinetics (reaction constants, kinetic laws applied, etc.).The set of DOs is always a lattice 20 .The lattice of DOs of the micro-RNA transcription-factor interaction model 23 is shown in Fig. 1.
The left-hand side of Fig. 2 shows the main definitions of this subsection together with their relations to one another.
Note, that these definitions are based upon subsets of species.In the next section, these definitions are complemented by the definitions and theorems (displayed on the right-hand side of Fig. 2) necessary for the algorithm to compute DOs.The ladder are based upon subsets of reactions.

Novel theoretical concepts
In this section, the new definitions which are necessary to formulate the algorithm are stated.

Set of organizational reactions (SORs)
In principle, this is a transfer of the species-based definitions from above to a reaction-based approach.The first idea is to transfer Definition 1 of a closure of a subset of species to reactions.Definition 5 (Elementary Reaction Closure (ERC)) Given a reaction network (S , R ) and a reaction r ∈ R , we call the set  23 .The 17 vertices of the lattice represent the DOs of the model.Each vertex displays the species of the respective DO.The boxes mark DOs that are organizations whereas the 6 ellipses mark genuine DOs.Species that do not appear in any DO that is a subset of that DO are marked green.The smallest DO of the lattice (which always has to be an organization) is at the bottom of the lattice and is empty since there is no inflow reaction in this example.Note, that in this figure there is no information contained about which reactions are active in the DOs.Note also, that each of the species TF2, miR_gene and Sink alone does not trigger any reaction.Therefore, these species create multiple DOs that are non-reactive (like the empty set).At the top of the lattice is the biggest DO.For this example, it contains all species of the model.www.nature.com/scientificreports/ of reactions the elementary reaction closure (ERC) of r , that is, the set of reactions that are activated as soon as r is active.
By design, the ERC of a reaction is unique.In the implementations of the algorithm, the computation of the ERC of a reaction is realized by the function create_ERCs().For the micro-RNA transcription-factor interaction model, Table 1 shows the ERC of each reaction.
In the following, the recursive construction of the ERC of r 3 is described as an example.The species miR_gene and TF2 are required to run reaction r 3 , which produces species miR_gene_TF2 .This in turn triggers the reac- tions r 4 and r 5 .This results in the addition of the species miR_gene and miR.miR triggers the reaction r 6 , thus the reaction r 6 is added at last.Since no further reactions are supported by the listed species set {miR_gene, TF2, } the ERC of r 3 is r 3 , r 4 , r 5 , r 6 .Definition 6 transfers DOs, which were defined for species sets, to sets of active reactions.Table 1.ERC of each reaction of the micro-RNA transcription-factor interaction model 23 .Here the elements of each ERC are listed in the order of their computation even though mathematically an ERC is a set.

Lemma 1 (Unique Set of Overproduction)
To each SOR(v) belongs a unique biggest set of species that can be overproduced by the set of reactions contained in SOR(v).
Proof There can be a number of flux vectors v1 , . . ., vk ∈ R m , each tracing to SOR(v) = {r j ∈ R : vj > 0} , but with different sets of overproduced species.Unifying all these sets of overproduced species results in a unique biggest set of overproduced species.
Roughly speaking, for a given DO, there can be multiple corresponding SORs, and conversely, for a given SOR, there can be multiple corresponding DOs.More precisely: 1. Given a reaction network (S , R ) and a DO D S , there can be several vectors v1 , . . ., vk ∈ R m + , through which a subset D is a DO.Nevertheless, the sets SOR(v 1 ), . . ., SOR(v k ) of reactions can be different from each other.These differences describe the different potential behaviors of the DO D in terms of the active reactions.An example of a DO performing several SORs is indicated by a red link between SORs in the hasse diagram.2. Given a reaction network (S , R ) , a set D S of species and a vector v ∈ R m + such that D is a DO through v .Then there can be different DOs all exhibiting the same behavior, that is, the same set SOR(v) of active reac- tions.These DOs include one, say D min , which is minimal in terms of its number of species and its number of compartments.The other DOs are unions of D min and further non-reacting species.

Lemma 2 (SORs form a lattice) The set of all SORs of a given reaction network (S , R ) forms a lattice.
Proof A lattice is a partially ordered set in which every two elements have a unique supremum (a least upper bound) and a unique infimum (a greatest lower bound).

Partial order of the set of SORs:
The subset relation for sets provides a partial order.

Unique supremum:
Thereby a compartmentalization of D is assumed, where the compartments of D 1 and D 2 are simply put next to each other disjointly without chang- ing or merging them in any way.With that, the minimality of R sup follows trivially since no new reaction is activated and thus no further species can be attained.The union is finite and it is performed disjointly like the union of R 1 and R 2 to get a unique supremum.The existence of R inf follows from the fact that there exists a unique minimal SOR R min of the reaction network.This is the set of inflow reactions together with the reactions in the ERCs of the inflow reactions.The reactions of R min are included in any SOR.Note that R min can be empty.The uniqueness of R inf can be proven easily by contradiction: If there were two different greatest infima, unifying them would give a greater one in contradiction to the assumption.

Unique infimum:
The lattice of SORs of the micro-RNA transcription-factor interaction model 23 is shown in Fig. 3.
For models of Influenza-A and SARS-CoV-2 virus infection dynamics, in 24 resp. 25it was shown how such lattices can be used to better understand model dynamics and and also to compare different models.Lemma 3 provides an equivalent definition of SORs that use ERCs to guarantee closedness and does not explicitly refer to DOs.It is applied in the implementation of the algorithm to compute SORs.Lemma 3 (Equivalent Definition of SORs) Given a reaction network (S , R ) , a subset R R of reactions is a SOR if and only if there is a vector v ∈ R m + with the following two properties: 1. Self-maintenance: N v ≥ 0 and 2. Closedness: R = ∪{ERC(r j ) : vj > 0}.
Proof Following Definition 6 it suffices to show that there as an appropriate corresponding DO D = ∪ k i=1 S i with closed subsets S 1 , . . ., S k S of species and feasible fluxes v1 , . . ., vk with respect to each of species are defined which are closed by definition.Thus, there is an equivalence between the set of species subsets S i and the set of elementary reaction closures ERC(r j ).
For each j = 1, . . ., k , the number is defined which is the number of species subsets S i supporting the reaction r j .Furthermore, for each subset S i , i = 1, . . ., k, and each reaction r j ∈ R, j = 1, . . ., m, the number.
is defined which equals one if r j is supported by S i , and which equals 0 if r j is not supported by S i .Now, for each subset S i , i = 1, . . ., k, and each reaction r j ∈ R , j = 1, . . ., m, the feasible fluxes are defined by

Maximal compartments (MCs) and minimal compartmentalizations
For real-life systems, evolutionary aspects such as efficiency are important.Efficiency can be realized by minimizing the number of compartments such systems exhibit.Definition 7 provides a description of the maximal compartments which can be realized for a given DO-SOR pair.
Definition 7 (Maximal Compartment (MC) and Minimal Compartmentalization of a DO-SOR pair) Given a reaction network (S , R ) , a DO D S and one of its associated SORs R R , then: ments is minimal, that is, there is no compartmentalization of the DO-SOR pair (D, R) with a number of compartments lower than l.
The algorithm presented in this work can compute all MCs of a DO-SOR pair.The algorithm exploits the fact, that an MC is closed and does not support a reaction outside of the SOR.From the set of MCs of a DO-SOR pair, the algorithm can also compute minimal compartmentalization by selecting a subset of the set of MCs of the DO-SOR pair.
In the micro-RNA transcription-factor interaction model 23 the application of MCs on the SOR {TF1_degradation, TF1_mRNA_degradation, TF1_transcription, TF1_translation, miR_degradation, miR_gene _TF2_binding, miR_gene_TF2_release, miR_synthesis} can be observed.as the vertex {R10, R2, R3, R4, R5, R6, R7, R8, R9} in the lattice of SORs, see Fig. 3.The algorithm produces 3 MCs: The MCs 1 and 2 are able to perform the SOR resulting in a minimal number of compartments of 2. These compartments are not unique.One can remove the species TF2 from MC2 without impacting the active reactions.The reactions supported in an MC can also be impacted by removing a species of its support, but only if the support of these reactions is also in another active compartment.
The right-hand side of Fig. 2 shows the new definitions of this subsection and relates them to those from the Preliminaries, which can be found on the left-hand side.
Persistence theorem for SORs and further implications for the dynamics Theorem 4 is a persistence theorem, which traces the persistence theorem for species (Theorem 3.25 in 20 ) and finally transfers it to reactions.
Lipschitz continuous on every bounded subset of R n + and Vol.:(0123456789) and ∂ 2 ĉ ∂x 2 are each continuous with respect to x and t, • |ĉ(x, t)| < K for all x ∈ , t ≥ 0 for a number K ∈ R , that is, ĉ is bounded, then the set of persistent species of the solution ĉ is a DO and the occurring compartmentalization is described by a flux vector v ∈ R m + which is derived by double-integration with respect to x and t as defined in Lemma 3.23 of 20 .From the definition of SORs (Definition 6) follows, that the set taken as the set of persistent reactions of the solution ĉ , is a SOR.Therefore, by definition, a reaction is persistent if and only if the set of species in its support is persistent.As shown in 21 , Theorem 4 does not only hold true for RDS but can also be transferred to the special cases of ODE and patch-like systems.Note that the concept of persistence used here is a new one that was defined in 20 .Note also that it is possible to extend Theorem 4 to other boundary conditions, for example, by adding appropriate reactions to the reaction network as it was indicated in 26 .
Theorem 4 also implies that a of reactions of the reaction network, which is not a SOR, can not persist.In other words, such a SOR must define a transient state of any solution of any RDS with that underlying reaction network.
From 20 another implication for the dynamics, not in the long term but right after leaving the initial states, can be derived.Lemma 3.21 in 20 states that for each compartment, the closure of the initially present species appears and does not vanish within a finite time.Transferred to reactions this means that right after leaving the initial state, in each compartment, the whole ERC of the initially active reactions is activated.Thus the set of reactions that persist is always a subset of the ERC of the initially active reactions.When solving for several solutions there is a huge save in complexity when using Gurobi because of the use of a solution pool.
From the examples stated in Section "Interpreting a lattice of SORs" it will be clear, that the functions all_ SORs() and all_DOs() exhibit the largest differences between the worst case runtime analyzed here and the actual runtimes.The latter is strongly impacted not only by the number of boolean variables but also by the number of constraints of the actual reaction network.

Computing all distributed organizations (DOs)
• Function: all_DOs() • Input: list of reactions • Output: list of DOs Like for SOR computation, all ERCs are computed.Then the largest DO is computed by mixed-integer linear programming (MILP).All remaining DOs are obtained successively by integer cuts.
For the MILP there are m continuous variables representing the flux vector v ∈ R m + and m discrete variables b ∈ {0, 1} m denoting whether reaction r j is active in the DO ( b j = 1 ) and n discrete variables e j denoting whether species s i is an element of the DO ( e i = 1 ).Thus, the search space is R m + × {0, 1} m × {0, 1} n .To find the largest DO the following objective function is used subject to the following constraints: Note that the variables are coupled such that if a reaction is present ( b j = 1 ) also all species involved as reactants or products must be present ( e i = 1 ).Further, note that the constraints "species closure" ensure that a reaction must be present ( b j = 1 ) if it is supported in the single species closure of a species that is present ( e i = 1).
Finally note that in the current implementation, Gurobi's solution pool is used, which generates all DO-SOR pairs.Recall that one DO can have many SORs.From these solutions, only the DOs are returned.Internally, however, the tool stores all SORs in a DO-SOR dictionary for later use.
The worst case time complexity of all_DOs() is given in since compared to all_SORs() there is a further boolean variable for each of the n species.

Computing all maximal compartments (MCs)
• Function: create_MCs() • Input: Reaction network, SOR, DO • Output: list of MCs (species sets) The function create_MCs() computes the (unique) maximal compartments of a given SOR.It works on a list of candidates for the MCs, which is initialized by the set of species of the DO and then broken down through the following four steps.First, for each compartment it is checked if it supports an inactive reaction, that is, a reaction not contained in the SOR.If this is the case, the compartment is split into k compartments, where k is the order of the reaction.Each compartment is missing exactly one species of the support for the reaction.The statements regarding the time complexity are with the condition of disjoint supports of all reactions resulting in k • m ≤ n The time complexity of this first step is given in O (n • k m+1 ).
After that, compartment candidates, which are subsets of other valid candidates, are eliminated: Next, the candidates are checked for closedness.Each reaction, which extends the present species set can not occur in this set and is therefore inactivated by updating the MCs in the same way as done for the inactive reactions, that is, by splitting it.This check of closure is repeated until none of the compartments is altered.The total time complexity of the check for closure is given in The maximal compartments (MCs) listed in MC_list are used in an ILP to find the minimal number of compartments needed for the affiliated SOR.For each MC ∈ MC_list there is a discrete variable b MC ∈ {0, 1} denoting whether the MC is part of the minimal compartmentalization ( b MC = 1 ) or not ( b MC = 0 ).The ILP, which is a set cover problem 29 , uses the objective function: subject to the following constraints: The first constraint ensures that all species are covered by the MCs and the second constraint guarantees that each reaction of the given SOR is active in at least one MC.With this, a minimal set of MCs, a minimal compartmentalization, is found such that all species in species(SOR) are covered and each reaction of the SOR can run in at least one MC.Since the problem resembles the set cover problem, which is proven NP-complete, it is possible, to solve this problem in polynomial time and the LP seems to be the most efficient way to solve this.
Solving the ILP is of exponential time complexity, that is, given in

Analysis of the models of the bio models database
This section provides analyses of the models of the BioModels Database 30 performed by using the algorithms presented in the previous chapter.At the end of this section, the lattice of SORs of an artificial reaction network, which is not contained in the BioModels database, is interpreted to exemplify which information can be drawn from it.
SORs, organizations and genuine DOs 932 models of the BioModels Database were transformed into a proper reaction network using the SBML reader of the libsbml package.These are analyzed in this section.To keep the study tractable, only the reactants and products of a reaction are considered, more precisely, the information contained in other elements of an SBMLmodel is ignored, for example, modifiers, rules, events, kinetic laws, or whether a species is flagged as constant.
The github contains the file biomodels.csvwith all extracted information of each model.In total, the algorithm computed 1'019'600 SORs for the 932 models, of which 218'870 can be represented by an organization, that is about 21.5% .All other SORs represent what are called genuine DOs, which are SORs that cannot be represented by organizations.In Fig. 5, all models are shown according to the number of their SORs and the fraction of SORs representing organizations.
Even though the majority of models do not exhibit any genuine DO, about four-fifths of all SORs are genuine DOs.More precisely, the fractions of models containing at least one genuine DO appear to increase with increasing number of SORs: 0% (for models with only one SOR), 14% (for models with two SORs), 17 , 36 , 36 , 34 , 57 , 59 , 57 , 79% (for models with 1001 to 50,000 SORs).The reason for this is that with an increasing number of SORs also the number of possibilities for creating new genuine DOs (by combining SORs separately with each other) increases.

Separability of supports and order of reaction
Now the prerequisites for genuine DOs are studied on the level of reactions.This will make clear why most of the models cannot exhibit any genuine DO.Separability of the support of a reaction is a necessary condition for genuine DOs since it is a prerequisite for disabling a reaction.In turn, separability of the support of a reaction is possible only if the order of a reaction is greater than one, that is, the number of species of its support is at least two.The higher the order of a reaction the more ways there are to construct a genuine DO by distributing species and this in turn increases the probability of the appearance of a genuine DOs.Figures 6 and 7 provide an overview of the orders of reactions of the analyzed models of the BioModels Database.
Log-scaled distribution of all reactions of the analyzed models of Only 14% of all reactions have an order bigger than 1 and thus can be deactivated by separating their supports.Most of the models do not contain reactions of higher order.Thus most of the models cannot exhibit a genuine DO.This, together with the above-mentioned observation of genuine DOs often resulting from combining others, explains why only 237 of the 932 analyzed models contain at least one SOR that is a genuine DO.In the following the structure of the computed SORs is studied, i.e., their compartmentalizations or, more precisely, their numbers of compartments.

Minimal compartmentalizations
For SORs requiring at least two compartments, that is, for genuine DOs, Fig. 8 provides an overview of the distribution of the minimum numbers of required compartments across the models (subfigure (a)) resp.the SORs (subfigure (b)).
Even though most models get along with one compartment for all their SORs there is a non-negligible number of models each containing at least one genuine DO (Fig. 8a).More precisely, SORs were found with their minimum number of required compartments occupying all numbers in the range from one to five (Fig. 8b).Here, the number of SORs decreases monotonously with an increasing minimum number of required compartments.Models that have at least one SOR that is a genuine DO (i.e., that can not be represented by an organization) are marked dark blue and the others are marked light blue.The fraction equals zero for models with only one SOR confirming the theorem stating that every model has at least one organization 20 .Figure 6.Log-scaled distribution of all reactions of the analyzed models of the BioModels Database with respect to their order.Reactions of order 1 dominate the set of models.Only 22.18% of the reactions have an order bigger than one and therefore allow for separating the species into several compartments and attaining a genuine DO.
Probably the reason for this is that the more compartments are needed, the more reactions have to be disabled simultaneously which in turn is even harder the more reactions have to be disabled.

Runtime and timeout analyses
While analysing the BioModels Database, all 3 parts of the algorithm resulted in large running times for some models respectively.The calculation of ERC only came close to the timeout cap in two exceptionally large models with more than 1500 reactions (BIOMD0000000255,BIOMD0000000595).The calculation of SORs and DOs was a problem for 2 models with more than 500 reactions and reactions of a high order (BIOMD0000000255 and BIOMD0000000496).Figure 10 provides an overview of how the runtime of SOR computation for each model relates to the number of species, the number of reactions, the number of SORs, and the overall number of constraints of the LP used to compute the SORs.The upper limit of the number of SORs is given as parameter of the LP.The default is 50.000.Heuristic algorithms using a combinatorial build up, could be used to grasp all these basic SORs.A similiar approach is used for an alternative computation of the DOs from the set of SORs.This function is available in the iterate_over_DB file.Most of the models not fully processed arose from the computation of all minimal numbers of compartments.Figure 9 reveals the distribution of the maximum number of MCs across the models and its correlation to computational timeouts.
As the maximum number of MCs increases, the number of models becomes smaller and the probability of a computation timeout increases, making the maximum number of MCs a good indicator of runtime.Heuristic approaches or greedy algorithms governing the set cover problem could be implemented in that case (Fig. 10).

Interpreting a lattice of SORs
To conclude this chapter, an artificial 3-generator model is introduced to exemplify the output of the algorithm presented in this paper and to demonstrate how to interpret it.When limited to organizations, such outputs can be organized in Hasse diagrams, which were used previously to compare and hierarchize Influenza-A and SARS-CoV-2 infection dynamics models 24,25 .By generalizing organizations to DOs and SORs in this study, it was shown that the output of the algorithm is based on lattices of DOs resp.SORs.The 3 generators, A1, A2 and A3, produce their corresponding products B1, B2 and B3 through the reactions R1, R2 and R3, respectively.The generator A2 is harmed by exposure to the products of the other generators, that is, B3 (reaction R4) and B1 (reaction R5).Thereby, also B1 is destroyed (R5).The harming of the production of B2 is counteracted by the reaction R6, which transforms B1 and B3 to two B2.Then there is an asymmetric reaction R7 erasing A1 and A2 when they meet.R7 is compensated by the last two reactions, namely R8 together with R9, which (through the production of the intermediate species D1) destroy the generators A1 and A3 and multiply B1 and thus also compensate for the destruction of B1 through R5.All nine reactions of the artificial model are listed below.
The lattice of all SORs of this artificial 3-generator model computed with compute_all_SORs() together with further information about the minimal compartmentalizations is shown in Fig. 11.
Only the largest SOR as well as the four smallest SORs of the artificial 3-generator model represent organizations whereas the intermediate SORs represent genuine DOs, which display different levels of compartmentalization of one and the same set of species.All three generators can exist independently of each other and build up the first level of reactive compartments (2nd line of the lattice when viewed from the bottom).They can be combined to build the first level of SORs that can only be obtained as genuine DOs as can be seen from the round shape of the nodes.
The outflow reaction R7 can then be activated by putting the generators A1 and A2 in one and the same compartment.Optional reactions like these can inflate the number of SORs, but here this is not the case since the support of R7 is a subset of the supports of the reactions that are required for larger SORs, which in turn keeps the lattice slim and easier to interpret.
The three SORs directly below the largest SOR require at least three compartments.This is quite a rare occurrence when compared with the results of the BioModels database (see Fig. 8a).This is because there has to be (product interaction benefiting B2) R7 : A1 + A2 −→ ∅ (generator interaction weakening A1 and A2) (production of B1, A2 compensates R5, R7) a specific pattern of reactions so that neither compartment can be merged with another one.To make these 3 possible merges impossible, a minimum number of higher-order reactions is required.Note, that the artificial 3-generator model is the default network of the algorithm if no SBML file is handed over to it.

Conclusion
In this work, an algorithm was presented for the first time that allows computing all persistent subspaces of a reaction-diffusion system (RDS) from the underlying reaction network alone, i.e., without knowledge of kinetic details.For this purpose, the theory developed in 20 has been extended from the species (or variable) level of the RDS to the reaction level by appropriate mathematical definitions.The main connections between these new definitions were presented and proved, culminating in the persistence Theorem 4 for SORs, which characterizes all persistent subspaces of an RDS at the level of reactions and allows for describing their inner structure in a mathematically precise way.
Then, the algorithm was presented which allows computing all SORs and DOs of an RDS by linear programming, as well as describing their possible internal structure in terms of compartmentalizations.An upper bound for the runtime was given for each part of the algorithm.An implementation of the algorithm in Python including documentation for the application is freely available in the Github https:// github.com/ Woitk eL/ dorga nalys is.
Finally, the algorithm was applied to all models of the BioModels Database to analyze them with respect to the occurrence of genuine DOs and their structure and to practically test the performance of the algorithm.This confirmed both the importance of DOs in general and that of genuine DOs for the analysis of RDS, as well as demonstrating the practicality of the algorithm and its implementation.
Thus, a framework is now available that allows RDS to be studied both mathematically-theoretically through a kind of generalized fixed point analysis and practically by computing all DOs or SORs to analyze, evaluate, construct, and compare RDS models from different disciplines such as chemistry, biochemistry, ecology, or sociology.
Future research tasks following this work include e.g. a further extension and generalization of the mathematical theory developed here e.g. for unconstrained systems, the application of the algorithm developed here to concrete models from different disciplines including a more detailed analysis and interpretation of the models of the BioModels Database especially with respect to the function of genuine DOs and their biological meaning, and a more detailed analysis of the runtime and complexity of the presented algorithm including possible improvements.

Figure 11.
The lattice of the 13 SORs of the artificial 3-generator model.For each SOR S, the corresponding DO D with the minimal compartmentalization is considered to provide the following information about S in its node: in the first line: reactions of S and overproduced (OP) species, in the second line: species of D and in the third line: minimal compartmentalization.A reaction resp.species of S is marked green if it is new, that is, if it does not appear in any SOR that is a subset of S. If there is only one compartment, then D is an organization and the border of the node is a rectangle (instead of an ellipse).Two SORs are linked by a line if their reactions form subsets and there is no other SOR between them.Such a line is red to indicate that the two linked SORs have the same set of species in their respective DOs.This marks different activation/deactivation patterns of reactions solely caused by different compartmentalizations of one and the same set of species.

Figure 1 .
Figure 1.Lattice of DOs of micro-RNA transcription-factor interaction model from23 .The 17 vertices of the lattice represent the DOs of the model.Each vertex displays the species of the respective DO.The boxes mark DOs that are organizations whereas the 6 ellipses mark genuine DOs.Species that do not appear in any DO that is a subset of that DO are marked green.The smallest DO of the lattice (which always has to be an organization) is at the bottom of the lattice and is empty since there is no inflow reaction in this example.Note, that in this figure there is no information contained about which reactions are active in the DOs.Note also, that each of the species TF2, miR_gene and Sink alone does not trigger any reaction.Therefore, these species create multiple DOs that are non-reactive (like the empty set).At the top of the lattice is the biggest DO.For this example, it contains all species of the model.

( 11 ) 3 Figure 2 .
Figure 2. Overview of the main definitions of this work and their mutual interrelations.The boxes highlighted in gray are linked to the respective parts of the text where they are described.On the left-hand side, the items based on species are listed as presented in the Preliminaries of this work.On the right-hand side, the new reaction-based terms are listed as introduced in the Results of this work.The term MC stands for the maximal compartment and is defined in Definition 7. https://doi.org/10.1038/s41598-023-44244-xwww.nature.com/scientificreports/

Figure 5 .
Figure 5. Log-scaled numbers of models according to their number of SORs and DOs resp.organizations.Models that have at least one SOR that is a genuine DO (i.e., that can not be represented by an organization) are marked dark blue and the others are marked light blue.The fraction equals zero for models with only one SOR confirming the theorem stating that every model has at least one organization20 .

Figure 7 .
Figure 7. Distribution of the highest order of reactions for all analyzed models of the BioModels Database.There are only a few models containing only zero-order reactions.Models of reaction order at most 1 are found to dominate.51.20% of the models do not have any reaction of orders bigger than one and therefore are not capable of exhibiting a genuine DO.

Figure 8 .
Figure 8.For all SORs requiring at least two compartments, that is genuine DOs: Distribution of the minimum number of required compartments across models (a) and across SORs (b).The 27 models causing timeout contain about half of all SORs.In fact, each model that causes a timeout contains well over 30,000 SORs.

Figure 9 .
Figure 9.Each model of the BioModels Database, that contains at least one genuine DO, belongs to a set of SORs.Each of these SORs has minimal compartmentalization, which is a minimal set cover of the SOR consisting of a selection of maximal compartments (MCs) of this SOR.This diagram plots for each model the maximum number of MCs whose size is a very likely bottleneck of the algorithm causing a computational timeout (red lines).

Figure 10 .
Figure 10.Runtime for each model depends on the following parameters: number of species, number of reactions, number of SORs and number of constraints applied in the LP to compute the SORs.Of these parameters, the number of SORs in a model appears to be the strongest indicator of runtime.
Feasible, that is, for every c ∈ R n + the vector v(c) is a feasible flux with respect to the set {s ∈ S : c s > 0} of species present in c.