Derivation of stationary distributions of biochemical reaction networks via structure transformation

Long-term behaviors of biochemical reaction networks (BRNs) are described by steady states in deterministic models and stationary distributions in stochastic models. Unlike deterministic steady states, stationary distributions capturing inherent fluctuations of reactions are extremely difficult to derive analytically due to the curse of dimensionality. Here, we develop a method to derive analytic stationary distributions from deterministic steady states by transforming BRNs to have a special dynamic property, called complex balancing. Specifically, we merge nodes and edges of BRNs to match in- and out-flows of each node. This allows us to derive the stationary distributions of a large class of BRNs, including autophosphorylation networks of EGFR, PAK1, and Aurora B kinase and a genetic toggle switch. This reveals the unique properties of their stochastic dynamics such as robustness, sensitivity, and multi-modality. Importantly, we provide a user-friendly computational package, CASTANET, that automatically derives symbolic expressions of the stationary distributions of BRNs to understand their long-term stochasticity.

where λ k andλk are the propensities of stochastic models (i.e., continuous-time Markov chains) for {S, C, R} and {S,C,R}, respectively. Let p n (t) andp n (t) be the probabilities of the continuous-time Markov chains associated with {S, C, R} and {S,C,R} being at state n, respectively. Then, p n (t) is the solution of CME given by: where r andr are the total numbers of reactions in the original and the translated reaction networks, respectively, and the third equality follows from Eq. (S1). Thus, p n (t) andp n (t) are the solution of the same CME because the CME associated with the translated network is given by dp n dt =r k =1λk (n − (ν k −νk))p n−(ν k −νk) −r k =1λk (n)p n .
Therefore, p n (t) ≡p n (t) for all n ∈ Z d ≥0 as long as they have the same initial condition.

SUPPLEMENTARY NOTE 2: PROPENSITY FACTORIZATION
As we discussed in the main text, to derive a stationary distribution, all the propensities λ k (n) should be factorized as for some constants κ k > 0 and functions θ(n) > 0 and ω(n) ≥ 0 on a set Γ = {n | n ≥ b}, where λ k (n) > 0 if and only if n ≥ ν k + b in Γ meaning that ν k + b represents the minimal copy numbers for the reaction to occur. This factorization form generalizes the previous result, Eq. (32) in [2], which requires the following factorization: Note that if ω(n) = 1 θ(n) and Γ = Z d ≥0 (i.e., b = 0), Eq. (S2) is equivalent to Eq. (S3).

Propensity factorization by solving recurrence relations
To obtain the desired factorization for given propensity functions, we need to solve recurrence relations. Here, we illustrate how to solve the following factorization equations of Fig. 1b.

Propensities following generalized mass action kinetics can be factorized without solving recurrence relations
If the propensities follow a sort of generalized stochastic mass action kinetics, then their factorization can be easily obtained. In what follows, we use the convention that −1 j=0 a j = 1 for any {a j }. Theorem 1. Let a biochemical reaction network {S, C, R} be given, and let λ k (n) be the propensity functions of the associated continuous-time Markov chain. Suppose that there exists a vector b such that for n ∈ Γ = {n | n ≥ b}, we have λ k (n) > 0 if n ≥ ν k + b and λ k (n) = 0 otherwise for each k = 1, 2, . . . , r. We further assume that there exist constants α k > 0 and functions f i : . . , r and all n ∈ Γ, where ν ki is the ith entry of the source complex vector ν k . Then λ k (n) = κ k θ(n)ω(n−ν k )1 {n≥ν k } on the Γ with Proof. For n ∈ Γ,

Remark 1.
A similar condition was introduced by Eq. (27) in the previous study [2], which is a special case of the generalized mass action kinetics with b = 0. Furthermore, the associated function f i (n i ) can be thought as the "rate of association" of the ith species as pointed out in [9]. Such rate of association of the species have been also used in order to generalize the Horn-Jackson-Feinberg theory to deterministic non-mass action system [11].
Remark 2. The example in Fig. 3g does not follow the generalized mass action kinetics because the propensity of the reaction A → A P , α 1 n A + α 2 n A n A P , depends on both n A and n A P , but the latter is not supported in the source complex. Nevertheless, it can be re-expressed as α 1 n A + α 2 n A (T 0 − n A ) using the conservation law: n A + n A P = T 0 so it now follows the generalized mass action kinetics.

SUPPLEMENTARY NOTE 3: DERIVATION OF STATIONARY DISTRIBUTIONS
If a network is weakly reversible and has zero deficiency, the deterministic mass action model with any rate constants always possesses a CBE c ∈ R d >0 [4,7,8]. Then the existence of the propensity factorization ensures that the stationary distribution of the associated continuous-time Markov chain can be derived analytically. In what follows, x y means there is at least one i such that x i < y i .
Theorem 2. For a given biochemical reaction network {S, C, R}, suppose that there exists a complex balanced equilibrium c ∈ R d >0 for the deterministic mass action model for {S, C, R} with rate constants {κ k }. Let λ k (n) be the propensity functions of the continuous-time Markov chain associated with the network {S, C, R}. Suppose that there exists a vector b such that for n ∈ Γ = {n | n ≥ b}, we have λ k (n) > 0 if n ≥ ν k + b and λ k (n) = 0 otherwise for each k = 1, 2, . . . , r. Assume further that the propensity functions λ k (n) can be factorized as where θ and ω are non-negative functions such that θ(n) > 0 if n ∈ Γ. Then the Γ is a closed subset, and the stochastic model admits a stationary measure that can be written as If π is summable over the state space, then the measure π can be normalized to be a stationary distribution.
Proof. We first show that Γ is closed, i.e., there is no transition from Γ to Γ c . Assume that there exists a reaction from n ∈ Γ to n + ν k − ν k ∈ Γ c with λ k (n) > 0. This implies that n i + ν ki − ν ki < b i for some i, and thus That is, n ν k + b, and n ≥ b since n ∈ Γ. Therefore, λ k (n) = 0 from the definition of b. This contradicts to λ k (n) > 0, so Γ is closed. We turn to showing that the given π solves the CME at steady state: After the substitution of π(n) = c n θ(n) and Eq. (S8), the Eq. (S10) We can show that the RHS and LHS are equal by using that c ∈ R d >0 is a CBE and thus for any fixed complex z ∈ C, Multiplying c −z ω(n − z)1 {n≥z} on the both sides, we have Hence, by summing these up for all z ∈ C and multiplying c n both sides, we get because every reaction has exactly one complex as its source complex and product complex. Since the right hand side of Eq. (S13) is equal to the RHS (Eq. (S12)), we need to show that the left hand side of Eq. (S13) is equal to the LHS (Eq. (S11)). If k ∈ K Γ (n) c , then n − (ν k − ν k ) ∈ Γ c meaning that n i − ν ki ≤ n i − (ν ki − ν ki ) < b i for some i and thus n − ν k ∈ Γ c . Since the network has the complex balanced equilibrium, the network is weakly reversible [7]. Thus, there exists a reactionk whose source complex νk is equal to ν k . For suchk, by the definition of b, λk(n) = κkθ(n)ω(n − νk)1 {n≥νk} = 0 because n ≥ b and n − νk ∈ Γ c (i.e., n − νk b). Since θ(n) > 0, furthermore, ω(n − νk)1 {n≥νk} = ω(n − ν k )1 {n≥ν k } = 0. Hence, the left hand side of Eq. (S13) is equal to the Eq. (S11) and therefore, Eq. (S11) and Eq. (S12) are equal.
Example 1. Here, we illustrate Theorem 2 with the translated EGFR network in Fig. 3a: The CBE c = (c A , c A P ) of the deterministic mass action model satisfies the balancing equations for the complex 0: For this example, b = (1, 0) since λ k (n) > 0 for n ≥ ν k + (1, 0), and λ k (n) = 0 for n such that n ≥ (1, 0) and n ν k + (1, 0) for all k = 1, . . . , 4. Letting f A (n A ) = n A (n A − 1) and f Ap = n A P , the condition in Theorem 1 holds. Therefore, the propensity factorization λ k (n) = κ k θ(n)ω(n − ν k )1 {n≥ν k } is given by By Theorem 2, a stationary measure π(n A , n A P ) is given by where M is a normalizing constant. π(n A , n A P ) is indeed a stationary distribution because it is summable over Γ as follows: Moreover, since Γ is closed, the states (n A , n A P ) ∈ Γ c are transient as each state in Γ c is reachable to Γ with either This implies that the stochastic process can visit the state n ∈ Γ c only finitely many times. Indeed, if (n A , n A P ) ∈ Γ c , i.e., n A = 0, then only the reactions 0 → A, A P → A and A P → 0 can occur in Γ c . Thus, the process moves along one of the stoichiometric vectors (1, 0), (1, −1), and (0, −1), then it eventually escapes from Γ c to Γ and never comes back to Γ c . Due to the transient states, Theorem 6.6, which is the most general version of the previous work [2], is not applicable to this example. Let us compute the marginal means and variances of n A and n A P . For n A P , the marginal distribution which is a Poisson distribution with rate α1 α3 , so both the mean and the variance are α1 α3 . Thus, the coefficient of variation (CV) is α3 α1 , and the Fano factor, defined as the variance divided by the mean, is always 1. For n A , the marginal distribution , and M 2 is the normalizing constant.
where I m (x) is the modified Bessel function of the first kind. Then the mean and variance are given by The CV and the Fano factor are also given by can be determined by the single parameter u0 = α 1 α 2 (1 + α 4 α 3 ). The CV attains the maximum at u0 ≈ 1.8, and the Fano factor monotonically increases. Since the Fano factor is always less than 1, this distribution is sub-Poissonian.
Remark 3. By the basic Markov properties [10], Theorem 2 not only provides a stationary measure but also characterizes the status of states. Since π(n) > 0 for each n ∈ Γ, every state in Γ is recurrent as long as the process is non-explosive (i.e., well-defined for all time t). Since Γ is a closed set, furthermore, if a state n ∈ Γ c is reachable to Γ, then n is transient. Another special case is that a state n is isolated state, i.e., λ k (n) = 0 for all k. In this case, the set of the single element {n} is an irreducible subset, and π(n) = 1 on {n} is a stationary distribution, obviously. Remark 4. This theorem does not require nor imply the irreducibility of Γ. If Γ is not irreducible then for an irreducible subset Γ g ⊂ Γ, we consider a measure π Γg = π| n∈Γg restricted on Γ g . This measure is still a stationary measure and can be a stationary distribution on Γ g as long as π Γg is summable over Γ g . For instance, in the example in Fig. 3d, Γ is not an irreducible set, but a union of irreducible subsets Since the stationary measure π obtained by Theorem 2 is summable on Γ T0 for each T 0 , the restricted stationary measure π Γ T 0 can be normalized to be a stationary distribution on the irreducible subset. In this remark, we prove the irreducibility of Fig. 3d. We say that for a Markov chain X, state y is accessible from state x and write x → a y if X can reach y starting from x with positive probability, that is, the probabilities P (X(t) = y | X(0) = x) > 0 for some t > 0. We say that states x and y communicate and write x ↔ c y if x → a y and y → a x. For the Markov chain X(t) associated with a translated network, we denote by λ C1→C2 and η the propensity of a reaction C 1 → C 2 and the stoichiometric vector of the reaction, respectively. Note that if λ C1→C2 (x) > 0 then x → a x + η because for a sufficiently small ∆t, By using this, for the translated network in Fig. 3d, we have (n A , n A P , n A P P ) ↔ c (n A − 1, n A P + 1, n A P P ) and Similarly, for the translated network in Fig. 1 As ↔ c forms an equivalent class, any two states in Z 2 ≥0 communicate and hence Z 2 ≥0 is irreducible. The irreducibilities of the other two examples in Fig.  3a, g can be shown in similar ways.
Remark 5. By introducing a new factor ω(n), Theorem 2 generalizes Theorem 6.6 in [2], which is the most general version of the previous work [2]. In particular, if the function ω(n) for the factorization is θ(n) −1 then Theorem 2 becomes equivalent to Theorem 6.6 in [2]. This generalization is demonstrated by three biologically relevant examples in Applying our theoretical framework (Fig. 1) has two practical difficulties. Translating a given network to a weakly reversible deficiency zero network (Fig. 1a) is not straightforward as prohibitively many candidates of translated networks often exist. Furthermore, unless propensity functions follow the generalized mass action kinetics, it is challenging to check whether the factorization condition holds (Fig. 1b) as it requires to solve associated recurrence relations. Thus, we have developed an open-source and publicly available MATLAB code (GitHub repository: http://github.com/Mathbiomed/CASTANET) that performs our theoretical analysis to derive stationary distributions. Specifically, the package checks two conditions: whether a given BRN can be made weakly reversible and of zero deficiency after network translation, and whether the propensities of the translated network can be factorized as in Eq. (S2). If these two conditions are satisfied, the package then calculates the analytic formula for a stationary distribution. One can easily run our code by simply entering the source complexes, product complexes, and propensity functions of reactions. 3 Step 3) Search all weakly reversible and deficiency zero translated networks. ...
Step 5) Compute CBE and derive an analytic formula for the stationary distribution.
Step 4) For each translated network, check the propensity factorization Eq. (1) with .

Manual for the code
We explain how to enter the input and run the code CRN_main.m with our example in Fig. 1a as follows.
Step 3) Run the section 'Performing Network translation' to obtain translated networks.
[Solution,Index] = CRN_translation(sources, products, 2); The third input argument 2 means that the function searches all translated networks whose reaction order is at most two (i.e., bimolecular). It can be changed to any other positive integer. The ith row of Solution contains the source and product complexes of the ith translated network. Each element in the ith row of Index is a set of indices indicating which reactions in the original network are merged to form a reaction in the ith translated BRN. For instance, if the ij entry of Index is {1, 3}, then this indicates that the first and third reactions in the original BRN were merged to form the jth reaction in the ith translated network. As the outcome, the number of identified weakly reversible deficiency zero translated networks is reported: The number of weakly reversible deficiency zero translated networks is 2.
Step 4) Run the section 'Performing Propensity factorization' to identify a translated network whose propensity functions satisfy the factorization condition (Eq. (S2)) among all the translated networks obtained in Step 3. Specifically, for each translated network, the code constructs a candidate (θ c (n)) for the function θ(n) and checks the factorization conditions with the candidate θ c (n). This candidate is necessarily the desired function θ(n) if there exists a factorization (see the next subsection for details). The key function CRN_theta_construction() provides the candidate, and CRN_check_factorization_condition() examines whether the factorization condition (Eq. (S2)) holds. Since the example in Fig. 1 has a translated network satisfying the factorization condition, the code successfully finds the factorization and displays the following line: The factorization condition holds for the translated network 1! If none of the translated networks have the desired propensity factorization, the code displays the following line: No translated network satisfying the factorization condition is identified.
Step 5) If a translated network having the propensity factorization is found in Step 4, run the section 'Compute CBE and derive a stationary distribution pi(n)' to compute a complex balanced equilibrium and analytically derive a stationary distribution. Then the code will provide the translated network and its stationary distribution formula, which is the same as the stationary distribution of the original network.

Output)
We have five outputs: source_trans, product_trans, Index_trans, lambda_trans, and pi. The first two outputs represent the stochiometric vectors of the source complexes and the product complexes in the translated network identified in Step 4. Index_trans is the row of Index corresponding to the translated network. lambda_trans contains the propensity functions of the translated network, and pi is a symbolic expression for the stationary distribution. , (alpha2ˆ(n2 -n1)*alpha4ˆn1*gamma((alpha1 + alpha4*n1)/alpha4))/ (alpha3ˆn2*gamma(n1 + 1)*factorial(n2)*gamma(alpha1/alpha4))) Here, piecewise(condition, value) means conditionally defined function in MATLAB. If condition holds then value is the function output. Since MATLAB symbolic expression uses gamma(a+k+1)/gamma(a+1) to represent (a + 1)(a + 2) · · · (a + k) and the domain of the gamma function does not contain the non-positive integers, the above complicated expression appears. The above expression certainly the same to the theoretical result.
Note that, technically, this is a formula for a stationary measure, not a distribution. To get the stationary distribution, the stationary measure needs to be normalized so that the sum of the probabilities over the state space is one, which is possible only when the formula is summable over the state space.

Underlying algorithm of the code
The code generates all possible network translations under a user-defined maximum reaction order (e.g., 1, 2, and 3 for unimolecular, bimolecular, and termolecular reaction networks, respectively). Among these translated networks, the code identifies weakly reversible deficiency zero networks. However, because there are sometimes prohibitively many translated networks (e.g., 864 candidates for the example in Fig. 1 with maximum reaction order 3), checking weak reversibility and deficiency for all translated network is extremely inefficient. In particular, it greatly increases computational cost to check weak reversibility and count the number of linkage classes via a connected components search algorithm (Tarjan's algorithm [12]). Thus, before performing this calculation, we first simply check whether a translated network can have a desired property by using the following necessary conditions of being a weakly reversible deficiency zero network, which greatly reduce computational cost.
Theorem 3 (Necessary condition of a network to be weakly reversible after network translation). For a given biochemical reaction network, if there exists a weakly reversible translated network of the given network then for each species index i, one of the following two conditions holds.
1. The ith coordinates of the stoichiometric vectors of all reactions, ν ik − ν ik , are zero. In other words, 2. There exist both positive and negative numbers among the ith coordinates of the stoichiometric vectors of all reactions. In other words, the set {ν i1 − ν i1 , ν i2 − ν i2 , . . . , ν ir − ν ir } has both positive and negative elements.
Proof. This can be proved by contradiction. Suppose that there exists a species index i such that the set of ith coordinates of all the stoichiometric vectors has only non-negative (non-positive) elements and has at least one positive (negative) element. Note that this is the negation of the above statement. Since the set of stoichiometric vectors is invariant under network translation, the weakly reversible translated network also has the same set of stoichoimetric vectors as the original network does. Note that every reaction in a weakly reversible network belongs to a closed cycle. However, the reaction whose stoichiometric vector has positive ith coordinate cannot be contained in any closed cycle because the ith coordinate of every other stoichiometric vector is non-negative.
If the necessary condition is not satisfied for a given BRN, we do not need to check weak reversibility of all translated networks, which greatly reduces computational cost.
Theorem 4 (Necessary condition of a network to have zero deficiency). For a given biochemical reaction network, let n, l, and s be the number of complexes, the number of linkage classes, and the dimension of the subspace spanned by the stoichiometric vectors, respectively. If the deficiency of the network is zero, then s + 1 ≤ n ≤ 2s.
Proof. The deficiency δ is given by n − l − s. Since the deficiency of the given network is zero, n = l + s. As long as the given network is not the empty network (i.e. S = C = R = ∅), there exists at least one linkage class, so 1 ≤ l. Each linkage class consists of at least two complexes so n(= l + s) ≥ 2l and thus l ≤ s. Therefore, s + 1 ≤ n = l + s ≤ 2s.
This condition might appear useless because n can vary among translated network. Nevertheless, s is invariant under network translation, and furthermore, we no longer need to calculate the number of linkage classes l for the condition in Theorem 4, which requires to perform the graph search algorithm and thus takes high computational cost. Hence, by avoiding this calculation for translated networks that do not satisfy the necessary condition, we can greatly reduce computational cost.
After identifying a weakly reversible and of zero deficiency network, the code checks whether each translated network satisfies the factorization condition (Eq. (S2)) by constructing the explicit formula for a candidate θ c (n) for the function θ(n) in Eq. (S2). The candidate θ c (n) can be constructed as follows. We first derive the recurrence relation of the function θ(n). If there exists a desired propensity factorization, from the factorization conditions for reactions i and j at n − ν j and n − ν i , By dividing the above equations, we get the following recurrence relation: Let n 0 be a state satisfying λ k (n 0 ) > 0 for all k. Let us consider a sequence of pairs of source complexes, (ν a1 , ν b1 ), (ν a2 , ν b2 ), . . . , (ν am , ν bm ), such that This means that n 0 → n 0 + (ν a1 − ν b1 ) → · · · → n 0 + m j=1 (ν aj − ν bj ) = n is a path from n 0 to n. For a pair of n and n 0 , in order to identify this type of paths, we use the row-style Hermite normal form, an analogue of reduced echelon form for a matrix over the integers. Specifically, non-zero rows of the Hermite normal form of the matrix whose jth row is ν j+1 − ν 1 form a basis for the lattice as ν i − ν j = (ν i − ν 1 ) − (ν j − ν 1 ). Thus, by using this basis, one can identify a path from n 0 to n [3,5]. Note that such a path is not unique in general but it is enough to consider a single path (see below). We multiply Eq. (S14) along this path, and then we obtain (S16) Note that the RHS of Eq. (S16) can be obtained by substituting n in Eq. (S14) with n + ν i and one can assume θ(n 0 ) = 1 without loss of generality as the factorization holds up to constants. For a special case, CASTANET uses a more efficient way to identify θ(n). Specifically, if the propensity functions of a translated network follow the generalized stochastic mass action kinetics and all the complexes consist of zero or one species, then CASTANET constructs the function θ(n) by using Eq. (S7).
Based on this expression for θ(n), the code constructs the candidate θ c (n) using Eq. (S16) with a path from n 0 to n. If there exists a desired function θ(n) for propensity factorization then it must be the same as the candidate θ c (n) because both should be expressed as Eq. (S16). Thus, the propensity factorization condition (Eq. (S2)) can be checked using θ c (n). In the package, CRN_find_elementary_path() generates a path from n 0 to n, CRN_theta_construction() constructs θ c (n), and CRN_check_factorization_condition() examines the factorization condition. The functions CRN_find_elementary_path(), CRN_find_elementary_function() and CRN_solve_sym_linear() are auxiliary functions to preprocess the input variables into appropriate forms for the construction code. See README.md in the GitHub repository for details. Finally, a complex balanced equilibrium of the deterministic mass action model for the translated network is determined by solving the algebraic equation for the complex balanced equilibrium with the function CRN_compute_cbe(). More examples can be found in https://github.com/Mathbiomed/CASTANET. All kinetics are assumed to follow the mass action kinetics. All rate constants are set to be one to reduce complexity while arbitrary rate constants are allowed to derive stationary distributions.   is sufficiently small compared to that of the original trans-autophosphorylation reaction. a Trans-autophosphorylation reaction A + AP → AP + AP is added to the original network (Fig. 3a). Due to this modification, although the modified network can be translated to a weakly reversible deficiency zero network, a desired propensity factorization does not exist. b Nevertheless, the stationary distributions π(nA) of the modified network (γ = 0.05, 0.1, 0.2, gray lines) calculated from stochastic simulations can be closely approximated by the stationary distribution of the original network (γ = 0, colored line), which is derived by our method (Fig. 3b, c). c Even the stationary distributions π(nA P ) of the modified network are nearly identical to those of the original network. d Similar to a, adding A + AP → AP + AP makes their propensities unable to be factorized as in Eq.
(S2) even after network translation. e, f, g The stationary distributions, π(nA), π(nA P ), and π(nA P P ) of the modified network can be approximated by the stationary distributions of the original network (γ = 0), which are obtained by our method (Fig.   3e, f). h Another trans-autophosphorylation reaction A + A → A + AP is added to the original network (Fig. 3g). i,j The stationary distributions π(nA) and π(nA P ) of the modified network are nearly identical to the stationary distributions of the original network (γ = 0). Note that the stationary distribution of the modified network in h can still be analytically derived because there exists a desired propensity factorization for this modified network after network translation. For each example, 10 5 simulations were performed using the Gillespie algorithm. See Fig. 3 for the values of parameters.  Fig. 4. See Fig. 4 for the values of parameters.