## Introduction

A standard approach to mathematical modeling of biochemical reaction networks (BRNs) is to use ordinary differential equations (ODEs), whose variables represent concentrations of molecules1. However, this deterministic description, while convenient for computation, by its nature cannot capture the inherent randomness of BRNs. In particular, the long-term behavior of ODE systems is characterized by steady states or other attractors, rather than by the stationary distributions statistically observed in real biological systems. As cell biology moves away from bulk averages to single-cell measurements, a focus has shifted to the study of such stationary distributions2,3. They can be described by various stochastic approaches1,4. In particular, stationary distributions can be described as steady-state solutions of the chemical master equation (CME), which has been widely used to describe the time evolution of the probabilities for the numbers of chemical species in BRNs such as gene regulatory networks and signaling pathways5.

Since the CME is a differential equation with infinitely many variables, its steady-state solution (i.e., the stationary distribution) can be found analytically only for simple cases, such as linear reaction networks6 or birth-death processes7. Unlike the CME, its deterministic counterpart is a finite dimensional ODE, whose steady-state solutions are relatively easier to calculate. An interesting question, therefore, is whether there is a systematic way of using these deterministic steady states for characterizing the stationary distribution of the stochastic counterpart. There is a positive answer to this question for special networks, called complex balanced networks.

A result from queuing theory8, reinterpreted in the context of BRNs9 through the connection between Petri nets and BRNs10, shows that for complex balanced networks whose kinetics are described by mass action reactions, stationary distributions can be characterized in terms of jointly distributed Poisson random variables with parameters corresponding to deterministic steady states. An independent proof of this result, together with deep applications to CMEs, was developed by Anderson, Craciun, and Kurtz11. Complex balancing is difficult to check and depends on rate constant values. However, beautiful work by Horn, Jackson, and Feinberg12,13,14 has shown that all networks that have the special structural properties of weak reversibility and zero deficiency are complex balanced, independently of rate constants. Weak reversibility of a network means that the network is a union of closed reaction cycles, and the deficiency of a network is the number of dependent closed reaction cycles, which can be easily checked. Satisfying these two structural properties is a simple condition to derive the stationary distribution of network under mass action reactions with the method in ref.11.

As various BRNs such as networks of several reversible reactions (e.g., A + B ↔ C ↔ 0) or cyclic reactions (e.g., A → B → C → A) are weakly reversible and deficiency zero, their stationary distributions can be analytically derived15,16,17,18,19,20. These have been used to characterize the stochasticity of various systems, including a genetic oscillator21 and a competitive inhibition enzyme kinetics model22. Unfortunately, the majority of BRNs do not have the special network structure. For instance, only ~0.36% of the Erdös-Rényi random networks of two species with up to bimolecular reactions have a deficiency of zero when the edge probability is 0.5, and the fraction decreases to zero as the number of species increases23. Moreover, from a biological standpoint, even simple networks are unlikely to be weakly reversible if they include a bimolecular reaction whose reverse reaction is unimolecular (e.g., autophosphorylation and dephosphorylation).

Here, we develop a framework to derive stationary distributions for a class of networks which do not have the special structure (i.e., weakly reversibility and zero deficiency) by modifying their structures via network translation24,25. Specifically, by simply merging reactions with a common stoichiometric vector and shifting reactions in the networks, we are able to change their structure to be weakly reversible and deficiency zero while preserving their stochastic dynamics. This allows us to derive the stationary distributions of a large class of BRNs including autophosphorylation networks of EGFR, PAK1, and Aurora B kinase. This derivation reveals key reactions determining the autophosphorylation status, which can seldom be done with a purely numerical approach. Furthermore, we describe how the stochastic dynamics of more complex BRNs can be tracked when our method is applicable for only their subnetworks. Importantly, we provide a user-friendly computational package CASTANET (Computational package for deriving Analytical STAtionary distributions of biochemical reaction networks with NEtwork Translation) that automatically derives the stationary distributions of submitted BRNs via our method. This will provide an effective tool to analyze the stochasticity of BRNs.

## Results

### Obtaining the desired network structure via network translation

As mentioned in the introduction, the stationary distributions of the stochastic mass action models for BRNs can be derived with any choice of rate constants using the previous method11 if and only if the networks have two structural properties: weak reversibility and zero deficiency. However, even very simple networks such as the one shown in Fig. 1a-left fail to satisfy the two properties. Weak reversibility means that if there exists a path from a complex (i.e., a node in the reaction graph) to another complex, then there is a reverse path from the second one back to the first one. Because there is no path from A + A to A while there is a path from A to A + A, the network in Fig. 1a-left is not weakly reversible. The deficiency of a network is a non-negative integer index calculated by subtracting both the number of linkage classes (i.e., connected components in the reaction graph) and the dimension of the subspace spanned by the stoichiometric vectors from the number of complexes. The deficiency of the network in Fig. 1a-left is one. Therefore, the previous method11 cannot be used to derive its stationary distribution.

Two different reactions, 0 → A and A → A + A, have the same stoichiometric vector (1, 0) because both reactions produce one molecule of A (Fig. 1a-left). Thus, these two reactions can be merged by unifying the source complexes 0 and A into 0 and summing the propensities of both reactions (Fig. 1a). This procedure is known as network translation24,25, which was proposed to investigate deterministic systems. This procedure is also applicable to stochastic systems as it preserves the stochastic dynamics (see Supplementary Note 1 for details). For instance, the propensities of the production of A are α1 + α4nA in both the original (Fig. 1a-left) and the translated network (Fig. 1a-right). Although the network translation is simple, it can effectively change the structure of the network to be a weakly reversible deficiency zero network.

### Propensity factorization is required

Even though the translated network is weakly reversible and of zero deficiency, the new model no longer follows mass action kinetics since the propensity of the reaction 0 → A is not constant (Fig. 1a-right). In this case, previously, it was known that the method in refs.11,26 is still applicable if the non-mass action propensity functions can be factorized as a certain form. However, the propensity functions of this translated network do not have the certain form. Thus, we generalize the previous factorization form so that stationary distributions can be derived for a larger class of BRNs. Specifically, we show that all the propensities of the translated network $${\tilde{\lambda }}_{k}({\bf{n}})$$ need to be factorized as

$${\tilde{\lambda }}_{k}({\bf{n}})={\kappa }_{k}\theta ({\bf{n}})\omega ({\bf{n}}-{\nu }_{k}){{\bf{1}}}_{\{{\bf{n}}\ge {\nu }_{k}\}}$$
(1)

for some constants κk > 0 and functions θ(n) > 0 and ω(n) ≥ 0 on a set Γ = {nn ≥ b} where the $${\nu }_{k}$$ is the source complex vector of the kth reaction, the inequality is coordinate-wise, and the b needs to be chosen so that $${\tilde{\lambda }}_{k}({\bf{n}})\,> \, 0$$ if and only if there are sufficient reactants (i.e., n ≥ $${\nu }_{k}$$ + b) in Γ. For the translated network (Fig. 1a-right), $${\tilde{\lambda }}_{k}({\bf{n}})\,> \, 0$$ if and only if n ≥ $${\nu }_{k}$$ like mass action kinetics, and thus b = (0, 0) and $${{\Gamma }}={{\mathbb{Z}}}_{\ge 0}^{2}$$.

Propensity functions satisfying the factorization condition (Eq. (8)) include a generalized mass action kinetics (Eq. (7)). For instance, if a source complex is 0, A, A + A, or A + B, propensity functions following the generalized mass action kinetics are proportional to 1, fA(nA), fA(nA)fA(nA − 1), or fA(nA)fB(nB), respectively. Note that if the fi’s are identity functions then the propensities follow standard mass action kinetics (Eq. (6))11,27. The propensity functions following the generalized mass action kinetics can be easily factorized with $$\theta ({\bf{n}})=\omega {({\bf{n}})}^{-1}=\mathop{\prod }\nolimits_{i = 1}^{d}\mathop{\prod }\nolimits_{j = {b}_{i}+1}^{{n}_{i}}{f}_{i}(j)$$, where d is the number of the constitutive chemical species (see Eq. (9) for details). However, the translated network (Fig. 1a-right) does not follow the generalized mass action kinetics (Eq. (7)) because the propensity function of the reaction 0 → A, α1 + α4nA, is not proportional to 1 (i.e., it is not constant). Thus, we need to solve recurrence relations as described in Supplementary Note 2 to identify the propensity factorization (Fig. 1b):

$${\kappa }_{k} ={\alpha }_{k}, \\ \theta ({\bf{n}}) =\frac{{\alpha }_{1}+{\alpha }_{4}{n}_{A}}{{\alpha }_{1}}{\mathop{\prod }\limits_{j=1}^{{n}_{A}}}\left(\frac{{\alpha }_{1}j}{{\alpha }_{1}+{\alpha }_{4}j}\right){\mathop{\prod }\limits_{j=1}^{{n}_{B}}}j,\\ {\text{and}}\,\omega ({\bf{n}}) =\frac{{\alpha }_{1}+{\alpha }_{4}{n}_{A}}{{\alpha }_{1}}\frac{1}{\theta ({\bf{n}})}$$
(2)

### Derivation of stationary distribution

After identifying κk, θ(n), and ω(n) via the propensity factorization, we need to find a complex balanced equilibrium (CBE) of the deterministic mass action model with rate constants {κk} for the translated network (Fig. 1c). The CBE is a steady state at which for each complex ν, the in-flow rate to ν is equal to the out-flow rate from ν12. For instance, based on the deterministic model in Fig. 1c, the complex balance conditions for the complexes 0, A, and A + B are κ3cAcB = κ1, κ1 = κ2cA, and κ2cA = κ3cAcB, respectively. By solving these equations, we can obtain the CBE, (cA, cB) = (κ1/κ2, κ2/κ3). Note that the existence of a CBE is guaranteed because we translate a network to be weakly reversible and deficiency zero13.

Finally, using the function θ(n) (Fig. 1b) and the CBE (cA, cB) (Fig. 1c), we can derive the stationary distribution of the stochastic model for the translated network, which is the same as that of the original network, as follows:

$$\pi ({n}_{A},{n}_{B})=M\frac{{c}_{A}^{{n}_{A}}{c}_{B}^{{n}_{B}}}{\theta ({n}_{A},{n}_{B})}$$
(3)

for nA ≥ 0, nB ≥ 0 where M is the normalizing constant so that the sum of the stationary distribution is one (see Methods for details). In this example, the distribution π(n) is obtained on $${{\Gamma }}={{\mathbb{Z}}}_{\ge 0}^{2}$$. This state space is closed as proved in Supplementary information, and it is irreducible (i.e., every state is reachable from every other state; see Supplementary Note 3 for details). On the other hand, if an irreducible state space is a proper subset of Γ, possibly due to a conservation law, then the normalizing constant M is chosen so that the sum of π(n) over the subset is one.

### Computational package, CASTANET

Applying our theoretical framework (Fig. 1) has two practical difficulties. Translating a given network to a weakly reversible deficiency zero network (Fig. 1a) is not straightforward as prohibitively many candidates of translated networks often exist. Furthermore, it is challenging to check whether the factorization condition holds (Fig. 1b) as it requires to solve associated recurrence relations. Thus, we have developed a user-friendly, open-source, and publicly available computational package, “CASTANET (https://github.com/Mathbiomed/CASTANET),” that automatically performs network translation and propensity factorization and derives stationary distributions (Fig. 2a). With this package, we were able to easily identify hundreds of BRNs and derive analytic forms of their stationary distributions. We have provided some of them in Fig. 2b and Supplementary Figs. 3 and 4. To use this package, users only need to enter the source complexes, product complexes, and propensity functions of reactions.

### Stationary distributions of autophosphorylation networks

Our theoretical framework and especially CASTANET extend the class of BRNs whose stationary distributions can be analytically derived using CBEs (Figs. 1 and 2). This class includes various autophosphorylations networks (Fig. 3) that are not weakly reversible due to autophosphorylation reactions, which occur in intermolecular (trans), intramolecular (cis), or mixed manners28,29.

Asymmetric trans-autophosphorylation occurs if two monomers form a homodimer and one of them acts as an ‘enzyme’ and phosphorylates the other. This type of autophosphorylation occurs in the epidermal growth factor receptor (EGFR), which triggers signal transduction for cell proliferation30. The key regulatory reactions for EGFR include its synthesis, trans-autophosphorylation, dephosphorylation, and degradation (Fig. 3a-left). The asymmetric trans-autophosphorylation is a reaction that transforms the complex A + A to A + AP. The dephosphorylation reaction is not the reverse of the previous reaction; instead, it occurs from the complex AP to the complex A. Thus, the network is not weakly reversible. However, CASTANET automatically identifies a weakly reversible deficiency zero translated network and its propensity factorization (Fig. 3a-right) and then derives the analytic form of stationary distribution (Fig. 3b) that matches what is calculated with stochastic simulations (Fig. 3c).

In addition, having the formula (Fig. 3b) allows us to easily understand the long-term behavior of the system, something that is not possible with a purely computational approach. For instance, $$\pi ({n}_{{A}_{P}})$$ in Fig. 3c is the Poisson distribution with rate α1/α3. This indicates that the synthesis (α1) and degradation rates (α3) of A are the determinants of the long-term status of $${n}_{{A}_{P}}$$, which is surprisingly robust to the changes of phosphorylation (α2) and dephosphorylation rates (α4). Furthermore, π(nA) is solely determined by $$\frac{{\alpha }_{1}}{{\alpha }_{2}}(1+\frac{{\alpha }_{4}}{{\alpha }_{3}})$$, and its moments can be calculated with the modified Bessel functions (see Supplementary Note 3 for details). This allows us to identify that the stationary distribution of nA is sub-Poissonian, and its coefficient of variation attains the maximum at $$\frac{{\alpha }_{1}}{{\alpha }_{2}}(1+\frac{{\alpha }_{4}}{{\alpha }_{3}})\approx 1.8$$ (Supplementary Fig. 1).

Trans- and cis-autophosphorylation can occur sequentially. For example, p21-activated kinase 1 (PAK1), which regulates cell motility and morphology, phosphorylates a threonine residue in the kinase domain in a trans manner asymmetrically (A + A → A + AP)28,31 and then phosphorylates a serine residue in the regulatory domain of itself in a cis manner (AP → APP)32,33 (Fig. 3d-left). While the original network of PAK1 is not weakly reversible (Fig. 3d-left), CASTANET identifies a weakly reversible deficiency zero translated network (Fig. 3d-right) and derives the analytic form of stationary distribution (Fig. 3e) that matches the simulation result (Fig. 3f).

Both trans- and cis-autophosphorylation can occur simultaneously as in Aurora B kinase, which controls mitotic progression34. In an Aurora B kinase network, cis-autophosphorylation (A → AP) promotes rapid trans-autophosphorylation (A + AP → AP + AP), which forms a positive feedback in the system34. For this network, CASTANET successfully applies our method to derive the analytic form of stationary distribution (Fig. 3g–i).

While mass action kinetics are commonly used to describe autophosphorylations34,35 as in our examples, the Michaelis–Menten function and Hill function are also often used36,37. Moreover, they are also used to describe the effects of phosphatases on dephosphorylation and proteasomes on degradation38,39, which EGFR, PAK1, and Aurora B kinase undergo40,41,42,43,44,45. Even when the mass action propensities in the networks (Fig. 3a, d, and g-left) are replaced with the Michaelis-Menten or Hill functions, their stationary distributions can still be derived with the same approach (Supplementary Fig. 5).

When the presented networks are extended by adding reactions, our methods might not be applicable. For instance, if an additional trans-autophosphorylation (A + AP → AP + AP) is added to the example in Fig. 3a, although it can be translated to a weakly reversible deficiency zero network, their propensities cannot be factorized as in Eq. (1). Thus, the stationary distribution of the extended network cannot be derived by our method. However, it can be approximated by the stationary distribution of the original network if the rate constant of the added reaction is small enough (see Supplementary Fig. 6 for details). Such approximation works for the extended networks of the other networks (Fig. 3d, g) as well. This indicates that if the stationary distributions of core subnetworks, which consist of dominant reactions, can be derived by our method, then it could be used to approximate the stationary distributions of their more complex parent networks.

### Translation of fast subnetworks reveals both the fast and slow dynamics of a multi-timescale system

As the number of nodes (i.e., complexes) of networks increases, the networks are less likely to be a weakly reversible deficiency zero network even after network translation, and thus our method is less likely to be applicable. However, such large networks commonly consist of reactions occurring at different time scales46. In this case, if we can derive the conditional stationary distributions of only fast subnetworks with our method, both the fast and slow dynamics of the full network can be accurately captured.

For gene regulatory networks, if the promoter kinetics (i.e., binding and unbinding of transcription factors to promoters) are fast, the fast subnetwork is a simple reversible binding network (i.e., weakly reversible and of zero deficiency), and thus its stationary distribution can be easily calculated21. On the other hand, when the promoter kinetics are slow, the fast subnetwork includes a complex protein reaction network whose stationary distribution is challenging to derive. This can occur for a variety of reasons, e.g., the presence of nucleosomes in eukaryotic cells usually slows down the binding and unbinding of transcription factors20.

A genetic toggle switch with the slow promoter kinetics, which consists of a pair of genes GA and GB, is an example of such multi-timescale system (Fig. 4a). The genes GA and GB express proteins A and B, respectively. Subsequently, these proteins undergo asymmetric trans-autophosphorylation, and they mutually repress gene expression by binding to the promoter region of each other’s gene. The binding and unbinding of the phosphorylated proteins occur at a slower time scale. Therefore, the entire network can be divided into the fast subnetwork consisting of gene expression, phosphorylation, dephosphorylation, and degradation and the slow subnetwork consisting of binding and unbinding (Fig. 4b). While the fast subnetworks for A and B are neither weakly reversible nor deficiency zero, they can be translated to weakly reversible deficiency zero networks (Fig. 4c-top). The propensity functions of these translated networks can be factorized as in Eq. (8), so their stationary distributions conditioned on the slow variables can be derived with CASTANET (Fig. 4c). Depending on the slowly changing gene states, the distributions of the proteins A, AP, B, and BP can dramatically change.

When the gene states slowly change, the fast variables rapidly equilibrate to the conditional stationary distributions determined by the current gene states (Fig. 4c). Thus, the weighted average of the conditional stationary distributions with the probabilities of the corresponding gene states accurately approximates the full (i.e., unconditional) stationary distribution under timescale separation (i.e., ε 1)20. For instance, the full stationary distribution of AP can be approximated as

$$\pi ({n}_{{A}_{P}})\approx {\rho }_{act}\pi ({n}_{{A}_{P}}| {n}_{{G}_{act}^{A}}=1)+{\rho }_{rep}\pi ({n}_{{A}_{P}}| {n}_{{G}_{rep}^{A}}=1)$$
(4)

where ρact and ρrep are the probabilities that the gene GA is active and repressed, respectively. The ρact becomes larger as the dissociation constant between GA and its repressor BP is larger, and the number of repressor BP is smaller. The ρact can be calculated by identifying the eigenvector of the matrix consisting of the dissociation constant, and the conditional stationary moments of the repressors obtained from Fig. 4c20, and ρrep = 1 − ρact (see Supplementary Note 5 for details). Thus, using Eq. (4), we can accurately capture the bimodal stationary distribution of the protein AP (Fig. 4d), leading to phenotypic heterogeneity in isogenic populations. Similarly, the full bimodal stationary distributions of the other fast variables A, B, and BP can also be accurately captured (Supplementary Fig. 7). Note that these bimodalities cannot be captured by the corresponding deterministic model, which predicted monostability. Such mismatches between the stochastic and deterministic model have been frequently observed in the presence of timescale separation1,20,47.

The conditional stationary distributions of the fast variables, obtained by using our approach (Fig. 4c), allow us to capture the slow dynamics of the full system as well. On the slow time scale, the slow variables are unlikely to be changed, but the fast variables rapidly equilibrate to their conditional stationary distributions for the given slow variables. Thus, by replacing the fast variables in the propensity functions of the slow reactions with their quasi-steady states (QSSs): conditional stationary moments, we can obtain the reduced model21,48. For the toggle switch system, the QSSs of the fast variables AP and BP can be computed from their conditional stationary distributions (Fig. 4c). Then by replacing the fast variables AP and BP with their QSSs, we can obtain the reduced model with only the slow variables, the active and repressed genes (Fig. 4e). This reduced model accurately captures the slow dynamics of the full model: the binding and unbinding of the repressors to the genes. Both the full and the reduced models yield nearly identical distributions of the residence time of the repressor BP, which quantifies how long the repressor maintains its binding to the gene GA (Fig. 4e). Because the reduced model does not simulate the fast reactions, which incur a large computational cost in the full model simulation, computation time decreases by 99.9998%.

## Discussion

In this study, we have developed a framework and its computational package that analytically derive stationary distributions of a large class of BRNs. Specifically, we showed that the stationary distribution of a BRN can be derived if two conditions are satisfied: the network can be transformed to a weakly reversible deficiency zero network via network translation (Fig. 1a) and the propensity functions of the translated network satisfies the generalized factorization property of mass action kinetics, identified in this study (Fig. 1b). We found that these conditions are satisfied in numerous BRNs including various autophosphoryaltion networks by using CASTANET (Fig. 2, Supplementary Figs. 3 and 4). Furthermore, even when only a subnetwork of more complex BRNs satisfies the conditions, the stochastic dynamics can often be captured. That is, the stationary distribution of the subnetwork consisting of dominant reactions, derived with our method, can accurately approximate the stationary distribution of its parent network (Supplementary Fig. 6). Furthermore, the derivation of the stationary distribution of a fast subnetwork is enough to capture both the slow and fast stochastic dynamics of its multi-timescale parent network (Fig. 4). With these analytically derived stationary distributions of BRNs, their long-term stochastic behaviors such as their dependence on rate constants can be effectively investigated, and the likelihood function of parameters for Bayesian inference can also be derived2.

Our work focused on the derivation of steady-state solutions of the CME using the underlying network structure following previous studies11,26. However, the CME is not usually used to capture cell division, which should be taken into account to describe single cell behavior in general. Thus, it would be interesting in future work to extend our method to the population balance equation49,50, which describes stochastic cell population dynamics (e.g., cell division) as well as intracellular dynamics. This extension could be accomplished by averaging stationary distributions from cell populations after a stationary distribution of each cell is derived by our method.

We have translated a network to have the desired structural properties (i.e., weak reversibility and zero deficiency) by merging reactions with a common stoichiometric vector (Figs. 1a and 3g) and shifting a reaction preserving its stoichiometric vector (Fig. 3a, d). While the idea underlying this procedure is simple, it greatly extends the class of networks whose structure can be changed to the desired one. For instance, when the edge probability is $$\frac{1}{2}$$, the fraction of deficiency zero networks among Erdös-Rényi random networks with two species and at most bimolecular reactions increases more than six times after network translation. The identification of such translation, which is not simple, can be done automatically by the provided computational package, CASTANET. In particular, to efficiently search translated networks, in CASTANET, we use the necessary conditions for network translation toward weakly reversible and deficiency zero networks, derived in this study (see Supplementary Note 4 for details).

Furthermore, CASTNET performs the propensity factorization of translated networks, which is required to derive the stationary distributions of networks with non-mass action kinetics. In this study, by extending the previous factorization condition11,26 to ours (Eq. (8)), we have been able to derive stationary distributions of various BRNs (Figs. 1, 2, and 3, Supplementary Figs. 3 and 4). Although the factorization condition with non-mass action kinetics have been rarely investigated11,26 due to its complexity and lack of motivation, our work motivates studies on it as translated networks typically follow non-mass action kinetics. To cover more weakly reversible deficiency zero translated networks, we aim to further generalize our factorization conditions, and accordingly, we will update our computational package CASTANET.

By changing the network structure while preserving the stochastic dynamics via network translation, we have been able to use the theory, applicable to weakly reversible deficiency zero networks11, to understand the stochastic dynamics of a larger class of networks with non-zero deficiencies. Similarly, by translating a network to have a deficiency of one, it would be possible to show that the networks have the properties of a network with a deficiency of one, such as absolute concentration robustness: the steady state value of a species is invariant to the overall input of the system51,52,53. Furthermore, network translation of stochastic BRNs can also be used to identify stochastic properties of networks based on their structures, such as positive recurrence54 and extinction52,55,56.

## Methods

### Biochemical reaction network

BRN is a graphical representation of a given biochemical system12,14,57,58. It consists of the triple $$\{{\mathcal{S}},{\mathcal{C}},{\mathcal{R}}\}$$ where $${\mathcal{S}}=\{{S}_{1},\ldots ,{S}_{d}\}$$ is the set of interacting species, $${\mathcal{C}}=\{{C}_{1},\ldots ,{C}_{m}\}$$ is the set of complexes, and $${\mathcal{R}}=\{{\nu }_{1}\to {\nu }_{1}^{\prime},\ldots ,{\nu }_{r}\to {\nu }_{r}^{\prime}\}$$ is the set of reactions. A complex is a non-negative linear combination of species (i.e., Ci = ai1S1 +  + aidSd), which is also represented as a d-dimensional non-negative integer-valued vector (ai1, …, aid). A reaction is an ordered pair of complexes. This allows the BRN to be represented as a directed graph ($${\mathcal{C}},{\mathcal{R}}$$), where complexes are nodes and reactions are directed edges. Hence, a reaction $${R}_{j}:= {\nu }_{j}\to {\nu }_{j}^{\prime}$$, where $${\nu }_{j}$$ and $${\nu }_{j}^{\prime}$$ are the source and product complexes of the jth reaction, respectively. The vector $${\nu }_{j}^{\prime}-{\nu }_{j}$$ is called a stoichiometric vector of the jth reaction, which describes the relative change in the number of molecules of reactants and products between the sides of each reaction. A linkage class is a connected component of the network when all reactions are regarded as undirected edges. Weak reversibility means that if there is a sequence of reactions from a complex Ci to another complex Cj then there must be a sequence of reactions from Cj to Ci. The deficiency δ is the integer index defined as $$| {\mathcal{C}}| -l-s$$, where $$| {\mathcal{C}}|$$ is the number of complexes, l is the number of linkage classes, and s is the dimension of the subspace spanned by all stoichiometric vectors (Fig. 1a).

#### Complex balanced equilibrium

CBE of the deterministic mass action model for a BRN with rate constants {κk} is the steady state $${\bf{c}}\in {{\mathbb{R}}}_{ \,{> }\,0}^{d}$$ which satisfies the following equality for each complex $$z\in {\mathcal{C}}$$ (Fig. 1c):

$$\mathop{\sum}\limits_{k:\, {\nu }_{k}^{\prime}=z}{\kappa }_{k}{{\bf{c}}}^{{\nu }_{k}}=\mathop{\sum}\limits_{k:\,{\nu }_{k}=z}{\kappa }_{k}{{\bf{c}}}^{{\nu }_{k}}$$
(5)

where the $${\kappa }_{k}{{\bf{c}}}^{{\nu }_{k}}={\kappa }_{k}{c}_{1}^{{\nu }_{k1}}{c}_{2}^{{\nu }_{k2}}\cdot \cdots \cdot {c}_{d}^{{\nu }_{kd}}$$ is the rate function of the kth reaction following the deterministic mass action kinetics, and $${\nu }_{ki}$$ is the ith entry of $${\nu }_{k}$$12. The LHS is the sum of rate functions over reactions whose product complex is z (i.e., $${\nu }_{k}^{\prime}=z$$), and the RHS is the sum of rate functions over reactions whose source complex is z (i.e., $${\nu }_{k}$$ = z). In other words, at CBE, the in-and out-flows create a balance for each complex. The deterministic mass action model for a BRN possesses a CBE regardless of rate constants if and only if the BRN is weakly reversible and deficiency zero13. Furthermore, even when a BRN has non-zero deficiency and is weakly reversible, the deterministic mass action model for the BRN possesses a CBE with specific choice of rate constants13.

#### Stochastic model of biochemical reaction networks

We model a BRN as a continuous-time Markov chain (CTMC) for an isothermal well-stirred system with a constant volume. The state of the CTMC at time t, $${\bf{n}}(t)=({n}_{1},\ldots ,{n}_{d})\in {{\mathbb{Z}}}_{\ge 0}^{d}$$, represents the copy number of each species. Each reaction is associated with a propensity function:

$${\lambda }_{k}:{{\mathbb{Z}}}_{\ge 0}^{d}\to {{\mathbb{R}}}_{\ge 0},\ k=1,\ldots ,r.$$

Specifically, λk(n) is the probability that the kth reaction occurs in a short interval of length dt if the state at the beginning of the interval was n. Using the propensity functions, we can derive the CME, which describes the time evolution of the probability of the model:

$$\frac{d{p}_{{\bf{n}}}}{dt}=\mathop{\sum }\limits_{k=1}^{r}{\lambda }_{k}({\bf{n}}-({\nu }_{k}^{\prime}-{\nu }_{k})){p}_{{\bf{n}}-({\nu }_{k}^{\prime}-{\nu }_{k})}-\mathop{\sum }\limits_{k=1}^{r}{\lambda }_{k}({\bf{n}}){p}_{{\bf{n}}}$$

for $${\bf{n}}\in {{\mathbb{Z}}}_{\ge 0}^{d}$$, where pn(t) denotes the probability that the state of the system equals $${\bf{n}}\in {{\mathbb{Z}}}_{\ge 0}^{d}$$ at time t. A stationary distribution π(n) of a given CTMC is the steady-state solution of the CME that satisfies the following infinite equation:

$$\mathop{\sum}\limits_{k}{\lambda }_{k}({\bf{n}}-({\nu }_{k}^{\prime}-{\nu }_{k}))\pi ({\bf{n}}-({\nu }_{k}^{\prime}-{\nu }_{k}))=\mathop{\sum}\limits_{k}{\lambda }_{k}({\bf{n}})\pi ({\bf{n}}).$$

It means that if the CTMC is initialized with its stationary distribution, the vector of probabilities p(t) will stay constant for all time t > 0.

The stochastic mass action propensity functions are assumed to be proportional to the number of ways in which species can combine to form the source complex. Hence, the kth propensity function with the rate constant αk can be written as:

$${\lambda }_{k}({n}_{1},\ldots ,{n}_{d})={\alpha }_{k}\mathop{\prod }\limits_{i=1}^{d}\frac{{n}_{i}!}{({n}_{i}-{\nu }_{ki})!}{{\bf{1}}}_{\{{n}_{i}\ge {\nu }_{ki}\}}.$$
(6)

Additionally, the propensity functions can have a more generalized form as follows:

$${\lambda }_{k}({n}_{1},\ldots ,{n}_{d})={\alpha }_{k}\mathop{\prod }\limits_{i=1}^{d}{f}_{i}({n}_{i}){f}_{i}({n}_{i}-1)\cdots {f}_{i}({n}_{i}-({\nu }_{ki}-1)){{\bf{1}}}_{\{{n}_{i}\ge {\nu }_{ki}\}}$$
(7)

where functions $${f}_{i}:{{\mathbb{Z}}}_{\ge 0}\to {{\mathbb{R}}}_{\ge 0}$$. For instance, the translated network in Fig. 3a-right follows this form as $${f}_{A}({n}_{A})={n}_{A}({n}_{A}-1)$$ and $${f}_{{A}_{P}}({n}_{{A}_{P}})={n}_{{A}_{P}}$$. This is called ‛generalized’ stochastic mass action kinetics since if fi’s are identity functions then it is equivalent to the stochastic mass action kinetics (Eq. (6)).

### Network translation

Network translation is a procedure to transform a BRN $$\{{\mathcal{S}},{\mathcal{C}},{\mathcal{R}}\}$$ to another one $$\{{\mathcal{S}},\tilde{{\mathcal{C}}},\tilde{{\mathcal{R}}}\}$$ that satisfies the condition: the sum of propensities of a set of reactions sharing the same stoichiometric vector remains identical (Fig. 1a). That is, for each vector $$\gamma \in {{\mathbb{Z}}}^{d}$$, the propensity functions of the original and the translated networks, λk(n) and $${\tilde{\lambda }}_{\tilde{k}}({\bf{n}})$$, satisfy the following:

$$\mathop{\sum}\limits_{k:\,{\nu }_{k}^{\prime}-{\nu }_{k}=\gamma }{\lambda }_{k}({\bf{n}})=\mathop{\sum}\limits_{\tilde{k}:\,{\tilde{\nu }}_{\tilde{k}}^{\prime}-{\tilde{\nu }}_{\tilde{k}}=\gamma }{\tilde{\lambda }}_{\tilde{k}}({\bf{n}})$$

for all $${\bf{n}}\in {{\mathbb{Z}}}_{\ge 0}^{d}$$, where $${\nu }_{k}^{\prime}-{\nu }_{k}$$ and $${\tilde{\nu }}_{\tilde{k}}^{\prime}-{\tilde{\nu }}_{\tilde{k}}$$ are the stoichiometric vectors of the kth and $$\tilde{k}$$th reactions of the first and second models, respectively. For example, merging several reactions sharing a common stoichiometric vector into one reaction is network translation. Similarly, shifting reactions preserving their stoichiometric vectors is also an instance of network translation (e.g., A + B → 2B to A → B). Network translation can change the structural properties of BRNs, such as weak reversibility and the deficiency, but it preserves the associated CME, i.e., stochastic dynamics (see Supplementary Note 1 for details).

### Propensity factorization

To derive the stationary distribution with our approach (Fig. 1), all the propensities $${\tilde{\lambda }}_{k}({\bf{n}})$$ should be factorized as

$${\tilde{\lambda }}_{k}({\bf{n}})={\kappa }_{k}\theta ({\bf{n}})\omega ({\bf{n}}-{\nu }_{k}){{\bf{1}}}_{\{{\bf{n}}\ge {\nu }_{k}\}}$$
(8)

for some constants κk > 0 and functions θ(n) > 0 and ω(n) ≥ 0 on a set $${{\Gamma }}=\{{\bf{n}}\,| \,{\bf{n}}\ge {\bf{b}}\}\subset {{\mathbb{Z}}}_{\ge 0}^{d}$$ at which $${\tilde{\lambda }}_{k}({\bf{n}})\,> \, 0$$ if n ≥ $${\nu }_{k}$$  + b and $${\tilde{\lambda }}_{k}({\bf{n}})=0$$ otherwise. For the stochastic mass action kinetics (Eq. (6)), b = 0 as $${\tilde{\lambda }}_{k}({\bf{n}})\,> \, 0$$ if and only if n ≥ $${\nu }_{k}$$ . For the translated network in Fig. 3a, b = (1, 0) because each propensity is positive if and only if n ≥ $${\nu }_{k}$$  + (1, 0) in $${{\Gamma }}=\{{\bf{n}}| ({n}_{A},{n}_{{A}_{P}})\,\ge\, (1,0)\}$$.

While the propensity factorization can be calculated by solving recurrence relations (see Supplementary Note 2 for details), it can be obtained without solving recurrence relations if all the propensities of a given network follow the generalized mass action kinetics (Eq. (7)). In this case, the factorization can be easily obtained as

$${\kappa }_{k}:= {\alpha }_{k},\ \theta ({\bf{n}}):= \mathop{\prod }\limits_{i=1}^{d}\mathop{\prod }\limits_{j={b}_{i}+1}^{{n}_{i}}{f}_{i}(j),\,\text{and}\,\ \omega ({\bf{n}}):= \frac{1}{\theta ({\bf{n}})}{{\bf{1}}}_{\{{\bf{n}}\ge {\bf{b}}\}},$$
(9)

where $$\mathop{\prod }\nolimits_{j = 0}^{-1}{a}_{j}=1$$ for any {aj} (see Supplementary Note 2 for details).

### Derivation of stationary distribution

If a network is weakly reversible and deficiency zero so that it has a CBE $${\bf{c}}\in {{\mathbb{R}}}_{ \,{> }\,0}^{d}$$13 and propensity function λk(n) can be factorized as in Eq. (8) on Γ, a stationary measure of the network can be derived as

$$\pi ({\bf{n}})=\left\{\begin{array}{ll}\frac{{{\bf{c}}}^{{\bf{n}}}}{\theta ({\bf{n}})}&\,\text{if}\,\ {\bf{n}}\in {{\Gamma }},\\ 0&\,\text{if}\,\ {\bf{n}}\in {{{\Gamma }}}^{\text{c}}.\end{array}\right.$$

Supplementary Note 3 provides the proof and detailed illustration. By scaling this stationary measure with the normalizing constant, which is the reciprocal of the summation of π(n) over the irreducible state space, the stationary distribution on the irreducible state space can be obtained. For instance, the normalizing constant for the stationary distribution (Fig. 3e) is calculated by summing π(n) over the irreducible state space $$\{({n}_{A},{n}_{{A}_{P}},{n}_{{A}_{PP}})\,| \,{n}_{A}\ge 1,{n}_{A}+{n}_{{A}_{P}}+{n}_{{A}_{PP}}={T}_{0}\}$$. While computing the normalizing constants is sometimes challenging, a symbolic computation approach using Wilf-Zeilberger theory can be used for the stochastic mass action model59.

### Computational package, CASTANET

We have developed a user-friendly, open-source, and publicly available computational package, CASTANET, that performs the network translation and propensity factorization automatically (Fig. 2a). The package checks two conditions: whether a given BRN can be made weakly reversible and of zero deficiency after network translation, and whether the propensities of the translated network can be factorized as in Eq. (8). If these two conditions are satisfied, CASTANET then calculates the analytic formula for a stationary distribution.

To efficiently search weakly reversible deficiency zero translated networks, we derived their necessary conditions (see Supplementary Note 4 for details) and incorporated them in the package. Furthermore, CASTANET constructs a candidate for the factorization function θ(n) in symbolic expression, which allows us to check propensity factorization condition without checking infinite combinations (see Supplementary Note 4).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.