## Abstract

Homeostasis of protein concentrations in cells is crucial for their proper functioning, requiring steady-state concentrations to be stable to fluctuations. Since gene expression is regulated by proteins such as transcription factors (TFs), the full set of proteins within the cell constitutes a large system of interacting components, which can become unstable. We explore factors affecting stability by coupling the dynamics of mRNAs and proteins in a growing cell. We find that mRNA degradation rate does not affect stability, contrary to previous claims. However, global structural features of the network can dramatically enhance stability. Importantly, a network resembling a bipartite graph with a lower fraction of interactions that target TFs has a higher chance of being stable. Scrambling the *E. coli* transcription network, we find that the biological network is significantly more stable than its randomized counterpart, suggesting that stability constraints may have shaped network structure during the course of evolution.

## Introduction

Cells require different protein levels to survive in different external environments. The expression of these proteins within the cell are therefore highly regulated. An important regulatory mechanism involves transcription factors (TFs), which are themselves proteins that can either up or downregulate the transcription of mRNAs coding for other proteins by binding to enhancer or promoter regions of the regulated gene^{1}. Despite the importance of maintaining desired protein concentrations within cells, factors affecting the stability of these concentrations to perturbations have received little attention.

One approach of studying the stability of such systems with a large number of interacting components was introduced by May in the 1970s in the context of complex ecological communities^{2}. The idea is that in a *n*-species community, the dynamics of the abundances *N*_{i} of each species may in general be described by a set of ordinary differential equations:

for *i* = 1, 2, ..., *n*, with corresponding steady-state solution \({N}_{i}^{ss}\) such that \({f}_{i}({\overrightarrow{{\bf{N}}}}^{ss})=0\) ∀ *i*. The dynamics of small perturbations about this steady-state \({x}_{i}(t)={N}_{i}(t)-{N}_{i}^{ss}\), when linearized about \({N}_{i}^{ss}\), has the form:

where **A** is the Jacobian matrix with elements \({A}_{ij}={\left(\frac{\partial {f}_{i}}{\partial {N}_{j}}\right)}^{ss}\). If all the eigenvalues of **A** have a negative real part, the system relaxes back to the steady-state upon perturbations and the steady-state is said to be stable; if any of the eigenvalues have a positive real part, the steady-state is unstable as the system will move away from it (exponentially fast) when infinitesimally perturbed. To construct **A**, one would need to precisely know the functions *f*_{i}, which is often hard to obtain. May’s approach was to model **A** as a random matrix with independent, identically distributed off-diagonal elements (with mean 0, standard deviation *σ*, and fraction of non-zero elements *C*) and constant diagonal elements—*a*. In the context of ecology, *σ* reflects the average interaction strength between species, *C* is the density of interactions or the probability that any two species interact, while *a* is the self-regulation term which sets the relaxation time-scale of the system if there were no other pairwise interactions. From random matrix theory (RMT) and in particular the circular law for matrix eigenvalue distributions^{3,4}, this system is stable if and only if \(a \, > \, \sigma \sqrt{nC}\). This implies that the system becomes unstable above some critical size, and that increasing *a* stabilizes the system and allows for stronger interactions between species.

This approach has also been used to analyze other large interacting systems. In particular, it has been used to argue why weak repressions by microRNAs, thought of as effectively increasing the degradation rate of mRNAs, confer stability to gene regulatory networks^{5,6}. However, such a framework does not take into account the functional form of *f*_{i} and in particular that the matrix elements often depend on the steady-state solutions themselves. These details of the model can be important—for example, when competition for resources between ecological species are explicitly modeled (using MacArthur’s consumer-resource model), even when the interactions (i.e., preferences of each species for the different resources) are completely random, the spectrum of the Jacobian matrix that represents effective pairwise interaction between species is no longer circular (but rather, follows the Marchenko-Pastur distribution)^{7}. Furthermore, transcriptional regulatory networks are not random but instead have distinct structural features. The structure of interaction networks has been known to affect stability in other models^{7,8,9,10}. However, how these features affect the stability of gene regulatory networks has not been explored.

Here, by analyzing a model that takes into account the transcription of mRNAs from genes, translation of mRNAs into proteins, and transcriptional regulation by proteins, we investigate the stability of this large system of coupled mRNAs and proteins in growing cells and find that while the mRNA degradation rate can affect relaxation rate back to steady-state levels, it does not affect whether the system is stable. Instead, stability can depend strongly on the global structural features of the interaction network. In particular, given the same number of proteins, TFs, number of interactions, and regulation strengths, a network with a lower fraction of interactions that target TFs has a higher chance of being stable. In the limit where there are no TF–TF interactions i.e. all TFs regulate proteins that are not TFs, it is possible for the system to remain stable for arbitrarily large system sizes, unlike random networks which become unstable when system size becomes too large. By scrambling the *E. coli*. transcription network, we find that the topology of real networks can stabilize the system since the randomized network with the same number of regulatory interactions is often unstable. These findings suggest that constraints imposed by system stability may have played a significant role in shaping the existing regulatory network during the evolutionary process. By carrying out the analysis for different physiological states the cell can be in (corresponding to different sets of dynamical equations) and with different choices of parameter distributions, we also show that our main results and conclusions are robust to the details of the model.

## Results

### The model

Gene expression involves two major steps: transcription and translation (Fig. 1a). Transcription is the process in which mRNA is synthesized by RNA polymerase using DNA as a template. The transcription rate of a gene *i* therefore depends on the number of RNA polymerases *n* and its effective gene copy number *g*_{i} which takes into account both its copy number and how strongly RNA polymerase can bind to the promoter of that gene^{11}. Due to the presence of TFs, \({g}_{i}(\overrightarrow{{\bf{c}}})\) can in general depend on the set of protein concentrations \(\overrightarrow{{\bf{c}}}\) (Fig. 1a). We assume that multiple TFs acting on the same gene act independently, with their effects stacking multiplicatively. This allows for both OR- and AND-gate-like combinatorial effects^{12}, and can emerge from a thermodynamic model of TF binding (Supplementary Note 1). Therefore, we adopt the following form for transcriptional regulation throughout the paper:

where *g*_{i0} is the effective gene copy number of *i* if it were unregulated (randomly drawn from a uniform distribution), and *γ*_{ij} controls the type and strength of regulation, i.e., how much gene expression of *i* changes in the presence of the TF *j*. In particular, *γ*_{ij} > 0 if *j* upregulates *i* and −1 ≤ *γ*_{ij} < 0 if *j* downregulates *i*. For each regulatory interaction, we assume that the fold-change Ω_{ij} is drawn from a uniform distribution between 1 and \({{{\Omega }}}_{\max }\), such that

since this would allow *g*_{i}(*c*_{j}) to increase (if *j* upregulates *i*) or decrease (if *j* downregulates *i*) by a factor of Ω_{ij} in the limit of high *c*_{j}. In Supplementary Note 5, we show that the main results do not depend on the particular distribution *P*(Ω) used.

Motivated by experimental measurements of the relationship between TF input and gene expression output showing a sigmoidal functional form of *f*_{ij}(*c*_{j})^{13,14}, we take it to be a Hill function

with the Hill coefficient *h* > 0.

Following ref. ^{11}, we assume a threshold number *n*_{c} of RNA polymerases above which the gene copy number is limiting the transcription rate (Fig. 1b). When this is the case, the transcription rate is proportional to *g*_{i} and is independent of *n*. If instead *n* < *n*_{c}, it is the RNA polymerases that are limiting, in which case the different genes have to compete for the limited pool of RNA polymerases. The transcription rate of a gene *i* is then proportional to both *n* and the fraction of RNA polymerases working on that gene, the gene allocation fraction:

Denoting the number of different genes by *N*, the dynamics of mRNA *m*_{i} for *i* = 1, ...*N* can therefore be described by the following equation:

where *k*_{m} characterizes the transcription rate of a single RNA polymerase, *τ*_{m} is the mRNA lifetime, and *n*_{s} is the maximum number of RNA polymerases per gene.

Similarly for the process of translation where ribosomes make proteins using mRNA as a template, the translation rate depends on the number of ribosomes *r* and the mRNA copy number *m*_{i}. As for RNA polymerases, there is also a threshold number of ribosomes *r*_{c} above which mRNA number is limiting and below which ribosomes are limiting (Fig. 1c). The dynamics of protein numbers *p*_{i} for *i* = 1, ..., *N*, with *p*_{N−1} = *n* corresponding to RNA polymerases and *p*_{N} = *r* corresponding to ribosomes, are therefore given by:

where *k*_{p} characterizes the translation rate of a single ribosome, *τ*_{p} is the protein lifetime, and *r*_{s} is the number of ribosomes per mRNA when ribosomes are in excess.

Depending on whether the RNA polymerases and ribosomes are limiting, there are four different cellular phases (Fig. 1b, c). The regime where *n* ≥ *n*_{c} and *r* ≥ *r*_{c} (phase 3 of the model, where the production rate of mRNAs and proteins are proportional to gene and mRNA copy numbers respectively) has been widely studied^{15,16,17}, but has been shown to be inconsistent with experimental observations in wild-type cells showing the exponential growth of protein levels^{18,19}. Instead, the regime where *n* < *n*_{c} and *r* < *r*_{c} (phase 1 of the model) is the one where wild-type fission yeast^{18} and mammalian cells appear to be in^{19}. We therefore focus on this phase for the rest of the paper. Note, however, that the phase 3 regime has been experimentally observed in defective budding yeast and mammalian cells that are excessively large^{20}, whereas the regime where RNA polymerases are in excess (*n* ≥ *n*_{c}) while ribosomes are limiting (*r* < *r*_{c}) (phase 2 of the model) has been observed in mutant fission yeast^{18}. We will address these two phases in the SI. The regime where *n* < *n*_{c} and *r* ≥ *r*_{c} (phase 4 of the model) is biologically unrealistic as ribosomes are typically more expensive to make compared to other proteins and hence having excess ribosomes while RNA polymerases are limited would be inefficient^{21,22}. This regime is therefore not considered.

It will be convenient to consider the dynamics of the concentrations of mRNAs \({c}_{mi}=\frac{{m}_{i}}{V}\) and proteins \({c}_{i}=\frac{{p}_{i}}{V}\). In bacteria^{23,24} and mammalian cells^{25}, the volume of the cell *V* is approximately proportional to the total protein mass. Hence, we assume for simplicity that each protein has the same mass and set the cell density to be 1, such that *V* = ∑_{i}*p*_{i}. The dynamics for concentrations in phase 1 are then given by:

where *c*_{mT} = ∑_{i}*c*_{mi} is the total concentration of all mRNAs and \(\frac{1}{\tau }=\frac{1}{{\tau }_{m}}-\frac{1}{{\tau }_{p}}\) is the difference between mRNA and protein degradation rates (which can be positive or negative). A summary of the list of model parameters can be found in Supplementary Table 1.

While these equations govern the dynamics of average concentrations and hence do not capture stochastic effects inherent in gene expression and in the binomial sampling of molecules during cell division, these fluctuations do not affect the average steady-state concentrations if the number of molecules is large (see Supplementary Note 2, Supplementary Fig. 1). In fact, these fluctuations can be considered as perturbations about steady-state values, and we investigate the stability of the system to such perturbations in the rest of the paper.

### Effects of network features and topology on stability of the system

To study how properties of the transcriptional regulatory network affect the stability of the system, we first consider the regime where the lifetime of mRNAs is much shorter than that of proteins, which is typically true for wild-type cells^{26}. In this limit of fast mRNA degradation, the relaxation dynamics of mRNA is much faster than that of proteins such that \(\frac{\mathrm{d}{c}_{mi}}{\mathrm{d}t}\approx 0\) at all times. Eliminating the fast process (by substituting the steady-state mRNA concentrations \({c}_{mi}=\frac{{k}_{m}{c}_{n}}{{k}_{p}{c}_{r}+\frac{1}{\tau }}{\phi }_{i}(\overrightarrow{{\bf{c}}})\) obtained from Eq. (9) into Eq. (10)), the dynamics of protein concentrations can be written as a set of *N* ODEs:

The stability of the system therefore depends only on the eigenvalues of the *N* × *N* Jacobian matrix \({\bf{A}}={k}_{p}{c}_{r}^{ss}({\bf{M}}-{\bf{I}})\), where we define the interaction matrix

with the steady-state protein concentrations given by \({c}_{i}^{ss}={\phi }_{i}({\overrightarrow{{\bf{c}}}}^{ss})\) (from Eq. (11)).

Denoting *λ*_{M} as the eigenvalues of **M**, the system is stable as long as the maximal real part of these eigenvalues \({\lambda }_{M,{r}_{\max }}\) is smaller than 1 (such that all eigenvalues of **A** have a negative real part). It is therefore useful to understand the structure of **M** by breaking it into two parts using Eq. (6):

where

captures the direct interactions between proteins, while

is a rank-1 matrix that captures the indirect interactions arising from competition for ribosomes.

It can be shown that both the structure of **M** (Eq. (13)) and the fact that stability only depends on **M** still hold in the other phases, despite the exact equations for protein dynamics being different (see Supplementary Note 3). Therefore, even though the simulations in the rest of this section are carried out in phase 1, our findings and conclusions also apply to the other phases.

#### Stability of the system scales with \(\sqrt{N}\) for random regulatory networks

We start by exploring the stability of fully random regulatory networks, which we take to be our null model.

Since the maximum eigenvalue of a random matrix depends on the standard deviation of its elements, we first carry out a naive estimate of how the elements of **M** scale with *N*. With *g*_{i}(*c*) given by Eq. (3),

Biologically, TF concentrations are often comparable to the values of dissociation constants *K*_{d} for DNA binding^{26}. Therefore, since *c*_{j} ~ 1/*N*, we also choose *K*_{ij} ~ 1/*N* (Eq. (5)), which would allow cells to maintain the full range of gene expression response. From Eq. (5), this implies that *f*_{ij} ~ *O*(1) and \(\frac{\partial {f}_{ij}}{\partial {c}_{j}} \sim N\), and hence *M*_{1} and *M*_{2} also scale with *N* (Eqs. (14), (15)). We therefore expect *M*_{ij} ~ *O*(1) (Eq. (13)), and hence (from RMT), for \({\lambda }_{M,{r}_{\max }}\) to scale approximately as \(\sqrt{N}\) for random interaction networks. \({\lambda }_{M,{r}_{\max }}\) also increases with the strength of the interactions *γ*, implying that the system will become unstable either when *N* exceeds a critical number or the regulation strength becomes too high. However, this argument neglects correlations between the elements of **M**, which could potentially be relevant. In fact, we will see in the later sections that the structure of **M** (Eq. (13)) plays an important role in influencing the stability of the system.

Therefore, to test if this scaling relation holds, we constructed networks of a specified interaction density *ρ* by randomly selecting *ρ**N*^{2} interactions from the *N*(*N* − 1) possibilities (where we have assumed that ribosomes cannot act as TFs), and choose half of the interactions to be upregulating with the remaining half being downregulating.

By taking the ensemble average over the randomly drawn networks, we indeed recover the \(\sqrt{N}\) scaling (Fig. 2a), which is also robust to the fraction of up- and downregulatory interactions (see Supplementary Note 4, Supplementary Fig. 2a) and the distribution of fold-changes *P*(Ω) (see Supplementary Note 5, Supplementary Fig. 3). For sufficiently large *N* or \({{{\Omega }}}_{\max }\), we can no longer find the fixed point of the system. Nevertheless, by simulating the dynamics, we find that for interaction networks of a given *N* and *ρ*, we get oscillatory, followed by chaotic behavior as \({{{\Omega }}}_{\max }\) is increased (Fig. 2b). Similar phenomena have also been described and analyzed in models of neural networks^{27} and ecological systems^{28}. While certain biochemical circuits have been known to generate oscillations such as in the cell cycle and the circadian clock, the oscillatory dynamics observed here is of a different nature—it does not come about from any specific fine-tuning of the network but, rather, emerges from having a large number of randomly and strongly interacting genes.

However, transcriptional regulatory networks are typically not random. Instead, they are enriched for distinct structural features such as the following motifs: feedforward loops (FFL), single-input module (SIM), and dense overlapping regulons (DOR) which do not contain any loops besides autoregulatory ones^{1,29}. In the next few subsections, we therefore explore the effects of network topology on system stability.

#### Random directed acyclic networks can also be unstable

Since transcription networks as a whole resemble directed acyclic graphs (DAGs)^{1,29}, we explore the stability of such networks.

In systems where the Jacobian matrix reflects the presence of direct interactions between components, the elements of the Jacobian matrix *A*_{ij} is 0 if *j* does not influence or regulate *i*. In such cases, if there are no interaction loops involving 2 or more components (e.g., E regulates F which also regulates E), **A** can be written as a triangular matrix for such a DAG and the eigenvalues are the diagonal elements of the matrix, i.e., the self-regulation loops. The system is therefore stable if there are no auto-activation among the components, i.e., there are no positive elements along the diagonal of **A**.

In our case, the presence of indirect interactions captured by the additional **M**_{2} matrix (Eq. (13)) implies that even if the regulation network is a DAG, the stability of the system is not determined solely by the self-regulation loops. Instead, we find that if we draw DAGs randomly (constructed by adding a connection only if the resultant network is still acyclic, Fig. 2c), even if there are no self interactions, the largest eigenvalue still scales approximately with \(\sqrt{N}\), suggesting that it is still possible for such a network to go unstable. Nevertheless, there is a negative offset in \({\lambda }_{M,{r}_{\max }}\) compared to the fully random case (Fig. 2a), implying that the lack of loops does help to stabilize the system.

#### Bipartite structure can maintain stability of large networks

A commonly found motif in the *Escherichia coli* transcription network is the dense-DORs which consist of a set of regulators that combinatorially control a set of output genes^{1,29,30}. There are several of these DORs in *E. coli*, each with hundreds of output genes, and they appear to occur in a single layer, i.e., there is no DOR at the output of another DOR. Such a structure can be thought of as a bipartite graph in which there are two types of nodes representing TFs and non-transcription factors (non-TFs), and every directed edge go from a TF to a non-TF. Since such graphs do not contain any regulatory loops (and are therefore also DAGs), we expect them to be more stable than random networks. However, they are a specific subset of DAGs in which none of the TFs are themselves regulated. This is also a key difference between these networks and bipartite, mutualistic networks commonly studied in ecological models^{9,10}. In this subsection, we investigate the stability of such networks.

To study this problem, we first group proteins into two categories: *q* TFs and *N*−*q* non-TFs, such that for any general network the components of the Jacobian matrix have the following structure:

where **T**_{1} (**T**_{2}) is a *q* × *q* matrix representing the direct (indirect) effect of TFs on TFs while **R**_{1} (**R**_{2}) is a (*N* − *q*) × *q* matrix representing the direct (indirect) effect of TFs on non-TFs, with their elements defined previously (Eqs. (13)–(15)). The non-zero eigenvalues of **M** are therefore the eigenvalues of the sub-matrix **Q** with elements:

When the network is sparse, each TF only regulates a small fraction of the total number of genes. Since *c*^{ss} ~ 1/*N*, the strength of indirect interactions are therefore typically much weaker than that of direct interactions (i.e., the non-zero elements of **M**_{2} are much smaller in magnitude than that of **M**_{1}, Eqs. (14), (15)).

When constructing random bipartite networks, we only allow TFs to regulate non-TFs (Fig. 3a), implying that **T**_{1} = **0**. The matrix **Q** therefore only consists of weak indirect interactions, and we expect the maximal eigenvalue to be smaller than that of random networks and DAGs. Moreover, since in this case **Q** is of rank-1, it has a unique real eigenvalue *λ*_{Q,b} which can be shown to be (see Supplementary Note 6):

where \(\frac{\partial \mathrm{log}\,{g}_{T}}{\partial {c}_{i}}=\mathop{\sum }\nolimits_{j = 1}^{N}{c}_{j}\frac{\partial \mathrm{log}\,{g}_{j}}{\partial {c}_{i}}\) as defined in Eq. (15) are the elements of the **M**_{2} matrix (and therefore small when the interaction density is low). The maximum eigenvalue of the interaction matrix **M** is then given by \({\lambda }_{M,b}=\max ({\lambda }_{Q,b},0)\), since 0 is also an eigenvalue of **M** (see Eqs. (17, 18)).

This expression (Eq. (20)) implies that unlike for fully random networks and random DAGs, the stability of bipartite networks can depend strongly on the ratio of up- and downregulating interactions (see Supplementary Note 4). In particular, there is a limit on the total strength of down-regulation (relative to that of up-regulation) for the system to be stable. For example, if the majority of the interactions are upregulating, *λ*_{Q,b} should be negative and hence *λ*_{M,b} must be 0. On the other hand, *λ*_{M,b} must be positive when the fraction of downregulations is sufficiently high. This tendency for inhibitory (activating) interactions to destabilize (stabilize) the system comes from the indirect effect that a regulator has on itself: a slight increase in the concentration of an inhibitor from its steady-state value will reduce the gene copy number and hence mRNA levels of the regulated gene. The mRNAs of the inhibitor therefore make up a larger fraction of the total mRNA in the cell. Since all mRNAs compete for the shared pool of ribosomes, this in turn causes the inhibitor concentrations to increase further. This positive feedback also exists in the other phases, although its physical origin may be different (see Supplementary Note 4, Supplementary Fig. 2b).

Indeed, by numerically constructing multiple instances of a bipartite network and varying the fraction of inhibitory interactions *P*_{neg}, we find that *λ*_{M,b} = 0 when *P*_{neg} is below a critical value that is approximately (but slightly greater than) 0.5 (Fig. 3b). Importantly, within this regime, the value of *λ*_{M,b} = 0 is independent of both *N* and the strength of interactions \({{{\Omega }}}_{\max }\) (Fig. 3b, c). This suggests that such a bipartite network structure can help to maintain and enhance the stability of the system, especially for large *N* and \({{{\Omega }}}_{\max }\).

#### Scrambling the interactions of E. coli transcriptional regulatory network can destabilize the system

Real transcription networks, however, are not strictly bipartite graphs—there are autoregulatory elements as well as TFs that regulate other TFs. To investigate how relevant network stability is to biological networks, we obtained the *E. coli* transcriptional regulatory network from ref. ^{31}. The network consists of *u* = 5654 regulatory interactions (of which *u*_{p} = 3187 are upregulating), with *q* = 211 TFs regulating *N* = 2274 genes. We compared its stability with that of randomly constructed networks with the same *N*, density of interactions \(\rho =\frac{u}{{N}^{2}}\approx 0.0011\), and ratio of positive (activating) to negative (inhibitory) regulation.

We first explored two different ways of scrambling the original network: (1) randomly choosing *u* directed connections out of the *N*(*N* − 1) possible connections, and (2) fixing the number of TFs *q* and randomly choosing *u* directed connections out of *q**N* possibilities. The second method of scrambling is motivated by the fact that *q* ≪ *N* and the stability of the system is governed solely by the *q* × *q* matrix **Q** representing how TFs affect TFs (Eq. (19)). For each drawn interaction network, we randomly choose *u*_{p} of the interactions to be upregulating (*γ*_{ij} > 0) and the rest to be downregulating (*γ*_{ij} < 0). We draw the fold-change Ω_{ij} of each regulatory interaction from a uniform distribution between 1 and \({{{\Omega }}}_{\max }=1000\). This choice of \({{{\Omega }}}_{\max }\) is motivated by the fact that TFs have been shown experimentally to change target protein levels by 100–1000 fold^{13}.

We find that with the real network, the system always converges to a stable fixed-point regardless of the regulation strengths (Fig. 4a). In contrast, for the randomly constructed networks (both with and without keeping *q* fixed), the probability of the system becoming unstable drastically increases when the interactions become too strong (Fig. 4a). This loss of a stable fixed point can give rise to either an oscillatory (Fig. 4b) or chaotic behavior (Fig. 4c). This suggests that for typical regulation strengths and density, the interaction network cannot be random, and that certain structural features of real networks are important for stability.

#### Network stability depends on the density of TF–TF interactions

Since it is the maximal eigenvalue of the *q* × *q* sub-matrix **Q** (Eq. (19)) that determines the stability of the system, and direct regulatory interactions are typically stronger than the indirect background effects, we expect a higher density of direct interactions among TFs to destabilize the system. This suggests that what matters for stability is not only the number of TFs and the total number of regulatory interactions, but also the fraction of those interactions that target TFs.

We therefore analyzed the composition of regulatory interactions in the *E. coli* transcription network, and found that there are (i) *u*_{s} = 134 self-regulations (of which 42 are activating), (ii) *u*_{t} = 373 TF-other TF interactions (of which 201 are activating), and (iii) *u*_{n} = *u* − *u*_{s} − *u*_{t} = 5148 TF-nonTF interactions (of which 2944 are activating) (Fig. 5a). In comparison, the scrambling method that maintained both the number of TFs and the total number of interactions gives a smaller number of self-interactions (〈*u*_{s}〉 = 2.5) and a larger number of direct TF-other TF interactions (〈*u*_{t}〉 = 522) than in the real network.

To investigate if this could be the origin of the enhanced stability of the *E. coli* regulatory network, we tried another scrambling method with the composition of the interactions kept fixed. In particular, after setting the first *q* = 211 (out of *N* = 2274) proteins to be TFs, we randomly draw the numbers of interaction pairs within the three categories (self, TF-otherTF, and TF-nonTF) by choosing each TF and its target separately. The sign of the interactions are then randomly assigned while maintaining the fraction of positive/negative interactions within each of these categories. We find that this scrambling procedure, which fixes the composition of regulatory interactions (in addition to *N*, *q,* and *ρ*), significantly increases the probability of the network having a stable fixed point (Fig. 4a).

Direct interactions among TFs can either be auto-regulatory loops or TFs regulating other TFs. We explored the effects of both of these factors, and found that assuming up- and downregulations to be equally likely, a random network is almost always stable when the density of TF-other TF interactions \({\rho }_{q}=\frac{{u}_{t}}{q(q-1)}\) is sufficiently low (Fig. 5b). Above this threshold value of *ρ*_{q}, the probability of the system not exhibiting a stable steady-state increases with *ρ*_{q} (Fig. 5b). This effect is observed regardless of the number of self-interactions or whether *u*_{n} is kept fixed (Fig. 5b).

While this implies that systems with a small number of TF–TF interactions are almost always stable, it does not mean that having a high density of TF–TF interactions will necessarily lead to an unstable system. This can be seen from the fact the probability of the system is stable does not drop sharply with *ρ*_{q} (Fig. 5b)—there are still systems with a relatively high density of TF–TF interactions that are still stable. This suggests that in the high *ρ*_{q} regime, the details of the interactions become important. For such a network with a large number of TF–TF interactions to be stable, the type and strength of those interactions will need to be more fine-tuned.

The phenomenon that a small *ρ*_{q} promotes stability is consistent with the stability of bipartite networks (*ρ*_{q} = 0) and the fact that direct regulatory interactions are typically much stronger than the indirect background interactions. Nevertheless, since **Q** (which has contributions from both **T**_{1} and **T**_{2}, Eq. (19)) is not a sparse matrix even when *ρ*_{q} is small, we do not expect the maximal eigenvalue \({\lambda }_{M,{r}_{\max }}\) to scale with *ρ*_{q} the way it does for a *q* × *q* random matrix with density *ρ*_{q}. Indeed, we find numerically that the presence of **T**_{2} can affect \({\lambda }_{M,{r}_{\max }}\) (Supplementary Note 7, Supplementary Fig. 4), suggesting that the indirect coupling between proteins can also play a role in influencing the stability of the system.

### Effect of degradation rates on protein level stability

So far, we have been working in the limit of fast mRNA degradation, where the stability of the system is governed only by the interaction matrix **M** (Eq. (12)). In this regime, since **M** is independent of degradation rates 1/*τ*_{m} and 1/*τ*_{p} (see Eqs. (12, 6, 3)), these do not affect whether the system is stable. The relaxation rates are also independent of *τ*_{m} and *τ*_{p}, with the relaxation rate in the absence of interactions given by (from Eq. (11)):

However, it is not clear if this insensitivity (of both stability and relaxation rates) to *τ*_{m} and *τ*_{p} still holds outside of the *τ*_{m} ≪ *τ*_{p} regime. Within the framework of RMT, a more negative self-regulation term typically increases the relaxation rate and hence has a stabilizing effect^{2}. Here, we ask if this is the case by investigating how mRNA and protein degradation rates affect the stability of the system and its relaxation timescale. In particular, can faster mRNA degradation rates help to stabilize a system that would otherwise be unstable if mRNAs degrade too slowly?

#### Values of mRNA and protein degradation rates do not affect whether the system is stable

To investigate how the degradation rates of proteins and mRNAs affect the stability of the system when *τ*_{m} is not too small, here we consider the full set of 2*N* equations (Eqs. (9, 10)) and study how the eigenvalues of the (2*N* × 2*N*) Jacobian matrix **J** varies with *τ*_{m} and *τ*_{p}.

To compare the relaxation rates of the full system with the protein relaxation rates when there are no interactions, we work with the transformed Jacobian matrix:

For an arbitrary regulatory network with a corresponding interaction matrix **M** (Eq. (12)), we find that the eigenvalues \(\tilde{\lambda }\) of \(\tilde{{\bf{J}}}\) are given by (see Supplementary Note 3):

where *λ*_{M} are the eigenvalues of **M** as before, and *ω* is a dimensionless quantity given by:

which reflects the difference between mRNA and protein degradation rates \(\left(\frac{1}{\tau }=\frac{1}{{\tau }_{m}}-\frac{1}{{\tau }_{p}}\right)\).

Since on average cell volume increases exponentially with rate (see Eq. (8)):

a growing cell has to satisfy the condition \(\frac{1}{{\tau }_{p}{k}_{p}{\phi }_{r}} \, < \, 1\). Therefore, since *τ*_{m }≥ 0, we have *ω* ≥ −1. The expression for \(\tilde{\lambda }\) (Eq. (23)) therefore implies that the system is stable if and only if \({\lambda }_{M,{r}_{\max }}\le 1\), regardless of the value of *τ*_{m} and *τ*_{p} (Fig. 6a). We find that despite differences in the details of the model, this conclusion still holds in the other phases (see Supplementary Note 3).

Therefore, unlike what has been argued in the literature and what one might expect from RMT, changing mRNA nor protein degradation rates has no effect on whether the overall system is stable. If steady-state protein concentrations are unstable because \({\lambda }_{M,{r}_{\max }}\) is too large (e.g., when interactions are too strong), increasing mRNA or protein degradation rates can never help to stabilize the system.

Importantly, this finding also implies that our results for how structural features of the transcription network affects stability holds outside the regime of fast mRNA degradation, since stability only depends on **M**.

#### Increasing mRNA degradation rate can improve response times, but only up to some limit

Besides system stability, another quantity of biological interest is the response time of the system to perturbations, which is especially relevant for cells experiencing changes in nutrient conditions^{32,33}. Since this relaxation timescale is determined by the slowest eigenvalue of the Jacobian matrix, here we discuss how the maximal real part of the eigenvalues \({\tilde{\lambda }}_{{r}_{\max }}\) changes with *τ*.

The expression for \(\tilde{\lambda }\) (Eq. (23)) implies that when the system is stable (\({\lambda }_{M,{r}_{\max }}<1\)), the rate at which the system relaxes to steady-state initially increases as *ω* increases from −1, but eventually plateau off − in the *ω* → *∞* limit (where *τ*_{m} ≪ *τ*_{p}), \(\tilde{\lambda }\to {\lambda }_{M}-1\) (Eq. (23), Fig. 6a). This implies that there is some benefit to having fast mRNA degradation in terms of response times, but once mRNA degrades much faster than proteins, further increasing mRNA degradation rate no longer affects the response time of the system. The eigenvalue spectrum in this *τ*_{m} ≪ *τ*_{p} limit appears to consist of two circular regions, one for the dynamics of mRNAs and the other for that of proteins (Fig. 6b), reminiscent of the RMT’s circular law. Increasing *τ*_{m} only shifts the eigenvalues corresponding to the mRNA sector and hence does not affect \({\tilde{\lambda }}_{{r}_{\max }}\). This is consistent with the fact that when *τ*_{m} ≪ *τ*_{p}, the dynamics of the overall system is governed only by the protein sector (Eq. (11)). Therefore, the slowest relaxation rate back to steady-state levels depends only on **M** and increasing mRNA degradation rate no longer improves the response time.

## Discussion

In systems with a large number of interacting components, the question of stability is often an important one, as results from RMT predict instability when the system size *N* becomes too large or interactions become too strong. In the context of gene expression, transcriptional regulation is crucial for cells to adapt to different environmental conditions by changing their gene expression levels. It is therefore important for transcriptional regulatory networks (TRNs) to be able to accommodate a large number of regulatory interactions without the system going unstable. However, we find here that similar to the intuition provided by RMT, \(\lambda \sim \sqrt{N}\) for a fully random regulation network, suggesting that the system will go unstable as the number of genes exceeds a threshold. In fact, based on typical values for the density of actual regulatory networks and interaction strengths, we find that the system has a high probability of being unstable if the TRN is randomly constructed.

Besides the number of genes, and the density and strengths of interactions, there are other factors that can affect the stability of the system, one of which is the network topology. This aspect is particularly relevant in this system since TRNs are far from being random but instead consist of recurring motifs. While the properties of these specific motifs have been widely studied and shown to be important for specific functions such as adaptation, robustness, and fast response to environmental changes^{1,29,30}, how they contribute to the overall stability of the network is less clear. We find here that global structural features of the network, which are fundamentally shaped by many of these motifs, can play a huge role in determining the stability of the system. In particular, given the same number of proteins, TFs, interaction density, and regulation strengths, a network that resembles a bipartite graph with a lower density of TF-otherTF interactions *ρ*_{q} has a higher chance of being stable. The significance of *ρ*_{q} fundamentally arises because of two main factors: (i) the eigenvalues of the Jacobian matrix and hence the stability of the system about its steady-state are governed only by the TF sector (i.e., how perturbations in TF concentrations affect TFs), and (ii) for a sparse regulatory network, the indirect background interactions arising from competition for ribosomes between different genes are typically much weaker than the direct regulatory interactions.

TRNs are also known to be scale-free, having a power-law out-degree distribution. This is consistent with the fact that most TFs only regulate a small number of genes, but there are TFs (often referred to as master regulators) that regulate a very large number of genes. Within a more abstract model of gene regulatory dynamics, the presence of these outgoing hubs has been shown to significantly increase the probability of the system reaching a stable target phenotype when the interaction strengths are allowed to vary while the network topology is kept fixed^{34}. Here, we find that having a low *ρ*_{q} can already significantly stabilize the system without the need to control the degree distributions. Nevertheless, having just a few master regulators may contribute to the network having a low *ρ*_{q} if for instance most of the regulations on TFs are carried out by the master regulators (and non-master regulators predominantly regulate non-TFs).

Besides the structural features of the network, another factor that could affect stability is the degradation rates of mRNA and proteins. Based on RMT, one may expect faster degradation to stabilize the system. This has in fact been argued to be the case^{5,6}. However, by taking into account the dynamics of protein concentrations and how it couples to the dynamics of mRNA levels, we find that this is not the case. Instead, the stability of the system depends solely on the regulatory network and the strengths of those regulations—if the system is unstable, it will be unstable regardless of how fast mRNA or protein degrades. This highlights the importance of taking into account key aspects of the interactions (through the form of the dynamical equations) when analyzing the stability of large coupled systems, similar in spirit to studies of ecological models where explicitly considering interactions mediated through competition for nutrients can give drastically different results from assuming random pairwise interactions between species^{7}. This prediction can also potentially be tested in the lab by varying the degradation rates of mRNAs (e.g., by using genetically modified RNases) or proteins (e.g., by using genetically modified proteases) in the cell and observing the dynamics of protein concentrations.

From an evolutionary perspective, there are many possible factors (such as the range of gene expression levels, environmental conditions, response time^{32,33}, level of unwanted crosstalk^{35}, etc.) that drive the addition or removal of regulatory connections. Our findings suggest that in addition to these considerations, another fundamental factor is the stability of the overall network. For example, there could be many ways of achieving a certain task such as allowing the cell to switch between two desired gene expression levels in two different nutrient conditions, but the only ones that can survive are those that also maintain the stability of the system. In other words, the stability of the system may have played a role in shaping current existing regulatory networks through the evolutionary process. Our approach can therefore provide insights into the design and evolutionary constraints for a functional regulatory network, which may potentially be useful for guiding the construction of synthetic genetic circuits^{36,37,38}. In the future, the ability to experimentally engineer a large, random regulatory circuit within cells could also allow testing of the results we have described.

In addition to transcriptional regulation, gene expression is also regulated at the post-transcriptional (e.g., through small-RNAs or micro-RNAs) and post-translational (e.g., through post-translational modifications) level. Our framework can be extended to take into account these effects (see Supplementary Note 8 for an example). How the stability of the system is affected by the coupling between these different forms of regulation with potentially different network structures is an interesting question that we leave for future work. Besides stability (determined by the eigenvalues of **J**), in the future, it could also be instructive to investigate the spread of perturbations within the regulatory network (i.e., the eigenvectors of **J**). This is analogous to the study of how concentration perturbations propagate in protein–protein interaction networks within the cell^{39}.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

The *E. coli*. transcriptional regulatory network data that support the findings of this study is available in the supplementary files of the paper (ref. ^{31}): https://doi.org/10.1073/pnas.1702581114. This data used for analysis is also available in a MATLAB data file on GitHub repository^{40}: https://github.com/yipeiguo/TRNstability.

## Code availability

All simulations and data analysis are carried out using codes written in MATLAB R2019a. These can be found on GitHub repository^{40}: https://github.com/yipeiguo/TRNstability.

## Change history

### 09 February 2021

A Correction to this paper has been published: https://doi.org/10.1038/s41467-021-21415-w

## References

Alon, U.

*An Introduction to Systems Biology: Design Principles of Biological Circuits*. (CRC press, 2019).May, R. M. Will a large complex system be stable?

*Nature***238**, 413 (1972).Ginibre, J. Statistical ensembles of complex, quaternion, and real matrices.

*J. Math. Phys.***6**, 440–449 (1965).Girko, V. L. Circular law.

*Theory Probab. Appl.***29**, 694–706 (1985).Chen, Y., Shen, Y., Lin, P., Tong, D. & Zhao, Y. et al. Gene regulatory network stabilized by pervasive weak repressions-microrna functions revealed by the may-wigner theory.

*Natl Sci. Rev.***6**, 1176–1188 (2019).Zhao, Y., Shen, X., Tang, T. & Wu, C.-I. Weak regulation of many targets is cumulatively powerful-?an evolutionary perspective on microrna functionality.

*Mol. Biol. Evolution***34**, 3041–3046 (2017).Cui, W., Marsland, W. III & Mehta, P. Diverse communities behave like typical random ecosystems. Preprint at https://www.biorxiv.org/content/10.1101/596551v2.full (2019).

Allesina, S. & Tang, S. Stability criteria for complex ecosystems.

*Nature***483**, 205 (2012).Thébault, E. & Fontaine, C. Stability of ecological communities and the architecture of mutualistic and trophic networks.

*Science***329**, 853–856 (2010).Okuyama, T. & Holland, J. N. Network structural properties mediate the stability of mutualistic communities.

*Ecol. Lett.***11**, 208–216 (2008).Lin, J. & Amir, A. Homeostasis of protein and mrna concentrations in growing cells.

*Nat. Commun.***9**, 4496 (2018).Buchler, N. E., Gerland, U. & Hwa, T. On schemes of combinatorial transcription logic.

*Proc. Natl Acad. Sci. USA***100**, 5136–5141 (2003).Kuhlman, T., Zhang, Z., Saier, M. H. & Hwa, T. Combinatorial transcriptional control of the lactose operon of escherichia coli.

*Proc. Natl Acad. Sci. USA***104**, 6043–6048 (2007).Kim, H. D. & Oashea, E. K. A quantitative model of transcription factor-activated gene expression.

*Nat. Struct. Mol. Biol.***15**, 1192 (2008).Paulsson, J. Models of stochastic gene expression.

*Phys. Life Rev.***2**, 157–175 (2005).Shahrezaei, V. & Swain, P. S. Analytical distributions for stochastic gene expression.

*Proc. Natl Acad. Sci. USA***105**, 17256–17261 (2008).Thattai, M. & Oudenaarden, A. Van Intrinsic noise in gene regulatory networks.

*Proc. Natl Acad. Sci. USA***98**, 8614–8619 (2001).Zhurinsky, J. et al. A coordinated global control over cellular transcription.

*Current Biol.***20**, 2010–2015 (2010).Schmidt, E. E. & Schibler, U. Cell size regulation, a mechanism that controls cellular rna accumulation: consequences on regulation of the ubiquitous transcription factors oct1 and nf-y and the liver-enriched transcription factor dbp.

*J. Cell Biol.***128**, 467–483 (1995).Neurohr, G. E. et al. Excessive cell growth causes cytoplasm dilution and contributes to senescence.

*Cell***176**, 1083–1097 (2019).Reuveni, S., Ehrenberg, M. & Paulsson, J. Ribosomes are optimized for autocatalytic production.

*Nature***547**, 293 (2017).Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z. & Hwa, T. Interdependence of cell growth and gene expression: origins and consequences.

*Science***330**, 1099–1102 (2010).Kubitschek, H. E., Baldwin, W. W., Schroeter, S. J. & Graetzer, R. Independence of buoyant cell density and growth rate in escherichia coli.

*J. Bacteriol.***158**, 296–299 (1984).Basan, M. et al. Inflating bacterial cells by increased protein synthesis.

*Mol. Syst. Biol.***11**, 836 (2015).Crissman, H. A. & Steinkamp, J. A. Rapid, simultaneous measurement of dna, protein, and cell volume in single cells from large mammalian cell populations.

*J. Cell Biol.***59**, 766 (1973).Milo, R. & Phillips, R.

*Cell Biology by the Numbers*. (Garland Science, 2015).Sompolinsky, H., Crisanti, A. & Sommers, H.-J. Chaos in random neural networks.

*Phys. Rev. Lett.***61**, 259 (1988).Roy, F., Biroli, G., Bunin, G. & Cammarota, C. Numerical implementation of dynamical mean field theory for disordered systems: application to the lotka-volterra model of ecosystems.

*J. Phys. A: Math. Theor.***52**, 484001 (2019).Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Network motifs in the transcriptional regulation network of escherichia coli.

*Nat. Genet.***31**, 64 (2002).Alon, U. Network motifs: theory and experimental approaches.

*Nat. Rev. Genet.***8**, 450–461 (2007).Fang, X., Sastry, A., Mih, N., Kim, D. & Tan, J. et al. Global transcriptional regulatory network for

*Escherichia coli*robustly connects gene expression to transcription factor activities.*Proc. Natl Acad. Sci. USA***114**, 10286–10291 (2017).van Heerden, J. H. et al. Lost in transition: start-up of glycolysis yields subpopulations of nongrowing cells.

*Science***343**, 1245114 (2014).Erickson, D. W. et al. A global resource allocation strategy governs growth transition kinetics of

*Escherichia coli*.*Nature***551**, 119–123 (2017).Schreier, H. I., Soen, Y. & Brenner, N. Exploratory adaptation in large random networks.

*Nat. Commun.***8**, 1–9 (2017).Friedlander, T., Prizak, R., Guet, C. C., Barton, N. H. & Tkačik, G. Intrinsic limits to gene regulation by global crosstalk.

*Nat. Commun.***7**, 12307 (2016).Adamala, K. P., Martin-Alarcon, D. A., Guthrie-Honea, K. R. & Boyden, E. S. Engineering genetic circuit interactions within and between synthetic minimal cells.

*Nat. Chem.***9**, 431 (2017).Ellis, T., Wang, X. & Collins, J. J. Diversity-based, model-guided construction of synthetic gene networks with predicted functions.

*Nat. Biotechnol.***27**, 465–471 (2009).Noireaux, V., Maeda, Y. T. & Libchaber, A. Development of an artificial cell, from self-organization to computation and self-reproduction.

*Proc. Natl Acad. Sci. USA***108**, 3473–3480 (2011).Maslov, S., Sneppen, K. & Ispolatov, I. Spreading out of perturbations in reversible reaction networks.

*N. J. Phys.***9**, 273 (2007).Guo, Y. & Amir, A. Exploring the effect of network topology, mRNA and protein dynamics on gene regulatory network stability.

*GitHub repository*, DOI:10.5281/zenodo.4294914, (2020).

## Acknowledgements

We thank Rui Fang, Jie Lin, Haim Sompolinsky, Grace Zhang, David Nelson, Naama Brenner, and Guy Bunin for useful discussions and feedback. This research was supported by the National Science Foundation through MRSEC DMR 14-20570, the Kavli Foundation, and the NSF CAREER 1752024.

## Author information

### Authors and Affiliations

### Contributions

Y.G. and A.A. designed research, performed research, and wrote the paper.

### Corresponding author

## Ethics declarations

### Competing interests

All authors declare no competing interests.

## Additional information

**Peer review information** *Nature Communications* thanks Tamar Friedlander and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Guo, Y., Amir, A. Exploring the effect of network topology, mRNA and protein dynamics on gene regulatory network stability.
*Nat Commun* **12, **130 (2021). https://doi.org/10.1038/s41467-020-20472-x

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41467-020-20472-x

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.