On chaotic dynamics in transcription factors and the associated effects in differential gene regulation

The control of proteins by a transcription factor with periodically varying concentration exhibits intriguing dynamical behaviour. Even though it is accepted that transcription factors vary their dynamics in response to different situations, insight into how this affects downstream genes is lacking. Here, we investigate how oscillations and chaotic dynamics in the transcription factor NF-κB can affect downstream protein production. We describe how it is possible to control the effective dynamics of the transcription factor by stimulating it with an oscillating ligand. We find that chaotic dynamics modulates gene expression and up-regulates certain families of low-affinity genes, even in the presence of extrinsic and intrinsic noise. Furthermore, this leads to an increase in the production of protein complexes and the efficiency of their assembly. Finally, we show how chaotic dynamics creates a heterogeneous population of cell states, and describe how this can be beneficial in multi-toxic environments.

the rate at which inactive IKK is made neutral. All the parameters used in the NF-κB model are found in the Supplementary  Table I: Default values of parameters in the model. The first 9 are from Ref. [1] and the next 4 from Ref. [2].
[IKK]tot and [A20] were chosen in order to obtain sustained spiky oscillations with frequency in the range 0.3-1 hr −1 when [T N F ] is kept fixed at 0.5 (the actual frequency obtained with these values is ν0 = 1/1.8 hr −1 .) N n = k N in (N tot − N n ) K I K I + I − k Iin I N n K N + N n (1) T N F = 0.5 + Asin( 2π T t) A schematic version of this network is shown in Figure 1A, and a more complete discussion is presented in [3]. To describe the impact of dynamically varying concentrations of a transcription factor, we turn to the description of production from genes that are influenced by this. We describe the protein production of a protein P i with the equations:ṁ Here, the m i represents the mRNA of gene i, and P i represents the protein level of gene i. All parameter values in the downstream model can be found in Supplementary Table II: This means that the transcription factors will enhance the mRNA production through sigmoid shaped curves as seen in Fig. 1E, which we term stimulation profiles. We stress that in the sections to follow, the Hill functions are not important in themselves, and the results are valid for genes that have different sigmoidal shaped stimulation profiles.
In the Gillespie algorithm we consider a volume V, with a spatially uniform mixture of N chemical species that can react through M different reactions, R 1 ...R M . The number of each of the species is denoted X 1 ...X N . At t = 0, we thus consider the initial number of molecules and calculates all reactions. The first goal is now to calculate the PDF, for the time until the next reaction occur We evaluate the probability that the next reaction is of type , and it occurs in the time-interval [t + τ, t + τ + dt].
We therefore consider: Therefore we want to describe P not (τ ) in terms of the rates. For an infinitesimal time step the probability for no reaction to appear is: We can thus define τ ≡ n · dt and then: This means that we should generate a random number according to the exponential distribution, and a random number according to a uniform distribution. Here one can use the transformation method, and we can then create the update process, where at each step we jump a step in time τ to next reaction, and picks the reaction according to r. Schematically the Gillespie algorithm can be described as: • Pick two random numbers, ν 1 and ν 2 . Calculate time until next reaction: Pick the next reaction: • Update the system according to the chosen reaction.
In this way the system can be updated, and adjusting the reactions in each time step. In Supplementary Figure   1A, we show simulations of the oscillations in NF-κB using the Gillespie algorithm, for the same initial conditions but in different volumes which results in different noise levels.

Transcription Factors and Hill Functions
The way that transcription factors regulate transcription is still far from well understood. However it is well established that not always only one transcription factor is needed, but rather a complex of several transcription factors act together in order to efficiently bind the polymerase and start transcription. In Supplementary Figure   1B we show a schematic figure, where three transcription factors are needed in order to bind the polymerase to the promoter region. To establish an equation for this, we consider the binding of n Transcription factors T to a position in on the genome E. We can write this process as: From this we can define the dissociation constant as : Now we want to consider the probability to be bound: Then we can insert the definition of the dissociation constant to obtain a Hill function on the form: In Supplementary Figure 1C we see the different output of the Hill function for different values of the parameters K and n. This equation is typically used to describe the production of mRNA for a system, since we describe the probability to bind to a promoter. Note that as the Hill coefficient (above n, often it is defined as h) increases, a switch like behaviour emerges for the output function and that the higher the value of K, the longer before the curve starts to rise. Many times in experiments, switch-like behaviours are found, and here the Hill equation is usually used to model this; also even though the theoretical value of n, is unknown [4]. One point to mention is here that even though a complex might consist for instance of 3 molecules, from experiments this value vary. One example is the binding of haemoglobin which should bind 4 molecules, but where experiments give a Hill coefficient of around 3.0.
The constant K, should be understood as a measure for the concentration of transcription factors to occupy half of the binding sites. In general Hill functions are extremely useful to model gene regulatory functions, because they have many of the experimentally-observed required characteristics [5,6], and suggest a natural way to implement sigmoidal signals to the mathematical description of genetic networks.

Protein Levels at Other Parameters
In addition to what is found in Figure 2, it is important to consider the profiles for the average number of proteins for other TNF frequencies in order to test the validity and robustness of the presented results. In Supplementary  oscillated TNF as a Van der Pol oscillator with Langevin noise. This is described by the dynamical system: With η(t) being uncorrelated white noise. Thereby we tested different TNF waveforms and calculated the ratio of protein in the chaotic regime vs protein in the flat TNF regime (Supplementary Figure 2I). Here we find that for all the tested wave forms, chaotic up regulates the protein production with at least a factor of 2. We note that the highest effect is found when TNF is oscillated as a square pulse.
In this work the main focus have been on two genes defined by a specific value of h and K. We found (Supplementary Figure 2J) that for for increasing hill coefficients and decreasing affinity the chaotic had an increasing ratio compared to the limit cycle regime. Thus for a large interval of parameters, chaotic dynamics results in increased protein levels and if we consider genes even higher values of h and K (we cannot go much lower that Gene 1), the results are even more significant and the production for a gene with h = 6 and K = 7µM has a 6 fold increase in chaos versus for flat TNF (Supplementary Figure 2H). Finally we tested how the distribution of NF-κB acutally changes in the different regimes and therefore give rise to the profiles found in Fig The equation for the m RN A is where (23) Because the NF-kB concentration N(t) is periodic in time with period T, therefore f(t) is also periodic with period T. To begin, let us examine the case where f(t) is composed of only a single mode with a non-zero average σ: In that case, the solution of the equation for m RN A is: In general of course f(t) is a Fourier sum of many modes, but since the equation for m RN A is linear in m, the resultant solution is: Here, the k = 0 term takes care of the σ term in the previous single-mode case. The argument can be easily extended to the protein, because the dP/dt equation is also linear in P and so each mode in m(t) will contribute additively to P(t) (just like each mode in f(t) contributes additively to m(t)). In order to understand how average expression of the gene depends on various parameters, we simply need to look at the k = 0 term, because averaged over time: Now we consider the two cases where K is very small and very large respectively. For small K we expand the fraction: Therefore this will be dominated by the smallest values of N(t). As we see in Supplementary Figure 2K these occurs in the chaotic state, and therefore the average production in total will be smaller for chaotic than for the limit cycle case. If we instead consider the case with very large K, we make the expansion This will clearly be dominated by the largest values of N(t) and as can be seen in Supplementary Figure 2K these also occurs for the chaotic state. In this case the chaotic dynamics thus increases the average production of genes with large K. These arguments explains the results we find in Fig. 2 in the main text from an analytical point of view in the extreme cases for the affinities of genes.

Overview of Complex Formation
In the following the ratios shown in Fig. 3G-I are shown. Note that each column i defined by the number og HAGs in the complex.
For the presented results in Fig. 3G we used the numbers presented in the following matrix: For the presented results in Fig. 3H we used the numbers presented in the following matrix: For the presented results in Fig. 3I we used the numbers presented in the following matrix: 0/n 1/n 2/n 3/n 4/n 5/n 6/n 7/n 8/n 9/n 10/n

Toxicity at Other Parameters
To explore the results shown in Figure 4, we tested the distribution of proteins inside the limit cycle and chaotic dynamics. In Supplementary Figure 3A-D we show how the distribution of proteins inside the chaotic regime is much broader than for the oscillatory dynamcis. We fixed the production ratios so the average sum of the two proteins should be equal (Supplementary Figure 3C), but in the chaotic region the mixture is broader and therefore the product of the two proteins is significantly higher in the chaotic region. This is one of the explanations of why chaos increases the survival rate when several external stresses are present.
In addition to what is found in Figure 4, we consider the death rates for a series of different parameters. First, we checked the results for another TNF frequency In Supplementary Figure 3E-H we observe the situations similar to  Figure 3G-H. These results might seem surprising, but there is a simple intuition of these, following the slightly simplified explanation. We defined the rate at which a cell die to be: As we defined in the text we normalized the proteins, and by adjusting their production profiles we fixed the number of total proteins. This can be seen in Supplementary Figure 3C (above), where we see that the sum of proteins has the same mean, even though the spread in the chaotic distribution is larger. Expanding the death rate (and just setting h=1), we find that: Here we note that the terms in the nominator and denominator are almost identical, but there is a product of the proteins only occurring in the denominator. Looking at the distribution of this (Supplementary Figure 3D) we see that this is significantly larger for the chaotic dynamics, since there is a mixture of represented proteins. Therefore the denominator will be larger in the chaotic regime and thus the death rate will in general be lower for the cells if  Supplementary Figure 3: Population heterogeneity and survival rate at different parameters. A) Heat map showing Protein levels during stochastic simulation when dynamics is a limit cycle. B) Heat map showing Protein levels during stochastic simulation when dynamics is chaotic. C) Sum of proteins in stochastic simulations when dynamics is in chaotic and limit cycle regime. D) Product of proteins in stochastic simulations when dynamics is in chaotic and limit cycle regime. E) Remaining cells vs time (Drug is added at T=4000 min). D1 = 3000, D2 = 3000. Below the ratio of living cells for chaos divided by limit cycle or modehopping dynamics. F) Same as A withḊ1+2 = N (0, 100.0) and D1+2(0) = 3000. The panel below shows a specific trajectory on this pattern. In general D1 is above D2 50% of the times and vice versa. Panels same as E. G) Same as A with D1+2(t) = 3000 + 1500 · sin( t 5000 + Ω) Panels same as E. H) Same as A with D1+2(t) = 7000 if sin( t 5000 + Ω) > 0.95 and otherwise D1+2(t) = 3000. Here h = 2. Panels same as E. I) Same as A with D1+2(t) = 3000 + 2500 · sin( t 5000 + Ω) Panels same as E. J) Same as A with D1+2(t) = 10000 if sin( t 5000 + Ω) > 0.95 and otherwise D1+2(t) = 2000 Here h = 2. Panels same as E.