Sources, propagation and consequences of stochasticity in cellular growth

Growth impacts a range of phenotypic responses. Identifying the sources of growth variation and their propagation across the cellular machinery can thus unravel mechanisms that underpin cell decisions. We present a stochastic cell model linking gene expression, metabolism and replication to predict growth dynamics in single bacterial cells. Alongside we provide a theory to analyse stochastic chemical reactions coupled with cell divisions, enabling efficient parameter estimation, sensitivity analysis and hypothesis testing. The cell model recovers population-averaged data on growth-dependence of bacterial physiology and how growth variations in single cells change across conditions. We identify processes responsible for this variation and reconstruct the propagation of initial fluctuations to growth and other processes. Finally, we study drug-nutrient interactions and find that antibiotics can both enhance and suppress growth heterogeneity. Our results provide a predictive framework to integrate heterogeneous data and draw testable predictions with implications for antibiotic tolerance, evolutionary and synthetic biology.


Supplementary Note 1: Stochastic modelling of single cell lineages
We lay out a stochastic framework that tracks the biochemical reactions occurring in a single cell over many cycles of growth, DNA replication and cell division. For generality, we consider R reactions involving N molecular species S 1 , ..., where r = 1, .., R and ν ± ir are the stoichiometric coefficients and a r is the reaction propensity. We describe the stochastic dynamics of intracellular molecule numbers x(t) = (x 1 (t), x 2 (t), . . . , x N (t)) T along a cell lineage by a jump process dx = R r=1 ν r dY r (t) − z D,x,p D dD(t). (2) The first term describes the stochastic dynamics of the intracellular reactions, where Y r are the number of occurrences of the r th reaction, which follows an inhomogeneous Poisson processes with mean t 0 ds a r (x(s)). The vector ν r = (ν + 1r − ν − 1r , . . . , ν + N r − ν − N r ) T yields the change in molecule numbers in the r th reaction. The second term describes the cell divisions and depends on the number of molecules lost in a cell division, z D,x,p D . The process D(t) counts the number of cell divisions, which will be described below. We will refer to such systems as reaction-division systems.
We assume that each molecule is partitioned independently at cell division with probability p D corresponding to the inherited mass fraction of the daughter cell. The sequence p 1 , p 2 , . . . , p D(t) consists of independent and identically distributed random variables. We do not distinguish between the two daughters, from which follows that Pr(p D ) = Pr(1 − p D ), and hence E[p D ] = 1/2 for every cell division D. The number of molecules lost in a cell division, which here is denoted by z 1,x,p1 , z 2,x,p2 , ..., z D(t),x,p D (t) , are independent random vectors following a binomial distribution To model the number of cell divisions, the counting process D(t), we assume that divisions follow initiation of DNA replication after a fixed period τ C+D . Assuming that all origins are replicated simultaneously with rate a R+1 , we have since division follows after each initiation event, where I(t) is the total number of these events and P (t) is a unit rate Poisson process. F (t) denotes the number of fork generations given by F (t) = F 0 + I(t) − D(t), while the number of replication origins is 2 F . Analogous to Donachie's observation 1 , we assume that replication initiation is a tightly controlled process occurring after the concentration of origins (O = 2 F /M ) is diluted below a critical threshold denoted by O c . We therefore set the propensity of replication initiation to where H(·) is the Heaviside step function. Because we focus on the effects of intrinsic fluctuations of the intracellular reactions, we assume that α is sufficiently large such that initiation of DNA replication follows instantaneously after crossing the threshold.

Supplementary Note 2: Stochastic analysis of reaction-division systems
In this note we derive a jump diffusion approximation for the intracellular concentrations of reaction-division systems in the form of Langevin equations. We then apply a small noise approximation from which we obtain closed form equations for mean and variances of the process. Finally, we use this result to determine the time-averaged statistics of concentrations and single-cell growth rates, and provide a noise decomposition that allows us to identify the sources of their variations.
To analyse the dynamics of the intracellular concentrations, we define the total cell mass M and change variables from molecule numbers to mass concentrations via where is the vector of molecular masses. The change of variables allows us to formulate a jump-diffusion approximation of the stochastic process in which reactions change the concentrations of all species continuously while cell divisions interrupt the dynamics by discontinuous jumps. Strictly speaking such an approximation is valid in the limit of large molecules numbers 2 , or equivalently in the limit of large mass M , and therefore we only consider the leading order terms here. In the following, we consider the contributions of biochemical reactions and cell divisions separately and then combine these effects.

Contributions of biochemical reactions
First, consider the process (2) in between any two cell divisions (dD(t) = 0) that we will approximate for large M (t). Setting a r ( where dW r (t) are independent Gaussian white noise sources. Using this relation in Eq. (2) and applying Ito's formula 3 to Eq. (6), we find that dX =(νf (X) − λ(X)X)dt where we neglected terms of order O(M −1 ) and ν ir = (ν r ) i is the stoichiometric matrix. The drift term in the above equation depends on the instantaneous growth rate λ(X) that evaluates to It can be verified that Eq. (8) obeys conservation of total-mass fraction m T X = 1. Moreover, when all reactions are mass-conserving, m T ν r = 0 for r = 1, .., R, there is no cell growth, i.e., λ(X) = 0, and we retain the standard chemical Langevin equation 4 .

Contributions of cell divisions
We consider a situation where the concentrations X change via cell divisions. Denoting the division times by t 1 , .., t D(t) , the jumps of the concentration process are where we use the shorthand notation and assume right-continuous paths. If the fluctuations are small, it is sufficient to focus only the first two jump moments.
The change in mass is given by ∆M (t D ) = −m T z D , where z D represents the vector of molecule numbers lost in a division and is distributed according to Eq. (3). Since m T X(t − D ) = 1 and the expectation values are The total cell mass is thus halved on average, as expected. For obtaining the second jump moment, and the law of total variance to obtain which depends explicitly on the intracellular concentrations and the coefficient of variation CV 2 p = Var[p D ]/E 2 [p D ] of the inherited mass fraction.
Obtaining the concentration jump moments is more elaborate because their jumps depend on the mass before and after division, M (t D ) − 1) and then set M (t D ) = p D M (t − D ) in the resulting expression, which is correct to leading order in the inverse mass. In effect, the mean concentrations do not change at cell division for i = 1, . . . , N . A similar calculation gives the second jump moments Note that although molecules are partitioned independently, the concentration process becomes correlated because the total mass fraction couples these quantities. Specifically, the change of mass-fractions satisfies E[m T ∆X(∆X) T m|X, M ] = 0 and thus the total mass-fraction, m T X = 1, is conserved at cell division. To evaluate the expectation involving p D we consider small variations about the mean, as it is typically the case for symmetrically dividing cells (see also discussion below). We then obtain E (1−p D ) p D ≈ 1 + CV 2 p in terms of CV p . For simplicitly, we will adopt this limit from here on.
Finally, we define the sequence of random vectors ξ 1 , . . . , ξ D and random variables ζ 1 , . . . , ζ D with zero mean that denote the mass-independent part of the partitioning errors, as well the zeromean random variables η 1 , . . . , η D that describe variation in the inherited volume fraction. These random variables are mutually uncorrelated and their components satisfy E[ζ D1,i ζ D2,j |X, M ] = 0, and for any two division indices D 1 , D 2 . The contribution of cell divisions can then be written as Note that in the main text we omit the dependence on CV 2 p in the first two expressions of Eq. (15), because biologically relevant division errors are smaller than 10% 5-7 , such that CV 2 p 1. Hence this source of variation has negligible effects on intracellular concentrations of symmetrically dividing cells. For lineages of asymmetrically dividing cells, such as budding yeast, the full dependence of Eqs. (12) and (14) has to be taken into account.

Langevin equations and growth rate
Combining now the contributions from intracellular reactions, Eq. (8), and divisions, Eq. (16), we obtain the following coupled Langevin equations dX =(νf (X) − λ(X)X)dt where X obeys conservation of total mass-fraction, m T X = 1, as explained above. Finally, to obtain a characterisation of the growth rate we apply Ito's formula to ln M . Neglecting terms of The first line of the above equation is the logarithmic change in the instantaneous mass in between cell divisions. Informally, we write Λ(t) = d dt ln M . Using the law of total variance, we can decompose the growth rate variation into the following contributions The first term denotes the contribution due to variations in intracellular concentrations, while the second term stems from the reactions contributing to biomass synthesis. For our purpose, the second term can be neglected because of averaging over the large number of these reactions that occur over one division cycle. To see this, assume for instance that these reactions scale as f r ∼ γ. The first term of the above equation is then of order γ 2 , while the second term is of order γ and thus contributes less to the total variation for large γ.
Small noise approximation of reaction-division systems Stochastic differential equations are notoriously difficult to solve analytically. We will therefore employ a small noise approximation 8 which yields closed equations for the means and variances of intracellular concentrations. In the limit of large M , we find that Eq. (17) reduces to a set of rate equations For finite but large M ,X equals the mean concentrations andM equals the mean cell mass. Interestingly, we observe that the dynamics ofX is independent ofM and of the division events. Thus the concentrationsX reach a steady state given by the solution of The mean cell massM increases exponentially between cell divisions Since in the deterministic limit, cell divisions must occur after fixed time intervals of τ = ln 2/λ, the constant M 0 , denoting the mass at cell birth, can be obtained from the delayed effect of initiation This is analogous to Donachie's result 1,9 , which states that the mean cell mass increase exponentially with the mean growth rate.
To investigate the stochastic component of the concentration dynamics, we separate the process X(t) into a deterministic partX, and a stochastic component (t) as follows To leading order in M 0 , the procedure yields the small noise approximation, which reads where J (X) is the Jacobian of the rate equations (20).
with the noise matrices for intracellular reactions and cell divisions defined as respectively.

Estimating lineage-averaged noise statistics
To obtain a representative measure of concentration variation, we employ statistics averaged over the cell lineage. While the mean concentrations are constant, its covariance can be obtained from the time-averaged solution of Eq. (26). To this end, we employ the Laplace transform of the covariance matrix Σ defined asΣ obtain the time-averaged covariance matrixΣ using the limit Note that since Σ(t) is periodic, the second equality corresponds to the average over a division cycle. For brevity, we introduce the linear operator LΣ = JΣ +ΣJ T . Using the fact that in the deterministic limit d dt D(t) = Solving forΣ(s) and multiplying by s we find whereM (T ) = 2M 0 . By taking the limit s → 0 of the above equation we obtain the time-averaged covarianceΣ. In the same limit, the first term in the square brackets vanishes, while the second and third terms are periodic functions, which evaluate to and Combining the last three equations we find thatΣ satisfies a linear set of equations, also known as the Lyapunov matrix equation, which read and determine the time-averaged noise statistics.

Estimation of growth rate fluctuations
Assuming small fluctuations we can linearise whereΣ is obtained from the solution of Eq. (35). Following the same arguments as above, we can neglect the contributions of the mass production to growth rate Λ and thus we use only the first term of Eq. (19). We then find Identifying the sources of growth variations To analyse the sources of growth variations, we follow Komorowski et al. 10 who investigated the effects of protein degradation using the linear noise approximation of the Chemical Master Equation. Along the same lines, we note that the matrix D, as defined in Eq. (28), can be rewritten as a sum over reactions D = R r=1 D r . We, therefore, decompose the coefficient of variation into additive contributions CV 2 , each of which satisfy the equations The matrices D 0 = λ(X)Γ and D r = ν r −Xm T ν r f r (X) ν r −Xm T ν r T for r = 1, . . . , R denote the respective noise matrices for cell divisions and biochemical reactions. Note that in contrast to Komorowski et al. 10 , we explicitly account for effects of cell divisions and dilution.

Supplementary Note 3: Parameter estimation
We follow the parametrisation of Weiße et al. 11 in large parts, which has been obtained from literature values and parameter optimisation. These values describe population averages, but they do not account for the doubling of mass in single cells. Specifically, intracellular association propensities are inversely proportional to cell volume, which we accounted for by scaling the Michaelis-Menten constants in units of mass concentrations. Additionally, we chose the critical origin concentration O c to closely match absolute protein levels 12 . To further match absolute mRNA levels 12 , we scaled the maximum elongation rate γ max from 11 by a factor of 10, which takes into account multiple ribosome-bindings to mRNAs (with ribosome density 1 per 30 codons 13 and assuming 300 codons per mRNA).
We estimated the absolute transcription rates w r,q,e,t and effective resource levels ζ, which scales the scales the thresholds of transcription and translation elongation (see Supplementary Tab. 1), using an adaptive MCMC sampler 14 . Using the SNA we evaluated mean growth rate as a function of nutrient quality n s describing number of resource molecules produced per nutrient molecule. Inverting this relation allowed us to determine ribosome levels and growth rate fluctuations for a given mean growth rate as well as to estimate the maximum growth rate for high nutrient qualities. We set the objective function as a weighted least square using (i) ribosomal mass fractions measured in Ref. 15 with weights given by the reciprocal of the variance over replicates 15 , (ii) coefficients of variation of single-cell growth rates measured in Refs. 16 and 17, and (iii) constraining maximum growth rate to 3.75 doublings per hour. Because error bars for (ii) and (iii) were not available, we assumed standard deviations of 0.025 and 0.1, respectively, for all data points.
The sampled parameter distributions are shown in Supplementary Fig. 1. We used medians of marginal distributions as parameter estimates, which are given in Supplementary Tab. 1. These values were consistent with the modal values of the sampled distributions of pairs of parameters ( Supplementary Fig. 1) indicating that all parameters were identifiable. The identifiability of the parameters depends crucially on the constraints imposed by single-cell data, i.e. growth rate variances, which we demonstrate using profile likelihoods 18,19 (Supplementary Fig. 2).

Supplementary Note 4: Reduced model for ribosome limitation
The reactions of the reduced model are: for y ∈ {r, q}. The propensities of mRNA degradation, ribosome binding and unbinding are the same as in the full model but only comprise the r-and q-proteins. Transcriptional and translational propensities depend on the resource a as in the main text, but a constant supply of resources with rate v m , Eq. (43), replaces nutrient uptake and metabolism. The profile likelihood is given by the maximum likelihood for a fixed value of ζ, a measure of absolute resource levels. Using population average data 15 that constrain only the mean behaviour of the model, the profile is flat (red line) indicating that the parameter is not identifiable. In contrast, including single-cell data 16,17 that also constrain growth rate variances the profile is clearly peaked (blue line) such that the parameter becomes identifiable within a 95% confidence interval (dashed grey line). Also shown is the maximum likelihood estimate (dark grey) and the median of the sampled parameter distributions (light grey line). Likelihood maximisation was performed using the Nelder-Mead method 20 .     24 . Ribosomal transcripts (purple for 1 dbl/h, teal for 2 dbl/h) are the most abundant mRNAs of the transcriptome (grey). Absolute abundances are obtained by normalising with the total mRNA numbers given in 12 (cf. Fig. 2a of the main text). (f) The reduced model (blue) fails to predict the measured total r-mRNA abundances in E. coli (red). The r-mRNA levels required to fit (c) and (d) are too small in the reduced model. Therefore, ribosome expression cannot account for the observed growth variation. In contrast, the model presented in the main text is in excellent agreement with the data 24 and supports the assumed nutrient-uptake and catabolic limitations.   Supplementary Fig. 7). For stochastic simulations with symmetric divisions, we draw the inherited volume fraction from a symmetric Beta distribution with a coefficient of variation of 5%. The parameter ζ scales the thresholds of transcription, θr and θnr, and translation elongation Kγ and hence is a measure of total resource levels. Obtained by parameter optimisation (see Supplementary Note 3). ♦Same as in Ref. 11, †same as in Ref. 11 but rescaled to account for adjusted resource levels; aa unit of amino acid. ‡For mixed nutrient-uptake and catabolic limitations value is rescaled to 600 min −1 .