Stochastic gene expression in Arabidopsis thaliana

Although plant development is highly reproducible, some stochasticity exists. This developmental stochasticity may be caused by noisy gene expression. Here we analyze the fluctuation of protein expression in Arabidopsis thaliana. Using the photoconvertible KikGR marker, we show that the protein expressions of individual cells fluctuate over time. A dual reporter system was used to study extrinsic and intrinsic noise of marker gene expression. We report that extrinsic noise is higher than intrinsic noise and that extrinsic noise in stomata is clearly lower in comparison to several other tissues/cell types. Finally, we show that cells are coupled with respect to stochastic protein expression in young leaves, hypocotyls and roots but not in mature leaves. Our data indicate that stochasticity of gene expression can vary between tissues/cell types and that it can be coupled in a non-cell-autonomous manner.

In the experiments we can estimate the auto-correlation of the gene expression in Arabidopsis. To be able to compare the estimation with the result from the two-stage gene expression model, we need to calculate the non-stationary auto-correlation between protein abundance in case of extrinsic noise. At time t 0 = 0 KikG is converted from green to red, i.e., for t > t 0 one has a red and a green fluorescent protein population. The red protein population decays to zero due to protein degradation and the green population is building up from zero. We therefore consider a stochastic process with initial condition n(t 0 ) = 0. We denote the value at time t 2 conditioned on the value at time t 1 by: n(t 2 )|n(t 1 ) and the expectation value of n at time t 2 conditioned on the value n(t 1 ) at time t 1 by hn(t 2 )|n(t 1 )i. If we denote by P (n 2 , t 2 |n 1 , t 1 ) the non-stationary conditional probability density this reads explicitly: The correlation between n(t 2 ) and n(t 1 ) is given by: hh(n(t 2 )|n(t 1 ))(n(t 1 )|n(t 0 ))ii = X n 2 X n 1 n 2 n 1 P (n 2 , t 2 |n 1 , t 1 )P (n 1 , t 1 |n 0 , t 0 ) hh·ii means averaging over all possible n 2 and over all possible n 1 . We consider the two-stage protein expression model shown in Figure 1A of the main text. If the protein degradation rate d 1 is much lower than the mRNA degradation rate d 0 the conditional probability reads in zero-order approximation in = d 0 /d The function P 0 denotes the probability distribution for n 1 = 0 initial proteins given by: (1 a n) k k!
where (a) n is the Pochhammer symbol 1 , the parameters and the function f are defined as: In general the parameters will slightly vary from cell to cell and also over time (timedependent extrinsic noise). For simplicity, we consider cell-to-cell variability in the transcription and translation rates ⌫ 0 and ⌫ 1 and keep the degradation rates constant (note 1 that in the simulations we only vary the translation rate ⌫ 1 ). This means the transcription and translation rates vary from cell to cell, but are otherwise constant. To calculate the auto-correlation, one has to average over the extrinsic noise: c(t 2 , t 1 ; t 0 ) ⌘ hhh(n(t 2 )|n(t 1 ))(n(t 1 )|n(t 0 ))iii e hhn(t 2 )|n(t 0 )ii e hhn(t 1 )|n(t 0 )ii e p [hh(n(t 2 )|n(t 0 )) 2 ii e hhn(t 2 )|n(t 0 )ii 2 e ] [hh(n(t 1 )|n(t 0 )) 2 ii e hhn(t 1 )|n(t 0 )ii 2 e ] Note that the above expression depends on the initial time t 0 . Using the above stated approximations we find after some algebra for the auto-correlation: This is the non-stationary autocorrelation for a death-birth process with extrinsic noise. It is exact, the correction of order O( 1 ) vanishes. To see why this is the case, write a = ⌫ 0 /d 0 and b = ⌫ 1 /d 1 1 and take the limit ! 1.
• Taking also the limit V 2 ! 0 yields: which is the non-stationary autocorrelation for a death-birth process with constant parameters. For t

The birth-death process as a lower bound
In the following we prove that the auto-correlation of the birth-death process provides a lower bound for the auto-correlation of the two-stage model with extrinsic noise. We start by rewriting Eq. (2): It is su cient to show that 1 for all possible parameter combinations. Because 1 f 21  1 we have: We have

Simulation of the KikGR experiment
In order to test our analytical results we simulated the KikGR experiment. For sake of simplicity we kept all parameters constant besides the translational rate ⌫ 1 . We simulated 10 5 trajectories of a single reporter according to: 1. Draw the translation rate ⌫ 1 from a log-normal distribution with h⌫ 1 i = 45 h 1 and Simulate the reporter until stationary state with zero protein initial condition and mRNA initial condition drawn from the Poisson distribution 3. Compute the non-stationary auto-correlation.
A typical simulation result is shown in Figure 1B. After the bleach at time t 0 = 0 the red proteins decay and the green proteins build up until being in stationary state. In Figure  1C we compare c 0 to the simulation results for di↵erent CVs of the protein translation rate ⌫ 1 . In Supplementary Figure 24A we compare the zero-order approximation for the auto-correlation given in Eq. (2) with simulation results for finite . We also show c

Estimation of the protein half-life
In order to obtain an estimate for c 0 we estimated the protein degradation rate by measuring the fluorescent of the remaining red fluorescence proteins after conversion. Note that we were only interested in obtaining a rough estimate of the lifetime of the protein. For simplicity we ignored bleaching e↵ects; by this we may overestimate the degradation rate, by means of which our estimate is an upper bound of the true rate. To obtain the estimate we processed the data as follows. For each cell we have three measurements: x 0 the fluorescence directly after the conversion (t 0 = 0), x 1 (t 1 = 3 h) and x 2 (t 2 = 6 h) after the conversion. The amount of red proteins at time t 0 is given by x(t) = x 0 e d 1 t . We transformed the data logarithmically and calculated for each cell an optimal degradation value by minimising: with respect to d 1 . Because the logarithm is a strictly monotonic function this transformation preserves the optimum in absence of noise and measurements errors, but simplifies its calculation considerably. As long as the noise is small the optima (with and without transformation) are very similar. We incur for the optimal d 1 : We find for d 1 : (d 1 ) 0.5 = 0.095 h 1 (median),d 1 = 0.091 h 1 (mean), and = 0.023 h 1 (SD). Using d 1 = 0.09 h 1 together with t 2 = 6 h, t 1 = 3 h and t 0 = 0 h in the expression for the auto-correlation we find c 0 ⇡ 0.6. Thus the theoretical prediction for the autocorrelation is 1 c 0.6.

Supplementary Note 2: Spatial correlation
Di↵usion-like transport Movement of molecules between cells can induce a spatial correlation. To see this, we briefly investigate a simplified system. We consider a protein which synthesized with rate i in cell i, is degraded with rate and is transported across cell boundaries with rate µ. We assume that µ and are homogeneous and constant over the grid. The synthesis, however, are stochastic variables. For sake of simplicity, we consider them to be time independent, i.e., the synthesis rates i vary from cell to cell, but are otherwise constant. The concentration of the protein on the tissue can be described by: where x is a vector of the concentrations of the protein on the cellular grid, C is the connectivity matrix of the grid and b is a vector with the production rates i . The connectivity matrix is given by C ii = |⌦ i | and C ij = 1 for i 6 = j if cell i is connected to cell j and C ij = 0 otherwise. |⌦ i | denotes the number of connected neighbors of cell i. For low mobility the contribution to the correlation due protein mobility will be proportional to " = µ/ . This can be seen as follows. In steady state the solution for x can be written formally: In case " ⌧ 1 we can expand x in terms of ": (N is the number of cells on the grid) from which follows: For the covariance between two cells we find: In case of no spatial correlation between the synthesis rates all cov(b n ,b m ) vanish; however, the covariance between two cells would be non-zero and proportional to " .., N}). Since we are interested in the covariance between the expression 6 systems of two neighbouring cells (i.e., cov(b n ,b m )), it is important to find an estimate for ". To obtain this we consider the case in which we have only one cell producing the protein (the donor cell), all other cells are receptor cells, i.e.,b 1 = / ,b i = 0 for i 6 = 1. In this case we have: x 1 =b 1 "|⌦ 1 |b 1 and x i = "b 1 , from which follows: By building the ratio of the measured fluorescence of a receptor cell to the measured fluorescence of the donor cell, i.e., by determining the fraction of fluorescence obtained by the receptor cell, one can find an approximation for ".

Characterization of localization and movement of YFP tags
We aimed to choose CFP and YFP tags that minimize intercellular movement and that localize completely to the nucleus to enable accurate measurements. We compared three tags: YFP, 2xNLS-YFP and NLS-2xYFP that was published to be cell-autonomous. Arabidopsis leaves were transiently transformed by biolistic transformations. Free YFP was localized in the cytoplasm and the nucleus (Supplementary Figure 1A). As compared to NLS-2xYFP, 2xNLS-YFP showed a stronger nuclear localization, while NLS-2xYFP was still prominently localized in the cytoplasm (Supplementary Figure 1B, C). Unexpectedly, all YFP variants tested showed the ability to move to adjacent cells. To obtain an estimate for " we have to build the ratio between the donor and the receptor cells. A precise value for " would be di cult to obtain, but for our purpose it is su cient to obtain an upper bound. This simplifies the experimental analysis considerably. For each donor cell we only determine the ratio between the receptor cell with the highest fluorescence and the donor cell. This systematically overestimate the rate, but provides a su cient upper bound. Free YFP showed the highest movement (" = 0.08 (+0.019/ 0.022), basic bootstrap confidence intervalls). 2xNLS-YFP (" = 0.0218 (+0.0061/ 0.0066)) and NLS-2xYFP (" = 0.0047 (+0.0016/ 0.0018)) both revealed much lower values (Supplementary Figure  1D). We decided to use 2xNLS-YFP for our analysis of the spatial correlation as this fusion exhibits only slightly higher movement rates, but enabled us to precisely measure the fluorescence in individual cells due to its strong nuclear localization.

Pearson correlation between neighbouring cells
The half-life of the used reporter CFP and YFP (24 h) 7 is of the same order as the cell division time of epidermal cells on young leaves (33 h) 5 . Due to this the stochastic gene expression system is never in stationary state on young growing leaves. This gives rise to an extra term when calculating the covariance from a single reporter. The daughter cells inherit mRNA and protein from their mother cell, by which the initial conditions are the same (or at least very similar). In order to analyse the consequence of this we write: where N 1 and N 2 denote the protein amounts directly after cell division and z 1 and z 2 denote the extrinsic stochastic processes in cell 1 and cell 2, resp. h·i N means averaging over all possible initial conditions. We assume that the intrinsic processes in cell 1 and cell 2 are independent in a mechanistic sense, i.e., they do not influence each other in any way (directly or indirectly through feedbacks to the environment), which implies they are statistically independent for a given realisation of sample extrinsic noise z 1 and z 2 . Further, we assume the stochastic processes are the same in all cells, i.e., Therefore, we can rewrite the covariance: We investigate now the case where the stochastic extrinsic processes z i and z j are identical. If the initial conditions N 1 and N 2 are independent we find: However, if the initial conditions are identical as it is in case of cell division, one obtains (again for identical z 1 and z 2 ): In a growing tissue there will be always next neighbour cells which are daughter cells and others which are not. Note that in stationary state the expressions given in Eq. (4) and (5) are identical, because all dependence on the initial conditions is decayed. Since on young developing leaves the gene expression systems are not in stationary state it is better to estimate the spatial correlation of the extrinsic noise by using the dual reporter system correlating CFP from cell 1 to YFP in cell 2: with c i and y i denoting the two channels of the dual reporter system of cell i. The initial mRNA and protein abundance of CFP are identical in the two daughter cells and the same is true for YFP, but the initial conditions of CFP are di↵erent and independent of YFP and vice versa. We arrive at: which yields for identical stochastic extrinsic processes z 1 and z 2 Eq. (4). We therefore define Pearson's correlation coe cient between two neighbouring cells: To analyse the experimental data we define for each set of cells from a given leaf: where N are the number of cells on the particular leave, the neighbours of cell i are given by the set ⌦ i and the number of neighbours by |⌦ i |. Note that r is also scale invariant, i.e., a scaling factor between c and y does not change r.
we arrive at the expressions 3 : We used h·i as a short hand notation for the average over extrinsic and intrinsic stochasticity.
In the experiment one cannot measure the abundance of the reporter directly, but rather quantifies the fluorescence. When marking the region of interest (ROI) one may mark pixels which do not belong to the true ROI or miss pixels which do. If we denote the average intensity of the true ROI with N pixels as c and the average intensity of the extra or missed n pixels as s (note that n can be positive or negative) we can write for the measured signal m: where p is the proportionally factors between fluorescence and abundance. The average intensity s of the extra or missed pixels is related to the average intensity c of the nucleus.
We therefore write s = c(1 + ⌘) where ⌘ is a stochastic number with |⌘| < 1, hc⌘i = hcih⌘i and h⌘i = 0 () hsi = hci). With this we rewrite Eq. (10): with the definition: It is important to realise that n, the number of erroneously marked or missed pixels and s the average intensity of these pixels are uncorrelated stochastic numbers. Due to this we obtain h" m i = 0 and hmi = phci. Altogether, we can write for the measured CFP and YFP signals: " c and " y denote the technical error due to stochasticity of the equipment (in general di↵erent for c and y). Because there is no bias in the experiment h" c i = h" y i = 0 holds. The proportionally factors between fluorescence and abundance p c and p y are in general di↵erent. This causes a problem, when one estimates the intrinsic noise from data. To see this we ignore the errors in Eqs. (13) and (14) and calculate the intrinsic noise using Eq.
We therefore normalise the data with the mean and again (for the sake of the argument) ignore the errors: The last equality holds if hci = hyi, which is true for the dual reporter system. In presence of technical and measurement errors the estimated (apparent) intrinsic noise is given by: with the total technical error: For the estimated (apparent) extrinsic noise we find: In both cases the apparent noise overestimates the true intrinsic and extrinsic noise, respectively. The factors (relative error) by which the intrinsic and extrinsic noise is overestimated are given by: In order to estimate the technical error generated by the used instruments (h" 2 c i, h" 2 y i), a yellow autofluorescent plastic slide (ChromaTechnologies, https://www.chroma.com/ products/accessories/92001-autofluorescent-plastic-slides) was imaged with identical CFP and YFP settings and laser intensities as used for noise measurements. Because fluorochromes do not bleach and di↵use in this type of slides, only one region of interest was imaged 221 times to avoid variation in our measurements due to inhomogeneities. Because we do not have any additional errors besides the technical error we can write: m s c = p s c s+" c and m s y = p s y s + " y , where s is the signal from the yellow autofluorescent plastic slide. We find: Taken together we obtain estimates for the technical errors: In contrast to the technical errors coming from the microscopy equipment the measurement error is the same for both channels per analysed cell. To obtain an estimate for " m , we performed the following series of experiments. The ROI were on three leaves marked three times. This means that the technical errors and the CFP and YFP amount are constant for the three subsequent measurements of the nuclei. The estimator for covariance between the measured signal for CFP and YFP reads: m in c/y are the signals obtained from the i th measurements of the n th nucleus. Using Eqs. (13) and (14) we find: It follows for the average over su ciently many nuclei: We also find: From Eqs. (21) and (22) one obtains: We estimate h" 2 m i by averaging over 100 nuclei on 3 di↵erent leaves (i.e., 300 nuclei in total): The measurement noise is two orders of magnitude larger than the technical errors from the microscopy equipment and thus dominates the relative errors made by estimating the intrinsic and extrinsic noise. The error of the extrinsic noise is the same for all measurements while the error of the intrinsic noise depends on the actual measured apparent intrinsic noise (Eq. 17). We find f ext = 2.0 ⇥ 10 3 and f int < 2.1 ⇥ 10 3 for all measured values of the intrinsic noise. We conclude that in both cases the error introduced due to mistakes made by marking of the ROI and fluctuations in the technical equipment are negligible.

13
Supplementary Note 4: Spatial correlation due to cell division When cells divide the mRNA and protein content of the mother cell is inherited to the daughter cells. When calculating the CFP-YFP cross-correlation of the dual reporter system one avoids to introduce a correlation due to identical initial conditions. However, even though the initial conditions of CFP and YFP of the daughter cells are di↵erent they come from the same distribution, the underlying protein distribution of the mother cell. This induces a correlation which will decay with time. To show this we start with Eq. (6) and make the dependence on the extrinsic noise process z 0 of the mother cell explicit: If we assume for simplicity that z 0 , z 1 and z 2 are independent we find: At t = 0, right after cell division the expression above yields 2 ext , while for t ! 1 the gene expression of the daughter cells are independent from the stochastic processes in the mother cell, therefore expect we cov(c 1 , y 2 ) ! 0 for large t. It follows that for the correlation (defined by Eq. 7) between daughter cells r(t)  1 with r(t = 0) = 1 and lim t!1 r(t) = 0 holds. To determine r(t) we calculate the non-stationary covariance between two daughter cells as well as 2 ext using the two-stage gene expression model. Because we only wish to estimate the contribution of cell division to the overall spatial correlation, we employ a simple cell division model. At time t = 0 an exact copy of the mother cell is produced. Thereby we avoid all complication introduced by cell growth. We obtain:

Simulation of the stochastic gene expression in daughter cells
We calculate the correlation between two cells that originate from the same mother cell. The daughter cells inherit all rates from the mother cell, beside the translational rate ⌫ 1 . In order to estimate how much of the correlation between cells stems from mRNA and protein inheritance, we assume that the translation rates of the daughter cells are in general di↵erent to each other and also di↵er from the translational rate of the mother cell. The dual reporter system is simulated as follows: 1. Draw the translation rate ⌫ 0 1 of the mother cell from a log-normal distribution with h⌫ 0 1 i = 500 h 1 and Var(⌫ 0 1 ) = 1000 h 2 . 2. mRNA and protein of CFP and YFP of the mother cell are drawn from the steady state distributions of the two-stage gene expression model 6 . We set = 12.5 and for this the deviations of the analytical protein steady state distribution and the true are small 6 .
3. Draw the translation rates ⌫ 1 1 and ⌫ 2 1 of the daughters cell from a log-normal distribution with h⌫ 1 1 i = h⌫ Estimating the contribution of cell division to the measured next neighbour correlation As shown above cell division introduces a correlation between neighbouring cells. This means that even though there are no further correlated processes, one may find a correlation due to inheritance of mRNA and protein. It is therefore important to estimate the value of this contribution. When we calculate the correlation between cells and their next neighbours we do not know which cells are daughter cells, but to make progress we reason that for any given cell two cells out of its neighbouring cells are progeny cells from the last two cell division events. Because we also do not know when after the cell division we observe the cells, we argue that all times within the intervals [0, T ] for the last cell division and [T, 2T ] from the next to last cell division are equally likely, where T is the inverse cell division rate. For young leaves we have T = 33 h 5 . We average r(t) over these time intervals: and find r 1 = 0.69 and r 2 = 0.11. We used d 1 = 0.029 h 1 as the degradation rate for CFP and YFP 7 . On average the cells have five neighbouring cells. Given that two from these five cells are correlated to the center cell through inheritance of mRNA and protein content and the others are uncorrelated we arrive at our final estimate for the contribution of cell division to the measure next-neighbour correlation: r = (r 1 + r 2 )/5 ⇡ 0.16.