Stochastic modelling reveals mechanisms of metabolic heterogeneity

Phenotypic variation is a hallmark of cellular physiology. Metabolic heterogeneity, in particular, underpins single-cell phenomena such as microbial drug tolerance and growth variability. Much research has focussed on transcriptomic and proteomic heterogeneity, yet it remains unclear if such variation permeates to the metabolic state of a cell. Here we propose a stochastic model to show that complex forms of metabolic heterogeneity emerge from fluctuations in enzyme expression and catalysis. The analysis predicts clonal populations to split into two or more metabolically distinct subpopulations. We reveal mechanisms not seen in deterministic models, in which enzymes with unimodal expression distributions lead to metabolites with a bimodal or multimodal distribution across the population. Based on published data, the results suggest that metabolite heterogeneity may be more pervasive than previously thought. Our work casts light on links between gene expression and metabolism, and provides a theory to probe the sources of metabolite heterogeneity.


INTRODUCTION
Cellular heterogeneity is ubiquitous across all domains of life. In microbes, clonal populations display phenotypic variability as a result of multiple factors such as fluctuations in the microenvironment, stochasticity in gene expression, or asymmetric partitioning at cell division [1][2][3] . Variability is well recognised at the transcriptional and translational levels. Yet various single-cell phenomena result from the emergence of distinct metabolic states within a clonal population. For example, metabolic heterogeneity plays a key role in antibiotic tolerance [4][5][6] , heterogeneous nutrient uptake 7,8 , and variations in growth rate 9,10 . It has also been shown that nutrient shifts can cause populations to split into two 11,12 or more 13 subpopulations with distinct growth abilities. The emergence of subpopulations has been theorised as a bet-hedging strategy that gives an evolutionary advantage for survival in adverse environments 4,14 .
A central challenge to quantify metabolic variability is the lack of techniques for measuring metabolites with singlecell resolution 15 . In contrast to single-cell measurements of protein expression, for which powerful reporter systems have been developed 16,17 , quantification of metabolites in single-cells remains a major challenge. Some of the techniques employed so far include Förster resonance energy transfer (FRET) sensors 18 , metabolite-responsive transcription factors 19,20 , RNA sensors 21 , and mass-spectrometry 22 , yet most of these technologies are in early stages of development 15 . As a result, metabolic heterogeneity is typically quantified indirectly via measurements of metabolic enzymes or growth rate in single-cells 9,12,23 .
Our objective in this paper is to characterise heterogeneity in metabolites as a result of stochastic enzyme expression and catalysis. Metabolic models traditionally assume that enzymatic reactions behave deterministically on the basis that both enzymes and metabolites appear in high molecule numbers 24 . However, single-cell proteomics in Escherichia a) Corresponding author: d.oyarzun@ed.ac.uk coli show that metabolic enzymes are as variable as any other member of the proteome 17 , while metabolomics data suggest that average metabolite abundances span several orders of magnitude 25 . The few datasets on single-cell metabolite abundance already suggest substantial variability in some metabolites in E. coli 19,26 . Such evidence casts doubt on the traditional assumption of metabolism being a purely deterministic process, suggesting a link between fluctuations in enzyme expression and metabolites.
The role of stochastic gene expression in protein variability has been well studied 2,3,[27][28][29] , but the impact of such randomness on metabolic reactions remains much less understood. Various theoretical studies have analysed the impact of fluctuations in the supply and consumption of metabolites [30][31][32] , or the propagation of enzyme noise to a metabolite 33 . However, despite the advancing experimental evidence of stochasticity in metabolism, mathematical models still lack the sufficient detail to integrate the processes that are known to shape protein heterogeneity, such as stochastic promoter switching and transcriptional bursting.
In this paper we propose a model for metabolite heterogeneity in single-cells. The model integrates stochasticity in enzyme catalysis 24 and expression 27 , two well-established processes that so far have been studied in isolation. Our approach includes a stochastic formulation of various relevant mechanisms in enzymatic reactions, including reversible catalysis, stochastic switching of promoter activity, fluctuations in mRNA transcripts, and consumption of the enzymatic product by downstream processes. We probe the model for various sources of stochasticity using simulations and analytical solutions for the stationary distribution of the metabolite. The analysis reveals intricate patterns of heterogeneity that translate into bimodal and multimodal distributions for the number of metabolite molecules. These phenomena arise from the interplay between a lowly abundant enzyme and its catalytic parameters. Under the separation of timescales typical of metabolic reactions, we show that metabolite distributions can be accurately approximated by a Poisson Mixture Model across large regions of the parameter space. The mixture model can be readily adapted to a wide class of gene expression models and provides a quanti-tative tool to predict metabolite variability from enzyme measurements in single-cells.

Stochastic model of an enzymatic reaction
We consider a model that combines enzyme kinetics and enzyme expression into a single stochastic description ( Figure  1A). The model includes an enzymatic reaction with standard Michaelis-Menten kinetics, in which substrate and enzyme bind reversibly to form a complex that undergoes reversible catalysis into a metabolite. We assume that enzyme expression follows the well established three-stage model for gene expression 3,27 , where a single copy gene switches stochastically between an inactive state (D off ) and active state (D on ). In the active state, mRNAs are transcribed and translated into protein. The model also includes consumption of the metabolite by downstream pathways, degradation of mRNA transcripts, and dilution by growth of all species. Since metabolic reactions operate far from thermodynamic equilibrium, we assume that the substrate pool remains constant so that the system reaches a non-zero flux, e.g. when the substrate is a highly abundant extracellular carbon source or a slowly varying intracellular metabolite. The model reactions are shown in equations (R1)-(R9) in the Methods section.
To investigate the emergence of metabolic heterogeneity, we need to compute the stationary probability distribution of metabolite molecules (n p ) for relevant combinations of model parameters. Figure 1B shows a typical simulation of the model obtained with Gillespie's algorithm 34 . A key challenge for such simulations, however, is the multiscale nature of enzymatic reactions: not only do metabolic reactions operate in a much faster timescale (milliseconds) than enzyme expression (tens of minutes) 30,35,36 , but also the average number of enzymes is much lower than the number of metabolites. These multiple scales result in reaction propensities that differ by several orders of magnitude, thus leading to extremely slow simulations which make the exploration of the parameter space infeasible. An alternative is to use simulation algorithms that exploit the separation of scales to increase computational speed, such as tau-leaping or slow-scale approximations 37 . Yet in our case it is unclear how such numerical approximations impact the predictions drawn from the simulations.
To determine the impact of genetic and catalytic parameters on metabolic heterogeneity, we obtained an analytic approximation for the distribution of metabolite molecules that can be evaluated efficiently without expensive stochastic simulations. Our solution allows the exploration of parameter space to characterise the different regimes promoting metabolic heterogeneity. The approximation follows from exploiting time scale separation in the Chemical Master Equation of the stochastic process 38 . In physiological regimes the model has three timescales: a fast metabolic time scale, in which substrate and enzyme bind and unbind; an intermediate time scale associated with the catalysis of the metabolite (n p ); and a slow timescale associated with the expression of the enzyme and dilution by cell growth.
The total amount of enzyme (free and substrate-bound, denoted as n e and n c respectively) varies in the slowest timescale, and therefore the binding/unbinding of substrate and enzyme equilibrates quickly. As a result, in the timescale of gene expression, the metabolite can be assumed to depend directly on the the total enzyme n etot = n e + n c rather than on n e and n c individually. Under this approximation, it is convenient to use the law of total probability: (1) The formula in (1) decomposes the distribution of metabolite P(n p ) into stochasticity originating from enzyme expression, P(n etot ), and from fluctuations in the catalytic reaction itself, described by the conditional distribution of metabolite given the amount of total enzyme, P(n p |n etot ). In the timescale of metabolite fluctuations, the total enzyme can be assumed to be in a quasi-stationary state. Further, exploiting the fast binding/unbinding between substrate and enzyme, we showed that the metabolite follows a birth-death process with effective propensities (details in Methods section): where E(n e |n etot , n p ) and E(n c |n etot , n p ) are the conditional expectations of the free enzyme (n e ) and complex (n c ) given the total enzyme and metabolite. In equation (2), n s is the constant number of substrate molecules, the parameters k 1 , k −1 , k cat , and k rev are the rate constants of the Michaelis-Menten mechanism (defined in Figure 1A), and k c is an effective first-order rate constant of metabolite consumption by downstream pathways. The conditional distribution needed in the model (1) can then be computed explicitly: with Poisson parameter and (λ ∞ , K) are two effective kinetic parameters The parameters λ ∞ and K are in units of molecules/cell and depend on the interplay between substrate abundance, enzyme kinetics, and downstream processes.   Table I.
As illustrated in Figure 1B, the distribution in (1) is a Poisson Mixture Model 39-41 (PMM) that convolves the enzyme distribution P(n etot ) with various Poisson modes P(n p |n etot ) arising from the catalytic activity. In our model, the analytical distribution of the total enzyme abundance follows the standard solution of the three-stage model for gene expression 27 , which can be computed explicitly in terms of model parameters. In certain limits, the three-stage model produces approximately Gamma or Normal distributions depending on the the mean expression level and the half lives of mRNAs and proteins 3,27 .
The decomposition in equation (1) shows that the PMM is not limited to the model for gene expression we have considered here. Other models may be used, either by using closedform expressions for P(n etot ), or by inferring the enzyme distribution directly from single-cell protein expression data such as flow cytometry or single-cell microscopy 1,17 . The PMM thus provides a versatile tool to predict metabolite heterogeneity from modelled or measured enzyme heterogeneity viewed as an upstream source of variation 41 . Figure 1C shows that the PMM distribution provides a good approximation to Gillespie simulations computed with typical parameter values.

Qualitative features of the Poisson Mixture Model
At the heart of the metabolite PMM is the interplay between variability from gene expression and that originating from enzyme kinetics. Specifically, the Poisson parameter λ(n etot ) in equation (4) controls the density and dispersion of the Poisson modes, which in turn shape the overall pattern of variability. As shown in Figure 1B, there are several cases of interest. For example, for irreversible reactions (k rev = 0), the Poisson parameter simplifies to which scales linearly with the enzyme abundance and thus the Poisson modes have equidistant means. In reversible reactions, on the other hand, the Poisson parameter saturates and causes the Poisson modes to concentrate around λ ∞ . This effect is stronger for strong reversibility (high k rev ), in which case the kinetic parameter K is small. Note also that in either case, as the enzyme number n etot grows, the Poisson modes spread out since λ(n etot ) controls both their mean and variance. From the construction of the PMM in (1), we observe that the enzyme distribution weighs the various Poisson modes, potentially producing metabolite distributions that are unimodal, bimodal or even multimodal. For example, for highly expressed reversible enzymes, the distribution P(n etot ) is nonnegligible when n etot is large. Hence most Poisson modes do not contribute to the final metabolite distribution, except the mode centred at λ ∞ , which leads to a unimodal metabolite distribution with a mean close to the deterministic average.
Conversely, for lowly expressed enzymes, there is a nonnegligible probability of enzymes not being expressed, and thus the first term of the PMM, i.e. P(0)Poisson(n p , 0), causes the metabolite distribution to peak at zero. However, the metabolite distribution may also display a second peak  Figure 1C) cover a large fraction of the parameter space. We identified two regimes in which metabolites are bimodal: in the switching-induced regime, bimodality propagate from the enzyme to metabolite. In the catalytically-induced regime, bimodality originates from a lowly abundant enzyme and the strong separation of timescales between expression and catalysis. The small panels show model predictions for a fixed kinetic parameter K = 0.1333 molecules, and increasing λ∞ = {300, 3000, 30000} molecules, obtained by increasing the turnover rate constant kcat. (B) Exact simulations for two parameter sets verify the predictions drawn from the PMM approximation. We simulated over a long time horizon to obtain accurate estimates for stationary distributions; insets shown only a small portion of the time courses. The parameter values for the promoter switching rates are indicated in panel A and we fixed λ∞ = 500 molecules. Both types of bimodality can be clearly distinguished in the time courses, but we note that they lead to almost identical distributions for the metabolite. In both cases, the PMM provides an accurate approximation for the stationary distributions.
at λ ∞ if, for example, the λ(n etot ) parameter causes many Poisson modes to concentrate around λ ∞ . This results in a bimodal metabolite distribution, whereby an isogenic population splits into metabolite producers and non-producers. Similar reasoning can be used to understand the emergence of multimodal metabolite distributions, which correspond to three or more subpopulations with varying metabolic activities. This qualitative analysis suggests that metabolic subpopulations can emerge even in cases where enzymes display unimodal distributions across the population. Crucially, this also indicates that metabolic subpopulations emerge through mechanisms that do not follow trivially from transcriptional heterogeneity alone, as we explore in more detail in next section.

Mechanisms for metabolic bimodality
First, we explored the impact of stochastic promoter switching on the emergence of metabolite bimodality. Figure 2A shows the summary of calculations when evaluating the PMM for variations in two key characteristics of the promoter across several orders of magnitude: the switching time scale ((k on + k off )/δ) and the promoter activity (k on /k off ), for various values of the kinetic parameter λ ∞ . We found three qualitatively distinct parameter regimes for the metabolite distribution that emerge from the combination of stochastic switching and catalysis: (1) a regime where both enzyme and metabolite have unimodal distributions, akin to the results shown earlier in Figure 1C; (2) a regime where both enzyme and metabolite have bimodal distributions; and (3) a regime in which the enzyme is unimodal but the metabolite is bimodal.
It can be shown that the deterministic version of our model in equations (R1)-(R9) has a single steady state. Hence regime (1) can be thought of as a stochastic correction consisting of unimodal distributions around a deterministic steady state. This is the expected behaviour under the traditional assumptions of high abundance of enzyme and metabolite molecules.
The other two regimes, however, correspond to alternative routes of noise-induced bimodality that cannot be explained using deterministic models [42][43][44] . Regime (2) is a highly stochastic regime dominated by the slow stochastic switching of the promoter, which drives and entrains the metabolic response. Hence we term it switching-induced bimodality. Slowly switching promoters are known to produce bimodal gene expression 29,41 , and thus this regime corresponds to a case in which bimodality propagates from enzymes to metabolites. Figure 2A shows that this behaviour appears robustly for slow switching and high promoter activity across values of λ ∞ kinetic parameter.
Regime (3), the second route for metabolite bimodality, originates from a unimodal but weakly expressed enzyme (low k on /k off ) expressed from fast switching promoters. In this case, the birth of a small number of enzyme molecules is sufficient to kick-start catalysis and make it rapidly settle in a quasi stationary regime. This distinct phenomenon is a result of the separation of time scales between enzyme expression and catalysis, and we refer to it as catalyticallyinduced bimodality. From Figure 2A, we observe that this form of bimodality appears for a narrow range of promoter switching parameters corresponding to fast switching genes with medium to low promoter activity. This behaviour disappears altogether for a low λ ∞ parameter, for example in case of strong reversibility.
To validate the predictions of the PMM approximation, we ran full Gillespie simulations over a long time horizon for different parameter sets. Figure 2B shows the simulation time courses and resulting histograms. For switching-induced bimodality, we observe how slowly switching promoters causes a single cells to lack the enzyme over several cell cycles, a period during which the metabolite is not produced. In the case of catalytic-induced bimodality, however, fast switching combined with a low average expression level cause the metabolite abundance to drop for shorter but more frequent intervals. In both cases the PMM provides an excellent approximation to the bimodal histograms obtained from the stochastic simulations. Furthermore, we observe that the bimodal metabolite distributions both regimes are almost indistinguishable from each other, yet they are produced by enzymes with substantially different time courses and distributions. These regimes therefore correspond to distinct forms of bimodality, arising from fundamentally different mechanisms.

Emergence of metabolic multimodality
To explore the emergence of multimodality, we examined the analytical formula of the PMM in (1) to identify kinetic regimes associated with distinct enzyme distributions. A nec-essary condition for the emergence of multiple modes is that the Poisson components do not overlap and are sufficiently spaced from each other. From the definition of the λ(n etot ) parameter in (5), this happens when the kinetic parameter K is large. As discussed earlier, depending on the distribution of the enzyme, the Poisson modes may appear or cancel in the final metabolite distribution. We thus swept the parameter K and evaluated the PMM across various enzyme expression levels, including low expression with a skewed distribution and high expression with a normally distributed enzyme.
As shown in Figure 3, we found intricate patterns of multimodal distributions, depending on the interplay between the heterogeneity of the enzyme, P(n etot ), and the enzyme kinetics encapsulated by the K parameter. Multimodality appears when the enzyme expression levels are low as compared to the parameter K. For instance, the values of K in Figure 3 are approximately 5-, 20-, and 100-fold those used in the bimodal examples in Figure 2. For enzymes expressed at intermediate levels, in the order of tens of molecules/cell on average, we found metabolite distributions that are unimodal but highly skewed. In the case of highly expressed enzymes, metabolites followed approximately normal distributions for a wide range of kinetic parameters.
The predictions are confirmed by Gillespie simulations of the full stochastic model, which display a striking match with the PMM approximation, even for complex multimodal distributions. The simulation time courses (shown in the insets of Figure 3) show that the multiple modes for weakly expressed enzymes correspond to cells remaining in a fixed metabolic state over the scale of the cell cycle but fluctuate across other states over longer time scales. For intermediate enzyme expression and large values for K, the metabolite does not settle in the quasi-stationary states and displays a long-tailed distribution. A decrease in K suppresses the tail of the distribution driving the PMM towards an approximately normal distribution. Altogether, these results indicate that the relation between enzyme expression and the kinetic parameters λ ∞ and, in particular, K are key determinants for the emergence of multimodality. This underscores the utility of the PMM to guide the prediction of qualitative and quantitative features of metabolite distributions for a wide range of parameter combinations.

DISCUSSION
Metabolic reactions are the powerhouse of living systems, fuelling the activity and dynamics of most cellular functions. Yet metabolism has been traditionally considered as a static process isolated from the rest of the cellular machinery. Currently, the accepted notion is that due to the large number of molecules involved, metabolism is a deterministic process at the cellular level, modulated by potentially random extrinsic factors 12,45 . Here we integrated enzyme kinetics and enzyme expression to propose a theoretical model for variability of metabolites in single cells. The model suggests that cell-tocell metabolite variation can also arise as a result of intrinsic sources such as stochastic fluctuations in enzyme expression.
The majority of work on non-genetic heterogeneity has focused on stochastic gene expression and the resulting variability in protein levels 1,2 . This has produced a wealth of singlecell data and models to understand the variability in transcription and translation observed in clonal populations. Metabolite heterogeneity, however, remains poorly understood theoretically and has been observed only indirectly (e.g., through measurements of metabolic enzymes 12,23 or growth rate 9 ) due to the lack of techniques to measure metabolite abundance in single cells.
Using the separation of time scales characteristic of metabolic reactions, we found that the stationary distribution of a metabolite follows a Poisson Mixture Model (PMM). The PMM can be efficiently evaluated across large domains of the parameter space and provides excellent approximations to the distributions computed from full stochastic simulations. Importantly, the model can be readily adapted to include different stochastic models for enzyme expression, beyond the threestage model considered here 3,46 , or even stochastic and timedependent enzyme expression modelled as upstream drives 41 . The model can also be parameterised from experimentally measured distributions for enzyme levels in single-cells 17 . In combination with the enzyme kinetic parameters, the PMM could provide a powerful tool to predict metabolite variability from single-cell protein data obtained with established techniques such as flow cytometry or time-lapse microscopy.
We found complex patterns of metabolite heterogeneity depending on the interplay between the timescale of promoter activation/deactivation, the enzyme expression level, and the enzyme kinetics. The model predicts that bimodal and multimodal metabolite distributions can emerge in various parameter regimes. In such regimes single-cells spend several cell cycles in a constant metabolic state, but in timescales as long as tens of cell cycles, they switch stochastically across different states. Such long-term fluctuations in single cells result in highly heterogeneous populations containing several subgroups of metabolically distinct cells.
Bimodal metabolic phenotypes have been observed as a result of transcriptional regulation 14,23 , post-translational control 11 and stochastic effects triggered by environmental shifts 12 . Our model reveals two distinct regimes in which metabolites display bimodality. One regime, which we call switching-induced bimodality, corresponds to the intuitive case in which a bimodal enzyme produces a bimodal metabolite. In agreement with previous studies on stochastic gene expression, this type of bimodality appears as a result of slow switching between promoter states 29,47 . In addition, we identified a fundamentally different mechanism of catalyticallyinduced bimodality, in which a unimodal enzyme produces a bimodal distribution of metabolite. This phenomenon results from a combination of slow fluctuations of a weakly expressed enzyme and the comparatively faster timescale of enzyme catalysis. Catalytic timescales are typically in the order of seconds or faster, so that slow fluctuations in enzyme expression levels produce two quasi-stationary metabolic states in single cells. At a population level, this leads to two distinct subpopulations of metabolite producers and non-producers.
As shown in Figure 4A, single-cell measurements in E. coli suggest that metabolic enzymes appear in low copy numbers across most cellular pathways 17 . In the specific growth conditions of that experiment, the data did not reveal bimodal expression of enzymes, which precludes the emergence of switching-induced bimodality in the metabolites they catalyse. However, as illustrated by the three representative distributions in Figure 4A, a number of enzymes have a low mean and a long-tailed distribution, akin to those required for catalytically-induced bimodality and multimodality. This suggests that enzyme distributions found in nature have the characteristics needed for the emergence of subpopulations with two or more distinct metabolite abundances.
Further requirements for metabolite bimodality and multimodality involve conditions on the parameters λ ∞ and K in equation (5). However, their computation requires rate constants (k 1 , k −1 and k rev ) that are rarely measured separately, and instead enzymology data typically provides values for k cat and K M = (k cat + k −1 )/k 1 only 48 . In the Methods we show that the ratio λ ∞ /K can be expressed as λ ∞ /K = × k cat / k c , where is the saturation level of the enzyme and k c is the first-order rate constant of metabolite consumption. As illustrated in Figure 4B, the ratio λ ∞ /K corresponds to a straight line in a (λ ∞ , K)-space, and a specific enzyme (i.e. with specific values for k 1 , k −1 and k rev ) corresponds to a single point on the line. In Figure 4B we compare model predictions for a lowly abundant enzyme with different λ ∞ /K ratios computed for k cat constants measured in E. coli 49 . Considering the large spread in measured k cat values, of up to seven orders of magnitude, plus the multiple combinations of metabolite consumption rates and enzyme saturation, the analysis suggests that catalytically-induced bimodality and multimodality are plausible within physiological regimes. Further validations of our predictions require measuring metabolite distributions directly, but this is still subject to a number of challenges in single-cell measurement technologies 15 . Our analysis shows that metabolite heterogeneity depends on a delicate interplay between enzyme expression and enzyme kinetics. It is reasonable to expect that energy-critical enzymes, such as those in central carbon metabolism, filter away fluctuations through post-translational regulatory mechanisms commonly found in metabolism. However, this may not be the case in pathways that are dynamically regulated in response to changes in the environment or cellular context. For example, transcriptional regulation in response to nutrient shifts may steer enzyme levels into regimes of low copy numbers where heterogeneity may dominate the resulting phenotypes. Such mechanism has been already shown to produce growth bimodality in the gluconeogenic switch of E. coli 12 , while a similar mechanism could underpin the large variability observed in single-cell measurements of S-Adenosyl methionine 21 . Noise-induced phenomena also have implications for the design of dynamic control systems for heterologous pathways, which are focus of much research in synthetic biology and metabolic engineering 51 .
In our efforts to build a theory that includes components shared by most enzymatic reactions, we have purposely overlooked a number of processes that can shape metabolic activity. For example, we have not addressed the impact of feedback mechanisms that control enzyme activity, including e.g. product inhibition and allostery, or transcriptional mechanisms that control enzyme expression in response to metabolites. Since post-translational regulation operates on timescales much shorter than enzyme expression, with similar timescale separation arguments it should be possible to express the metabolite distribution as a mixture model akin to ours. In such case the mixture components are not necessarily Poisson and their distribution will depend on the particular mechanism under study. We expect that bimodal and multimodal responses are likely to emerge in this setting, but the precise parameter conditions would have to be studied on a case-by-case basis. Transcriptional feedback control can also display various mechanisms depending on the particular pathway under study. One common motif relies on transcription factors (TF) that up-or down-regulate enzyme expression upon binding to a specific metabolite 20 . These mechanisms have been shown to play important roles on metabolic activity 52 , but they also bring to the fore subtle questions that require detailed examination, for example, on the role of fluctuations coming from TF expression itself, or the impact of negative TF autoregulation 53 . Our study paves the way for these and other questions to be addressed and raises exciting prospects for the future research in metabolic heterogeneity.
In this paper we laid theoretical foundations to study metabolism in conjunction with stochastic enzyme expression. We brought together classic models for gene expression and enzyme kinetics, and discovered a rich array of distinct stochastic phenomena that underpin the emergence of metabolic subpopulations. Our theory provides a quantitative basis to draw testable hypotheses on the sources of metabolite heterogeneity, which together with the ongoing efforts in single-cell metabolite measurements, will help to re-think metabolism as an active source of phenotypic variation.

Stochastic modelling and simulation
We built a fully stochastic model for the reaction scheme describing a metabolic reaction coupled with gene expression (Fig. 1A): All reactions are assumed to follow mass action kinetics. Model simulations were computed with Gillespie's algorithm 34 Table I. The parameters are selected in a physiologically realistic range respecting the scale separation of molecule numbers between mRNA (∼ 1 − 5 molecules), total enzyme abundance (∼ 100 molecules) and metabolites (∼ 1000 molecules).
To identify bimodality in Figure 2, we detected the existence of one or two peaks in a distribution and defined it as bimodal if the height of the smaller peak is a least 10% of the larger peak and the trough between peaks is at most 10% of the height of the smaller peak.   Figure 1A. We use realistic enzyme kinetic parameters 49 and fast promoter switching according to measured ranges 54 . The dilution rate δ constant corresponds to a doubling rate of approximately 46 minutes, typical in the Escherichia coli bacterium. The distributions in Figure 1C were

Analytical expressions for the metabolite distribution
To derive an analytic approximation for the probability to observe n p metabolites in a cell, we first use the law of total probability as shown in (1).
Distribution of the total enzyme -Because free enzymes and complexes degrade at the same rate and n etot = n e + n c is conserved by the metabolic reaction, in the slow timescale the enzyme distribution P(n etot ) follows the standard solution 27 of the three-stage model for gene expression: where Γ is the Gamma function and 2 F 1 is the ordinary hypergeometric function. The parameters are γ = (k on +k off )/δ and α ± = a + γ ± (a + γ) 2 − 4ak off /δ /2, with a = k tx /δ Conditional distribution for the metabolite -To compute the second term in (1), we observe that enzyme expression occurs on a much longer timescale than enzyme kinetics, and thus metabolites can be considered to be in a quasiequilibrium state of the catalytic reactions (R1)-(R2) and metabolite consumption (R6).
To explicitly compute the mixture components P(n p |n etot ), we assume that reversible binding between substrate and enzyme in reaction (R1) is much faster than the catalytic step and metabolite consumption 35 . In this limit, the metabolite number evolves according to the effective reactions: where k eff cat and k eff rev are effective propensities averaged over the fast fluctuating variables n c and n e : k eff cat = k cat E(n c |n etot , n p ), k eff rev = k rev E(n e |n etot , n p ), where E denotes the expectation operator. The derivation of the effective propensities in (9) corresponds to a particular case of a more general methodology for timescale separation in stochastic chemical systems 29,55,56 . Note that since the total enzyme levels are conserved in the catalytic timescale, it follows that E(n e |n etot , n p ) + E(n c |n etot , n p ) = n etot .
To derive the conditional expectations in (9), we write the first-order moment equation for the free enzyme n e , which according from equations (R1)-(R2) reads d dt E(n e |n etot , n p ) = − k 1 n s E(n e |n etot , n p ) + k −1 E(n c |n etot , n p ) + k cat E(n c |n etot , n p ) − k rev E(n e |n etot , n p )n p .
Under the assumption that the reversible binding of substrate and enzyme is much faster than the other processes, the first two terms dominate the right hand side of equation (11) and determine the enzyme-complex quasi-equilibrium. Equating these two terms and using the conservation relation in (10), we obtain E(n e |n etot , n p ) ≈ k −1 (k −1 + k 1 n s ) n etot , E(n c |n etot , n p ) ≈ k 1 n s k −1 + k 1 n s n etot , and thus both conditional expectations depend on n etot and are independent of the metabolite abundance. Therefore, the reactions in (8) correspond to a birth-death process with a zero-th order birth propensity k eff cat and two linear death propensities. The mixture components P(n p |n etot ) are thus Poissonian with parameter λ(n etot ) as shown in (3)-(4).

Comparison of PMM predictions and measured kinetic parameters
The PMM depends on the effective parameters λ ∞ and K, which are functions of five rate constants (k cat , k 1 , k −1 , k c and k rev ). Most of these parameters are not available, except k cat and K M = (k cat + k −1 )/k 1 . From equation (5) it follows which allows the computation of λ ∞ /K for measured k cat values in different saturation conditions and consumption rate constants. The red line in Figure 4B represents the λ ∞ / K ratio for a saturated enzyme ( = 1), fast consumption (k c = 100 × δ) and the median k cat ≈ 16.5 s −1 in E. coli 49 .
The top grey line is the case without consumption, i.e. metabolites are diluted by cell growth (k c = δ). Lower saturation ( = 0.2) moves the red line down the vertical axis (bottom grey line). The enzyme distribution in Figure 4B was produced with promoter switching parameters {k on , k off } = {1.56, 3} × 10 −4 s −1 , k tx = 0.025 s −1 , and k tl = 0.2 s −1 .
The boundaries between unimodal, bimodal and multimodal distributions were computed as follows. Unimodal distributions are those with a single maximum. Bimodal distributions were detected as in Figure 2. Multimodal distributions are those with at least one additional peak higher than a threshold of 1 × 10 −4 and the trough between neighbouring peaks at most 90 % of its height.